The Lakehouse runtime catalog is a fully managed, serverless service that provides a single source of truth for your data lakehouse. It enables multiple engines, including Apache Spark, Apache Flink, and BigQuery, to share tables and metadata without copying files.
The Lakehouse runtime catalog supports storage access delegation (credential vending), which improves security by removing the need for direct Cloud Storage bucket access. It also integrates with Knowledge Catalog for unified governance, lineage, and data quality.
Key capabilities
As a component of Google Cloud Lakehouse , the Lakehouse runtime catalog provides several advantages for data management and analysis, including a serverless architecture, engine interoperability with open APIs, a unified user experience, and high-performance analytics, streaming, and AI when used with BigQuery. For more information on these benefits, see What is Google Cloud Lakehouse?
Supported engines
The Lakehouse runtime catalog is compatible with several query engines including (but not limited to) Apache Spark, Apache Flink, and Trino. The following table provides links to documentation for each engine:
| Engine | Documentation |
|---|---|
| Apache Spark | Quickstart: Use with Spark |
| Apache Flink | Use with Apache Flink |
| Trino | Use with Trino |
Configuration options
The Lakehouse runtime catalog can be configured in one of two ways: with the Apache Iceberg REST catalog endpoint or the *Custom Apache Iceberg catalog for BigQuery endpoint. The best option depends on your use case, as shown in the following table:
| Use case | Recommendation |
|---|---|
| New Lakehouse runtime catalog users that want their open source engine to access data in Cloud Storage and need interoperability with other engines, including BigQuery and AlloyDB for PostgreSQL. | Use the Apache Iceberg REST catalog endpoint . |
| Existing Lakehouse runtime catalog users that have current tables with the custom Apache Iceberg catalog for BigQuery. | Continue using the custom Apache Iceberg catalog for BigQuery endpoint , but use the Apache Iceberg REST catalog for new workflows. Tables created with the custom Apache Iceberg catalog for BigQuery endpoint are visible with the Apache Iceberg REST catalog through BigQuery catalog federation. |
Differences with Google Cloud Lakehouse metastore (classic)
The Lakehouse runtime catalog is the recommended metastore on Google Cloud, while Google Cloud Lakehouse metastore (classic) is considered a legacy feature.
The core differences between the Lakehouse runtime catalog and Google Cloud Lakehouse metastore (classic) include the following:
- The Lakehouse runtime catalog supports a direct integration with open source engines like Spark, which helps reduce redundancy when you store metadata and run jobs. Tables in the Lakehouse runtime catalog are directly accessible from multiple open source engines and BigQuery.
- The Lakehouse runtime catalog supports the Apache Iceberg REST catalog endpoint, while Google Cloud Lakehouse metastore (classic) does not.
Lakehouse runtime catalog limitations
The following limitations apply to tables in Lakehouse runtime catalog:
Table management
- You can't create or modify Lakehouse Iceberg REST catalog tables with BigQuery data definition language (DDL) or data manipulation language (DML) statements. You can modify Lakehouse Iceberg REST catalog tables using the BigQuery API (with the bq command-line tool or client libraries), but doing so risks making changes that are incompatible with the external engine.
- Lakehouse runtime catalog tables don't support renaming
operations
or the
ALTER TABLE ... RENAME TOSpark SQL statement. - Lakehouse runtime catalog tables don't support clustering .
- Lakehouse runtime catalog tables don't support flexible column names .
- The Lakehouse runtime catalog doesn't support Apache Iceberg views.
Querying
- Query performance for Lakehouse runtime catalog tables from the BigQuery engine might be slow compared to querying data in standard BigQuery tables. In general, query speed should be equivalent to reading data from Cloud Storage.
- A BigQuery dry run of a query that uses a Lakehouse runtime catalog table might report a lower bound of 0 bytes of data, even if rows are returned. This result occurs because the amount of data that is processed from the table can't be determined until the full query is run. Running the query incurs a cost for processing this data.
- You can't reference a Lakehouse runtime catalog table in a wildcard table query.
API and metadata
- You can't use the
tabledata.listmethod to retrieve data from Lakehouse runtime catalog tables. Instead, you can save query results to a BigQuery table, and then use thetabledata.listmethod on that table. - The display of table storage statistics for Lakehouse runtime catalog tables isn't supported.
Quotas and limits
- Lakehouse runtime catalog tables in BigQuery are subject to the same quotas and limits as standard tables.
What's next
- Understand the Apache Iceberg REST catalog endpoint .
- Understand the Custom Apache Iceberg catalog for BigQuery endpoint .

