Your use case might require you to connect an external Iceberg REST Catalog (IRC) table to an existing Google Cloud Lakehouse table. Dataflow's job builder UI lets you build a pipeline that migrates your external open source Iceberg catalog tables into Lakehouse in a low-code or no-code way. This process lets you consolidate data into a unified Lakehouse-managed Iceberg format for cross-engine analytics.
Use the following connection details to import data from external Iceberg catalogs.
Before you begin
To import data, you need the following:
- Connection information for the external Iceberg REST Catalog. For example: catalog name, namespace, table name, account URI, and role to access the catalog.
- A Lakehouse Iceberg catalog, namespace, and table to import the data into.
Support and limitations
Importing data from external Iceberg catalogs to Google Cloud Lakehouse using Dataflow has the following limitations:
- This feature supports reading from externally available Iceberg providers that support IRC (Iceberg Rest Catalog) into Lakehouse. Other Iceberg catalog types aren't supported.
- This feature supports batch and streaming pipelines.
Import an external Iceberg catalog table
To import an external Iceberg catalog table into Google Cloud Lakehouse, complete the following steps:
-
In the Google Cloud console, go to the Google Cloud Lakehouse Metastorepage.
-
Select the catalog, namespace, and table you want to import data into.
-
On the Table detailspage, click Import table.
-
In the Import configurationdialog, select Import a table from an Apache Iceberg REST Catalog into Lakehouse (Batch).
The Dataflow Job builderpage opens.
-
In the Sourcessection:
-
To expand the Iceberg tablesource panel, click the expander arrow.
-
In the Iceberg tablefield, enter the identifier of the Apache Iceberg table.
-
In the Catalog namefield, enter the name of the catalog.
-
In the Filterfield, enter the Iceberg filter to use. For example,
id > 5. -
Optional: To specify source table column changes, use the Keep columnsor Drop columnssections.
-
In the Catalog typelist of the Catalog propertiessection, select the type of catalog.
-
In the Catalog URIfield, enter the URI of the catalog. For example,
http://localhost:8181. -
In the Warehouse namefield, enter the catalog name.
For some external Iceberg REST Catalog providers, the warehouse is abstracted, and the catalog name is provided as the warehouse name.
-
In the Authentication typelist, select the authentication type. For example,
OAUTH2.
-
-
Optional: In the Transformssection, add any transforms to the source data.
-
In the Sinksection:
- Optional: Review the Lakehouse tablesink panel. The information in this panel, such as the Lakehouse table, catalog name, and warehouse location, is typically prepopulated.
-
In the Dataflow optionssection, click Run job.
What's next
- Learn more about how to Create a custom job with the job builder UI .
- Learn more in the Introduction to Google Cloud Lakehouse tables for Apache Iceberg in BigQuery .
- Read the blog post Lakehouse evolved: Build open, high-performance, enterprise Iceberg-native lakehouses .

