To read from Apache Iceberg to Dataflow, use the managed I/O connector .
Managed I/O supports the following capabilities for Apache Iceberg:
- Hadoop
- Hive
- REST-based catalogs
- BigQuery metastore (requires Apache Beam SDK 2.62.0 or later if not using Runner v2)
- Batch write
- Streaming write
- Dynamic destinations
- Dynamic table creation
For BigQuery tables for Apache Iceberg
,
use the BigQueryIO
connector
with BigQuery Storage API. The table must already exist; dynamic table creation is
not supported.
Dependencies
Add the following dependencies to your project:
Java
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-managed</artifactId>
<version>${beam.version}</version>
</dependency>
<dependency>
<groupId>org.apache.beam</groupId>
<artifactId>beam-sdks-java-io-iceberg</artifactId>
<version>${beam.version}</version>
</dependency>
Example
The following example reads from an Apache Iceberg table and writes the data to text files.
Java
To authenticate to Dataflow, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .
What's next
- Write to Apache Iceberg .
- Streaming Write to Apache Iceberg with BigLake REST Catalog .
- Learn more about Managed I/O .

