As of April 10, 2026, Dataplex Universal Catalog is now called Knowledge Catalog. The API, client library, CLI, and IAM names remain unchanged. For more information, see Introducing the Google Cloud Knowledge Catalog .

Import entry links from a Google Sheet

This document explains how to use the entry link import utility to bulk import entry links (relationships between glossary terms and data assets, including definition , related , and synonym entry link types) from a Google Sheet into Knowledge Catalog (formerly Dataplex Universal Catalog).

Before you begin

Before you import entry links into Knowledge Catalog, complete the following prerequisites.

Set up the service account

To run the import utility using Google Sheets, you must set up a service account with the necessary permissions to access the Google Sheets API and impersonate your user credentials:

Identify or create a service account.

Select an existing service account or create a new one in the project where you run the import utility. For more information, see Create service accounts . Note the service account email (for example, SERVICE_ACCOUNT_NAME @ PROJECT_ID .iam.gserviceaccount.com ).
Configure service account impersonation.

To run the import utility script locally, your user account must have permission to impersonate the service account. Grant your user account the Service Account Token Creatorrole ( roles/iam.serviceAccountTokenCreator ) on the service account.

For more information, see Manage access to service accounts .
Grant the service account Editor access to the Google Sheet.

Open the Google Sheet you want to use for the import process, click Share, and add the service account email as an Editor . This permission enables the service account to read from or write data to your sheet.

Create a Cloud Storage bucket

Create a Cloud Storage bucket to serve as a staging area for import files.

Required roles

To ensure that the service account has the necessary permissions to import entry links from a Google Sheet, ask your administrator to grant the following IAM roles to the service account:

Dataplex Administrator ( roles/dataplex.admin ) on the project
Dataplex Catalog Admin ( roles/dataplex.catalogAdmin ) on the project
Dataplex Catalog Editor ( roles/dataplex.catalogEditor ) on the project
Storage Object Admin ( roles/storage.objectAdmin ) on the Cloud Storage bucket
Storage Object Creator ( roles/storage.objectCreator ) on the Cloud Storage bucket

For more information about granting roles, see Manage access to projects, folders, and organizations .

Your administrator might also be able to give the service account the required permissions through custom roles or other predefined roles .

Enable APIs

To import entry links, enable the following APIs in your project:

Roles required to enable APIs

To enable APIs, you need the Service Usage Admin IAM role ( roles/serviceusage.serviceUsageAdmin ), which contains the serviceusage.services.enable permission. Learn how to grant roles .

Enable the APIs

Set up the git repository

Clone the dataplex-labs repository:

git  
clone  
https://github.com/GoogleCloudPlatform/dataplex-labs.git cd 
  
dataplex-labs/dataplex-quickstart-labs/00-resources/scripts/python/business-glossary-import/dataplex-glossary/import

Install dependencies

Install the required Python dependencies:

 pip3 
 install 
 - 
 r 
 requirements 
 . 
 txt 
 cd 
 dataplex 
 - 
 glossary

If you encounter any issues with the package installation, set up a new Python development environment .

Authenticate and configure service account impersonation

Initialize the Google Cloud CLI and authenticate using Application Default Credentials (ADC) with service account impersonation:

 # Set your service account email address 
 SA_EMAIL 
 = 
 " SERVICE_ACCOUNT_EMAIL 
" 
 # Authenticate ADC using service account impersonation and required scopes 
gcloud  
init
gcloud  
auth  
login
gcloud  
auth  
application-default  
login  
 \ 
  
--impersonate-service-account = 
 " 
 ${ 
 SA_EMAIL 
 } 
 " 
  
 \ 
  
--scopes = 
 "https://www.googleapis.com/auth/spreadsheets"

Replace SERVICE_ACCOUNT_EMAIL with the service account email ID. For example:

SERVICE_ACCOUNT_NAME @ PROJECT_ID .iam.gserviceaccount.com

Cross-project import requirements

To import entry links across multiple Google Cloud projects, ensure your configuration meets the following requirements before running the import utility:

Configure cross-project IAM permissions: The service account executing the import script must have sufficient permissions in all target projects.
Verify entry existence: The target entries must already exist in their respective Knowledge Catalog projects before you run the import.
Grant the Cloud Storage bucket access to Knowledge Catalog service agents: The Knowledge Catalog service accounts in each target project require access to your Cloud Storage buckets.

Create and structure the Google Sheet for import

To successfully run a bulk import, you must create a new Google Sheet using a precise column schema so that the import utility can successfully parse and validate your metadata fields. Ensure that you have granted the service account Editor access to the Google Sheet.

The first row of the sheet must contain these exact, case-sensitive schema headers:

Column header	Required or Optional	Description
`entry_link_type`	Required	Value must be `definition` , `related` , or `synonym` .
`source_entry`	Required	The full resource path of the source entry in the format: `projects/ PROJECT_ID /locations/ LOCATION /entryGroups/ ENTRYGROUP_NAME /entries/ ENTRY_NAME`
`target_entry`	Required	The full resource path of the target entry in the format: `projects/ PROJECT_ID /locations/ LOCATION /entryGroups/ ENTRYGROUP_NAME /entries/ ENTRY_NAME`
`source_path`	Optional	Column or field path for definition links (for example, `Schema.column_name` ).

Set up environment variables

Set up the following environment variables:

 # Set your Google Sheet URL 
 export 
  
 SPREADSHEET_URL 
 = 
 " GOOGLE_SHEET_URL 
" 
 # Set your bucket name 
 export 
  
 BUCKETS 
 = 
 " COMMA_SEPARATED_LIST_OF_BUCKETS 
" 
 # Set the project ID 
 export 
  
 USER_PROJECT 
 = 
 " USER_PROJECT 
"

Import entry links from the Google Sheet

To import the entry links from the Google Sheet into Knowledge Catalog, run the entrylinks-import.py script:

 cd 
  
import
python3  
entrylinks-import.py  
 \ 
  
--spreadsheet-url = 
 " 
 $SPREADSHEET_URL 
 " 
  
 \ 
  
--buckets = 
 " 
 $BUCKETS 
 " 
  
 \ 
  
--user-project = 
 " 
 $USER_PROJECT 
 "

To run multiple import jobs in parallel, specify multiple Cloud Storage buckets in the --buckets parameter. The script splits the metadata into smaller batches and processes them concurrently across the buckets, reducing total ingestion time.

You can review the execution logs in the logs/ directory in your local execution path. These logs help you audit the transfer process and identify skipped entries or formatting warnings.

What's next

Learn how to manage a business glossary .
Learn how to export entry links to a Google Sheet .
Learn how to import glossaries from a Google Sheet .
Learn more about metadata management .