As of April 10, 2026, Dataplex Universal Catalog is now called Knowledge Catalog. The API, client library, CLI, and IAM names remain unchanged. For more information, seeIntroducing the Google Cloud Knowledge Catalog.
Search multi-region lineage using server-side automationStay organized with collectionsSave and categorize content based on your preferences.
This document describes how to look up multi-level, cross-regional data
lineage by using thesearchLineageStreamingAPI.
ThesearchLineageStreamingAPI performs a breadth-first search in a specified direction (upstream or
downstream) starting from a defined set of root entities, and returns a unified
lineage graph as a real-time streaming response.
Unlike standard lineage lookup APIs that might time out on massive multi-project graphs,searchLineageStreamingdelivers real-time, chunked responses. Use this API when building tools that need to traverse broad, deep, or cross-regional data architectures without request timeouts.
ThesearchLineageStreamingAPI includes the following capabilities:
Breadth-first search: Traverses the lineage graph layer by layer,
accurately calculating the depth of each connected asset.
Streaming response: Returns subgraphs and lineage links as they are
discovered by the backend system. This is highly efficient for broad or deep
lineage graphs and prevents request timeouts.
Multi-location and multi-project traversal: Although you specify only one
billing project in the request path, the API automatically discovers and
traverses lineage links across multiple Google Cloud projects and geographical
locations, provided you have the required permissions.
Fine-grained column-level lineage: Supports searching for column-level
dependencies between assets.
Wildcard lookups: Lets you to retrieve all column-level lineage for a
specific entity by suffixing the fully qualified name (FQN) with*.
Pipeline insights: Optionally retrieves metadata about the transformation
pipelines (processes) that created the lineage links.
Before you begin
Before you make requests to the API, ensure that you have met the following
security and environmental prerequisites:
Required roles
To get the permissions that
you need to search for data lineage links,
ask your administrator to grant you theData Lineage Viewer(roles/datalineage.viewer) IAM role on the projects where the lineage links and processes are stored.
For more information about granting roles, seeManage access to projects, folders, and organizations.
This predefined role contains
the permissions required to search for data lineage links. To see the exact permissions that are
required, expand theRequired permissionssection:
Required permissions
The following permissions are required to search for data lineage links:
Search entity-level lineage:datalineage.events.geton the project where the link is stored
Search column-level lineage:datalineage.events.getFieldson the project where the link is stored
Retrieve full pipeline process details:datalineage.processes.geton the project where the process is stored
When you configure your API request, you must distinguish between the resource
used for administrative billing and the actual locations scanned by the API:
Billing parent path: Theparentpath in the URL request must use the
formatprojects/project/locations/location.
This specific project-location pair is used exclusively to evaluate billing
quotas and API rate limits.
Target locations: Explicitly define the regions you want the
backend to scan in thelocationsarray inside the request body.
Authentication setup
Initialize an environment variable with a Google Cloud access token to
authenticate yourcurlcommands:
The following examples use the endpointdatalineage.googleapis.com.
Search multi-level, multi-project lineage
To execute a deep lineage search that traverses across multiple depths of the
graph and scans across distinct Google Cloud projects, define the following
variables:
Setlimits.maxDepthto your target traversal depth (accepts values from1to100).
Populate thelocationsarray with the target regions you want the backend
to cross-reference (for example,["us", "us-east1"]).
importcom.google.api.gax.rpc.ServerStream;importcom.google.cloud.datacatalog.lineage.v1.LineageClient;importcom.google.cloud.datacatalog.lineage.v1.LocationName;importcom.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingRequest;importcom.google.cloud.datacatalog.lineage.v1.SearchLineageStreamingResponse;importjava.util.ArrayList;publicclassAsyncSearchLineageStreaming{publicstaticvoidmain(String[]args)throwsException{asyncSearchLineageStreaming();}publicstaticvoidasyncSearchLineageStreaming()throwsException{// This snippet has been automatically generated and should be regarded as a code template only.// It will require modifications to work:// - It may require correct/in-range values for request initialization.// - It may require specifying regional endpoints when creating the service client as shown in// https://cloud.google.com/java/docs/setup#configure_endpoints_for_the_client_librarytry(LineageClientlineageClient=LineageClient.create()){SearchLineageStreamingRequestrequest=SearchLineageStreamingRequest.newBuilder().setParent(LocationName.of("[PROJECT]","[LOCATION]").toString()).addAllLocations(newArrayList<String>()).setRootCriteria(SearchLineageStreamingRequest.RootCriteria.newBuilder().build()).setFilters(SearchLineageStreamingRequest.SearchFilters.newBuilder().build()).setLimits(SearchLineageStreamingRequest.SearchLimits.newBuilder().build()).build();ServerStream<SearchLineageStreamingResponse>stream=lineageClient.searchLineageStreamingCallable().call(request);for(SearchLineageStreamingResponseresponse:stream){// Do something when a response is received.}}}}
/*** This snippet has been automatically generated and should be regarded as a code template only.* It will require modifications to work.* It may require correct/in-range values for request initialization.* TODO(developer): Uncomment these variables before running the sample.*//*** Required. The project and location to initiate the search from.*/// const parent = 'abc123'/*** Required. The locations to search in.*/// const locations = ['abc','def']/*** Required. Criteria for the root of the search.*/// const rootCriteria = {}/*** Required. Direction of the search.*/// const direction = {}/*** Optional. Filters for the search.*/// const filters = {}/*** Optional. Limits for the search.*/// const limits = {}// Imports the Lineage libraryconst{LineageClient}=require('@google-cloud/lineage').v1;// Instantiates a clientconstlineageClient=newLineageClient();asyncfunctioncallSearchLineageStreaming(){// Construct requestconstrequest={parent,locations,rootCriteria,direction,};// Run requestconststream=awaitlineageClient.searchLineageStreaming(request);stream.on('data',(response)=>{console.log(response)});stream.on('error',(err)=>{throw(err)});stream.on('end',()=>{/* API call completed */});}callSearchLineageStreaming();
# This snippet has been automatically generated and should be regarded as a# code template only.# It will require modifications to work:# - It may require correct/in-range values for request initialization.# - It may require specifying regional endpoints when creating the service# client as shown in:# https://googleapis.dev/python/google-api-core/latest/client_options.htmlfromgoogle.cloudimportdatacatalog_lineage_v1defsample_search_lineage_streaming():# Create a clientclient=datacatalog_lineage_v1.LineageClient()# Initialize request argument(s)request=datacatalog_lineage_v1.SearchLineageStreamingRequest(parent="parent_value",locations=["locations_value1","locations_value2"],direction="UPSTREAM",)# Make the requeststream=client.search_lineage_streaming(request=request)# Handle the responseforresponseinstream:print(response)
require"google/cloud/data_catalog/lineage/v1"### Snippet for the search_lineage_streaming call in the Lineage service## This snippet has been automatically generated and should be regarded as a code# template only. It will require modifications to work:# - It may require correct/in-range values for request initialization.# - It may require specifying regional endpoints when creating the service# client as shown in https://cloud.google.com/ruby/docs/reference.## This is an auto-generated example demonstrating basic usage of# Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client#search_lineage_streaming.#defsearch_lineage_streaming# Create a client object. The client can be reused for multiple calls.client=Google::Cloud::DataCatalog::Lineage::V1::Lineage::Client.new# Create a request. To set request fields, pass in keyword arguments.request=Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingRequest.new# Call the search_lineage_streaming method to start streaming.output=client.search_lineage_streamingrequest# The returned object is a streamed enumerable yielding elements of type# ::Google::Cloud::DataCatalog::Lineage::V1::SearchLineageStreamingResponseoutput.eachdo|current_response|pcurrent_responseendend
By default, the API leaves process information omitted (maxProcessPerLinkdefaults to0). To retrieve the resource names of the pipelines that created
your data links, configurelimits.maxProcessPerLinkto a non-zero positive
integer.
Response behavior: The resulting stream populates thelinks[].processesfield
with process messages containing only their absolute system resource name
(such asprojects/my-project/locations/us/processes/my-process).
Retrieve full process details using a FieldMask
If you need full structural metadata about a pipeline (such as itsdisplayName,
systemattributes, or executionorigin) instead of just its resource name,
you must use an APIFieldMask:
Provide a non-zero value tolimits.maxProcessPerLink.
Append afieldsquery parameter to the URL path, specifyinglinks.processes.processalong with other required fields.
You can search for both table-level (asset-level) and column-level (field-level)
lineage in a single request by providing multiple entities in therootCriteria.entities.entitieslist:
For table-level lineage, omit thefieldarray.
For column-level lineage, specify a single column in thefieldarray.
To search for all available column-level lineage for a specific table without
listing every column individually, use the wildcard character*as the single
value in thefieldarray.
You can refine your lineage search results by using thefiltersblock in the
request body.
Filter by dependency type
To restrict results to specific dependency types, such as direct copies
(EXACT_COPY) or transformations like filtering and grouping (OTHER), use
thedependencyTypesfilter.
Troubleshooting: Handle unreachable locations and partial graphs
Because the streaming API scans across a distributed set of projects and
locations simultaneously, some remote regions might be temporarily down,
uncommunicative, or misconfigured during execution.
Symptom: The returned lineage graph layout appears incomplete or
is missing expected regional hops.
Diagnostic: To protect data integrity, thesearchLineageStreamingResponsestream populates a dedicatedunreachablefield (repeated string) with problematic locations, using the formatprojects/PROJECT_NUMBER/locations/LOCATION(for example,projects/123456789/locations/us-east1).
Best practice: Always build your client applications
to inspect theunreachablefield to verify data completeness before
processing the graph.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-06-18 UTC."],[],[]]