Stay organized with collectionsSave and categorize content based on your preferences.
Preview
This product is subject to the "Pre-GA Offerings Terms" in the General Service Terms section
of theService Specific Terms.
You can process personal data for this product as outlined in theCloud Data Processing
Addendum, subject to the obligations and restrictions described in the agreement under
which you access Google Cloud.
Pre-GA products are available "as is" and might have limited support.
For more information, see thelaunch stage descriptions.
Understand the result
Enterprise Knowledge Graph writes results into a new BigQuery table for every job. This is a snapshot of the data at the time the job is executed. By default, every job generates a randomcluster_idfor each entity cluster. However, if you want to keep the ID stable among different job runs, use theprevious BigQuery result tableadvanced option.
Output Schema
Field name
Type
Description
cluster_id
STRING
This cluster ID is a private knowledge graph machine ID (MID) assigned to this cluster of records. It can be used to uniquely identify the record in your dataset. You can use thePrevious BigQuery tablein the Advanced Options to keep thiscluster_idstable and consistent across multiple runs.
source_name
STRING
The source name specified in the input configuration, to help you join dataset together.
source_key
STRING
The unique key in your source table, to help you join dataset together.
confidence
FLOAT
Confidence score that determines how strongly these records belong to this cluster.
assignment_age
INTEGER
Used internally for cluster_id (MID) stabilization across different jobs.
cloud_kg_mid
STRING
The Google Cloud Knowledge Graph linked entity MID. You could use this MID as your permanent ID or look up additional details from Cloud Knowledge Graph API.
Use SQL to join the dataset together
Enterprise Knowledge Graph outputs grouped entities by cluster ID. The simplest way to view the result is by using the cluster ID to "group by" your result. The following example performs a quick sanity check by joining the output table with the original table.
This entity cluster represents two different records that belong to the same cluster. This samecluster_idsignals that these two records should be joined and merged.
Measure success
Pair-wise
Precision: Ratio of distinct entities incorrectly identified as similar false positives (easier to detect by manual inspection).
Recall: Ratio of similar entities that aren't identified as false negatives or harder to detect.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eThis product operates under the "Pre-GA Offerings Terms" and the Cloud Data Processing Addendum, as detailed in the Service Specific Terms.\u003c/p\u003e\n"],["\u003cp\u003ePre-GA products are offered "as is" and might have limitations in support, so you should refer to the launch stage descriptions for further details.\u003c/p\u003e\n"],["\u003cp\u003eEnterprise Knowledge Graph outputs results into a new BigQuery table with a unique \u003ccode\u003ecluster_id\u003c/code\u003e for each entity cluster, and using the previous BigQuery table advanced option will enable the \u003ccode\u003ecluster_id\u003c/code\u003e to remain stable.\u003c/p\u003e\n"],["\u003cp\u003eThe reconciliation confidence score can help users evaluate records that are less likely to be part of the same entity cluster, and the Cloud Knowledge Graph API can help further disambiguate entities using the cloud_kg_mid.\u003c/p\u003e\n"],["\u003cp\u003eKey metrics like precision, recall, and cluster V-measure, including homogeneity and completeness, are used to measure the success of entity clustering.\u003c/p\u003e\n"]]],[],null,["# Evaluate result and quality\n\n| **Preview**\n|\n|\n| This product is subject to the \"Pre-GA Offerings Terms\" in the General Service Terms section\n| of the [Service Specific Terms](/terms/service-terms#1).\n|\n| You can process personal data for this product as outlined in the\n| [Cloud Data Processing\n| Addendum](/terms/data-processing-addendum), subject to the obligations and restrictions described in the agreement under\n| which you access Google Cloud.\n|\n| Pre-GA products are available \"as is\" and might have limited support.\n|\n| For more information, see the\n| [launch stage descriptions](/products#product-launch-stages).\n\n\u003cbr /\u003e\n\nUnderstand the result\n---------------------\n\nEnterprise Knowledge Graph writes results into a new BigQuery table for every job. This is a snapshot of the data at the time the job is executed. By default, every job generates a random `cluster_id` for each entity cluster. However, if you want to keep the ID stable among different job runs, use the `previous BigQuery result table` advanced option.\n\n**Key Point:** The reconciliation confidence score can help you evaluate records that are less likely in the same entity cluster. For information about how the confidence scores are calculated, see [Understand the reconciliation confidence score](/enterprise-knowledge-graph/docs/confidence-score). \n\n### Output Schema\n\n| **Key Point:** Use [Cloud Knowledge Graph API](/enterprise-knowledge-graph/docs/search-api) to look up the cloud_kg_mid that can help you further disambiguate the entity.\n| **Note:** The Cloud Knowledge Graph API only links places and point of interest today. The linking will expand to more entities soon.\n\nUse SQL to join the dataset together\n------------------------------------\n\nEnterprise Knowledge Graph outputs grouped entities by cluster ID. The simplest way to view the result is by using the cluster ID to \"group by\" your result. The following example performs a quick sanity check by joining the output table with the original table. \n\n # get all entity clusters\n SELECT distinct (cluster_id) FROM `ekg-test.\u003cdataset\u003e.clusters_9425187210682344597` order by cluster_id LIMIT 1000;\n \n # join data with original table\n SELECT confidence, RS., SRC. FROM `ekg-test.\u003cdataset\u003e.clusters_9425187210682344597` as RS join `ekg-api-test.demo.organization` as SRC\n on RS.source_key = SRC.source_key where cluster_id = \"r-02b72jsgrbws18\";\n\nThis entity cluster represents two different records that belong to the same cluster. This same `cluster_id` signals that these two records should be joined and merged.\n\nMeasure success\n---------------\n\n**Pair-wise**\n\n- Precision: Ratio of distinct entities incorrectly identified as similar false positives (easier to detect by manual inspection).\n\n- Recall: Ratio of similar entities that aren't identified as false negatives or harder to detect.\n\n**Cluster V-measure**\n\n- Cluster V-measure: (1 + beta) \\* homogeneity \\* completeness / (beta \\* homogeneity + completeness) where beta=1.\n\n- Cluster Homogeneity: Ratio of clusters that have entities belonging to the same entity.\n\n- Cluster Completeness: Ratio of clusters in which all entities belonging to the same entity are placed into the same cluster."]]