User-provided data matching

User-provided data matching (UPDM) joins first-party data that you've collected about a user—such as information from your websites, apps, or physical stores—with that same user's signed-in activity across all Google ad data , including Google owned & operated data. This includes data bought through Google Marketing Platform (GMP) products, for example, YouTube bought using Display & Video 360. Other GMP products which are not Google owned & operated are not supported.

To be eligible for user-provided data matching, the ad event must be linked to a signed-in user in Google ad data.

This document describes the user-provided data matching feature, and provides guidance on setup and use.

Connections Tab Overview

Gaining valuable advertising insights often requires stitching together data from multiple sources. Building your own solution to this data pipeline problem requires significant time investment and engineering investment. The Connections page in Ads Data Hub streamlines this process by providing a step-by-step, guided interface for importing, transforming, and matching your ads data in BigQuery, so that you can use it in your Ads Data Hub queries, or any other product that reads from BigQuery. Enriching your queries with first-party data can deliver richer customer experiences, and is more resistant to industry-wide ad-tracking changes.

The Connections page is built with tools that allow you to encrypt and share personally identifiable information (PII) with partners in a privacy-centric manner. After selecting which columns contain PII, Ads Data Hub encrypts the data, ensuring that your first-party data can only be exported or read by people who have permission to do so. Knowing which first-party data is needed for your measurement or activation use case can be difficult, so Ads Data Hub provides a comprehensive list of predefined use cases, then guides you through the entire experience of extracting, transforming, and loading your data. While can create multiple types of connections, this document assumes that you are using the Connections page for user-provided data matching.

Supported first-party data sources

You can import data from these data sources:

  • BigQuery
  • Cloud Storage
  • Secure FTP (sFTP)
  • Snowflake
  • MySQL
  • PostgreSQL
  • Amazon Redshift
  • Amazon S3

Because user-provided data matching is only available on Google owned and operated inventory for signed-in users, it is not impacted by the upcoming deprecation of third-party cookies . Since it's more resistant to industry changes than third-party data, it can provide richer insights, which can lead to higher customer engagement.

Learn the terminology

  • User-provided data connection:Set up a user-provided data connection to import and match your data, schedule data imports, transform data, and match your ads data using a user ID. The ad event must be linked to a signed-in user in Google ad data. Requires multiple Google Cloud projects.
  • First-party data connection:Set up a first-party data connection as a data preparation tool, to schedule data imports and transform data without the advanced features of UPDM. This type of connection requires only one Google Cloud project.
  • Data source:A connected product, imported file, or third-party integration–for example, BigQuery.
  • Destination:A use case; typically a Google product or product feature, where imported data is activated–for example, Ads Data Hub user-provided data matching.
  • Admin project:The Google Cloud project that contains your proprietary advertising data in its raw format.
  • Output dataset:The BigQuery dataset that Ads Data Hub writes to. By default, this is a dataset under your admin project. To change it to another Google Cloud project, see Configure service accounts .

Process summary

  1. Setup of data ingestion and matching
    • Yougrant the required permissions to the services accounts on your admin project. See Set up data ingestion .
  2. First-party data ingestion and matching
    • Youformat and upload your first-party data to your BigQuery dataset. For the simplest setup, use your admin project . However, you can use any BigQuery dataset that you own.
    • Youinitiate a data-matching request by creating a connection and setting an import schedule.
    • Googlejoins data between your project and Google-owned data containing Google's user ID and hashed user-provided data to build and update match tables.
    • See Ingest first-party data
  3. Ongoing queries in Ads Data Hub, based on matched data
    • Yourun queries against the match tables in the same way you run regular queries in Ads Data Hub. See Query matched data .

Learn about privacy requirements

Collecting customer data

When using user-provided data matching, you must upload first-party data. This could be information you collected from your websites, apps, physical stores, or any information that a customer shared with you directly.

You must:

  • Ensure that your privacy policy discloses that you share customer data with third parties to perform services on your behalf, and that you obtain consent for such sharing where legally required
  • Only use Google's approved API or interface to upload customer data
  • Comply with all applicable laws and regulations , including any self-regulatory or industry codes that may apply

First-party consent acknowledgement

To ensure you are able to use your first-party data in Ads Data Hub, you must confirm that you have obtained proper consent to share data from EEA end users with Google per the EU user consent policy and Ads Data Hub policy . This requirement applies to each Ads Data Hub account, and must be updated every time you upload new first-party data. Any one user can make this acknowledgement on behalf of the entire account.

Note that the same Google service query rules that apply to analysis queries also apply to UPDM queries. For example, you can't run cross-service queries on users in the EEA when you create a match table.

To learn how to acknowledge consent in Ads Data Hub, see Consent requirements for the European Economic Area .

Data size

To protect end-user privacy, user-provided data matching enforces these requirements regarding the size of your data:

  • You must upload least 1,000 records in your user list.
  • Every successful update of your match table must include a minimum number of newly matched users. This behavior is similar to difference checks .
  • Your list must not exceed the maximum number of records. To learn about the maximum data limit, reach out to your Google representative.

Set up data ingestion

Before you start, you need to configure your Ads Data Hub account to create data connections, which is how you'll establish your data-matching pipeline. You only need to perform these steps once.

From the Connectionspage, click Begin setupto open the account setup wizard at the UPDM enablement stage.

Go to Connections

What permissions are granted for BigQuery and Cloud Storage?

If you set up UPDM for use with BigQuery or Cloud Storage, use this reference to understand the permissions that are granted to the Ads Data Hub service accounts.

BigQuery

Datafusion service account
Purpose
The datafusion service account is used to display a list of source fields in the Ads Data Hub UI.
Format
service- some-number @gcp-sa-datafusion.iam.gserviceaccount.com
Required access
BigQuery Data Viewer
roles/bigquery.dataViewer
for specific datasets in Data Source and Destination projects
Storage Admin
roles/storage.admin
for the Data Source project, or a dedicated storage bucket
Dataproc service account
Purpose
The dataproc service account is responsible for running the data pipelines in the background.
Format
some-number -compute@developer.gserviceaccount.com
Required access
BigQuery Data Viewer
roles/bigquery.dataViewer
for specific datasets in Data Source and Destination projects
BigQuery Data Editor
roles/bigquery.dataEditor
for specific datasets in the Destination project
BigQuery Job User
roles/bigquery.jobUser
for both Data Source and Destination projects
Storage Admin
roles/storage.admin
for both Data Source and Destination projects, or a dedicated storage bucket
UPDM service account
Purpose
The UPDM service account is used to run the matching job.
Format
service- some-number @gcp-sa-adsdataconnector.iam.gserviceaccount.com
Required access
BigQuery Data Viewer
roles/bigquery.dataViewer
for the Destination project
BigQuery Job User
roles/bigquery.jobUser
for the Destination project

Cloud Storage

Datafusion service account
Purpose
The datafusion service account is used to display a list of source fields in the Ads Data Hub UI.
Format
service- some-number @gcp-sa-datafusion.iam.gserviceaccount.com
Required access
Storage Object Viewer
roles/storage.objectViewer
for specific storage buckets in the Data Source project
BigQuery Data Viewer
roles/bigquery.dataViewer
for the Data Source project, or a dedicated storage bucket
Storage Admin
roles/storage.admin
for the Data Source project, or a dedicated storage bucket
Dataproc service account
Purpose
The dataproc service account is responsible for running the data pipelines in the background.
Format
some-number -compute@developer.gserviceaccount.com
Required access
Storage Admin
roles/storage.admin
for both Data Source and Destination projects, or a dedicated storage bucket
BigQuery Job User
roles/bigquery.jobUser
for the Destination project
UPDM service account
Purpose
The UPDM service account is used to run the matching job.
Format
service- some-number @gcp-sa-adsdataconnector.iam.gserviceaccount.com
Required access
BigQuery Data Viewer
roles/bigquery.dataViewer
for the Destination project
BigQuery Job User
roles/bigquery.jobUser
for the Destination project

Other data sources

Not necessary for other data sources

Ingest and match first-party data

Format data for input

Your data must adhere to these formatting requirements to be correctly matched:

  • Where indicated in the following input field descriptions, you must upload using SHA256 hashing.
  • Input fields must be formatted as strings. For example, if you're using BigQuery's SHA256 hash function with the Base16 encoding function ( TO_HEX ), use the following transformation: TO_HEX(SHA256(user_data)) .
  • UPDM supports both Base16 and Base64 encoding. You must align encoding of your first-party data with the decoding used in your Ads Data Hub query. If you change your first-party data encoding, you must update your Ads Data Hub query to decode from the same base. The following examples use Base16 encoding.

User ID

  • Plain text
  • Hashing: None
  • Remove leading and trailing whitespaces
  • Lowercase all characters
  • Include a domain name for all email addresses, such as gmail.com or hotmail.co.jp
  • Remove accents—for example, change è, é, ê, or ë to e
  • Remove all periods (.) that precede the domain name in gmail.com and googlemail.com email addresses
  • Hashing: Base16 encoded SHA256

Valid: TO_HEX(SHA256("jeffersonloveshiking@gmail.com"))

Invalid: TO_HEX(SHA256("JéffersonLôvesHiking@gmail.com"))

Phone

  • Strip whitespace
  • Format in E.164 format—for instance, US example: +14155552671, UK example: +442071838750
  • Remove all special characters except the "+" before the country code
  • Hashing: Base16 encoded SHA256

Valid: TO_HEX(SHA256("+18005550101"))

Invalid: TO_HEX(SHA256("(800) 555-0101"))

First name

  • Strip whitespace
  • Lowercase all characters
  • Remove all prefixes-for example Mrs., Mr., Ms., Dr.
  • Don't remove accents—for example, è, é, ê, or ë
  • Hashing: Base16 encoded SHA256

Valid: TO_HEX(SHA256("daní"))

Invalid: TO_HEX(SHA256("Daní"))

Last name

  • Strip whitespace
  • Lowercase all characters
  • Remove all suffixes-for example Jr., Sr., 2nd, 3rd, II, III, PHD, MD
  • Don't remove accents—for example, è, é, ê, or ë
  • Hashing: Base16 encoded SHA256

Valid: TO_HEX(SHA256("delacruz"))

Invalid: TO_HEX(SHA256("de la Cruz, Jr."))

Country

  • Include the country code even if all of your customer data is from the same country
  • Don't hash country data
  • Use ISO 3166-1 alpha-2 country codes
  • Hashing: None

Valid: US

Invalid: United States of America or USA

Zip code

  • Don't hash zip code data
  • Both US and international zip and postal codes are allowed
  • For US:
    • 5 digit codes are allowed—for example, 94043
    • 5 digits followed by 4 digit extension are also allowed—for example, 94043-1351 or 940431351
  • For all other countries:
    • No formatting needed (No need to lowercase, or remove spaces and special characters)
    • Leave out postal code extensions
  • Hashing: None

Hash validation and data encoding

You can use the following hash validation scripts to ensure that your data is correctly formatted.

JavaScript

Base16

  /** 
 * @fileoverview Provides the hashing algorithm for User-Provided Data Match, as 
 * well as some valid hashes of sample data for testing. 
 */ 
 async 
  
 function 
  
 hash 
 ( 
 token 
 ) 
  
 { 
  
 // Removes leading or trailing spaces and converts all characters to lowercase. 
  
 const 
  
 formattedToken 
  
 = 
  
 token 
 . 
 trim 
 (). 
 toLowerCase 
 (); 
  
 // Hashes the formatted string using the SHA-256 hashing algorithm. 
  
 const 
  
 hashArrayBuffer 
  
 = 
  
 await 
  
 crypto 
 . 
 subtle 
 . 
 digest 
 ( 
  
 'SHA-256' 
 , 
  
 ( 
 new 
  
 TextEncoder 
 ()). 
 encode 
 ( 
 formattedToken 
 )); 
  
 // Converts the hash buffer to a hexadecimal string. 
  
 return 
  
 Array 
 . 
 from 
 ( 
 new 
  
 Uint8Array 
 ( 
 hashArrayBuffer 
 )) 
  
 . 
 map 
 (( 
 b 
 ) 
  
 = 
>  
 b 
 . 
 toString 
 ( 
 16 
 ). 
 padStart 
 ( 
 2 
 , 
  
 '0' 
 )) 
  
 . 
 join 
 ( 
 '' 
 ); 
 } 
 function 
  
 main 
 () 
  
 { 
  
 // Expected hash for test@gmail.com: 
  
 // 87924606b4131a8aceeeae8868531fbb9712aaa07a5d3a756b26ce0f5d6ca674 
  
 hash 
 ( 
 'test@gmail.com' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for +18005551212: 
  
 // 61d9111bed3e6d9cfc1bc3b5cb35a402687c4f1546bee061a2bd444fbdd64c44 
  
 hash 
 ( 
 '+18005551212' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for John: 
  
 // 96d9632f363564cc3032521409cf22a852f2032eec099ed5967c0d000cec607a 
  
 hash 
 ( 
 'John' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for Doe: 
  
 // 799ef92a11af918e3fb741df42934f3b568ed2d93ac1df74f1b8d41a27932a6f 
  
 hash 
 ( 
 'Doe' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
 } 
 main 
 () 
 

Base64

  /** 
 * @fileoverview Provides the hashing algorithm, as well as some valid hashes of 
 * sample data for testing. 
 */ 
 async 
  
 function 
  
 hash 
 ( 
 token 
 ) 
  
 { 
  
 // Removes leading or trailing spaces and converts all characters to lowercase. 
  
 const 
  
 formattedToken 
  
 = 
  
 token 
 . 
 trim 
 (). 
 toLowerCase 
 (); 
  
 // Hashes the formatted string using the SHA-256 hashing algorithm. 
  
 const 
  
 hashBuffer 
  
 = 
  
 await 
  
 crypto 
 . 
 subtle 
 . 
 digest 
 ( 
  
 'SHA-256' 
 , 
  
 ( 
 new 
  
 TextEncoder 
 ()). 
 encode 
 ( 
 formattedToken 
 )); 
  
 // Converts the hash buffer to a base64-encoded string and returns it. 
  
 const 
  
 base64Str 
  
 = 
  
 btoa 
 ( 
 String 
 . 
 fromCharCode 
 (... 
 new 
  
 Uint8Array 
 ( 
 hashBuffer 
 ))); 
  
 return 
  
 base64Str 
 ; 
 } 
 function 
  
 main 
 () 
  
 { 
  
 // Expected hash for test@gmail.com: 
  
 // h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ= 
  
 hash 
 ( 
 'test@gmail.com' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for +18005551212: 
  
 // YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ= 
  
 hash 
 ( 
 '+18005551212' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for John: ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo= 
  
 hash 
 ( 
 'John' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
  
 // Expected hash for Doe: eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8= 
  
 hash 
 ( 
 'Doe' 
 ). 
 then 
 ( 
 result 
  
 = 
>  
 console 
 . 
 log 
 ( 
 result 
 )); 
 } 
 main 
 () 
 

Python

Base16

  """Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 Supports: Python 2, Python 3 
 Sample hashes: 
 - Email 'test@gmail.com': 87924606b4131a8aceeeae8868531fbb9712aaa07a5d3a756b26ce0f5d6ca674 
 - Phone '+18005551212':   61d9111bed3e6d9cfc1bc3b5cb35a402687c4f1546bee061a2bd444fbdd64c44 
 - First name 'John':      96d9632f363564cc3032521409cf22a852f2032eec099ed5967c0d000cec607a 
 - Last name 'Doe':        799ef92a11af918e3fb741df42934f3b568ed2d93ac1df74f1b8d41a27932a6f 
 """ 
 import 
  
 base64 
 import 
  
 hashlib 
 def 
  
 updm_hash 
 ( 
 token 
 ): 
 # Generates a SHA-256 hash of the input token after normalization. 
 return 
 hashlib 
 . 
 sha256 
 ( 
 token 
 . 
 strip 
 () 
 . 
 lower 
 () 
 . 
 encode 
 ( 
 'utf-8' 
 )) 
 . 
 hexdigest 
 () 
 def 
  
 print_updm_hash 
 ( 
 token 
 ): 
 # Prints the SHA-256 hash and the original token. 
 print 
 ( 
 'Hash: " 
 {} 
 " 
 \t 
 (Token: 
 {} 
 )' 
 . 
 format 
 ( 
 updm_hash 
 ( 
 token 
 ), 
 token 
 )) 
 def 
  
 main 
 (): 
 # Hashes and prints sample tokens. 
 print_updm_hash 
 ( 
 'test@gmail.com' 
 ) 
 print_updm_hash 
 ( 
 '+18005551212' 
 ) 
 print_updm_hash 
 ( 
 'John' 
 ) 
 print_updm_hash 
 ( 
 'Doe' 
 ) 
 if 
 __name__ 
 == 
 '__main__' 
 : 
 main 
 () 
 

Base64

  """Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 Supports: Python 2, Python 3 
 Sample hashes: 
 - Email 'test@gmail.com': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ= 
 - Phone '+18005551212':   YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ= 
 - First name 'John':      ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo= 
 - Last name 'Doe':        eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8= 
 """ 
 import 
  
 base64 
 import 
  
 hashlib 
 def 
  
 hash 
 ( 
 token 
 ): 
 # Generates a base64-encoded SHA-256 hash of a normalized input string. 
 return 
 base64 
 . 
 b64encode 
 ( 
 hashlib 
 . 
 sha256 
 ( 
 token 
 . 
 strip 
 () 
 . 
 lower 
 () 
 . 
 encode 
 ( 
 'utf-8' 
 )) 
 . 
 digest 
 ()) 
 . 
 decode 
 ( 
 'utf-8' 
 ) 
 def 
  
 print_hash 
 ( 
 token 
 , 
 expected 
 = 
 None 
 ): 
 # Computes and displays the hash of a token, with optional validation. 
 hashed 
 = 
 hash 
 ( 
 token 
 ) 
 if 
 expected 
 is 
 not 
 None 
 and 
 hashed 
 != 
 expected 
 : 
 print 
 ( 
 'ERROR: Incorrect hash for token " 
 {} 
 ". Expected " 
 {} 
 ", got " 
 {} 
 "' 
 . 
 format 
 ( 
 token 
 , 
 expected 
 , 
 hashed 
 )) 
 return 
 print 
 ( 
 'Hash: " 
 {} 
 " 
 \t 
 (Token: 
 {} 
 )' 
 . 
 format 
 ( 
 hashed 
 , 
 token 
 )) 
 def 
  
 main 
 (): 
 # Tests the hash function with sample tokens and expected results. 
 print_hash 
 ( 
 'test@gmail.com' 
 , 
 expected 
 = 
 'h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ=' 
 ) 
 print_hash 
 ( 
 '+18005551212' 
 , 
 expected 
 = 
 'YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ=' 
 ) 
 print_hash 
 ( 
 'John' 
 , 
 expected 
 = 
 'ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo=' 
 ) 
 print_hash 
 ( 
 'Doe' 
 , 
 expected 
 = 
 'eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8=' 
 ) 
 if 
 __name__ 
 == 
 '__main__' 
 : 
 main 
 () 
 

Go

Base16

  /* 
 Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 Sample hashes: 
 - Email 'test@gmail.com': 87924606b4131a8aceeeae8868531fbb9712aaa07a5d3a756b26ce0f5d6ca674 
 - Phone '+18005551212':   61d9111bed3e6d9cfc1bc3b5cb35a402687c4f1546bee061a2bd444fbdd64c44 
 - First name 'John':      96d9632f363564cc3032521409cf22a852f2032eec099ed5967c0d000cec607a 
 - Last name 'Doe':        799ef92a11af918e3fb741df42934f3b568ed2d93ac1df74f1b8d41a27932a6f 
 */ 
 package 
  
 main 
 import 
  
 ( 
  
 "crypto/sha256" 
  
 "fmt" 
  
 "strings" 
 ) 
 // Hash hashes an email, phone, first name, or last name into the correct format. 
 func 
  
 Hash 
 ( 
 token 
  
 string 
 ) 
  
 string 
  
 { 
  
 formatted 
  
 := 
  
 strings 
 . 
 TrimSpace 
 ( 
 strings 
 . 
 ToLower 
 ( 
 token 
 )) 
  
 hashed 
  
 := 
  
 sha256 
 . 
 Sum256 
 ([] 
 byte 
 ( 
 formatted 
 )) 
  
 encoded 
  
 := 
  
 fmt 
 . 
 Sprintf 
 ( 
 "%x" 
 , 
  
 hashed 
 [:]) 
  
 return 
  
 encoded 
 } 
 // PrintHash prints the hash for a token. 
 func 
  
 PrintHash 
 ( 
 token 
  
 string 
 ) 
  
 { 
  
 fmt 
 . 
 Printf 
 ( 
 "Hash: \"%s\"\t(Token: %s)\n" 
 , 
  
 Hash 
 ( 
 token 
 ), 
  
 token 
 ) 
 } 
 func 
  
 main 
 () 
  
 { 
  
 PrintHash 
 ( 
 "test@gmail.com" 
 ) 
  
 PrintHash 
 ( 
 "+18005551212" 
 ) 
  
 PrintHash 
 ( 
 "John" 
 ) 
  
 PrintHash 
 ( 
 "Doe" 
 ) 
 } 
 

Base64

  /* 
 Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 Sample hashes: 
 - Email 'test@gmail.com': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ= 
 - Phone '+18005551212':   YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ= 
 - First name 'John':      ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo= 
 - Last name 'Doe':        eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8= 
 */ 
 package 
  
 main 
 import 
  
 ( 
  
 "crypto/sha256" 
  
 "encoding/base64" 
  
 "fmt" 
  
 "strings" 
 ) 
 // Hash hashes an email, phone, first name, or last name into the correct format. 
 func 
  
 Hash 
 ( 
 token 
  
 string 
 ) 
  
 string 
  
 { 
  
 formatted 
  
 := 
  
 strings 
 . 
 TrimSpace 
 ( 
 strings 
 . 
 ToLower 
 ( 
 token 
 )) 
  
 hashed 
  
 := 
  
 sha256 
 . 
 Sum256 
 ([] 
 byte 
 ( 
 formatted 
 )) 
  
 encoded 
  
 := 
  
 base64 
 . 
 StdEncoding 
 . 
 EncodeToString 
 ( 
 hashed 
 [:]) 
  
 return 
  
 encoded 
 } 
 // PrintHash prints the hash for a token. 
 func 
  
 PrintHash 
 ( 
 token 
  
 string 
 ) 
  
 { 
  
 fmt 
 . 
 Printf 
 ( 
 "Hash: \"%s\"\t(Token: %s)\n" 
 , 
  
 Hash 
 ( 
 token 
 ), 
  
 token 
 ) 
 } 
 func 
  
 main 
 () 
  
 { 
  
 PrintHash 
 ( 
 "test@gmail.com" 
 ) 
  
 PrintHash 
 ( 
 "+18005551212" 
 ) 
  
 PrintHash 
 ( 
 "John" 
 ) 
  
 PrintHash 
 ( 
 "Doe" 
 ) 
 } 
 

Java

Base16

  package 
  
 updm.hashing 
 ; 
 import static 
  
 java.nio.charset.StandardCharsets.UTF_8 
 ; 
 import 
  
 com.google.common.base.Ascii 
 ; 
 import 
  
 com.google.common.hash.Hashing 
 ; 
 /** 
 * Example of the UPDM hashing algorithm using hex-encoded SHA-256. 
 * 
 * <p>This uses the Guava Hashing to generate the hash: https://github.com/google/guava 
 * 
 * <p>Sample valid hashes: 
 * 
 * <ul> 
 *   <li>Email "test@gmail.com": "87924606b4131a8aceeeae8868531fbb9712aaa07a5d3a756b26ce0f5d6ca674" 
 *   <li>Phone "+18005551212": "61d9111bed3e6d9cfc1bc3b5cb35a402687c4f1546bee061a2bd444fbdd64c44" 
 *   <li>First name "John": "96d9632f363564cc3032521409cf22a852f2032eec099ed5967c0d000cec607a" 
 *   <li>Last name "Doe": "799ef92a11af918e3fb741df42934f3b568ed2d93ac1df74f1b8d41a27932a6f" 
 * </ul> 
 */ 
 public 
  
 final 
  
 class 
 HashExample 
  
 { 
  
 private 
  
 HashExample 
 () 
  
 {} 
  
 public 
  
 static 
  
 String 
  
 hash 
 ( 
 String 
  
 token 
 ) 
  
 { 
  
 // Normalizes and hashes the input token. 
  
 String 
  
 formattedToken 
  
 = 
  
 Ascii 
 . 
 toLowerCase 
 ( 
 token 
 ). 
 strip 
 (); 
  
 return 
  
 Hashing 
 . 
 sha256 
 (). 
 hashString 
 ( 
 formattedToken 
 , 
  
 UTF_8 
 ). 
 toString 
 (); 
  
 } 
  
 public 
  
 static 
  
 void 
  
 printHash 
 ( 
 String 
  
 token 
 ) 
  
 { 
  
 // Calculates and prints the token's hash. 
  
 System 
 . 
 out 
 . 
 printf 
 ( 
 "Hash: \"%s\"\t(Token: %s)\n" 
 , 
  
 hash 
 ( 
 token 
 ), 
  
 token 
 ); 
  
 } 
  
 public 
  
 static 
  
 void 
  
 main 
 ( 
 String 
 [] 
  
 args 
 ) 
  
 { 
  
 // Executes hash calculations and prints results for sample tokens. 
  
 printHash 
 ( 
 "test@gmail.com" 
 ); 
  
 printHash 
 ( 
 "+18005551212" 
 ); 
  
 printHash 
 ( 
 "John" 
 ); 
  
 printHash 
 ( 
 "Doe" 
 ); 
  
 } 
 } 
 

Base64

  package 
  
 updm.hashing 
 ; 
 import static 
  
 java.nio.charset.StandardCharsets.UTF_8 
 ; 
 import 
  
 java.security.MessageDigest 
 ; 
 import 
  
 java.security.NoSuchAlgorithmException 
 ; 
 import 
  
 java.util.Base64 
 ; 
 /** 
 * Example of the hashing algorithm. 
 * 
 * <p>Sample hashes: 
 * 
 * <ul> 
 *   <li>Email 'test@gmail.com': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ= 
 *   <li>Phone '+18005551212': YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ= 
 *   <li>First name 'John': ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo= 
 *   <li>Last name 'Doe': eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8= 
 * </ul> 
 */ 
 public 
  
 final 
  
 class 
 HashExample 
  
 { 
 private 
  
 HashExample 
 () 
  
 {} 
 public 
  
 static 
  
 String 
  
 hash 
 ( 
 String 
  
 token 
 ) 
  
 { 
  
 // Normalizes and hashes the input token using SHA-256 and Base64 encoding. 
  
 String 
  
 formattedToken 
  
 = 
  
 token 
 . 
 toLowerCase 
 (). 
 strip 
 (); 
  
 byte 
 [] 
  
 hash 
 ; 
  
 try 
  
 { 
  
 hash 
  
 = 
  
 MessageDigest 
 . 
 getInstance 
 ( 
 "SHA-256" 
 ). 
 digest 
 ( 
 formattedToken 
 . 
 getBytes 
 ( 
 UTF_8 
 )); 
  
 } 
  
 catch 
  
 ( 
 NoSuchAlgorithmException 
  
 e 
 ) 
  
 { 
  
 throw 
  
 new 
  
 IllegalStateException 
 ( 
 "SHA-256 not supported" 
 , 
  
 e 
 ); 
  
 } 
  
 return 
  
 Base64 
 . 
 getEncoder 
 (). 
 encodeToString 
 ( 
 hash 
 ); 
 } 
 public 
  
 static 
  
 void 
  
 printHash 
 ( 
 String 
  
 token 
 ) 
  
 { 
  
 // Calculates and prints the hash for the given token. 
  
 System 
 . 
 out 
 . 
 printf 
 ( 
 "Hash: \"%s\"\t(Token: %s)\n" 
 , 
  
 hash 
 ( 
 token 
 ), 
  
 token 
 ); 
 } 
 public 
  
 static 
  
 void 
  
 main 
 ( 
 String 
 [] 
  
 args 
 ) 
  
 { 
  
 // Executes hash calculations and prints results for sample tokens. 
  
 printHash 
 ( 
 "test@gmail.com" 
 ); 
  
 printHash 
 ( 
 "+18005551212" 
 ); 
  
 printHash 
 ( 
 "John" 
 ); 
  
 printHash 
 ( 
 "Doe" 
 ); 
 } 
 } 
 

SQL

Base16

  /* 
 Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 The following code uses Google Standard SQL and can be run on BigQuery to generate match tables from unhashed data. 
 Sample hashes: 
 - Email 'test@gmail.com': 87924606b4131a8aceeeae8868531fbb9712aaa07a5d3a756b26ce0f5d6ca674 
 - Phone '+18005551212':   61d9111bed3e6d9cfc1bc3b5cb35a402687c4f1546bee061a2bd444fbdd64c44 
 - First name 'John':      96d9632f363564cc3032521409cf22a852f2032eec099ed5967c0d000cec607a 
 - Last name 'Doe':        799ef92a11af918e3fb741df42934f3b568ed2d93ac1df74f1b8d41a27932a6f 
 The unhashed input table schema is assumed to be: 
 - Column name: UserID, Type: String 
 - Column name: Email, Type: String 
 - Column name: Phone, Type: String 
 - Column name: FirstName, Type: String 
 - Column name: LastName, Type: String 
 - Column name: PostalCode, Type: String 
 - Column name: CountryCode, Type: String 
 */ 
 -- Creates a new table with hashed versions of specified columns from the input table. 
 CREATE 
  
 TABLE 
  
 ` 
 your_project_name 
 . 
 your_dataset_name 
 . 
 output_hashed_table_name 
 ` 
 AS 
 SELECT 
  
 UserID 
 , 
  
 TO_HEX 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 Email 
 ))) 
  
 AS 
  
 Email 
 , 
  
 TO_HEX 
 ( 
 SHA256 
 ( 
 Phone 
 )) 
  
 AS 
  
 Phone 
 , 
  
 TO_HEX 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 FirstName 
 ))) 
  
 AS 
  
 FirstName 
 , 
  
 TO_HEX 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 LastName 
 ))) 
  
 AS 
  
 LastName 
 , 
  
 PostalCode 
 , 
  
 CountryCode 
 , 
 FROM 
  
 ` 
 your_project_name 
 . 
 your_dataset_name 
 . 
 input_unhashed_table_name 
 ` 
 ; 
 

Base64

  /* 
 Provides the hashing algorithm, as well as some valid hashes of sample data for testing. 
 The following code uses Google Standard SQL and can be run on BigQuery to generate match tables from unhashed data. 
 Sample hashes: 
 - Email 'test@gmail.com': h5JGBrQTGorO7q6IaFMfu5cSqqB6XTp1aybOD11spnQ= 
 - Phone '+18005551212':   YdkRG+0+bZz8G8O1yzWkAmh8TxVGvuBhor1ET73WTEQ= 
 - First name 'John':      ltljLzY1ZMwwMlIUCc8iqFLyAy7sCZ7VlnwNAAzsYHo= 
 - Last name 'Doe':        eZ75KhGvkY4/t0HfQpNPO1aO0tk6wd908bjUGieTKm8= 
 The unhashed input table schema is assumed to be: 
 - Column name: UserID, Type: String 
 - Column name: Email, Type: String 
 - Column name: Phone, Type: String 
 - Column name: FirstName, Type: String 
 - Column name: LastName, Type: String 
 - Column name: PostalCode, Type: String 
 - Column name: CountryCode, Type: String 
 */ 
 -- Creates a new table with Base64-encoded SHA-256 hashes of specified columns. 
 CREATE 
  
 TABLE 
  
 ` 
 your_project_name 
 . 
 your_dataset_name 
 . 
 output_hashed_table_name 
 ` 
 AS 
 SELECT 
  
 UserID 
 , 
  
 TO_BASE64 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 Email 
 ))) 
  
 AS 
  
 Email 
 , 
  
 TO_BASE64 
 ( 
 SHA256 
 ( 
 Phone 
 )) 
  
 AS 
  
 Phone 
 , 
  
 TO_BASE64 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 FirstName 
 ))) 
  
 AS 
  
 FirstName 
 , 
  
 TO_BASE64 
 ( 
 SHA256 
 ( 
 LOWER 
 ( 
 LastName 
 ))) 
  
 AS 
  
 LastName 
 , 
  
 PostalCode 
 , 
  
 CountryCode 
 , 
 FROM 
  
 ` 
 your_project_name 
 . 
 your_dataset_name 
 . 
 input_unhashed_table_name 
 ` 
 ; 
 

Join keys

Some combinations of user-provided data are stronger than others. Following is a list of different user-provided data combinations, ranked by relative strength. If you use an address, you must include: First name, Last name, Country, and Zip code.

  1. Email, Phone, Address (strongest)
  2. Phone, Address
  3. Email, Address
  4. Email, Phone
  5. Address
  6. Phone
  7. Email (weakest)

Create a match table

  1. Click Connections > Create connection > User-provided data matching.
  2. Choose a data source, then click Connect.
  3. Authenticate, if prompted, then click Next:

    BigQuery

    Click Apply to grant access to BigQuery.

    Cloud Storage

    Click Apply to grant access to Cloud Storage.

    MySQL

    Enter your MySQL database location, port, username, and password.

    S3

    Enter your Amazon S3 secret access key .

    PostgreSQL

    Enter your PostgreSQL database location, port, username, password, and database.

    Redshift

    Enter your Redshift database location, port, username, password, and database.

    sFTP

    Enter your sFTP server location, username, and password.

    Snowflake

    Enter your Snowflake account identifier, username, and password.

  4. Configure your data source, then click Next:

    BigQuery

    Select the BigQuery table to import.

    Cloud Storage

    Enter the gsutil path, such as gs://my-bucket/folder/ and select your file's formatting.

    If this is the first time you have connected this resource, an alert appears. Click Apply to grant access, then click Next. Note: you must have a role with permission to delegate storage.buckets.setIamPolicy for the relevant bucket.

    MySQL

    Select the MySQL database and table that you want to use.

    S3

    Enter the URI to the file you want to upload, relative to the host address.

    PostgreSQL

    Enter the PostgreSQL schema and table (or view) name.

    Redshift

    Enter the Redshift schema and table (or view) name. By default, Redshift uses database location URLs that follow this template: cluster-identifier.account-number.aws-region.redshift.amazonaws.com .

    sFTP

    Enter the file path and name, formatted as / PATH / FILENAME .csv

    Snowflake

    Enter the Snowflake database, schema, and table (or view) that you want to use.

  5. Select a BigQuery dataset to use as an intermediary destination , then click Next. This step ensures that your data is correctly formatted .
  6. Optional: Modify the format of your data. Transformations include compute hash, lower/upper case formatting, and merging/splitting fields.
    1. Click Action> > Transform.
    2. In the panel that pops up, click Add transformationor Add another transformation
    3. Choose a transformation type from the dropdown menu and enter the requirements.
    4. Click Save.
  7. Choose at least one join key and map the fields you will use. Ads Data Hub will automatically map fields with identical names, indicated by a . Make any necessary edits, then click Next.
  8. Set a schedule:
    1. Name your connection.
    2. Set a frequency, dictating how often data will be imported into the dataset that you selected in the previous step. Each run will overwrite data in the destination table.
    3. Specify how you'd like user ID collisions to be handled. You can choose between keeping the existing match or overwriting with new data.
  9. Click Finish. Match tables are generally ready to be queried 12 hours after they are created.

View connection details

The connection details page gives you information on a given connection's recent runs and errors. To view details for a specific connection:

  1. Click Connections.
  2. Click the name of the connection to view its details.
  3. You can now see the connection's details and recent runs. Each shows two possible types of errors: connection-level (the connection didn't run) and row-level errors (a row wasn't imported).
    1. A Failedstatus indicates that the entire connection failed to run (e.g. service account permission issue). Click on the error status to see which errors impacted the connection.
    2. A Completedstatus indicates that the connection ran successfully. However, there may still be row-level errors—indicated by a non-zero value in the "Rows with errors" column. Click the value to learn more about which records failed.

Edit a connection

The following details can be edited:

  • Connection name
  • Schedule
  • Destination table
  • Field mapping

Editing data source isn't supported. To change a data source, create a new connection and delete the old one.

To edit connection details:

  1. Click Connections.
  2. Click the name of the connection you want to edit.
  3. Edit the details you want to change:
    • Connection name: Click Edit, enter the new name, then press Enter.
    • Schedule: Click Edit, set the new schedule, then click Save.
    • Destination table: Click Edit, enter the new destination name, then click Save.
    • Field mapping: Click , make changes to fields, then click Save.
  4. Click .

Query matched data

Query the match tables

When your match tables contain enough data to satisfy privacy checks, you're ready to run queries against the tables.

The original table for first-party data (1PD) is represented by my_data . This includes both Personally Identifiable Information (PII) and non-PII data. Using the original table can improve your reports with more insights, as it represents all the 1PD data in scope, when compared to a match table.

Each table in the Ads Data Hub schema containing a user_id field is accompanied by a match table. For example, for the adh.google_ads_impressions table, Ads Data Hub also generates a match table called adh.google_ads_impressions_updm containing your user IDs. Separate match tables are created for policy-isolated network tables. For example, for the adh.google_ads_impressions_policy_isolated_network table, Ads Data Hub also generates a match table called adh.google_ads_impressions_policy_isolated_network_updm containing your user IDs.

These tables contain a subset of the users available in the original tables, where there is a match on the user_id . For example, if the original table contains data for User A and User B, but only User A is matched, then User B won't be in the match table.

The match tables contain an additional column called customer_data_user_id , which stores the user identifier as BYTES.

It's important to consider the field's type when writing your queries. SQL comparison operators expect that the literals you're comparing are of the same type. Depending on how the user_id is stored in your table of first-party data, you may need to encode the values in the table before matching the data. You need to cast your join key into BYTES for successful matches:

  JOIN 
  
 ON 
  
 adh 
 . 
 google_ads_impressions_updm 
 . 
 customer_data_user_id 
  
 = 
  
 CAST 
 ( 
 my_data 
 . 
 user_id 
  
 AS 
  
 BYTES 
 ) 
 

Additionally, string comparisons in SQL are sensitive to capitalization, so you may need to encode strings on both sides of your comparison to ensure that they can be accurately compared.

Sample queries

Count matched users

This query counts the number of matched users in your Google Ads impressions table.

  /* Count matched users in Google Ads impressions table */ 
 SELECT 
  
 COUNT 
 ( 
 DISTINCT 
  
 user_id 
 ) 
 FROM 
  
 adh 
 . 
 google_ads_impressions_updm 
 

Calculate match rate

Not all users are eligible for matching. For example, signed-out users, children, and unconsented users are not matched through UPDM. You can use the is_updm_eligible field to calculate more accurate UPDM match rates. Note that the is_updm_eligible field was available beginning on October 1, 2024. You cannot use this field to calculate match rates before that date.

  /* Calculate the UPDM match rate */ 
 CREATE 
  
 TEMP 
  
 TABLE 
  
 total_events 
  
 OPTIONS 
 ( 
 privacy_checked_export 
 = 
 TRUE 
 ) 
  
 AS 
 SELECT 
  
 customer_id 
 , 
  
 COUNT 
 ( 
 * 
 ) 
  
 AS 
  
 n 
 FROM 
  
 adh 
 . 
 google_ads_impressions 
 WHERE 
  
 is_updm_eligible 
 GROUP 
  
 BY 
  
 1 
 ; 
 CREATE 
  
 TEMP 
  
 TABLE 
  
 matched_events 
  
 OPTIONS 
 ( 
 privacy_checked_export 
 = 
 TRUE 
 ) 
  
 AS 
 SELECT 
  
 customer_id 
 , 
  
 COUNT 
 ( 
 * 
 ) 
  
 AS 
  
 n 
 FROM 
  
 adh 
 . 
 google_ads_impressions_updm 
 GROUP 
  
 BY 
  
 1 
 ; 
 SELECT 
  
 customer_id 
 , 
  
 SAFE_DIVIDE 
 ( 
 matched_events 
 . 
 n 
 , 
  
 total_events 
 . 
 n 
 ) 
  
 AS 
  
 match_rate 
 FROM 
  
 total_events 
 LEFT 
  
 JOIN 
  
 matched_events 
  
 USING 
  
 ( 
 customer_id 
 ) 
 

Join first-party and Google Ads data

This query shows how to join first-party data with Google Ads data:

  /* Join first-party data with Google Ads data. The customer_data_user_id field 
 contains your ID as BYTES. You need to cast your join key into BYTES for 
 successful matches. */ 
 SELECT 
  
 inventory_type 
 , 
  
 COUNT 
 ( 
 * 
 ) 
  
 AS 
  
 impressions 
 FROM 
  
 adh 
 . 
 yt_reserve_impressions_updm 
  
 AS 
  
 google_data_imp 
 LEFT 
  
 JOIN 
  
 ` 
 my_data 
 ` 
 ON 
  
 google_data_imp 
 . 
 customer_data_user_id 
  
 = 
  
 CAST 
 ( 
 my_data 
 . 
 user_id 
  
 AS 
  
 BYTES 
 ) 
 GROUP 
  
 BY 
  
 inventory_type 
 

UPDM FAQs

For a list of FAQs related to UPDM, see UPDM FAQs .

Create a Mobile Website
View Site in Mobile | Classic
Share by: