Stay organized with collectionsSave and categorize content based on your preferences.
An external data source is a data source that you can query directly from
BigQuery, even though the data is not stored in
BigQuery storage. For example, you might have data in a
different Google Cloud database, in files in Cloud Storage, or in a
different cloud product altogether that you would like to analyze in
BigQuery, but that you aren't prepared to migrate.
Use cases for external data sources include the following:
For extract-load-transform (ELT) workloads, loading and cleaning your data
in one pass and writing the cleaned result into BigQuery
storage, by using aCREATE TABLE ... AS SELECTquery.
Joining BigQuery tables with frequently changing data from
an external data source. By querying the external data source directly, you
don't need to reload the data into BigQuery storage every
time it changes.
As an Ads Data Hub for Marketers customer, you can leverage this BigQuery feature
to easily bring in first-party data from other sources, such as S3 and Azure,
and join it to Google advertising data in your queries.
The following is a high-level overview of the steps required to export data from
Amazon S3 to BigQuery for use in Ads Data Hub. Refer toConnect to Amazon S3for full details.
Create an AWS IAM policy for BigQuery. After the policy is created, the
Amazon Resource Name (ARN) can be found in thePolicy detailspage.
Create an AWS IAM role for BigQuery, using the policy created in the
previous step.
Create a connection in BigQuery. Create a connection in a BigQuery project
that Ads Data Hub has access to—for example, your admin project. The
BigQuery Google identity, which will be used in the next step, is shown in
theConnection infopage.
Add a trust relationship to the AWS role. In theAWS IAMpage, edit the
role created in the earlier step:
Modify the maximum session duration to 12 hours.
Add a trust policy to the AWS role using the BigQuery Google identity
created in the previous step.
Optional:Schedulecontinuous data load in BigQuery.
Azure Blob Storage
The following is a high-level overview of the steps required to export data from
Azure Blob Storage to BigQuery for use in Ads Data Hub. Refer toConnect to Blob Storagefor full details.
Create an application in your Azure tenant.
Create a connection in BigQuery.
Tenant IDis the directory ID from the previous step.
Federated Application (Client) IDis the Application (client) ID
from the previous step.
BigQuery Google identitywill be used the next step.
Add a federated credential in Azure.
ForSubject identifier, use the BigQuery Google identity from the
previous step.
Assign a role to BigQuery's Azure applications, granting Storage Blob Data
Reader access.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2024-09-18 UTC."],[[["\u003cp\u003eBigQuery external data sources allow you to query data from sources like AWS S3 and Azure Blob Storage directly without importing it into BigQuery.\u003c/p\u003e\n"],["\u003cp\u003eThis feature enables efficient extract-load-transform (ELT) processes and joining BigQuery tables with frequently updated external data.\u003c/p\u003e\n"],["\u003cp\u003eAds Data Hub users can leverage external data sources to combine their first-party data with Google advertising data for analysis.\u003c/p\u003e\n"],["\u003cp\u003eConnecting to external data sources involves setting up connections and permissions in both BigQuery and the external data source (AWS or Azure).\u003c/p\u003e\n"],["\u003cp\u003eData loading and querying can be automated through scheduled tasks for continuous data integration.\u003c/p\u003e\n"]]],["External data sources enable querying data directly from other sources like Cloud Storage, S3, or Azure without migrating it to BigQuery. Use cases include ELT workloads and joining frequently changing data. To connect S3, create an AWS IAM policy and role, then create a BigQuery connection, add a trust relationship, and load/query data. For Azure Blob Storage, create an Azure application, make a BigQuery connection, add federated credentials, assign a role, and then load and query data.\n"],null,["# BigQuery external connections\n\nAn external data source is a data source that you can query directly from\nBigQuery, even though the data is not stored in\nBigQuery storage. For example, you might have data in a\ndifferent Google Cloud database, in files in Cloud Storage, or in a\ndifferent cloud product altogether that you would like to analyze in\nBigQuery, but that you aren't prepared to migrate.\n\nUse cases for external data sources include the following:\n\n- For extract-load-transform (ELT) workloads, loading and cleaning your data in one pass and writing the cleaned result into BigQuery storage, by using a `CREATE TABLE ... AS SELECT` query.\n- Joining BigQuery tables with frequently changing data from an external data source. By querying the external data source directly, you don't need to reload the data into BigQuery storage every time it changes.\n\nAs an Ads Data Hub for Marketers customer, you can leverage this BigQuery feature\nto easily bring in first-party data from other sources, such as S3 and Azure,\nand join it to Google advertising data in your queries.\n\nFor complete details on connecting external data sources to BigQuery, see\n[Introduction to external data sources](https://cloud.google.com/bigquery/docs/external-data-sources#external_data_source_feature_comparison).\n\nLimitations\n-----------\n\n- The following [locations](https://cloud.google.com/bigquery/docs/omni-introduction#locations) are supported. If your AWS or Azure data is in an unsupported region, you could also consider using [BigQuery Data Transfer Service](https://cloud.google.com/bigquery/docs/dts-introduction).\n - AWS - US East (N. Virginia) (`aws-us-east-1`)\n - Azure - East US 2 (`azure-eastus2`)\n- Jobs that are run on data from BigQuery connections:\n - are subject to the same [aggregation requirements](/ads-data-hub/marketers/guides/privacy-checks#aggregation_requirements) as other jobs in Ads Data Hub\n - must adhere to Google's [policies](/ads-data-hub/marketers/resources/policies)\n\nAmazon S3\n---------\n\nThe following is a high-level overview of the steps required to export data from\nAmazon S3 to BigQuery for use in Ads Data Hub. Refer to\n[Connect to Amazon S3](https://cloud.google.com/bigquery/docs/omni-aws-create-connection)\nfor full details.\n\n1. Create an AWS IAM policy for BigQuery. After the policy is created, the Amazon Resource Name (ARN) can be found in the **Policy details** page.\n2. Create an AWS IAM role for BigQuery, using the policy created in the previous step.\n3. Create a connection in BigQuery. Create a connection in a BigQuery project that Ads Data Hub has access to---for example, your admin project. The BigQuery Google identity, which will be used in the next step, is shown in the **Connection info** page.\n4. Add a trust relationship to the AWS role. In the **AWS IAM** page, edit the role created in the earlier step:\n 1. Modify the maximum session duration to 12 hours.\n 2. Add a trust policy to the AWS role using the BigQuery Google identity created in the previous step.\n5. [Load data](https://cloud.google.com/bigquery/docs/load-data-using-cross-cloud-transfer#load_data) into the BigQuery dataset.\n6. [Query the data](/ads-data-hub/marketers/guides/run-queries) in Ads Data Hub. Learn about [joining first-party data](/ads-data-hub/marketers/guides/join-your-data).\n7. Optional: [Schedule](https://cloud.google.com/bigquery/docs/scheduling-queries) continuous data load in BigQuery.\n\nAzure Blob Storage\n------------------\n\nThe following is a high-level overview of the steps required to export data from\nAzure Blob Storage to BigQuery for use in Ads Data Hub. Refer to\n[Connect to Blob Storage](https://cloud.google.com/bigquery/docs/omni-azure-create-connection)\nfor full details.\n\n1. Create an application in your Azure tenant.\n2. Create a connection in BigQuery.\n - **Tenant ID** is the directory ID from the previous step.\n - **Federated Application (Client) ID** is the Application (client) ID from the previous step.\n - **BigQuery Google identity** will be used the next step.\n3. Add a federated credential in Azure.\n - For **Subject identifier**, use the BigQuery Google identity from the previous step.\n4. Assign a role to BigQuery's Azure applications, granting Storage Blob Data Reader access.\n5. [Load data](https://cloud.google.com/bigquery/docs/load-data-using-cross-cloud-transfer#load_data) into the BigQuery dataset.\n6. [Query the data](/ads-data-hub/marketers/guides/run-queries) in Ads Data Hub. Learn about [joining first-party data](/ads-data-hub/marketers/guides/join-your-data).\n7. Optional: [Schedule](https://cloud.google.com/bigquery/docs/scheduling-queries) continuous data load in BigQuery."]]