Stay organized with collectionsSave and categorize content based on your preferences.
Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively
This tutorial shows you how to use the BigQuerybqutil.procedure.bqml_generate_embeddingspublic stored procedure to iterate
through calls to theML.GENERATE_EMBEDDINGfunction.
Calling the function iteratively lets you address any retryable errors that
occur due to exceeding thequotas and limitsthat apply to
the function.
To review the source code for thebqutil.procedure.bqml_generate_embeddingsstored procedure in GitHub, seebqml_generate_embeddings.sqlx.
For more information about the stored procedure parameters and usage, see theREADME file.
This tutorial guides you through the following tasks:
Iterating through calls to theML.GENERATE_EMBEDDINGfunction, using the
remote model and thebigquery-public-data.bbc_news.fulltextpublic data
table with thebqutil.procedure.bqml_generate_embeddingsstored procedure.
Required permissions
To run this tutorial, you need the following Identity and Access Management (IAM)
roles:
Create and use BigQuery datasets, connections, and models:
BigQuery Admin (roles/bigquery.admin).
Grant permissions to the connection's service account: Project IAM Admin
(roles/resourcemanager.projectIamAdmin).
These predefined roles contain the permissions required to perform the tasks in
this document. To see the exact permissions that are required, expand theRequired permissionssection:
Required permissions
Create a dataset:bigquery.datasets.create
Create, delegate, and use a connection:bigquery.connections.*
Set the default connection:bigquery.config.*
Set service account permissions:resourcemanager.projects.getIamPolicyandresourcemanager.projects.setIamPolicy
In the Google Cloud console, on the project selector page,
select or create a Google Cloud project.
Roles required to select or create a project
Select a project: Selecting a project doesn't require a specific
IAM role—you can select any project that you've been
granted a role on.
Create a project: To create a project, you need the Project Creator
(roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission.Learn how to grant
roles.
Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.
Roles required to enable APIs
To enable APIs, you need the Service Usage Admin IAM
role (roles/serviceusage.serviceUsageAdmin), which
contains theserviceusage.services.enablepermission.Learn how to grant
roles.
The query takes several seconds to complete, after which theembeddingmodel appears in thesampledataset in theExplorerpane.
Because the query uses aCREATE MODELstatement to create a model, there
are no query results.
Run the stored procedure
Run thebqutil.procedure.bqml_generate_embeddingsstored procedure, which
iterates through calls to theML.GENERATE_EMBEDDINGfunction
using thetarget_dataset.embedding_modelmodel and thebigquery-public-data.bbc_news.fulltextpublic data table:
In the Google Cloud console, go to theBigQuerypage.
CALL`bqutil.procedure.bqml_generate_embeddings`("bigquery-public-data.bbc_news.fulltext",-- source table"PROJECT_ID.target_dataset.news_body_embeddings",-- destination table"PROJECT_ID.target_dataset.embedding_model",-- model"body",-- content column["filename"],-- key columns'{}'-- optional arguments encoded as a JSON string);
ReplacePROJECT_IDwith the project ID of the
project you are using for this tutorial.
The stored procedure creates atarget_dataset.news_body_embeddingstable
to contain the output of theML.GENERATE_EMBEDDINGfunction.
When the query is finished running, confirm that there are no rows
in thetarget_dataset.news_body_embeddingstable that contain a retryable
error. In the query editor, run the following statement:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-09 UTC."],[[["\u003cp\u003eThis tutorial demonstrates how to use the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure to iteratively call the \u003ccode\u003eML.GENERATE_EMBEDDING\u003c/code\u003e function in BigQuery, helping to handle errors caused by exceeding usage quotas.\u003c/p\u003e\n"],["\u003cp\u003eThe tutorial guides you through creating a remote model over a \u003ccode\u003etext-embedding-005\u003c/code\u003e model and utilizing the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure with a public dataset to generate embeddings.\u003c/p\u003e\n"],["\u003cp\u003eKey steps involve creating a dataset and a connection, enabling the necessary APIs (BigQuery, BigQuery Connection, and Vertex AI), granting permissions to the connection's service account, and running the stored procedure to manage the iterative process of the model.\u003c/p\u003e\n"],["\u003cp\u003eThe document outlines the costs associated with using BigQuery ML and Vertex AI, as well as the necessary setup and permissions required to complete the tutorial.\u003c/p\u003e\n"]]],[],null,["Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively\n\nThis tutorial shows you how to use the BigQuery\n`bqutil.procedure.bqml_generate_embeddings` public stored procedure to iterate\nthrough calls to the\n[`ML.GENERATE_EMBEDDING` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding).\nCalling the function iteratively lets you address any retryable errors that\noccur due to exceeding the\n[quotas and limits](/bigquery/quotas#cloud_ai_service_functions) that apply to\nthe function.\n\nTo review the source code for the `bqutil.procedure.bqml_generate_embeddings`\nstored procedure in GitHub, see\n[`bqml_generate_embeddings.sqlx`](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/definitions/bqml_generate_embeddings.sqlx).\nFor more information about the stored procedure parameters and usage, see the\n[README file](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/README.md#bqml_generate_embeddings-source_table-string-target_table-string-ml_model-string-content_column-string-key_columns-array-options_string-string).\n\nThis tutorial guides you through the following tasks:\n\n- Creating a [remote model over a `text-embedding-005` model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model).\n- Iterating through calls to the `ML.GENERATE_EMBEDDING` function, using the remote model and the `bigquery-public-data.bbc_news.fulltext` public data table with the `bqutil.procedure.bqml_generate_embeddings` stored procedure.\n\nRequired permissions\n\nTo run this tutorial, you need the following Identity and Access Management (IAM)\nroles:\n\n- Create and use BigQuery datasets, connections, and models: BigQuery Admin (`roles/bigquery.admin`).\n- Grant permissions to the connection's service account: Project IAM Admin (`roles/resourcemanager.projectIamAdmin`).\n\nThese predefined roles contain the permissions required to perform the tasks in\nthis document. To see the exact permissions that are required, expand the\n**Required permissions** section: \n\nRequired permissions\n\n- Create a dataset: `bigquery.datasets.create`\n- Create, delegate, and use a connection: `bigquery.connections.*`\n- Set the default connection: `bigquery.config.*`\n- Set service account permissions: `resourcemanager.projects.getIamPolicy` and `resourcemanager.projects.setIamPolicy`\n- Create a model and run inference:\n - `bigquery.jobs.create`\n - `bigquery.models.create`\n - `bigquery.models.getData`\n - `bigquery.models.updateData`\n - `bigquery.models.updateMetadata`\n\nYou might also be able to get these permissions with\n[custom roles](/iam/docs/creating-custom-roles) or other\n[predefined roles](/iam/docs/understanding-roles).\n\nCosts\n\n\nIn this document, you use the following billable components of Google Cloud:\n\n\n- **BigQuery ML**: You incur costs for the data that you process in BigQuery.\n- **Vertex AI**: You incur costs for calls to the Vertex AI model.\n\n\nTo generate a cost estimate based on your projected usage,\nuse the [pricing calculator](/products/calculator). \nNew Google Cloud users might be eligible for a [free trial](/free). \n\n\u003cbr /\u003e\n\nFor more information about BigQuery pricing, see\n[BigQuery pricing](/bigquery/pricing).\n\nFor more information about Vertex AI pricing, see\n[Vertex AI pricing](/vertex-ai/generative-ai/pricing).\n\nBefore you begin\n\n1. In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n2.\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n3.\n\n\n Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,aiplatform.googleapis.com)\n\nCreate a dataset\n\nCreate a BigQuery dataset to store your models and sample data:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to the **BigQuery** page](https://console.cloud.google.com/bigquery)\n2. In the **Explorer** pane, click your project name.\n\n3. Click more_vert **View actions \\\u003e Create dataset**.\n\n4. On the **Create dataset** page, do the following:\n\n 1. For **Dataset ID** , enter `target_dataset`.\n\n 2. For **Location type** , select **Multi-region** , and then select\n **US (multiple regions in United States)**.\n\n 3. Leave the remaining default settings as they are, and click\n **Create dataset**.\n\nCreate the text embedding generation model\n\nCreate a remote model that represents a hosted Vertex AI\n`text-embedding-005` model:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CREATE OR REPLACE MODEL `target_dataset.embedding_model`\n REMOTE WITH CONNECTION DEFAULT\n OPTIONS (ENDPOINT = 'text-embedding-005');\n ```\n\n The query takes several seconds to complete, after which the\n `embedding` model appears in the `sample` dataset in the **Explorer** pane.\n Because the query uses a `CREATE MODEL` statement to create a model, there\n are no query results.\n\nRun the stored procedure\n\nRun the `bqutil.procedure.bqml_generate_embeddings` stored procedure, which\niterates through calls to the `ML.GENERATE_EMBEDDING` function\nusing the `target_dataset.embedding_model` model and the\n`bigquery-public-data.bbc_news.fulltext` public data table:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CALL `bqutil.procedure.bqml_generate_embeddings`(\n \"bigquery-public-data.bbc_news.fulltext\", -- source table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.news_body_embeddings\", -- destination table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.embedding_model\", -- model\n \"body\", -- content column\n [\"filename\"], -- key columns\n '{}' -- optional arguments encoded as a JSON string\n );\n ```\n\n Replace \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e with the project ID of the\n project you are using for this tutorial.\n\n The stored procedure creates a `target_dataset.news_body_embeddings` table\n to contain the output of the `ML.GENERATE_EMBEDDING` function.\n3. When the query is finished running, confirm that there are no rows\n in the `target_dataset.news_body_embeddings` table that contain a retryable\n error. In the query editor, run the following statement:\n\n ```googlesql\n SELECT *\n FROM `target_dataset.news_body_embeddings`\n WHERE ml_generate_embedding_status LIKE '%A retryable error occurred%';\n ```\n\n The query returns the message `No data to display`.\n\nClean up\n\n| **Caution** : Deleting a project has the following effects:\n|\n| - **Everything in the project is deleted.** If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.\n| - **Custom project IDs are lost.** When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an `appspot.com` URL, delete selected resources inside the project instead of deleting the whole project.\n|\n|\n| If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects\n| can help you avoid exceeding project quota limits.\n1. In the Google Cloud console, go to the **Manage resources** page.\n\n [Go to Manage resources](https://console.cloud.google.com/iam-admin/projects)\n2. In the project list, select the project that you want to delete, and then click **Delete**.\n3. In the dialog, type the project ID, and then click **Shut down** to delete the project.\n\n\u003cbr /\u003e"]]