Stay organized with collectionsSave and categorize content based on your preferences.
Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively
This tutorial shows you how to use the BigQuerybqutil.procedure.bqml_generate_embeddingspublic stored procedure to iterate
through calls to theML.GENERATE_EMBEDDINGfunction.
Calling the function iteratively lets you address any retryable errors that
occur due to exceeding thequotas and limitsthat apply to
the function.
To review the source code for thebqutil.procedure.bqml_generate_embeddingsstored procedure in GitHub, seebqml_generate_embeddings.sqlx.
For more information about the stored procedure parameters and usage, see theREADME file.
This tutorial guides you through the following tasks:
Iterating through calls to theML.GENERATE_EMBEDDINGfunction, using the
remote model and thebigquery-public-data.bbc_news.fulltextpublic data
table with thebqutil.procedure.bqml_generate_embeddingsstored procedure.
Required permissions
To run this tutorial, you need the following Identity and Access Management (IAM)
roles:
Create and use BigQuery datasets, connections, and models:
BigQuery Admin (roles/bigquery.admin).
Grant permissions to the connection's service account: Project IAM Admin
(roles/resourcemanager.projectIamAdmin).
These predefined roles contain the permissions required to perform the tasks in
this document. To see the exact permissions that are required, expand theRequired permissionssection:
Required permissions
Create a dataset:bigquery.datasets.create
Create, delegate, and use a connection:bigquery.connections.*
Set the default connection:bigquery.config.*
Set service account permissions:resourcemanager.projects.getIamPolicyandresourcemanager.projects.setIamPolicy
The query takes several seconds to complete, after which theembeddingmodel appears in thesampledataset in theExplorerpane.
Because the query uses aCREATE MODELstatement to create a model, there
are no query results.
Run the stored procedure
Run thebqutil.procedure.bqml_generate_embeddingsstored procedure, which
iterates through calls to theML.GENERATE_EMBEDDINGfunction
using thetarget_dataset.embedding_modelmodel and thebigquery-public-data.bbc_news.fulltextpublic data table:
In the Google Cloud console, go to theBigQuerypage.
CALL`bqutil.procedure.bqml_generate_embeddings`("bigquery-public-data.bbc_news.fulltext",-- source table"PROJECT_ID.target_dataset.news_body_embeddings",-- destination table"PROJECT_ID.target_dataset.embedding_model",-- model"body",-- content column["filename"],-- key columns'{}'-- optional arguments encoded as a JSON string);
ReplacePROJECT_IDwith the project ID of the
project you are using for this tutorial.
The stored procedure creates atarget_dataset.news_body_embeddingstable
to contain the output of theML.GENERATE_EMBEDDINGfunction.
When the query is finished running, confirm that there are no rows
in thetarget_dataset.news_body_embeddingstable that contain a retryable
error. In the query editor, run the following statement:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eThis tutorial demonstrates how to use the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure to iteratively call the \u003ccode\u003eML.GENERATE_EMBEDDING\u003c/code\u003e function in BigQuery, helping to handle errors caused by exceeding usage quotas.\u003c/p\u003e\n"],["\u003cp\u003eThe tutorial guides you through creating a remote model over a \u003ccode\u003etext-embedding-005\u003c/code\u003e model and utilizing the \u003ccode\u003ebqutil.procedure.bqml_generate_embeddings\u003c/code\u003e stored procedure with a public dataset to generate embeddings.\u003c/p\u003e\n"],["\u003cp\u003eKey steps involve creating a dataset and a connection, enabling the necessary APIs (BigQuery, BigQuery Connection, and Vertex AI), granting permissions to the connection's service account, and running the stored procedure to manage the iterative process of the model.\u003c/p\u003e\n"],["\u003cp\u003eThe document outlines the costs associated with using BigQuery ML and Vertex AI, as well as the necessary setup and permissions required to complete the tutorial.\u003c/p\u003e\n"]]],[],null,["Handle quota errors by calling ML.GENERATE_EMBEDDING iteratively\n\nThis tutorial shows you how to use the BigQuery\n`bqutil.procedure.bqml_generate_embeddings` public stored procedure to iterate\nthrough calls to the\n[`ML.GENERATE_EMBEDDING` function](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-generate-embedding).\nCalling the function iteratively lets you address any retryable errors that\noccur due to exceeding the\n[quotas and limits](/bigquery/quotas#cloud_ai_service_functions) that apply to\nthe function.\n\nTo review the source code for the `bqutil.procedure.bqml_generate_embeddings`\nstored procedure in GitHub, see\n[`bqml_generate_embeddings.sqlx`](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/definitions/bqml_generate_embeddings.sqlx).\nFor more information about the stored procedure parameters and usage, see the\n[README file](https://github.com/GoogleCloudPlatform/bigquery-utils/blob/master/stored_procedures/README.md#bqml_generate_embeddings-source_table-string-target_table-string-ml_model-string-content_column-string-key_columns-array-options_string-string).\n\nThis tutorial guides you through the following tasks:\n\n- Creating a [remote model over a `text-embedding-005` model](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-remote-model).\n- Iterating through calls to the `ML.GENERATE_EMBEDDING` function, using the remote model and the `bigquery-public-data.bbc_news.fulltext` public data table with the `bqutil.procedure.bqml_generate_embeddings` stored procedure.\n\nRequired permissions\n\nTo run this tutorial, you need the following Identity and Access Management (IAM)\nroles:\n\n- Create and use BigQuery datasets, connections, and models: BigQuery Admin (`roles/bigquery.admin`).\n- Grant permissions to the connection's service account: Project IAM Admin (`roles/resourcemanager.projectIamAdmin`).\n\nThese predefined roles contain the permissions required to perform the tasks in\nthis document. To see the exact permissions that are required, expand the\n**Required permissions** section: \n\nRequired permissions\n\n- Create a dataset: `bigquery.datasets.create`\n- Create, delegate, and use a connection: `bigquery.connections.*`\n- Set the default connection: `bigquery.config.*`\n- Set service account permissions: `resourcemanager.projects.getIamPolicy` and `resourcemanager.projects.setIamPolicy`\n- Create a model and run inference:\n - `bigquery.jobs.create`\n - `bigquery.models.create`\n - `bigquery.models.getData`\n - `bigquery.models.updateData`\n - `bigquery.models.updateMetadata`\n\nYou might also be able to get these permissions with\n[custom roles](/iam/docs/creating-custom-roles) or other\n[predefined roles](/iam/docs/understanding-roles).\n\nCosts\n\n\nIn this document, you use the following billable components of Google Cloud:\n\n\n- **BigQuery ML**: You incur costs for the data that you process in BigQuery.\n- **Vertex AI**: You incur costs for calls to the Vertex AI model.\n\n\nTo generate a cost estimate based on your projected usage,\nuse the [pricing calculator](/products/calculator). \nNew Google Cloud users might be eligible for a [free trial](/free). \n\n\u003cbr /\u003e\n\nFor more information about BigQuery pricing, see\n[BigQuery pricing](/bigquery/pricing).\n\nFor more information about Vertex AI pricing, see\n[Vertex AI pricing](/vertex-ai/generative-ai/pricing).\n\nBefore you begin\n\n1. In the Google Cloud console, on the project selector page,\n select or create a Google Cloud project.\n\n | **Note**: If you don't plan to keep the resources that you create in this procedure, create a project instead of selecting an existing project. After you finish these steps, you can delete the project, removing all resources associated with the project.\n\n [Go to project selector](https://console.cloud.google.com/projectselector2/home/dashboard)\n2.\n [Verify that billing is enabled for your Google Cloud project](/billing/docs/how-to/verify-billing-enabled#confirm_billing_is_enabled_on_a_project).\n\n3.\n\n\n Enable the BigQuery, BigQuery Connection, and Vertex AI APIs.\n\n\n [Enable the APIs](https://console.cloud.google.com/flows/enableapi?apiid=bigquery.googleapis.com,bigqueryconnection.googleapis.com,aiplatform.googleapis.com)\n\nCreate a dataset\n\nCreate a BigQuery dataset to store your models and sample data:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to the **BigQuery** page](https://console.cloud.google.com/bigquery)\n2. In the **Explorer** pane, click your project name.\n\n3. Click more_vert **View actions \\\u003e Create dataset**.\n\n4. On the **Create dataset** page, do the following:\n\n 1. For **Dataset ID** , enter `target_dataset`.\n\n 2. For **Location type** , select **Multi-region** , and then select\n **US (multiple regions in United States)**.\n\n 3. Leave the remaining default settings as they are, and click\n **Create dataset**.\n\nCreate the text embedding generation model\n\nCreate a remote model that represents a hosted Vertex AI\n`text-embedding-005` model:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CREATE OR REPLACE MODEL `target_dataset.embedding_model`\n REMOTE WITH CONNECTION DEFAULT\n OPTIONS (ENDPOINT = 'text-embedding-005');\n ```\n\n The query takes several seconds to complete, after which the\n `embedding` model appears in the `sample` dataset in the **Explorer** pane.\n Because the query uses a `CREATE MODEL` statement to create a model, there\n are no query results.\n\nRun the stored procedure\n\nRun the `bqutil.procedure.bqml_generate_embeddings` stored procedure, which\niterates through calls to the `ML.GENERATE_EMBEDDING` function\nusing the `target_dataset.embedding_model` model and the\n`bigquery-public-data.bbc_news.fulltext` public data table:\n\n1. In the Google Cloud console, go to the **BigQuery** page.\n\n [Go to BigQuery](https://console.cloud.google.com/bigquery)\n2. In the query editor, run the following statement:\n\n ```googlesql\n CALL `bqutil.procedure.bqml_generate_embeddings`(\n \"bigquery-public-data.bbc_news.fulltext\", -- source table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.news_body_embeddings\", -- destination table\n \"\u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e.target_dataset.embedding_model\", -- model\n \"body\", -- content column\n [\"filename\"], -- key columns\n '{}' -- optional arguments encoded as a JSON string\n );\n ```\n\n Replace \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e with the project ID of the\n project you are using for this tutorial.\n\n The stored procedure creates a `target_dataset.news_body_embeddings` table\n to contain the output of the `ML.GENERATE_EMBEDDING` function.\n3. When the query is finished running, confirm that there are no rows\n in the `target_dataset.news_body_embeddings` table that contain a retryable\n error. In the query editor, run the following statement:\n\n ```googlesql\n SELECT *\n FROM `target_dataset.news_body_embeddings`\n WHERE ml_generate_embedding_status LIKE '%A retryable error occurred%';\n ```\n\n The query returns the message `No data to display`.\n\nClean up\n\n| **Caution** : Deleting a project has the following effects:\n|\n| - **Everything in the project is deleted.** If you used an existing project for the tasks in this document, when you delete it, you also delete any other work you've done in the project.\n| - **Custom project IDs are lost.** When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an `appspot.com` URL, delete selected resources inside the project instead of deleting the whole project.\n|\n|\n| If you plan to explore multiple architectures, tutorials, or quickstarts, reusing projects\n| can help you avoid exceeding project quota limits.\n1. In the Google Cloud console, go to the **Manage resources** page.\n\n [Go to Manage resources](https://console.cloud.google.com/iam-admin/projects)\n2. In the project list, select the project that you want to delete, and then click **Delete**.\n3. In the dialog, type the project ID, and then click **Shut down** to delete the project.\n\n\u003cbr /\u003e"]]