Stay organized with collectionsSave and categorize content based on your preferences.
The ML.TRANSFORM function
This document describes theML.TRANSFORMfunction, which you can use
to preprocess feature data. This function processes input data by
applying the data transformations captured in theTRANSFORMclauseof an existing model. The statistics that were calculated for data
transformation during model training are applied to the input data of the function.
ML.TRANSFORM(
MODEL `PROJECT_ID.DATASET.MODEL_NAME`,
{ TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }
)
Arguments
ML.TRANSFORMtakes the following arguments:
PROJECT_ID: the project that contains the
resource.
DATASET: the BigQuery dataset that
contains the resource.
MODEL_NAME: the name of a model. The model
must have been created by using aCREATE MODELstatement that includes aTRANSFORMclauseto manually preprocess feature data. You can check to see if a model uses aTRANSFORMclause by using thebq showcommandto look at themodel's metadata.
If the model was trained using aTRANSFORMclause, the model metadata
contains a section about the transform columns. The function returns an error
if you specify a model that was trained without aTRANSFORMclause.
TABLE_NAME: the name of the input table that
contains the feature data to preprocess.
If you specify a value for theTABLE_NAMEargument, the input column names
in the table must match the input column names in the model'sTRANSFORMclause, and their types should be compatible according to
BigQueryimplicit coercion rules.
You can get the input column names and data types from themodel's metadata,
in the section about the feature columns.
QUERY_STATEMENT: A query that generates the feature
data to preprocess. For the supported SQL syntax of theQUERY_STATEMENTclause, seeGoogleSQL query syntax.
If you specify a value for theQUERY_STATEMENTargument, the input column
names from the query must match the input column names in the model'sTRANSFORMclause, and their types should be compatible according to
BigQueryimplicit coercion rules.
You can get the input column names and data types from themodel's metadata,
in the section about the feature columns.
Output
ML.TRANSFORMreturns the columns specified in the model'sTRANSFORMclause.
Example
The following example returns feature data that has been preprocessed by
using theTRANSFORMclause included in the model namedmydataset.mymodelin your default project.
Create the model that contains theTRANSFORMclause:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003e\u003ccode\u003eML.TRANSFORM\u003c/code\u003e is a function used to preprocess feature data by applying the transformations defined in the \u003ccode\u003eTRANSFORM\u003c/code\u003e clause of an existing model.\u003c/p\u003e\n"],["\u003cp\u003eThe function utilizes the statistics calculated during the model training phase to transform the input data.\u003c/p\u003e\n"],["\u003cp\u003eTo use \u003ccode\u003eML.TRANSFORM\u003c/code\u003e, the specified model must have been created with a \u003ccode\u003eTRANSFORM\u003c/code\u003e clause and its input data must match the column names and be compatible with the data types specified in this \u003ccode\u003eTRANSFORM\u003c/code\u003e clause.\u003c/p\u003e\n"],["\u003cp\u003eThe function accepts either a table or a query as input to provide the data to be preprocessed.\u003c/p\u003e\n"],["\u003cp\u003eThe \u003ccode\u003eML.TRANSFORM\u003c/code\u003e function returns the columns that are specified in the model's \u003ccode\u003eTRANSFORM\u003c/code\u003e clause, reflecting the applied transformations.\u003c/p\u003e\n"]]],[],null,["# The ML.TRANSFORM function\n=========================\n\nThis document describes the `ML.TRANSFORM` function, which you can use\nto preprocess feature data. This function processes input data by\napplying the data transformations captured in the\n[`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\nof an existing model. The statistics that were calculated for data\ntransformation during model training are applied to the input data of the function.\n\nFor more information about which models support\nfeature preprocessing, see\n[End-to-end user journey for each model](/bigquery/docs/e2e-journey).\n\nSyntax\n------\n\n```sql\nML.TRANSFORM(\n MODEL `PROJECT_ID.DATASET.MODEL_NAME`,\n { TABLE `PROJECT_ID.DATASET.TABLE_NAME` | (QUERY_STATEMENT) }\n)\n```\n\n### Arguments\n\n`ML.TRANSFORM` takes the following arguments:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the project that contains the resource.\n- \u003cvar translate=\"no\"\u003eDATASET\u003c/var\u003e: the BigQuery dataset that contains the resource.\n- \u003cvar translate=\"no\"\u003eMODEL_NAME\u003c/var\u003e: the name of a model. The model must have been created by using a `CREATE MODEL` statement that includes a [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform) to manually preprocess feature data. You can check to see if a model uses a `TRANSFORM` clause by using the [`bq show` command](/bigquery/docs/reference/bq-cli-reference#bq_show) to look at the [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata). If the model was trained using a `TRANSFORM` clause, the model metadata contains a section about the transform columns. The function returns an error if you specify a model that was trained without a `TRANSFORM` clause.\n- \u003cvar translate=\"no\"\u003eTABLE_NAME\u003c/var\u003e: the name of the input table that\n contains the feature data to preprocess.\n\n If you specify a value for the `TABLE_NAME` argument, the input column names\n in the table must match the input column names in the model's `TRANSFORM`\n clause, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n You can get the input column names and data types from the\n [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata),\n in the section about the feature columns.\n- \u003cvar translate=\"no\"\u003eQUERY_STATEMENT\u003c/var\u003e: A query that generates the feature\n data to preprocess. For the supported SQL syntax of the `QUERY_STATEMENT`\n clause, see\n [GoogleSQL query syntax](/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax).\n\n If you specify a value for the `QUERY_STATEMENT` argument, the input column\n names from the query must match the input column names in the model's\n `TRANSFORM` clause, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n You can get the input column names and data types from the\n [model's metadata](/bigquery/docs/getting-model-metadata#get_model_metadata),\n in the section about the feature columns.\n\nOutput\n------\n\n`ML.TRANSFORM` returns the columns specified in the model's `TRANSFORM` clause.\n\nExample\n-------\n\nThe following example returns feature data that has been preprocessed by\nusing the `TRANSFORM` clause included in the model named `mydataset.mymodel`\nin your default project.\n\nCreate the model that contains the `TRANSFORM` clause: \n\n```sql\nCREATE OR REPLACE MODEL `mydataset.mymodel`\n TRANSFORM(\n species,\n island,\n ML.MAX_ABS_SCALER(culmen_length_mm) OVER () AS culmen_length_mm,\n ML.MAX_ABS_SCALER(flipper_length_mm) OVER () AS flipper_length_mm,\n sex,\n body_mass_g)\n OPTIONS (\n model_type = 'linear_reg',\n input_label_cols = ['body_mass_g'])\nAS (\n SELECT *\n FROM `bigquery-public-data.ml_datasets.penguins`\n WHERE body_mass_g IS NOT NULL\n);\n```\n\nReturn feature data preprocessed by the model's `TRANSFORM` clause: \n\n```sql\nSELECT\n *\nFROM\n ML.TRANSFORM(\n MODEL `mydataset.mymodel`,\n TABLE `bigquery-public-data.ml_datasets.penguins`);\n```\n\nThe result is similar to the following: \n\n```\n+-------------------------------------+--------+---------------------+---------------------+--------+-----------------+-------------+\n| species | island | culmen_length_mm | flipper_length_mm | sex | culmen_depth_mm | body_mass_g |\n--------------------------------------+--------+ ------------------- +---------------------+--------+-----------------+-------------+\n| Adelie Penguin (Pygoscelis adeliae) | Dream | 0.61409395973154368 | 0.79653679653679654 | Female | 18.4 | 3475.0 |\n| Adelie Penguin (Pygoscelis adeliae) | Dream | 0.66778523489932884 | 0.79653679653679654 | Male | 19.1 | 4650.0 |\n+-------------------------------------+--------+---------------------+---------------------+--------+-----------------+-------------+\n```\n\nWhat's next\n-----------\n\n- For information about feature preprocessing, see [Feature preprocessing overview](/bigquery/docs/preprocess-overview)."]]