The ML.WEIGHTS function
This document describes the ML.WEIGHTS
function, which lets you see the
underlying weights that a model uses during prediction. This function applies to linear and logistic regression models
and matrix factorization models
.
For matrix factorization models, you can use the ML.GENERATE_EMBEDDING
function
as an alternative to the ML.WEIGHTS
function. ML.GENERATE_EMBEDDING
generates the same factor weights and intercept data as ML.WEIGHTS
as an array in a single column, rather than in two columns.
Having all of the embeddings in a single column lets you directly use the VECTOR_SEARCH
function
on the ML.GENERATE_EMBEDDING
output.
Syntax
ML.WEIGHTS( MODEL ` PROJECT_ID . DATASET . MODEL `, STRUCT([, STANDARDIZE AS standardize]))
Arguments
ML.WEIGHTS
takes the following arguments:
-
PROJECT_ID: your project ID. -
DATASET: the BigQuery dataset that contains the model. -
MODEL: the name of the model. -
STANDARDIZE: aBOOLvalue that specifies whether the model weights should be standardized to assume that all features have a mean of0and a standard deviation of1. Standardizing the weights allows the absolute magnitude of the weights to be compared to each other. The default value isFALSE. This argument only applies to linear and logistic regression models.
Output
ML.WEIGHTS
has different output columns for different model types.
Linear and logistic regression models
For linear and logistic regression models, ML.WEIGHTS
returns the
following columns:
-
trial_id: anINT64value that contains the hyperparameter tuning trial ID. This column is only returned if you ran hyperparameter tuning when creating the model. -
processed_input: aSTRINGvalue that contains the name of the feature input column. The value of this column matches the name of the feature column provided in thequery_statementclause that was used when the model was trained. -
weight: if the column identified by theprocessed_inputvalue is numerical,weightcontains aFLOAT64value and thecategory_weightscolumn containsNULLvalues. If the column identified by theprocessed_inputvalue is non-numerical and has been converted to one-hot encoding, theweightcolumn isNULLand thecategory_weightscolumn contains the category names and weights for each category. -
category_weights.category: aSTRINGvalue that contains the category name if the column identified by theprocessed_inputvalue is non-numeric. -
category_weights.weight: aFLOAT64that contains the category's weight if the column identified by theprocessed_inputvalue is non-numeric. -
class_label: aSTRINGvalue that contains the label for a given weight. Only used for multiclass models. The output includes one row per<class_label, processed_input>combination.
If you used the TRANSFORM
clause
in the CREATE MODEL
statement that created the model, ML.WEIGHTS
outputs
the weights of TRANSFORM
output features. The weights are denormalized by
default, with the option to get normalized weights, exactly like models that
are created without TRANSFORM
.
Matrix factorization models
For matrix factorization models, ML.WEIGHTS
returns the following columns:
-
trial_id: anINT64value that contains the hyperparameter tuning trial ID. This column is only returned if you ran hyperparameter tuning when creating the model. -
processed_input: aSTRINGvalue that contains the name of the user or item column. The value of this column matches the name of the user or item column provided in thequery_statementclause that was used when the model was trained. -
feature: aSTRINGvalue that contains the names of the specific users or items used during training. -
factor_weights: anARRAY<STRUCT>value that contains the factors and the weights for each factor.-
factor_weights.factor: anINT64value that contains the latent factor from training. This value can be between1and the value of theNUM_FACTORSoption . -
factor_weights.weight: aFLOAT64value that contains the weight of the respective factor and feature.
-
-
intercept: aFLOAT64value that contains the intercept or bias term for a feature.
There is an additional row in the output that contains the global__intercept__
value calculated from the input data. This row has NULL
values for the processed_input
and factor_weights
columns. For implicit feedback
models, global__intercept__
is always 0.
Examples
The following examples show how to use ML.WEIGHTS
with and without the standardize
argument.
Without standardization
The following example retrieves weight information from mymodel
in mydataset
. The dataset is in your default project. It returns the weights
that are associated with each one-hot encoded category for the input column input_col
.
SELECT category , weight FROM UNNEST (( SELECT category_weights FROM ML . WEIGHTS ( MODEL ` mydataset . mymodel ` ) WHERE processed_input = 'input_col' ))
This command uses the UNNEST
function because the category_weights
column is a nested repeated column.
With standardization
The following example retrieves weight information from mymodel
in mydataset
. The dataset is in your default project. It retrieves standardized
weights, which assume all features have a mean of 0
and a standard deviation
of 1
.
SELECT * FROM ML . WEIGHTS ( MODEL ` mydataset . mymodel ` , STRUCT ( true AS standardize ))
What's next
- For more information about model weights support in BigQuery ML, see BigQuery ML model weights overview .
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models .

