Stay organized with collectionsSave and categorize content based on your preferences.
The ML.CONFUSION_MATRIX function
This document describes theML.CONFUSION_MATRIXfunction, which you can use
to return a confusion matrix for the input classification model and input data.
PROJECT_ID: the project that contains the
resource.
DATASET: the dataset that contains the
resource.
MODEL: the name of the model.
TABLE: the name of the input table that contains
the evaluation data.
IfTABLEis specified, the input column names in the table must match the
column names in the model, and their types should be compatible according to
BigQueryimplicit coercion rules.
The input must have a column that matches the
label column name provided during training. This value is provided using theinput_label_colsoption. Ifinput_label_colsis unspecified, the column
namedlabelin the training data is used.
If you don't specify eitherTABLEorQUERY_STATEMENT,ML.CONFUSION_MATRIXcomputes the confusion matrix results as follows:
If the data is split during training, the split evaluation data is used to
compute the confusion matrix results.
If the data is not split during training, the entire training input is
used to compute the confusion matrix results.
QUERY_STATEMENT: a GoogleSQL query that is
used to generate the evaluation data. For the supported SQL syntax of theQUERY_STATEMENTclause in GoogleSQL, seeQuery syntax.
IfQUERY_STATEMENTis specified, the input column names from the query
must match the column names in the model, and their types should be
compatible according to BigQueryimplicit coercion rules.
The input must have a column that matches the label column name provided
during training. This value is provided using theinput_label_colsoption.
Ifinput_label_colsis unspecified, the column namedlabelin the
training data is used. The extra columns are ignored.
If you used theTRANSFORMclausein theCREATE MODELstatement that created the model, then only the input
columns present in theTRANSFORMclause must appear inQUERY_STATEMENT.
If you don't specify eitherTABLEorQUERY_STATEMENT,ML.CONFUSION_MATRIXcomputes the confusion matrix results as follows:
If the data is split during training, the split evaluation data is used to
compute the confusion matrix results.
If the data is not split during training, the entire training input is
used to compute the confusion matrix results.
THRESHOLD: aFLOAT64value that specifies a custom
threshold for the binary-class classification model to use for evaluation. The
default value is0.5.
A0value for precision or recall means that the selected threshold
produced no true positive labels. ANaNvalue for precision means that the
selected threshold produced no positive labels, neither true positives nor
false positives.
If bothTABLEandQUERY_STATEMENTare unspecified, you can't use a
threshold.
You can't useTHRESHOLDwith multiclass classification models.
TRIAL_ID: anINT64value that identifies the
hyperparameter tuning trial that you want the function to evaluate. The
function uses the optimal trial by default. Only specify this argument if you
ran hyperparameter tuning when creating the model.
Output
The output columns of theML.CONFUSION_MATRIXfunction depend on the model.
The first output column is alwaysexpected_label. There areNadditional
columns, one for each class in the trained model. The names of the additional
columns depend on the class labels used to train the model.
If the training class labels all conform to BigQuerycolumn naming rules, the labels are used
as the column names. Columns that don't conform to naming rules are altered to
conform to the column naming rules and to be unique. For example, if the labels
are0and1, the output column names are_0and_1.
The columns are ordered based on the class labels in ascending order. If the
labels in the evaluation data match those in the training data, theTrue Positivesare shown on the diagonal from top left to bottom right. The expected (or
actual) labels are listed one per row, and the predicted labels are listed one
per column.
The values in theexpected_labelcolumn are the exact values and type passed
intoML.CONFUSION_MATRIXin the label column of the evaluation data. This is
true even if they don't exactly match the values or type used during training.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[[["\u003cp\u003eThe \u003ccode\u003eML.CONFUSION_MATRIX\u003c/code\u003e function generates a confusion matrix for a classification model using input data, allowing for evaluation of model performance.\u003c/p\u003e\n"],["\u003cp\u003eThe function requires a model (\u003ccode\u003eproject_id.dataset.model\u003c/code\u003e) and either a table (\u003ccode\u003eproject_id.dataset.table\u003c/code\u003e) or a query statement to provide evaluation data.\u003c/p\u003e\n"],["\u003cp\u003eThe input data must have columns matching the model's column names and include a label column, and when unspecified, \u003ccode\u003einput_label_cols\u003c/code\u003e defaults to a column named "label".\u003c/p\u003e\n"],["\u003cp\u003eYou can optionally specify a custom \u003ccode\u003ethreshold\u003c/code\u003e for binary-class classification models, or a specific \u003ccode\u003etrial_id\u003c/code\u003e for hyperparameter-tuned models, but \u003ccode\u003ethreshold\u003c/code\u003e is unavailable when both \u003ccode\u003etable\u003c/code\u003e and \u003ccode\u003equery_statement\u003c/code\u003e are omitted.\u003c/p\u003e\n"],["\u003cp\u003eThe output of \u003ccode\u003eML.CONFUSION_MATRIX\u003c/code\u003e presents the expected labels and the predicted labels, with the True Positives appearing diagonally, and output column names are based on class labels.\u003c/p\u003e\n"]]],[],null,["# The ML.CONFUSION_MATRIX function\n================================\n\nThis document describes the `ML.CONFUSION_MATRIX` function, which you can use\nto return a confusion matrix for the input classification model and input data.\n\nSyntax\n------\n\n```sql\nML.CONFUSION_MATRIX(\n MODEL `PROJECT_ID.DATASET.MODEL_NAME`,\n [, { TABLE `PROJECT_ID.DATASET.TABLE` | (QUERY_STATEMENT) }]\n STRUCT(\n [THRESHOLD AS threshold]\n [, TRIAL_ID AS trial_id]))\n```\n\n### Arguments\n\n`ML.CONFUSION_MATRIX` takes the following arguments:\n\n- \u003cvar translate=\"no\"\u003ePROJECT_ID\u003c/var\u003e: the project that contains the resource.\n- \u003cvar translate=\"no\"\u003eDATASET\u003c/var\u003e: the dataset that contains the resource.\n- \u003cvar translate=\"no\"\u003eMODEL\u003c/var\u003e: the name of the model.\n- \u003cvar translate=\"no\"\u003eTABLE\u003c/var\u003e: the name of the input table that contains\n the evaluation data.\n\n If `TABLE` is specified, the input column names in the table must match the\n column names in the model, and their types should be compatible according to\n BigQuery [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n The input must have a column that matches the\n label column name provided during training. This value is provided using the\n `input_label_cols` option. If `input_label_cols` is unspecified, the column\n named `label` in the training data is used.\n\n If you don't specify either `TABLE` or `QUERY_STATEMENT`,\n `ML.CONFUSION_MATRIX` computes the confusion matrix results as follows:\n - If the data is split during training, the split evaluation data is used to compute the confusion matrix results.\n - If the data is not split during training, the entire training input is used to compute the confusion matrix results.\n- \u003cvar translate=\"no\"\u003eQUERY_STATEMENT\u003c/var\u003e: a GoogleSQL query that is\n used to generate the evaluation data. For the supported SQL syntax of the\n `QUERY_STATEMENT` clause in GoogleSQL, see\n [Query syntax](/bigquery/docs/reference/standard-sql/query-syntax#sql_syntax).\n\n If `QUERY_STATEMENT` is specified, the input column names from the query\n must match the column names in the model, and their types should be\n compatible according to BigQuery\n [implicit coercion rules](/bigquery/docs/reference/standard-sql/conversion_rules#coercion).\n The input must have a column that matches the label column name provided\n during training. This value is provided using the `input_label_cols` option.\n If `input_label_cols` is unspecified, the column named `label` in the\n training data is used. The extra columns are ignored.\n\n If you used the\n [`TRANSFORM` clause](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create#transform)\n in the `CREATE MODEL` statement that created the model, then only the input\n columns present in the `TRANSFORM` clause must appear in `QUERY_STATEMENT`.\n\n If you don't specify either `TABLE` or `QUERY_STATEMENT`,\n `ML.CONFUSION_MATRIX` computes the confusion matrix results as follows:\n - If the data is split during training, the split evaluation data is used to compute the confusion matrix results.\n - If the data is not split during training, the entire training input is used to compute the confusion matrix results.\n- \u003cvar translate=\"no\"\u003eTHRESHOLD\u003c/var\u003e: a `FLOAT64` value that specifies a custom\n threshold for the binary-class classification model to use for evaluation. The\n default value is `0.5`.\n\n A `0` value for precision or recall means that the selected threshold\n produced no true positive labels. A `NaN` value for precision means that the\n selected threshold produced no positive labels, neither true positives nor\n false positives.\n\n If both `TABLE` and `QUERY_STATEMENT` are unspecified, you can't use a\n threshold.\n\n You can't use `THRESHOLD` with multiclass classification models.\n- \u003cvar translate=\"no\"\u003eTRIAL_ID\u003c/var\u003e: an `INT64` value that identifies the\n hyperparameter tuning trial that you want the function to evaluate. The\n function uses the optimal trial by default. Only specify this argument if you\n ran hyperparameter tuning when creating the model.\n\n| **Note:** `ML.CONFUSION_MATRIX` requires input data with some models, and returns an error if it is absent. If this occurs, provide input data when using `ML.CONFUSION_MATRIX` with these models.\n\nOutput\n------\n\nThe output columns of the `ML.CONFUSION_MATRIX` function depend on the model.\nThe first output column is always `expected_label`. There are `N` additional\ncolumns, one for each class in the trained model. The names of the additional\ncolumns depend on the class labels used to train the model.\n\nIf the training class labels all conform to BigQuery\n[column naming rules](/bigquery/docs/schemas#column_names), the labels are used\nas the column names. Columns that don't conform to naming rules are altered to\nconform to the column naming rules and to be unique. For example, if the labels\nare `0` and `1`, the output column names are `_0` and `_1`.\n\nThe columns are ordered based on the class labels in ascending order. If the\nlabels in the evaluation data match those in the training data, the\n[True Positives](https://developers.google.com/machine-learning/glossary/#true_positive)\nare shown on the diagonal from top left to bottom right. The expected (or\nactual) labels are listed one per row, and the predicted labels are listed one\nper column.\n\nThe values in the `expected_label` column are the exact values and type passed\ninto `ML.CONFUSION_MATRIX` in the label column of the evaluation data. This is\ntrue even if they don't exactly match the values or type used during training.\n\nLimitations\n-----------\n\n`ML.CONFUSION_MATRIX` doesn't support\n[imported TensorFlow models](/bigquery/docs/reference/standard-sql/bigqueryml-syntax-create-tensorflow).\n\nExamples\n--------\n\nThe following examples demonstrate the use of the `ML.CONFUSION_MATRIX` function.\n\n### `ML.CONFUSION_MATRIX` with a query statement\n\nThe following example returns the confusion matrix for a logistic\nregression model named `mydataset.mymodel` in your default project: \n\n```sql\nSELECT\n *\nFROM\n ML.CONFUSION_MATRIX(MODEL `mydataset.mymodel`,\n (\n SELECT\n *\n FROM\n `mydataset.mytable`))\n```\n\n### `ML.CONFUSION_MATRIX` with a custom threshold\n\nThe following example returns the confusion matrix for a logistic\nregression model named `mydataset.mymodel` in your default project: \n\n```sql\nSELECT\n *\nFROM\n ML.CONFUSION_MATRIX(MODEL `mydataset.mymodel`,\n (\n SELECT\n *\n FROM\n `mydataset.mytable`),\n STRUCT(0.6 AS threshold))\n```\n\nWhat's next\n-----------\n\n- For information about model evaluation, see [BigQuery ML model evaluation overview](/bigquery/docs/evaluate-overview).\n- For information about the supported SQL statements and functions for each model type, see [End-to-end user journey for each model](/bigquery/docs/e2e-journey)."]]