The ML.FEATURE_IMPORTANCE function
This document describes the ML.FEATURE_IMPORTANCE
function, which lets you
see the feature importance score. This score indicates how useful or valuable
each feature was in the construction of a boosted tree or a random forest model
during training. For more information, see the feature_importances
property
in the XGBoost library.
Syntax
ML.FEATURE_IMPORTANCE( MODEL ` PROJECT_ID . DATASET . MODEL ` )
Arguments
ML.FEATURE_IMPORTANCE
takes the following arguments:
-
PROJECT_ID: your project ID. -
DATASET: the BigQuery dataset that contains the model. -
MODEL: the name of the model.
Output
ML.FEATURE_IMPORTANCE
returns the following columns:
-
feature: aSTRINGvalue that contains the name of the feature column in the input training data. -
importance_weight: aFLOAT64value that contains the number of times a feature is used to split the data across all trees. -
importance_gain: aFLOAT64value that contains the average gain across all splits the feature is used in. -
importance_cover: aFLOAT64value that contains the average coverage across all splits the feature is used in.
If the TRANSFORM
clause
was used in the CREATE MODEL
statement that created the model, ML.FEATURE_IMPORTANCE
returns the information of the pre-transform columns
from the query_statement
clause of the CREATE MODEL
statement.
Permissions
You must have the bigquery.models.create
and bigquery.models.getData
Identity and Access Management (IAM) permissions
in order to run ML.FEATURE_IMPORTANCE
.
Limitations
ML.FEATURE_IMPORTANCE
is only supported with boosted tree models
and random forest models
.
Example
This example retrieves feature importance from mymodel
in mydataset
. The dataset is in your default project.
SELECT * FROM ML . FEATURE_IMPORTANCE ( MODEL ` mydataset . mymodel ` )
What's next
- For more information about Explainable AI, see BigQuery Explainable AI overview .
- For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models .

