The ML.PRINCIPAL_COMPONENTS function

This document describes the ML.PRINCIPAL_COMPONENTS function, which lets you see the principal components of a principal component analysis (PCA) model. Principal components and eigenvectors are the same concepts in PCA models.

Syntax

ML.PRINCIPAL_COMPONENTS(
  MODEL ` PROJECT_ID 
. DATASET 
. MODEL 
`
)

Arguments

ML.PRINCIPAL_COMPONENTS takes the following arguments:

PROJECT_ID : your project ID.
DATASET : the BigQuery dataset that contains the model.
MODEL : the name of the model.

Output

ML.PRINCIPAL_COMPONENTS returns the following columns:

principal_component_id : an INT64 that contains the principal component ID.
feature : a STRING value that contains the feature column name.
numerical_value : a FLOAT64 value that contains the feature value for the principal component that principal_component_id identifies if the column identified by the feature value is numeric. Otherwise, numerical_value is NULL .
categorical_value : an ARRAY<STRUCT> value that contains information about categorical features. Each struct contains the following fields:
- categorical_value.category : a STRING value that contains the name of each category.
- categorical_value.value : a FLOAT64 value that contains the value of categorical_value.category for the principal component that principal_component_id identifies.

The output is in descending order by the eigenvalues of the principal components, which you can get by using the ML.PRINCIPAL_COMPONENT_INFO function .

Example

The following example retrieves the principal components from the model mydataset.mymodel in your default project:

 SELECT 
  
 * 
 FROM 
  
 ML 
 . 
 PRINCIPAL_COMPONENTS 
 ( 
 MODEL 
  
 ` 
 mydataset 
 . 
 mymodel 
 ` 
 )

What's next

For more information about model weights support in BigQuery ML, see BigQuery ML model weights overview .
For more information about supported SQL statements and functions for ML models, see End-to-end user journeys for ML models .