The ML.TRAINING_INFO function

This document describes the ML.TRAINING_INFO function, which lets you see information about the training iterations of a model.

You can run ML.TRAINING_INFO while the CREATE MODEL statement for the target model is running, or you can wait until after the CREATE MODEL statement completes. If you run ML.TRAINING_INFO before the first training iteration of the CREATE MODEL statement completes, the query returns a Not found error.

Syntax

ML.TRAINING_INFO(
  MODEL ` PROJECT_ID 
. DATASET 
. MODEL_NAME 
`,
)

Arguments

ML.TRAINING_INFO takes the following arguments:

  • PROJECT_ID : your project ID.
  • DATASET : the BigQuery dataset that contains the model.
  • MODEL_NAME : the name of the model.

Output

ML.TRAINING_INFO returns the following columns:

  • training_run : an INT64 value that contains the training run identifier for the model. The value in this column is 0 for a newly created model. If you retrain the model using the warm_start argument of the CREATE MODEL statement, this value is incremented.
  • iteration : an INT64 value that contains the iteration number of the training run. The value for the first iteration is 0 . This value is incremented for each additional training run.
  • loss : a FLOAT64 value that contains the loss metric calculated after an iteration on the training data:

    • For logistic regression models, this is log loss .
    • For linear regression models, this is mean squared error .
    • For multiclass logistic regressions, this is cross-entropy log loss .
    • For explicit matrix factorization models this is mean squared error calculated over the seen input ratings.
    • For implicit matrix factorization models, the loss is calculated using the following formula:
    $$ Loss = \sum_{u, i} c_{ui}(p_{ui} - x^T_uy_i)^2 + \lambda(\sum_u||x_u||^2 + \sum_i||y_i||^2) $$

    For more information about what the variables mean, see Feedback types .

  • eval_loss : a FLOAT64 value that contains the loss metric calculated on the holdout data. For k-means models, ML.TRAINING_INFO doesn't return an eval_loss column. If the DATA_SPLIT_METHOD argument is NO_SPLIT , then all entries in the eval_loss column are NULL .

  • learning_rate : a FLOAT64 value that contains the learning rate in this iteration.

  • duration_ms : an INT64 value that contains how long the iteration took, in milliseconds.

  • cluster_info : an ARRAY<STRUCT> value that contains the fields centroid_id , cluster_radius , and cluster_size . ML.TRAINING_INFO computes cluster_radius and cluster_size with standardized features. Only returned for k-means models.

Permissions

You must have the bigquery.models.create and bigquery.models.getData Identity and Access Management (IAM) permissions in order to run ML.TRAINING_INFO .

Limitations

ML.TRAINING_INFO is subject to the following limitations:

  • ML.TRAINING_INFO doesn't support imported TensorFlow models .
  • For time series models , ML.TRAINING_INFO only returns three columns: training_run , iteration , and duration_ms . It doesn't expose the training information per iteration, or per time series if multiple time series are forecasted at once. The duration_ms is the total time cost for the entire process.

Example

The following example retrieves training information from the model mydataset.mymodel in your default project:

 SELECT 
  
 * 
 FROM 
  
 ML 
 . 
 TRAINING_INFO 
 ( 
 MODEL 
  
 ` 
 mydataset 
 . 
 mymodel 
 ` 
 ) 

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: