Create a clustering model with BigQuery DataFrames

Create a k-means clustering model on the lengths and sex of penguins using the BigQuery DataFrames API.

Explore further

For detailed documentation that includes this code sample, see the following:

Code sample

Python

Before trying this sample, follow the Python setup instructions in the BigQuery quickstart using client libraries . For more information, see the BigQuery Python API reference documentation .

To authenticate to BigQuery, set up Application Default Credentials. For more information, see Set up authentication for client libraries .

  from 
  
 bigframes.ml.cluster 
  
 import 
 KMeans 
 import 
  
 bigframes.pandas 
  
 as 
  
 bpd 
 # Load data from BigQuery 
 query_or_table 
 = 
 "bigquery-public-data.ml_datasets.penguins" 
 bq_df 
 = 
 bpd 
 . 
 read_gbq 
 ( 
 query_or_table 
 ) 
 # Create the KMeans model 
 cluster_model 
 = 
 KMeans 
 ( 
 n_clusters 
 = 
 10 
 ) 
 cluster_model 
 . 
 fit 
 ( 
 bq_df 
 [ 
 "culmen_length_mm" 
 ], 
 bq_df 
 [ 
 "sex" 
 ]) 
 # Predict using the model 
 result 
 = 
 cluster_model 
 . 
 predict 
 ( 
 bq_df 
 ) 
 # Score the model 
 score 
 = 
 cluster_model 
 . 
 score 
 ( 
 bq_df 
 ) 
 

What's next

To search and filter code samples for other Google Cloud products, see the Google Cloud sample browser .

Design a Mobile Site
View Site in Mobile | Classic
Share by: