Getting Started with Image Classification Using Vertex AI and BigQuery

This guide provides a complete end-to-end workflow for training models and classifying imagery assets using Google Cloud's Vertex AI platform with Gemini 2.5 Flash. You'll learn to integrate BigQuery for data retrieval, Cloud Storage for asset management, and Vertex AI for machine learning inference in a Python Colab environment.

Configuration

Set the following project-specific variables before running the code samples:

  PROJECT_ID 
 = 
 " PROJECT_ID 
" 
 REGION 
 = 
 " REGION 
" 
 # e.g., "us-central1" 
 LOCATION 
 = 
 " LOCATION 
" 
 # e.g., "us" 
 CUSTOMER_ID 
 = 
 " CUSTOMER_ID 
" 
 # required to subscribe to the dataset 
 

Environment Setup

Install required dependencies and configure authentication to access Google Cloud services:

  # Install Google Cloud SDK dependencies for AI Platform integration 
 ! 
 pip 
 install 
 google 
 - 
 cloud 
 - 
 aiplatform 
 google 
 - 
 cloud 
 - 
 storage 
 google 
 - 
 cloud 
 - 
 bigquery 
 google 
 - 
 cloud 
 - 
 bigquery 
 - 
 data 
 - 
 exchange 
 - 
 q 
 # Import core libraries for cloud services and machine learning operations 
 import 
  
 json 
 import 
  
 os 
 from 
  
 google.cloud 
  
 import 
 bigquery 
 import 
  
 vertexai 
 from 
  
 vertexai.generative_models 
  
 import 
 GenerativeModel 
 , 
 Part 
 # Configure authentication for Google Cloud service access 
 # Initiates OAuth flow in new browser tab if authentication required 
 from 
  
 google.colab 
  
 import 
 auth 
 if 
 os 
 . 
 environ 
 . 
 get 
 ( 
 "VERTEX_PRODUCT" 
 ) 
 != 
 "COLAB_ENTERPRISE" 
 : 
 from 
  
 google.colab 
  
 import 
 auth 
 auth 
 . 
 authenticate_user 
 ( 
 project_id 
 = 
 PROJECT_ID 
 ) 
 # Initialize Vertex AI client with project configuration 
 vertexai 
 . 
 init 
 ( 
 project 
 = 
 PROJECT_ID 
 , 
 location 
 = 
 REGION 
 ) 
 print 
 ( 
 f 
 "Vertex AI initialized for project: 
 { 
 PROJECT_ID 
 } 
 in region: 
 { 
 REGION 
 } 
 " 
 ) 
 

You must also subscribe to the Analytics Hub dataset.

  from 
  
 google.cloud 
  
 import 
 bigquery_data_exchange_v1beta1 
 ah_client 
 = 
 bigquery_data_exchange_v1beta1 
 . 
 AnalyticsHubServiceClient 
 () 
 HUB_PROJECT_ID 
 = 
 'maps-platform-analytics-hub' 
 DATA_EXCHANGE_ID 
 = 
 f 
 "imagery_insights_exchange_ 
 { 
 LOCATION 
 } 
 " 
 LINKED_DATASET_NAME 
 = 
 f 
 "imagery_insights___preview___ 
 { 
 LOCATION 
 } 
 " 
 # subscribe to the listing (create a linked dataset in your consumer project) 
 destination_dataset 
 = 
 bigquery_data_exchange_v1beta1 
 . 
 DestinationDataset 
 () 
 destination_dataset 
 . 
 dataset_reference 
 . 
 dataset_id 
 = 
 LINKED_DATASET_NAME 
 destination_dataset 
 . 
 dataset_reference 
 . 
 project_id 
 = 
 PROJECT_ID 
 destination_dataset 
 . 
 location 
 = 
 LOCATION 
 LISTING_ID 
 = 
 f 
 "imagery_insights_ 
 { 
 CUSTOMER_ID 
 . 
 replace 
 ( 
 '-' 
 , 
  
 '_' 
 ) 
 } 
 __ 
 { 
 LOCATION 
 } 
 " 
 published_listing 
 = 
 f 
 "projects/ 
 { 
 HUB_PROJECT_ID 
 } 
 /locations/ 
 { 
 LOCATION 
 } 
 /dataExchanges/ 
 { 
 DATA_EXCHANGE_ID 
 } 
 /listings/ 
 { 
 LISTING_ID 
 } 
 " 
 request 
 = 
 bigquery_data_exchange_v1beta1 
 . 
 SubscribeListingRequest 
 ( 
 destination_dataset 
 = 
 destination_dataset 
 , 
 name 
 = 
 published_listing 
 , 
 ) 
 # request the subscription 
 ah_client 
 . 
 subscribe_listing 
 ( 
 request 
 = 
 request 
 ) 
 

Data Extraction with BigQuery

Execute a BigQuery query to extract Google Cloud Storage URIs from the latest_observations table. These URIs will be passed directly to the Vertex AI model for classification.

  # Initialize BigQuery client 
 bigquery_client 
 = 
 bigquery 
 . 
 Client 
 ( 
 project 
 = 
 PROJECT_ID 
 ) 
 # Define SQL query to retrieve observation records from imagery dataset 
 query 
 = 
 f 
 """ 
 SELECT 
 * 
 FROM 
 ` 
 { 
 PROJECT_ID 
 } 
 .imagery_insights___preview___ 
 { 
 LOCATION 
 } 
 .latest_observations` 
 LIMIT 10; 
 """ 
 print 
 ( 
 f 
 "Executing BigQuery query: 
 \n 
 { 
 query 
 } 
 " 
 ) 
 # Submit query job to BigQuery service and await completion 
 query_job 
 = 
 bigquery_client 
 . 
 query 
 ( 
 query 
 ) 
 # Transform query results into structured data format for downstream processing 
 # Convert BigQuery Row objects to dictionary representations for enhanced accessibility 
 query_response_data 
 = 
 [] 
 for 
 row 
 in 
 query_job 
 : 
 query_response_data 
 . 
 append 
 ( 
 dict 
 ( 
 row 
 )) 
 # Extract Cloud Storage URIs from result set, filtering null values 
 gcs_uris 
 = 
 [ 
 item 
 . 
 get 
 ( 
 "gcs_uri" 
 ) 
 for 
 item 
 in 
 query_response_data 
 if 
 item 
 . 
 get 
 ( 
 "gcs_uri" 
 )] 
 print 
 ( 
 f 
 "BigQuery query returned 
 { 
 len 
 ( 
 query_response_data 
 ) 
 } 
 records." 
 ) 
 print 
 ( 
 f 
 "Extracted 
 { 
 len 
 ( 
 gcs_uris 
 ) 
 } 
 GCS URIs:" 
 ) 
 for 
 uri 
 in 
 gcs_uris 
 : 
 print 
 ( 
 uri 
 ) 
 

Image Classification Function

This helper function handles the classification of images using Vertex AI's Gemini 2.5 Flash model:

  def 
  
 classify_image_with_gemini 
 ( 
 gcs_uri 
 : 
 str 
 , 
 prompt 
 : 
 str 
 = 
 "What is in this image?" 
 ) 
 - 
> str 
 : 
  
 """ 
 Performs multimodal image classification using Vertex AI's Gemini 2.5 Flash model. 
 Leverages direct Cloud Storage integration to process image assets without local 
 download requirements, enabling scalable batch processing workflows. 
 Args: 
 gcs_uri (str): Fully qualified Google Cloud Storage URI 
 (format: gs://bucket-name/path/to/image.jpg) 
 prompt (str): Natural language instruction for classification task execution 
 Returns: 
 str: Generated textual description from the generative model, or error message 
 if classification pipeline fails 
 Raises: 
 Exception: Captures service-level errors and returns structured failure response 
 """ 
 try 
 : 
 # Instantiate Gemini 2.5 Flash model for inference operations 
 model 
 = 
 GenerativeModel 
 ( 
 "gemini-2.5-flash" 
 ) 
 # Construct multimodal Part object from Cloud Storage reference 
 # Note: MIME type may need dynamic inference for mixed image formats 
 image_part 
 = 
 Part 
 . 
 from_uri 
 ( 
 uri 
 = 
 gcs_uri 
 , 
 mime_type 
 = 
 "image/jpeg" 
 ) 
 # Execute multimodal inference request with combined visual and textual inputs 
 responses 
 = 
 model 
 . 
 generate_content 
 ([ 
 image_part 
 , 
 prompt 
 ]) 
 return 
 responses 
 . 
 text 
 except 
 Exception 
 as 
 e 
 : 
 print 
 ( 
 f 
 "Error classifying image from URI 
 { 
 gcs_uri 
 } 
 : 
 { 
 e 
 } 
 " 
 ) 
 return 
 "Classification failed." 
 

Batch Image Classification

Process all extracted URIs and generate classifications:

  classification_results 
 = 
 [] 
 # Execute batch classification pipeline across all extracted GCS URIs 
 for 
 uri 
 in 
 gcs_uris 
 : 
 print 
 ( 
 f 
 " 
 \n 
 Processing: 
 { 
 uri 
 } 
 " 
 ) 
 # Define comprehensive classification prompt for detailed feature extraction 
 classification_prompt 
 = 
 "Describe this image in detail, focusing on any objects, signs, or features visible." 
 # Invoke Gemini model for multimodal inference on current asset 
 result 
 = 
 classify_image_with_gemini 
 ( 
 uri 
 , 
 classification_prompt 
 ) 
 # Aggregate structured results for downstream analytics and reporting 
 classification_results 
 . 
 append 
 ({ 
 "gcs_uri" 
 : 
 uri 
 , 
 "classification" 
 : 
 result 
 }) 
 print 
 ( 
 f 
 "Classification for 
 { 
 uri 
 } 
 : 
 \n 
 { 
 result 
 } 
 " 
 ) 
 

Next Steps

With your images classified, consider these advanced workflows:

  • Model Fine-tuning: Use classification results to train custom models.
  • Automated Processing: Set up Cloud Functions to classify new images automatically.
  • Data Analysis: Perform statistical analysis on classification patterns.
  • Integration: Connect results to downstream applications.

Troubleshooting

Common issues and solutions:

  • Authentication errors: Ensure proper IAM roles and API enablement.
  • Rate limiting: Implement exponential backoff for large batches.
  • Memory constraints: Process images in smaller batches for large datasets.
  • URI format errors: Verify GCS URIs follow the format gs://bucket-name/path/to/image .

For additional support, refer to the Vertex AI documentation and BigQuery documentation .

Create a Mobile Website
View Site in Mobile | Classic
Share by: