Analyze multimodal data in Python with BigQuery DataFrames

This tutorial shows you how to analyze multimodal data in a Python notebook by using BigQuery DataFrames classes and methods.

This tutorial uses the product catalog from the public Cymbal pet store dataset.

To upload a notebook already populated with the tasks covered in this tutorial, see BigFrames Multimodal DataFrame .

Objectives

  • Create multimodal DataFrames.
  • Combine structured and unstructured data in a DataFrame.
  • Transform images.
  • Generate text and embeddings based on image data.
  • Chunk PDFs for further analysis.

Costs

In this document, you use the following billable components of Google Cloud:

  • BigQuery : you incur costs for the data that you process in BigQuery.
  • BigQuery Python UDFs : you incur costs for using BigQuery DataFrames image transformation and chunk PDF methods.
  • Cloud Storage : you incur costs for the objects stored in Cloud Storage.
  • Vertex AI : you incur costs for calls to Vertex AI models.

To generate a cost estimate based on your projected usage, use the pricing calculator .

New Google Cloud users might be eligible for a free trial .

For more information about, see the following pricing pages:

Before you begin

  1. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  2. Verify that billing is enabled for your Google Cloud project .

  3. Enable the BigQuery, BigQuery Connection, Cloud Storage, and Vertex AI APIs.

    Enable the APIs

Required roles

To get the permissions that you need to complete this tutorial, ask your administrator to grant you the following IAM roles:

For more information about granting roles, see Manage access to projects, folders, and organizations .

You might also be able to get the required permissions through custom roles or other predefined roles .

Set up

In this section, you create the Cloud Storage bucket, connection, and notebook used in this tutorial.

Create a bucket

Create a Cloud Storage bucket for storing transformed objects:

  1. In the Google Cloud console, go to the Bucketspage.

    Go to Buckets

  2. Click Create.

  3. On the Create a bucketpage, in the Get startedsection, enter a globally unique name that meets the bucket name requirements .

  4. Click Create.

Create a connection

Create a Cloud resource connection and get the connection's service account. BigQuery uses the connection to access objects in Cloud Storage.

  1. Go to the BigQuerypage.

    Go to BigQuery

  2. In the Explorerpane, click Add data.

    The Add datadialog opens.

  3. In the Filter Bypane, in the Data Source Typesection, select Business Applications.

    Alternatively, in the Search for data sourcesfield, you can enter Vertex AI .

  4. In the Featured data sourcessection, click Vertex AI.

  5. Click the Vertex AI Models: BigQuery Federationsolution card.

  6. In the Connection typelist, select Vertex AI remote models, remote functions and BigLake (Cloud Resource).

  7. In the Connection IDfield, type bigframes-default-connection .

  8. Click Create connection.

  9. Click Go to connection.

  10. In the Connection infopane, copy the service account ID for use in a later step.

Grant permissions to the connection's service account

Grant the connection's service account the roles that it needs to access Cloud Storage and Vertex AI. You must grant these roles in the same project you created or selected in the Before you begin section.

To grant the role, follow these steps:

  1. Go to the IAM & Adminpage.

    Go to IAM & Admin

  2. Click Grant access.

  3. In the New principalsfield, enter the service account ID that you copied earlier.

  4. In the Select a rolefield, choose Cloud Storage, and then select Storage Object User.

  5. Click Add another role.

  6. In the Select a rolefield, select Vertex AI, and then select Vertex AI User.

  7. Click Save.

Create a notebook

Create a notebook where you can run Python code:

  1. Go to the BigQuerypage.

    Go to BigQuery

  2. In the tab bar of the editor pane, click the drop-down arrow next to SQL query, and then click Notebook.

  3. In the Start with a templatepane, click Close.

  4. Click Connect > Connect to a runtime.

  5. If you have an existing runtime, accept the default settings and click Connect. If you don't have an existing runtime, select Create new Runtime, and then click Connect.

    It might take several minutes for the runtime to get set up.

Create a multimodal DataFrame

Create a multimodal DataFrame that integrates structured and unstructured data by using the from_glob_path method of the Session class :

  1. In the notebook, create a code cell and copy the following code into it:
      import 
      
      bigframes 
     
     # Flags to control preview image/video preview size 
      bigframes 
     
     . 
     options 
     . 
      display 
     
     . 
     blob_display_width 
     = 
     300 
     import 
      
     bigframes.pandas 
      
     as 
      
     bpd 
     # Create blob columns from wildcard path. 
     df_image 
     = 
     bpd 
     . 
      from_glob_path 
     
     ( 
     "gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/images/*" 
     , 
     name 
     = 
     "image" 
     ) 
     # Other ways are: from string uri column 
     # df = bpd.DataFrame({"uri": ["gs://<my_bucket>/<my_file_0>", "gs://<my_bucket>/<my_file_1>"]}) 
     # df["blob_col"] = df["uri"].str.to_blob() 
     # From an existing object table 
     # df = bpd.read_gbq_object_table("<my_object_table>", name="blob_col") 
     # Take only the 5 images to deal with. Preview the content of the Mutimodal DataFrame 
     df_image 
     = 
     df_image 
     . 
     head 
     ( 
     5 
     ) 
     df_image 
     
    
  2. Click Run.

    The final call to df_image returns the images that have been added to the DataFrame. Alternatively, you could call the .display method.

Combine structured and unstructured data in the DataFrame

Combine text and image data in the multimodal DataFrame:

  1. In the notebook, create a code cell and copy the following code into it:
      # Combine unstructured data with structured data 
     df_image 
     [ 
     "author" 
     ] 
     = 
     [ 
     "alice" 
     , 
     "bob" 
     , 
     "bob" 
     , 
     "alice" 
     , 
     "bob" 
     ] 
     # type: ignore 
     df_image 
     [ 
     "content_type" 
     ] 
     = 
     df_image 
     [ 
     "image" 
     ] 
     . 
     blob 
     . 
     content_type 
     () 
     df_image 
     [ 
     "size" 
     ] 
     = 
     df_image 
     [ 
     "image" 
     ] 
     . 
     blob 
     . 
     size 
     () 
     df_image 
     [ 
     "updated" 
     ] 
     = 
     df_image 
     [ 
     "image" 
     ] 
     . 
     blob 
     . 
     updated 
     () 
     df_image 
     
    
  2. Click Run .

    The code returns the DataFrame data.

  3. In the notebook, create a code cell and copy the following code into it:

      # Filter images and display, you can also display audio and video types. Use width/height parameters to constrain window sizes. 
     df_image 
     [ 
     df_image 
     [ 
     "author" 
     ] 
     == 
     "alice" 
     ][ 
     "image" 
     ] 
     . 
     blob 
     . 
     display 
     () 
     
    
  • Click Run .

    The code returns images from the DataFrame where the author column value is alice .

  • Perform image transformations

    Transform image data by using the following methods of the Series.BlobAccessor class :

    The transformed images are written to Cloud Storage.

    Transform images:

    1. In the notebook, create a code cell and copy the following code into it:
        df_image 
       [ 
       "blurred" 
       ] 
       = 
       df_image 
       [ 
       "image" 
       ] 
       . 
       blob 
       . 
       image_blur 
       ( 
       ( 
       20 
       , 
       20 
       ), 
       dst 
       = 
       f 
       " 
       { 
       dst_bucket 
       } 
       /image_blur_transformed/" 
       , 
       engine 
       = 
       "opencv" 
       ) 
       df_image 
       [ 
       "resized" 
       ] 
       = 
       df_image 
       [ 
       "image" 
       ] 
       . 
       blob 
       . 
       image_resize 
       ( 
       ( 
       300 
       , 
       200 
       ), 
       dst 
       = 
       f 
       " 
       { 
       dst_bucket 
       } 
       /image_resize_transformed/" 
       , 
       engine 
       = 
       "opencv" 
       ) 
       df_image 
       [ 
       "normalized" 
       ] 
       = 
       df_image 
       [ 
       "image" 
       ] 
       . 
       blob 
       . 
       image_normalize 
       ( 
       alpha 
       = 
       50.0 
       , 
       beta 
       = 
       150.0 
       , 
       norm_type 
       = 
       "minmax" 
       , 
       dst 
       = 
       f 
       " 
       { 
       dst_bucket 
       } 
       /image_normalize_transformed/" 
       , 
       engine 
       = 
       "opencv" 
       , 
       ) 
       # You can also chain functions together 
       df_image 
       [ 
       "blur_resized" 
       ] 
       = 
       df_image 
       [ 
       "blurred" 
       ] 
       . 
       blob 
       . 
       image_resize 
       ( 
       ( 
       300 
       , 
       200 
       ), 
       dst 
       = 
       f 
       " 
       { 
       dst_bucket 
       } 
       /image_blur_resize_transformed/" 
       , 
       engine 
       = 
       "opencv" 
       ) 
       df_image 
       
      
    2. Update all references to {dst_bucket} to refer to the bucket that you created , in the format gs:// mybucket .
    3. Click Run .

      The code returns the original images as well as all of their transformations.

    Generate text

    Generate text from multimodal data by using the predict method of the GeminiTextGenerator class :

    1. In the notebook, create a code cell and copy the following code into it:
        from 
        
       bigframes.ml 
        
       import 
       llm 
       gemini 
       = 
       llm 
       . 
       GeminiTextGenerator 
       ( 
       model_name 
       = 
       "gemini-2.0-flash-001" 
       ) 
       # Deal with first 2 images as example 
       df_image 
       = 
       df_image 
       . 
       head 
       ( 
       2 
       ) 
       # Ask the same question on the images 
       df_image 
       = 
       df_image 
       . 
       head 
       ( 
       2 
       ) 
       answer 
       = 
       gemini 
       . 
       predict 
       ( 
       df_image 
       , 
       prompt 
       = 
       [ 
       "what item is it?" 
       , 
       df_image 
       [ 
       "image" 
       ]]) 
       answer 
       [[ 
       "ml_generate_text_llm_result" 
       , 
       "image" 
       ]] 
       
      
    2. Click Run .

      The code returns the first two images in df_image , along with text generated in response to the question what item is it? for both images.

    3. In the notebook, create a code cell and copy the following code into it:

        # Ask different questions 
       df_image 
       [ 
       "question" 
       ] 
       = 
       [ 
       # type: ignore 
       "what item is it?" 
       , 
       "what color is the picture?" 
       , 
       ] 
       answer_alt 
       = 
       gemini 
       . 
       predict 
       ( 
       df_image 
       , 
       prompt 
       = 
       [ 
       df_image 
       [ 
       "question" 
       ], 
       df_image 
       [ 
       "image" 
       ]] 
       ) 
       answer_alt 
       [[ 
       "ml_generate_text_llm_result" 
       , 
       "image" 
       ]] 
       
      
  • Click Run .

    The code returns the first two images in df_image , with text generated in response to the question what item is it? for the first image, and text generated in response to the question what color is the picture? for the second image.

  • Generate embeddings

    Generate embeddings for multimodal data by using the predict method of the MultimodalEmbeddingGenerator class :

    1. In the notebook, create a code cell and copy the following code into it:
        # Generate embeddings on images 
       embed_model 
       = 
       llm 
       . 
       MultimodalEmbeddingGenerator 
       () 
       embeddings 
       = 
       embed_model 
       . 
       predict 
       ( 
       df_image 
       [ 
       "image" 
       ]) 
       embeddings 
       
      
    2. Click Run .

      The code returns the embeddings generated by a call to an embedding model.

    Chunk PDFs

    Chunk PDF objects by using the pdf_chunk method of the Series.BlobAccessor class :

    1. In the notebook, create a code cell and copy the following code into it:
        # PDF chunking 
       df_pdf 
       = 
       bpd 
       . 
       from_glob_path 
       ( 
       "gs://cloud-samples-data/bigquery/tutorials/cymbal-pets/documents/*" 
       , 
       name 
       = 
       "pdf" 
       ) 
       df_pdf 
       [ 
       "chunked" 
       ] 
       = 
       df_pdf 
       [ 
       "pdf" 
       ] 
       . 
       blob 
       . 
       pdf_chunk 
       ( 
       engine 
       = 
       "pypdf" 
       ) 
       chunked 
       = 
       df_pdf 
       [ 
       "chunked" 
       ] 
       . 
       explode 
       () 
       chunked 
       
      
    2. Click Run .

      The code returns the chunked PDF data.

    Clean up

    1. In the Google Cloud console, go to the Manage resources page.

      Go to Manage resources

    2. In the project list, select the project that you want to delete, and then click Delete .
    3. In the dialog, type the project ID, and then click Shut down to delete the project.
    Create a Mobile Website
    View Site in Mobile | Classic
    Share by: