Get started with media recommendations

You can quickly build a state-of-the-art media recommendations app. Media recommendations enable your audiences to discover more personalized content, like what to watch or read next, with Google-quality results that are customized by optimization objectives.

For general information about Vertex AI Search for media, see Introduction to media search and recommendations .

In this getting-started tutorial, you will use the Movielens dataset to demonstrate how to upload your media content catalog and user events into Vertex AI Search and train a personalized movie recommendation model. The Movielens dataset contains a catalog of movies (documents) and user movie ratings (user events).

In this tutorial, you train a recommendation model of type Others You May Like optimized for click-through-rate (CTR). After training, the model can recommend movies based on a user ID and on a seed movie.

To meet the minimum data requirements for the model, each positive movie rating (4 or higher) is treated as a view-item event.

Estimated time to complete this tutorial:

  • Initial steps to start training the model: ~1.5 hours.
  • Waiting for the model to train: ~24 hours. ( Train the model )
  • Evaluating the model predictions and cleaning up: ~30 minutes. ( Preview recommendations )

If you completed the Get started with media search tutorial and you still have the data store (suggested name quickstart-media-data-store ), then you can use that data store instead of creating another. In this case, you should begin the tutorial at Create an app for media recommendations .

Objectives

  • Learn how to import media documents and user events data from BigQuery into Vertex AI Search.
  • Train and evaluate recommendation models.

Before following this tutorial, make sure you have done the steps in Before you begin .


To follow step-by-step guidance for this task directly in the Google Cloud console, click Guide me :

Guide me


Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the AI Applications, Cloud Storage, BigQuery APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the AI Applications, Cloud Storage, BigQuery APIs.

    Enable the APIs

Prepare the dataset

You use the Cloud Shell to import the Movielens dataset and restructure the dataset for Vertex AI Search for media.

Open the Cloud Shell

  1. Open Google Cloud console .
  2. Select your Google Cloud project.
  3. Take note of the project ID in the Project infocard on the dashboard page. You will need the project ID for the following procedures.
  4. Click the Activate Cloud Shellbutton at the top of the console. A Cloud Shell session opens inside a new frame at the bottom of the Google Cloud console and displays a command-line prompt. For other ways to launch the Cloud Shell, see Launch Cloud Shell .

    Cloud Shell

Import the dataset

The Movielens dataset is available in a public Cloud Storage bucket to make it easier to import.

  1. Run the following using your project ID to set the default project for the command-line.

     gcloud  
    config  
     set 
      
    project  
     PROJECT_ID 
     
    
  2. Create a BigQuery dataset:

     bq  
    mk  
    movielens 
    
  3. Load movies.csv into a new movies BigQuery table:

     bq  
    load  
    --skip_leading_rows = 
     1 
      
    movielens.movies  
     \ 
      
    gs://cloud-samples-data/gen-app-builder/media-recommendations/movies.csv  
     \ 
      
    movieId:integer,title,genres 
    
  4. Load ratings.csv into a new ratings BigQuery table:

     bq  
    load  
    --skip_leading_rows = 
     1 
      
    movielens.ratings  
     \ 
      
    gs://cloud-samples-data/gen-app-builder/media-recommendations/ratings.csv  
     \ 
      
    userId:integer,movieId:integer,rating:float,time:timestamp 
    

Create BigQuery views

In this step, you will restructure the Movielens dataset so it follows the expected format for media recommendations. Media recommendations require user events data in order to create a model. For this guide, you will create fake view-item events during the past 90 days from positive ratings ( < 4 ).

  1. Create a view that converts the movies table into the Google-defined Document schema :

      bq 
      
     mk 
      
     -- 
     project_id 
     = 
      PROJECT_ID 
     
      
    \  
     -- 
     use_legacy_sql 
     = 
     false 
      
    \  
     -- 
     view 
      
     ' 
      
     WITH 
      
     t 
      
     AS 
      
     ( 
      
     SELECT 
      
     CAST 
     ( 
     movieId 
      
     AS 
      
     string 
     ) 
      
     AS 
      
     id 
     , 
      
     SUBSTR 
     ( 
     title 
     , 
      
     0 
     , 
      
     128 
     ) 
      
     AS 
      
     title 
     , 
      
     SPLIT 
     ( 
     genres 
     , 
      
     "|" 
     ) 
      
     AS 
      
     categories 
      
     FROM 
      
     ` 
      PROJECT_ID 
     
     . 
     movielens 
     . 
     movies 
     ` 
     ) 
      
     SELECT 
      
     id 
     , 
      
     "default_schema" 
      
     as 
      
     schemaId 
     , 
      
     null 
      
     as 
      
     parentDocumentId 
     , 
      
     TO_JSON_STRING 
     ( 
     STRUCT 
     ( 
     title 
      
     as 
      
     title 
     , 
      
     categories 
      
     as 
      
     categories 
     , 
      
     CONCAT 
     ( 
     "http://mytestdomain.movie/content/" 
     , 
      
     id 
     ) 
      
     as 
      
     uri 
     , 
      
     "2023-01-01T00:00:00Z" 
      
     as 
      
     available_time 
     , 
      
     "2033-01-01T00:00:00Z" 
      
     as 
      
     expire_time 
     , 
      
     "movie" 
      
     as 
      
     media_type 
     )) 
      
     as 
      
     jsonData 
      
     FROM 
      
     t 
     ; 
     ' 
      
    \ movielens 
     . 
     movies_view 
     
    

    Now, the new view has the schema that the AI Applications API expects.

  2. Go to the BigQuerypage in Google Cloud console.

    Go to BigQuery

  3. In the Explorerpane, expand your project name, expand the movielens dataset and click movies_view to open the query page for this view.

    Products view

  4. Go to the Table explorertab.

  5. In the Generated querypane, click the Copy to querybutton. The query editor opens.

  6. Click Runto see movie data in the view that you created.

  7. Create fictitious user events from movie ratings by running the following Cloud Shell command:

      bq 
      
     mk 
      
     -- 
     project_id 
     = 
      PROJECT_ID 
     
      
     \ 
      
     -- 
     use_legacy_sql 
     = 
     false 
      
     \ 
      
     -- 
     view 
      
     ' 
     WITH t AS ( 
     SELECT 
     MIN(UNIX_SECONDS(time)) AS old_start, 
     MAX(UNIX_SECONDS(time)) AS old_end, 
     UNIX_SECONDS(TIMESTAMP_SUB( 
     CURRENT_TIMESTAMP(), INTERVAL 90 DAY)) AS new_start, 
     UNIX_SECONDS(CURRENT_TIMESTAMP()) AS new_end 
     FROM ` PROJECT_ID 
    .movielens.ratings`) 
     SELECT 
     CAST(userId AS STRING) AS userPseudoId, 
     "view-item" AS eventType, 
     FORMAT_TIMESTAMP("%Y-%m-%dT%X%Ez", 
     TIMESTAMP_SECONDS(CAST( 
     (t.new_start + (UNIX_SECONDS(time) - t.old_start) * 
     (t.new_end - t.new_start) / (t.old_end - t.old_start)) 
     AS int64))) AS eventTime, 
     [STRUCT(movieId AS id, null AS name)] AS documents, 
     FROM ` PROJECT_ID 
    .movielens.ratings`, t 
     WHERE rating >= 4;' 
      
     \ 
      
     movielens 
     . 
     user_events 
     
    

Activate AI Applications

  1. In the Google Cloud console, go to the AI Applicationspage.

    AI Applications

  2. Optional: Click Allow Google to selectively sample model input and responses.

  3. Click Continue and activate the API.

Create an app for media recommendations

The procedures in this section guide you through creating and deploying a media recommendations app.

  1. In the Google Cloud console, go to the AI Applicationspage.

    AI Applications

  2. Click Create app .

  3. On the Create apppage, under Media recommendations, click Create.

  4. In the App namefield, enter a name for your app, such as quickstart-media-recommendations . Your app ID appears under the app name.

  5. Under Recommendations type, make sure Others you may likeis selected.

  6. Under Business Objective, make sure Click-through rate (CTR)is selected.

  7. Click Continue.

  8. Create a data store.

    1. On the Data Storespage, click Create data store.

    2. Enter a display name for your data store, such as quickstart-media-data-store , and then click Create.

  9. Select the data store you just created, and then click Createto create your app.

Import data

Next, import the movies and user events data that were formatted earlier.

Import documents

Import the movies_view document created in the Create BigQuery views section to your quickstart-media-data-store data store.

  1. Under Native sourceson the Import documentspage, select BigQuery.

  2. Enter the name of the movies BigQuery view that you created and click Import.

      PROJECT_ID 
    .movielens.movies_view 
    
  3. Wait until all documents have been imported, which should take about 15 minutes. There should be 86537 documents when complete.

    You can check the Activitytab for the import operation status. When the import is complete, the import operation status changes to Completed.

Import user events

Import the user_events records created in the Create BigQuery views section to your data store.

  1. On the Eventstab, click Import Events.

  2. Under Native sourceson the Import documents page, select BigQuery.

  3. Enter the name of the user_events BigQuery view that you created and click Import.

      PROJECT_ID 
    .movielens.user_events 
    
  4. Wait until at least a million events have been imported before proceeding to the next step, in order to meet the data requirements for training a new model.

    You can check the Activitytab for the operation status. The process takes about an hour to complete because you are importing millions of rows.

  5. To see if the requirements have been met, go to the Data quality> Requirementstab. Even after the user events have been imported, it can take some time for the Requirementstab to update its status to Data requirements met.

Train the recommendation model

  1. Go to the Configurationspage.

  2. Click the Servingtab. A serving config has already been created.

    If you want to adjust the Recommendation demotion or Result diversification settings, you can do so on this page.

  3. Click the Trainingtab.

    After the data requirements have been met, the model begins training automatically. You can view the training and tuning status on this page.

    It might take a couple of days for the model to train and become ready to query. The Ready to queryfield indicates Yeswhen the process is complete. You need to refresh the page to see the change Noto Yes.

Preview recommendations

After the model is ready to query:

  1. In the navigation menu, click Preview .

  2. Click the Document IDfield. A list of document IDs appears.

  3. Enter a seed document (movie) ID, such as 4993 for "The Lord of the Rings: The Fellowship of the Ring (2001)".

    Enter ID

  4. Select the Serving configname from the drop-down menu.

  5. Click Get recommendations. A list of recommended documents appears.

Deploy your app for structured data

There is no recommendations widget for deploying your app. To test your app before deployment:

  1. Go to the Datapage, Documentstab, and copy a document ID.

  2. Go to the Integrationpage. This page includes a sample command for the servingConfigs.recommend method in the REST API.

  3. Paste the document ID you copied earlier into the Document IDfield.

  4. Leave the User Pseudo IDfield as is.

  5. Copy the example request and run it in Cloud Shell.

For help integrating the recommendations app into your web app, see the code samples at Get media recommendations .

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

You can reuse the data store you created for media search in the Get started with media search tutorial. Try that tutorial before doing this clean up procedure.

  1. To avoid unnecessary Google Cloud charges, use the Google Cloud console to delete your project if you don't need it.
  2. If you created a new project to learn about AI Applications and you no longer need the project, delete the project .
  3. If you used an existing Google Cloud project, delete the resources you created to avoid incurring charges to your account. For more information, see Delete an app .
  4. Follow the steps in Turn off Vertex AI Search .
  5. If you created a BigQuery dataset, delete it in Cloud Shell:

     bq  
    rm  
    --recursive  
    --dataset  
    movielens 
    

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: