Tutorial: Perform evaluation using the console

Learn how to get started with Gen AI evaluation service using the Google Google Cloud console.

Before you begin

Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

Go to project selector

Verify that billing is enabled for your Google Cloud project .

Make sure that you have the following role or roles on the project: Storage Admin

Check for the roles

In the Google Cloud console, go to the IAM page.
Go to IAM
Select the project.
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

Grant the roles

In the Google Cloud console, go to the IAM page.
Go to IAM
Select the project.
Click Grant access .
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
In the Select a role list, select a role.
To grant additional roles, click Add another role and add each additional role.
Click Save .

In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

Roles required to select or create a project

Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
Create a project : To create a project, you need the Project Creator role ( roles/resourcemanager.projectCreator ), which contains the resourcemanager.projects.create permission. Learn how to grant roles .

Go to project selector

Verify that billing is enabled for your Google Cloud project .

Make sure that you have the following role or roles on the project: Storage Admin

Check for the roles

In the Google Cloud console, go to the IAM page.
Go to IAM
Select the project.
In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator.
For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.

Grant the roles

In the Google Cloud console, go to the IAM page.
Go to IAM
Select the project.
Click Grant access .
In the New principals field, enter your user identifier. This is typically the email address for a Google Account.
In the Select a role list, select a role.
To grant additional roles, click Add another role and add each additional role.
Click Save .

Evaluate your model

To evaluate your model:

In the Google Cloud console, go to the Gen AI Evaluation page.

Go to Evaluation
Click New evaluationto open the evaluation page.
For Define evaluation dataset, select an option:
- Upload file: Click Uploadto upload a CSV or JSONL file. The dataset should contain either prompts or records to use in a prompt template and optionally model responses, with a maximum of 200 rows.
- Generate data: Enter a Prompt templateto guide the Gen AI evaluation service in generating a dataset. Variables you define in your prompt template are generated and populated in the dataset. For more information, see Use prompt templates .
  1. Define variables to generate: Specify variables to generate and descriptions of the variable to guide generation. If needed, click Add another variable description.
  2. Enter a Number of samplesto generate.
  3. Click Generate and preview datasetto display a generated dataset based on your prompt template and variables. To adjust the dataset, you can add more details to the variable descriptions and click Re-generate.
- Use model logs: Use the snapshot of prompts and responses from the logged traffic of the selected model. You can only use this option if you have request-response logs enabled on a deployed model in Vertex AI. If you just enabled logging, allow time for sufficient samples to accumulate.
  1. Select the Modeland the Regionyou want to log traffic from. You must have enabled logging already on your selected model and region.
  2. Enter a Sampling count.
  3. (Optional) Enable Filter by prompt templateto use only logs that match your Prompt template. This can be useful if you use your selected models for a variety of use cases and want to evaluate one specific use case.
For Define model responses to evaluate, select an option:
- From dataset(only available if you selected Upload filefor Define evaluation dataset): If you want to use one of the fields in the uploaded dataset as your response, select a Response column.
- From model(only available if you selected Use model logsfor Define evaluation dataset): If you're using model logs as the evaluation dataset, the Gen AI evaluation service uses the model responses from the model logs.
- Call model: Select a model. The Gen AI evaluation service runs prompts on the selected model and uses the responses for evaluation.
(Optional) For Auto-generated evaluation metrics, you can Specify custom instructionsto guide the rubrics generated from each prompt. For example, Evaluate the dataset on cultural sensitivity to the countries {name} . For more information, see Define your evaluation metrics .
For Name and output directory, enter the following:
1. Evaluation name: Enter a name for your evaluation.
2. Output private data path: Enter the name of a Cloud Storage bucket where you want to store your evaluation, or click Browse to choose the bucket.
Click Evaluate.

View your evaluation results

To view an evaluation result:

In the Google Cloud console, go to the Gen AI Evaluation page.

Go to Evaluation
Click the evaluation name.
For each prompt in your evaluation dataset, the model's response displays along with the evaluation results.

What's next

Define your evaluation metrics .

Tutorial: Perform evaluation using the console Stay organized with collections Save and categorize content based on your preferences.

Before you begin

Check for the roles

Grant the roles

Check for the roles

Grant the roles

Evaluate your model

View your evaluation results

What's next

Tutorial: Perform evaluation using the console