Learn how to get started with Gen AI evaluation service using the Google Google Cloud console.
Before you begin
- Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
-  In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-  Create a project 
: To create a project, you need the Project Creator
      ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles .
 
-  Verify that billing is enabled for your Google Cloud project . 
-  Make sure that you have the following role or roles on the project: Storage Admin Check for the roles-  In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
-  In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator. 
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
 Grant the roles-  In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- Click Grant access .
-  In the New principals field, enter your user identifier. This is typically the email address for a Google Account. 
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save .
 
-  
-  In the Google Cloud console, on the project selector page, select or create a Google Cloud project. Roles required to select or create a project - Select a project : Selecting a project doesn't require a specific IAM role—you can select any project that you've been granted a role on.
-  Create a project 
: To create a project, you need the Project Creator
      ( roles/resourcemanager.projectCreator), which contains theresourcemanager.projects.createpermission. Learn how to grant roles .
 
-  Verify that billing is enabled for your Google Cloud project . 
-  Make sure that you have the following role or roles on the project: Storage Admin Check for the roles-  In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
-  In the Principal column, find all rows that identify you or a group that you're included in. To learn which groups you're included in, contact your administrator. 
- For all rows that specify or include you, check the Role column to see whether the list of roles includes the required roles.
 Grant the roles-  In the Google Cloud console, go to the IAM page. Go to IAM
- Select the project.
- Click Grant access .
-  In the New principals field, enter your user identifier. This is typically the email address for a Google Account. 
- In the Select a role list, select a role.
- To grant additional roles, click Add another role and add each additional role.
- Click Save .
 
-  
Evaluate your model
To evaluate your model:
-  In the Google Cloud console, go to the Gen AI Evaluation page. 
-  Click New evaluationto open the evaluation page. 
-  For Define evaluation dataset, select an option: -  Upload file: Click Uploadto upload a CSV or JSONL file. The dataset should contain either prompts or records to use in a prompt template and optionally model responses, with a maximum of 200 rows. 
-  Generate data: Enter a Prompt templateto guide the Gen AI evaluation service in generating a dataset. Variables you define in your prompt template are generated and populated in the dataset. For more information, see Use prompt templates . -  Define variables to generate: Specify variables to generate and descriptions of the variable to guide generation. If needed, click Add another variable description. 
-  Enter a Number of samplesto generate. 
-  Click Generate and preview datasetto display a generated dataset based on your prompt template and variables. To adjust the dataset, you can add more details to the variable descriptions and click Re-generate. 
 
-  
-  Use model logs: Use the snapshot of prompts and responses from the logged traffic of the selected model. You can only use this option if you have request-response logs enabled on a deployed model in Vertex AI. If you just enabled logging, allow time for sufficient samples to accumulate. -  Select the Modeland the Regionyou want to log traffic from. You must have enabled logging already on your selected model and region. 
-  Enter a Sampling count. 
-  (Optional) Enable Filter by prompt templateto use only logs that match your Prompt template. This can be useful if you use your selected models for a variety of use cases and want to evaluate one specific use case. 
 
-  
 
-  
-  For Define model responses to evaluate, select an option: -  From dataset(only available if you selected Upload filefor Define evaluation dataset): If you want to use one of the fields in the uploaded dataset as your response, select a Response column. 
-  From model(only available if you selected Use model logsfor Define evaluation dataset): If you're using model logs as the evaluation dataset, the Gen AI evaluation service uses the model responses from the model logs. 
-  Call model: Select a model. The Gen AI evaluation service runs prompts on the selected model and uses the responses for evaluation. 
 
-  
-  (Optional) For Auto-generated evaluation metrics, you can Specify custom instructionsto guide the rubrics generated from each prompt. For example, Evaluate the dataset on cultural sensitivity to the countries {name}. For more information, see Define your evaluation metrics .
-  For Name and output directory, enter the following: -  Evaluation name: Enter a name for your evaluation. 
-  Output private data path: Enter the name of a Cloud Storage bucket where you want to store your evaluation, or click Browse to choose the bucket. 
 
-  
-  Click Evaluate. 
View your evaluation results
To view an evaluation result:
-  In the Google Cloud console, go to the Gen AI Evaluation page. 
-  Click the evaluation name. 
-  For each prompt in your evaluation dataset, the model's response displays along with the evaluation results. 

