Full name: projects.locations.evaluateInstances
Evaluates instances based on a given metric.
Endpoint
posthttps:  
 
 
Where {service-endpoint} 
is one of the supported service endpoints 
.
Path parameters
location 
 
  string 
 
Required. The resource name of the Location to evaluate the instances. Format: projects/{project}/locations/{location} 
Request body
The request body contains data with the following structure:
metric_inputs 
 
  Union type 
 
 metric_inputs 
can be only one of the following:exactMatchInput 
 
  object (  ExactMatchInput 
 
) 
 
Auto metric instances. Instances and metric spec for exact match metric.
bleuInput 
 
  object (  BleuInput 
 
) 
 
Instances and metric spec for bleu metric.
rougeInput 
 
  object (  RougeInput 
 
) 
 
Instances and metric spec for rouge metric.
fluencyInput 
 
  object (  FluencyInput 
 
) 
 
LLM-based metric instance. General text generation metrics, applicable to other categories. Input for fluency metric.
coherenceInput 
 
  object (  CoherenceInput 
 
) 
 
Input for coherence metric.
safetyInput 
 
  object (  SafetyInput 
 
) 
 
Input for safety metric.
groundednessInput 
 
  object (  GroundednessInput 
 
) 
 
Input for groundedness metric.
fulfillmentInput 
 
  object (  FulfillmentInput 
 
) 
 
Input for fulfillment metric.
summarizationQualityInput 
 
  object (  SummarizationQualityInput 
 
) 
 
Input for summarization quality metric.
pairwiseSummarizationQualityInput 
 
  object (  PairwiseSummarizationQualityInput 
 
) 
 
Input for pairwise summarization quality metric.
summarizationHelpfulnessInput 
 
  object (  SummarizationHelpfulnessInput 
 
) 
 
Input for summarization helpfulness metric.
summarizationVerbosityInput 
 
  object (  SummarizationVerbosityInput 
 
) 
 
Input for summarization verbosity metric.
questionAnsweringQualityInput 
 
  object (  QuestionAnsweringQualityInput 
 
) 
 
Input for question answering quality metric.
pairwiseQuestionAnsweringQualityInput 
 
  object (  PairwiseQuestionAnsweringQualityInput 
 
) 
 
Input for pairwise question answering quality metric.
questionAnsweringRelevanceInput 
 
  object (  QuestionAnsweringRelevanceInput 
 
) 
 
Input for question answering relevance metric.
questionAnsweringHelpfulnessInput 
 
  object (  QuestionAnsweringHelpfulnessInput 
 
) 
 
Input for question answering helpfulness metric.
questionAnsweringCorrectnessInput 
 
  object (  QuestionAnsweringCorrectnessInput 
 
) 
 
Input for question answering correctness metric.
pointwiseMetricInput 
 
  object (  PointwiseMetricInput 
 
) 
 
Input for pointwise metric.
pairwiseMetricInput 
 
  object (  PairwiseMetricInput 
 
) 
 
Input for pairwise metric.
toolCallValidInput 
 
  object (  ToolCallValidInput 
 
) 
 
Tool call metric instances. Input for tool call valid metric.
toolNameMatchInput 
 
  object (  ToolNameMatchInput 
 
) 
 
Input for tool name match metric.
toolParameterKeyMatchInput 
 
  object (  ToolParameterKeyMatchInput 
 
) 
 
Input for tool parameter key match metric.
toolParameterKvMatchInput 
 
  object (  ToolParameterKVMatchInput 
 
) 
 
Input for tool parameter key value match metric.
cometInput 
 
  object (  CometInput 
 
) 
 
Translation metrics. Input for Comet metric.
metricxInput 
 
  object (  MetricxInput 
 
) 
 
Input for Metricx metric.
trajectoryExactMatchInput 
 
  object (  TrajectoryExactMatchInput 
 
) 
 
Input for trajectory exact match metric.
trajectoryInOrderMatchInput 
 
  object (  TrajectoryInOrderMatchInput 
 
) 
 
Input for trajectory in order match metric.
trajectoryAnyOrderMatchInput 
 
  object (  TrajectoryAnyOrderMatchInput 
 
) 
 
Input for trajectory match any order metric.
trajectoryPrecisionInput 
 
  object (  TrajectoryPrecisionInput 
 
) 
 
Input for trajectory precision metric.
trajectoryRecallInput 
 
  object (  TrajectoryRecallInput 
 
) 
 
Input for trajectory recall metric.
trajectorySingleToolUseInput 
 
  object (  TrajectorySingleToolUseInput 
 
) 
 
Input for trajectory single tool use metric.
Response body
Response message for EvaluationService.EvaluateInstances.
If successful, the response body contains data with the following structure:
evaluation_results 
 
  Union type 
 
 evaluation_results 
can be only one of the following:exactMatchResults 
 
  object (  ExactMatchResults 
 
) 
 
Auto metric evaluation results. Results for exact match metric.
bleuResults 
 
  object (  BleuResults 
 
) 
 
Results for bleu metric.
rougeResults 
 
  object (  RougeResults 
 
) 
 
Results for rouge metric.
fluencyResult 
 
  object (  FluencyResult 
 
) 
 
LLM-based metric evaluation result. General text generation metrics, applicable to other categories. result for fluency metric.
coherenceResult 
 
  object (  CoherenceResult 
 
) 
 
result for coherence metric.
safetyResult 
 
  object (  SafetyResult 
 
) 
 
result for safety metric.
groundednessResult 
 
  object (  GroundednessResult 
 
) 
 
result for groundedness metric.
fulfillmentResult 
 
  object (  FulfillmentResult 
 
) 
 
result for fulfillment metric.
summarizationQualityResult 
 
  object (  SummarizationQualityResult 
 
) 
 
Summarization only metrics. result for summarization quality metric.
pairwiseSummarizationQualityResult 
 
  object (  PairwiseSummarizationQualityResult 
 
) 
 
result for pairwise summarization quality metric.
summarizationHelpfulnessResult 
 
  object (  SummarizationHelpfulnessResult 
 
) 
 
result for summarization helpfulness metric.
summarizationVerbosityResult 
 
  object (  SummarizationVerbosityResult 
 
) 
 
result for summarization verbosity metric.
questionAnsweringQualityResult 
 
  object (  QuestionAnsweringQualityResult 
 
) 
 
Question answering only metrics. result for question answering quality metric.
pairwiseQuestionAnsweringQualityResult 
 
  object (  PairwiseQuestionAnsweringQualityResult 
 
) 
 
result for pairwise question answering quality metric.
questionAnsweringRelevanceResult 
 
  object (  QuestionAnsweringRelevanceResult 
 
) 
 
result for question answering relevance metric.
questionAnsweringHelpfulnessResult 
 
  object (  QuestionAnsweringHelpfulnessResult 
 
) 
 
result for question answering helpfulness metric.
questionAnsweringCorrectnessResult 
 
  object (  QuestionAnsweringCorrectnessResult 
 
) 
 
result for question answering correctness metric.
pointwiseMetricResult 
 
  object (  PointwiseMetricResult 
 
) 
 
Generic metrics. result for pointwise metric.
pairwiseMetricResult 
 
  object (  PairwiseMetricResult 
 
) 
 
result for pairwise metric.
toolCallValidResults 
 
  object (  ToolCallValidResults 
 
) 
 
Tool call metrics. Results for tool call valid metric.
toolNameMatchResults 
 
  object (  ToolNameMatchResults 
 
) 
 
Results for tool name match metric.
toolParameterKeyMatchResults 
 
  object (  ToolParameterKeyMatchResults 
 
) 
 
Results for tool parameter key match metric.
toolParameterKvMatchResults 
 
  object (  ToolParameterKVMatchResults 
 
) 
 
Results for tool parameter key value match metric.
cometResult 
 
  object (  CometResult 
 
) 
 
Translation metrics. result for Comet metric.
metricxResult 
 
  object (  MetricxResult 
 
) 
 
result for Metricx metric.
trajectoryExactMatchResults 
 
  object (  TrajectoryExactMatchResults 
 
) 
 
result for trajectory exact match metric.
trajectoryInOrderMatchResults 
 
  object (  TrajectoryInOrderMatchResults 
 
) 
 
result for trajectory in order match metric.
trajectoryAnyOrderMatchResults 
 
  object (  TrajectoryAnyOrderMatchResults 
 
) 
 
result for trajectory any order match metric.
trajectoryPrecisionResults 
 
  object (  TrajectoryPrecisionResults 
 
) 
 
result for trajectory precision metric.
trajectoryRecallResults 
 
  object (  TrajectoryRecallResults 
 
) 
 
Results for trajectory recall metric.
trajectorySingleToolUseResults 
 
  object (  TrajectorySingleToolUseResults 
 
) 
 
Results for trajectory single tool use metric.
| JSON representation | 
|---|
| { // evaluation_results "exactMatchResults" : { object ( | 
ExactMatchInput
Input for exact match metric.
metricSpec 
 
  object (  ExactMatchSpec 
 
) 
 
Required. Spec for exact match metric.
instances[] 
 
  object (  ExactMatchInstance 
 
) 
 
Required. Repeated exact match instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
ExactMatchSpec
This type has no fields.
Spec for exact match metric - returns 1 if prediction and reference exactly matches, otherwise 0.
ExactMatchInstance
Spec for exact match instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
BleuInput
Input for bleu metric.
metricSpec 
 
  object (  BleuSpec 
 
) 
 
Required. Spec for bleu score metric.
instances[] 
 
  object (  BleuInstance 
 
) 
 
Required. Repeated bleu instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
BleuSpec
Spec for bleu score metric - calculates the precision of n-grams in the prediction as compared to reference - returns a score ranging between 0 to 1.
useEffectiveOrder 
 
  boolean 
 
Optional. Whether to useEffectiveOrder to compute bleu score.
| JSON representation | 
|---|
| { "useEffectiveOrder" : boolean } | 
BleuInstance
Spec for bleu instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
RougeInput
Input for rouge metric.
metricSpec 
 
  object (  RougeSpec 
 
) 
 
Required. Spec for rouge score metric.
instances[] 
 
  object (  RougeInstance 
 
) 
 
Required. Repeated rouge instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
RougeSpec
Spec for rouge score metric - calculates the recall of n-grams in prediction as compared to reference - returns a score ranging between 0 and 1.
rougeType 
 
  string 
 
Optional. Supported rouge types are rougen[1-9], rougeL, and rougeLsum.
useStemmer 
 
  boolean 
 
Optional. Whether to use stemmer to compute rouge score.
splitSummaries 
 
  boolean 
 
Optional. Whether to split summaries while using rougeLsum.
| JSON representation | 
|---|
| { "rougeType" : string , "useStemmer" : boolean , "splitSummaries" : boolean } | 
RougeInstance
Spec for rouge instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
FluencyInput
Input for fluency metric.
metricSpec 
 
  object (  FluencySpec 
 
) 
 
Required. Spec for fluency score metric.
instance 
 
  object (  FluencyInstance 
 
) 
 
Required. Fluency instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
FluencySpec
Spec for fluency score metric.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "version" : integer } | 
FluencyInstance
Spec for fluency instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
| JSON representation | 
|---|
| { "prediction" : string } | 
CoherenceInput
Input for coherence metric.
metricSpec 
 
  object (  CoherenceSpec 
 
) 
 
Required. Spec for coherence score metric.
instance 
 
  object (  CoherenceInstance 
 
) 
 
Required. Coherence instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
CoherenceSpec
Spec for coherence score metric.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "version" : integer } | 
CoherenceInstance
Spec for coherence instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
| JSON representation | 
|---|
| { "prediction" : string } | 
SafetyInput
Input for safety metric.
metricSpec 
 
  object (  SafetySpec 
 
) 
 
Required. Spec for safety metric.
instance 
 
  object (  SafetyInstance 
 
) 
 
Required. Safety instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
SafetySpec
Spec for safety metric.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "version" : integer } | 
SafetyInstance
Spec for safety instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
| JSON representation | 
|---|
| { "prediction" : string } | 
GroundednessInput
Input for groundedness metric.
metricSpec 
 
  object (  GroundednessSpec 
 
) 
 
Required. Spec for groundedness metric.
instance 
 
  object (  GroundednessInstance 
 
) 
 
Required. Groundedness instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
GroundednessSpec
Spec for groundedness metric.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "version" : integer } | 
GroundednessInstance
Spec for groundedness instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
context 
 
  string 
 
Required. Background information provided in context used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "context" : string } | 
FulfillmentInput
Input for fulfillment metric.
metricSpec 
 
  object (  FulfillmentSpec 
 
) 
 
Required. Spec for fulfillment score metric.
instance 
 
  object (  FulfillmentInstance 
 
) 
 
Required. Fulfillment instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
FulfillmentSpec
Spec for fulfillment metric.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "version" : integer } | 
FulfillmentInstance
Spec for fulfillment instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
instruction 
 
  string 
 
Required. Inference instruction prompt to compare prediction with.
| JSON representation | 
|---|
| { "prediction" : string , "instruction" : string } | 
SummarizationQualityInput
Input for summarization quality metric.
metricSpec 
 
  object (  SummarizationQualitySpec 
 
) 
 
Required. Spec for summarization quality score metric.
instance 
 
  object (  SummarizationQualityInstance 
 
) 
 
Required. Summarization quality instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
SummarizationQualitySpec
Spec for summarization quality score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute summarization quality.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
SummarizationQualityInstance
Spec for summarization quality instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to be summarized.
instruction 
 
  string 
 
Required. Summarization prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
PairwiseSummarizationQualityInput
Input for pairwise summarization quality metric.
metricSpec 
 
  object (  PairwiseSummarizationQualitySpec 
 
) 
 
Required. Spec for pairwise summarization quality score metric.
instance 
 
  object (  PairwiseSummarizationQualityInstance 
 
) 
 
Required. Pairwise summarization quality instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
PairwiseSummarizationQualitySpec
Spec for pairwise summarization quality score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute pairwise summarization quality.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
PairwiseSummarizationQualityInstance
Spec for pairwise summarization quality instance.
prediction 
 
  string 
 
Required. Output of the candidate model.
baselinePrediction 
 
  string 
 
Required. Output of the baseline model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to be summarized.
instruction 
 
  string 
 
Required. Summarization prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "baselinePrediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
SummarizationHelpfulnessInput
Input for summarization helpfulness metric.
metricSpec 
 
  object (  SummarizationHelpfulnessSpec 
 
) 
 
Required. Spec for summarization helpfulness score metric.
instance 
 
  object (  SummarizationHelpfulnessInstance 
 
) 
 
Required. Summarization helpfulness instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
SummarizationHelpfulnessSpec
Spec for summarization helpfulness score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute summarization helpfulness.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
SummarizationHelpfulnessInstance
Spec for summarization helpfulness instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to be summarized.
instruction 
 
  string 
 
Optional. Summarization prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
SummarizationVerbosityInput
Input for summarization verbosity metric.
metricSpec 
 
  object (  SummarizationVerbositySpec 
 
) 
 
Required. Spec for summarization verbosity score metric.
instance 
 
  object (  SummarizationVerbosityInstance 
 
) 
 
Required. Summarization verbosity instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
SummarizationVerbositySpec
Spec for summarization verbosity score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute summarization verbosity.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
SummarizationVerbosityInstance
Spec for summarization verbosity instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to be summarized.
instruction 
 
  string 
 
Optional. Summarization prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
QuestionAnsweringQualityInput
Input for question answering quality metric.
metricSpec 
 
  object (  QuestionAnsweringQualitySpec 
 
) 
 
Required. Spec for question answering quality score metric.
instance 
 
  object (  QuestionAnsweringQualityInstance 
 
) 
 
Required. Question answering quality instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
QuestionAnsweringQualitySpec
Spec for question answering quality score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute question answering quality.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
QuestionAnsweringQualityInstance
Spec for question answering quality instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to answer the question.
instruction 
 
  string 
 
Required. Question Answering prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
PairwiseQuestionAnsweringQualityInput
Input for pairwise question answering quality metric.
metricSpec 
 
  object (  PairwiseQuestionAnsweringQualitySpec 
 
) 
 
Required. Spec for pairwise question answering quality score metric.
instance 
 
  object (  PairwiseQuestionAnsweringQualityInstance 
 
) 
 
Required. Pairwise question answering quality instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
PairwiseQuestionAnsweringQualitySpec
Spec for pairwise question answering quality score metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute question answering quality.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
PairwiseQuestionAnsweringQualityInstance
Spec for pairwise question answering quality instance.
prediction 
 
  string 
 
Required. Output of the candidate model.
baselinePrediction 
 
  string 
 
Required. Output of the baseline model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Required. Text to answer the question.
instruction 
 
  string 
 
Required. Question Answering prompt for LLM.
| JSON representation | 
|---|
| { "prediction" : string , "baselinePrediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
QuestionAnsweringRelevanceInput
Input for question answering relevance metric.
metricSpec 
 
  object (  QuestionAnsweringRelevanceSpec 
 
) 
 
Required. Spec for question answering relevance score metric.
instance 
 
  object (  QuestionAnsweringRelevanceInstance 
 
) 
 
Required. Question answering relevance instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
QuestionAnsweringRelevanceSpec
Spec for question answering relevance metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute question answering relevance.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
QuestionAnsweringRelevanceInstance
Spec for question answering relevance instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Optional. Text provided as context to answer the question.
instruction 
 
  string 
 
Required. The question asked and other instruction in the inference prompt.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
QuestionAnsweringHelpfulnessInput
Input for question answering helpfulness metric.
metricSpec 
 
  object (  QuestionAnsweringHelpfulnessSpec 
 
) 
 
Required. Spec for question answering helpfulness score metric.
instance 
 
  object (  QuestionAnsweringHelpfulnessInstance 
 
) 
 
Required. Question answering helpfulness instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
QuestionAnsweringHelpfulnessSpec
Spec for question answering helpfulness metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute question answering helpfulness.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
QuestionAnsweringHelpfulnessInstance
Spec for question answering helpfulness instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Optional. Text provided as context to answer the question.
instruction 
 
  string 
 
Required. The question asked and other instruction in the inference prompt.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
QuestionAnsweringCorrectnessInput
Input for question answering correctness metric.
metricSpec 
 
  object (  QuestionAnsweringCorrectnessSpec 
 
) 
 
Required. Spec for question answering correctness score metric.
instance 
 
  object (  QuestionAnsweringCorrectnessInstance 
 
) 
 
Required. Question answering correctness instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
QuestionAnsweringCorrectnessSpec
Spec for question answering correctness metric.
useReference 
 
  boolean 
 
Optional. Whether to use instance.reference to compute question answering correctness.
version 
 
  integer 
 
Optional. Which version to use for evaluation.
| JSON representation | 
|---|
| { "useReference" : boolean , "version" : integer } | 
QuestionAnsweringCorrectnessInstance
Spec for question answering correctness instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
context 
 
  string 
 
Optional. Text provided as context to answer the question.
instruction 
 
  string 
 
Required. The question asked and other instruction in the inference prompt.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "context" : string , "instruction" : string } | 
PointwiseMetricInput
Input for pointwise metric.
metricSpec 
 
  object (  PointwiseMetricSpec 
 
) 
 
Required. Spec for pointwise metric.
instance 
 
  object (  PointwiseMetricInstance 
 
) 
 
Required. Pointwise metric instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
PointwiseMetricSpec
Spec for pointwise metric.
metricPromptTemplate 
 
  string 
 
Required. Metric prompt template for pointwise metric.
| JSON representation | 
|---|
| { "metricPromptTemplate" : string } | 
PointwiseMetricInstance
Pointwise metric instance. Usually one instance corresponds to one row in an evaluation dataset.
instance 
 
  Union type 
 
 instance 
can be only one of the following:jsonInstance 
 
  string 
 
Instance specified as a json string. String key-value pairs are expected in the jsonInstance to render PointwiseMetricSpec.instance_prompt_template.
| JSON representation | 
|---|
| { // instance "jsonInstance" : string // Union type } | 
PairwiseMetricInput
Input for pairwise metric.
metricSpec 
 
  object (  PairwiseMetricSpec 
 
) 
 
Required. Spec for pairwise metric.
instance 
 
  object (  PairwiseMetricInstance 
 
) 
 
Required. Pairwise metric instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
PairwiseMetricSpec
Spec for pairwise metric.
metricPromptTemplate 
 
  string 
 
Required. Metric prompt template for pairwise metric.
| JSON representation | 
|---|
| { "metricPromptTemplate" : string } | 
PairwiseMetricInstance
Pairwise metric instance. Usually one instance corresponds to one row in an evaluation dataset.
instance 
 
  Union type 
 
 instance 
can be only one of the following:jsonInstance 
 
  string 
 
Instance specified as a json string. String key-value pairs are expected in the jsonInstance to render PairwiseMetricSpec.instance_prompt_template.
| JSON representation | 
|---|
| { // instance "jsonInstance" : string // Union type } | 
ToolCallValidInput
Input for tool call valid metric.
metricSpec 
 
  object (  ToolCallValidSpec 
 
) 
 
Required. Spec for tool call valid metric.
instances[] 
 
  object (  ToolCallValidInstance 
 
) 
 
Required. Repeated tool call valid instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
ToolCallValidSpec
This type has no fields.
Spec for tool call valid metric.
ToolCallValidInstance
Spec for tool call valid instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
ToolNameMatchInput
Input for tool name match metric.
metricSpec 
 
  object (  ToolNameMatchSpec 
 
) 
 
Required. Spec for tool name match metric.
instances[] 
 
  object (  ToolNameMatchInstance 
 
) 
 
Required. Repeated tool name match instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
ToolNameMatchSpec
This type has no fields.
Spec for tool name match metric.
ToolNameMatchInstance
Spec for tool name match instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
ToolParameterKeyMatchInput
Input for tool parameter key match metric.
metricSpec 
 
  object (  ToolParameterKeyMatchSpec 
 
) 
 
Required. Spec for tool parameter key match metric.
instances[] 
 
  object (  ToolParameterKeyMatchInstance 
 
) 
 
Required. Repeated tool parameter key match instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
ToolParameterKeyMatchSpec
This type has no fields.
Spec for tool parameter key match metric.
ToolParameterKeyMatchInstance
Spec for tool parameter key match instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
ToolParameterKVMatchInput
Input for tool parameter key value match metric.
metricSpec 
 
  object (  ToolParameterKVMatchSpec 
 
) 
 
Required. Spec for tool parameter key value match metric.
instances[] 
 
  object (  ToolParameterKVMatchInstance 
 
) 
 
Required. Repeated tool parameter key value match instances.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
ToolParameterKVMatchSpec
Spec for tool parameter key value match metric.
useStrictStringMatch 
 
  boolean 
 
Optional. Whether to use STRICT string match on parameter values.
| JSON representation | 
|---|
| { "useStrictStringMatch" : boolean } | 
ToolParameterKVMatchInstance
Spec for tool parameter key value match instance.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Required. Ground truth used to compare against the prediction.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string } | 
CometInput
Input for Comet metric.
metricSpec 
 
  object (  CometSpec 
 
) 
 
Required. Spec for comet metric.
instance 
 
  object (  CometInstance 
 
) 
 
Required. Comet instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
CometSpec
Spec for Comet metric.
sourceLanguage 
 
  string 
 
Optional. Source language in BCP-47 format.
targetLanguage 
 
  string 
 
Optional. Target language in BCP-47 format. Covers both prediction and reference.
version 
 
  enum (  CometVersion 
 
) 
 
Required. Which version to use for evaluation.
| JSON representation | 
|---|
|  { 
 "sourceLanguage" 
 : 
 string 
 , 
 "targetLanguage" 
 : 
 string 
 , 
 "version" 
 : 
 enum (  | 
CometVersion
Comet version options.
| Enums | |
|---|---|
| COMET_VERSION_UNSPECIFIED | Comet version unspecified. | 
| COMET_22_SRC_REF | Comet 22 for translation + source + reference (source-reference-combined). | 
CometInstance
Spec for Comet instance - The fields used for evaluation are dependent on the comet version.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
source 
 
  string 
 
Optional. Source text in original language.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "source" : string } | 
MetricxInput
Input for MetricX metric.
metricSpec 
 
  object (  MetricxSpec 
 
) 
 
Required. Spec for Metricx metric.
instance 
 
  object (  MetricxInstance 
 
) 
 
Required. Metricx instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
MetricxSpec
Spec for MetricX metric.
sourceLanguage 
 
  string 
 
Optional. Source language in BCP-47 format.
targetLanguage 
 
  string 
 
Optional. Target language in BCP-47 format. Covers both prediction and reference.
version 
 
  enum (  MetricxVersion 
 
) 
 
Required. Which version to use for evaluation.
| JSON representation | 
|---|
|  { 
 "sourceLanguage" 
 : 
 string 
 , 
 "targetLanguage" 
 : 
 string 
 , 
 "version" 
 : 
 enum (  | 
MetricxVersion
MetricX version options.
| Enums | |
|---|---|
| METRICX_VERSION_UNSPECIFIED | MetricX version unspecified. | 
| METRICX_24_REF | MetricX 2024 (2.6) for translation + reference (reference-based). | 
| METRICX_24_SRC | MetricX 2024 (2.6) for translation + source (QE). | 
| METRICX_24_SRC_REF | MetricX 2024 (2.6) for translation + source + reference (source-reference-combined). | 
MetricxInstance
Spec for MetricX instance - The fields used for evaluation are dependent on the MetricX version.
prediction 
 
  string 
 
Required. Output of the evaluated model.
reference 
 
  string 
 
Optional. Ground truth used to compare against the prediction.
source 
 
  string 
 
Optional. Source text in original language.
| JSON representation | 
|---|
| { "prediction" : string , "reference" : string , "source" : string } | 
TrajectoryExactMatchInput
Instances and metric spec for TrajectoryExactMatch metric.
metricSpec 
 
  object (  TrajectoryExactMatchSpec 
 
) 
 
Required. Spec for TrajectoryExactMatch metric.
instances[] 
 
  object (  TrajectoryExactMatchInstance 
 
) 
 
Required. Repeated TrajectoryExactMatch instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectoryExactMatchSpec
This type has no fields.
Spec for TrajectoryExactMatch metric - returns 1 if tool calls in the reference trajectory exactly match the predicted trajectory, else 0.
TrajectoryExactMatchInstance
Spec for TrajectoryExactMatch instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
referenceTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for reference tool call trajectory.
| JSON representation | 
|---|
| { "predictedTrajectory" : { object ( | 
Trajectory
ToolCall
Spec for tool call.
toolName 
 
  string 
 
Required. Spec for tool name
toolInput 
 
  string 
 
Optional. Spec for tool input
| JSON representation | 
|---|
| { "toolName" : string , "toolInput" : string } | 
TrajectoryInOrderMatchInput
Instances and metric spec for TrajectoryInOrderMatch metric.
metricSpec 
 
  object (  TrajectoryInOrderMatchSpec 
 
) 
 
Required. Spec for TrajectoryInOrderMatch metric.
instances[] 
 
  object (  TrajectoryInOrderMatchInstance 
 
) 
 
Required. Repeated TrajectoryInOrderMatch instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectoryInOrderMatchSpec
This type has no fields.
Spec for TrajectoryInOrderMatch metric - returns 1 if tool calls in the reference trajectory appear in the predicted trajectory in the same order, else 0.
TrajectoryInOrderMatchInstance
Spec for TrajectoryInOrderMatch instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
referenceTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for reference tool call trajectory.
| JSON representation | 
|---|
| { "predictedTrajectory" : { object ( | 
TrajectoryAnyOrderMatchInput
Instances and metric spec for TrajectoryAnyOrderMatch metric.
metricSpec 
 
  object (  TrajectoryAnyOrderMatchSpec 
 
) 
 
Required. Spec for TrajectoryAnyOrderMatch metric.
instances[] 
 
  object (  TrajectoryAnyOrderMatchInstance 
 
) 
 
Required. Repeated TrajectoryAnyOrderMatch instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectoryAnyOrderMatchSpec
This type has no fields.
Spec for TrajectoryAnyOrderMatch metric - returns 1 if all tool calls in the reference trajectory appear in the predicted trajectory in any order, else 0.
TrajectoryAnyOrderMatchInstance
Spec for TrajectoryAnyOrderMatch instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
referenceTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for reference tool call trajectory.
| JSON representation | 
|---|
| { "predictedTrajectory" : { object ( | 
TrajectoryPrecisionInput
Instances and metric spec for TrajectoryPrecision metric.
metricSpec 
 
  object (  TrajectoryPrecisionSpec 
 
) 
 
Required. Spec for TrajectoryPrecision metric.
instances[] 
 
  object (  TrajectoryPrecisionInstance 
 
) 
 
Required. Repeated TrajectoryPrecision instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectoryPrecisionSpec
This type has no fields.
Spec for TrajectoryPrecision metric - returns a float score based on average precision of individual tool calls.
TrajectoryPrecisionInstance
Spec for TrajectoryPrecision instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
referenceTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for reference tool call trajectory.
| JSON representation | 
|---|
| { "predictedTrajectory" : { object ( | 
TrajectoryRecallInput
Instances and metric spec for TrajectoryRecall metric.
metricSpec 
 
  object (  TrajectoryRecallSpec 
 
) 
 
Required. Spec for TrajectoryRecall metric.
instances[] 
 
  object (  TrajectoryRecallInstance 
 
) 
 
Required. Repeated TrajectoryRecall instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectoryRecallSpec
This type has no fields.
Spec for TrajectoryRecall metric - returns a float score based on average recall of individual tool calls.
TrajectoryRecallInstance
Spec for TrajectoryRecall instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
referenceTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for reference tool call trajectory.
| JSON representation | 
|---|
| { "predictedTrajectory" : { object ( | 
TrajectorySingleToolUseInput
Instances and metric spec for TrajectorySingleToolUse metric.
metricSpec 
 
  object (  TrajectorySingleToolUseSpec 
 
) 
 
Required. Spec for TrajectorySingleToolUse metric.
instances[] 
 
  object (  TrajectorySingleToolUseInstance 
 
) 
 
Required. Repeated TrajectorySingleToolUse instance.
| JSON representation | 
|---|
| { "metricSpec" : { object ( | 
TrajectorySingleToolUseSpec
Spec for TrajectorySingleToolUse metric - returns 1 if tool is present in the predicted trajectory, else 0.
toolName 
 
  string 
 
Required. Spec for tool name to be checked for in the predicted trajectory.
| JSON representation | 
|---|
| { "toolName" : string } | 
TrajectorySingleToolUseInstance
Spec for TrajectorySingleToolUse instance.
predictedTrajectory 
 
  object (  Trajectory 
 
) 
 
Required. Spec for predicted tool call trajectory.
| JSON representation | 
|---|
|  { 
 "predictedTrajectory" 
 : 
 { 
 object (  | 
ExactMatchResults
Results for exact match metric.
exactMatchMetricValues[] 
 
  object (  ExactMatchMetricValue 
 
) 
 
Output only. Exact match metric values.
| JSON representation | 
|---|
|  { 
 "exactMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
ExactMatchMetricValue
Exact match metric value for an instance.
score 
 
  number 
 
Output only. Exact match score.
| JSON representation | 
|---|
| { "score" : number } | 
BleuResults
Results for bleu metric.
bleuMetricValues[] 
 
  object (  BleuMetricValue 
 
) 
 
Output only. Bleu metric values.
| JSON representation | 
|---|
|  { 
 "bleuMetricValues" 
 : 
 [ 
 { 
 object (  | 
BleuMetricValue
Bleu metric value for an instance.
score 
 
  number 
 
Output only. Bleu score.
| JSON representation | 
|---|
| { "score" : number } | 
RougeResults
Results for rouge metric.
rougeMetricValues[] 
 
  object (  RougeMetricValue 
 
) 
 
Output only. Rouge metric values.
| JSON representation | 
|---|
|  { 
 "rougeMetricValues" 
 : 
 [ 
 { 
 object (  | 
RougeMetricValue
Rouge metric value for an instance.
score 
 
  number 
 
Output only. Rouge score.
| JSON representation | 
|---|
| { "score" : number } | 
FluencyResult
Spec for fluency result.
explanation 
 
  string 
 
Output only. Explanation for fluency score.
score 
 
  number 
 
Output only. Fluency score.
confidence 
 
  number 
 
Output only. confidence for fluency score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
CoherenceResult
Spec for coherence result.
explanation 
 
  string 
 
Output only. Explanation for coherence score.
score 
 
  number 
 
Output only. Coherence score.
confidence 
 
  number 
 
Output only. confidence for coherence score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
SafetyResult
Spec for safety result.
explanation 
 
  string 
 
Output only. Explanation for safety score.
score 
 
  number 
 
Output only. Safety score.
confidence 
 
  number 
 
Output only. confidence for safety score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
GroundednessResult
Spec for groundedness result.
explanation 
 
  string 
 
Output only. Explanation for groundedness score.
score 
 
  number 
 
Output only. Groundedness score.
confidence 
 
  number 
 
Output only. confidence for groundedness score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
FulfillmentResult
Spec for fulfillment result.
explanation 
 
  string 
 
Output only. Explanation for fulfillment score.
score 
 
  number 
 
Output only. Fulfillment score.
confidence 
 
  number 
 
Output only. confidence for fulfillment score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
SummarizationQualityResult
Spec for summarization quality result.
explanation 
 
  string 
 
Output only. Explanation for summarization quality score.
score 
 
  number 
 
Output only. Summarization Quality score.
confidence 
 
  number 
 
Output only. confidence for summarization quality score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
PairwiseSummarizationQualityResult
Spec for pairwise summarization quality result.
pairwiseChoice 
 
  enum (  PairwiseChoice 
 
) 
 
Output only. Pairwise summarization prediction choice.
explanation 
 
  string 
 
Output only. Explanation for summarization quality score.
confidence 
 
  number 
 
Output only. confidence for summarization quality score.
| JSON representation | 
|---|
|  { 
 "pairwiseChoice" 
 : 
 enum (  | 
PairwiseChoice
Pairwise prediction autorater preference.
| Enums | |
|---|---|
| PAIRWISE_CHOICE_UNSPECIFIED | Unspecified prediction choice. | 
| BASELINE | baseline prediction wins | 
| CANDIDATE | Candidate prediction wins | 
| TIE | Winner cannot be determined | 
SummarizationHelpfulnessResult
Spec for summarization helpfulness result.
explanation 
 
  string 
 
Output only. Explanation for summarization helpfulness score.
score 
 
  number 
 
Output only. Summarization Helpfulness score.
confidence 
 
  number 
 
Output only. confidence for summarization helpfulness score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
SummarizationVerbosityResult
Spec for summarization verbosity result.
explanation 
 
  string 
 
Output only. Explanation for summarization verbosity score.
score 
 
  number 
 
Output only. Summarization Verbosity score.
confidence 
 
  number 
 
Output only. confidence for summarization verbosity score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
QuestionAnsweringQualityResult
Spec for question answering quality result.
explanation 
 
  string 
 
Output only. Explanation for question answering quality score.
score 
 
  number 
 
Output only. Question Answering Quality score.
confidence 
 
  number 
 
Output only. confidence for question answering quality score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
PairwiseQuestionAnsweringQualityResult
Spec for pairwise question answering quality result.
pairwiseChoice 
 
  enum (  PairwiseChoice 
 
) 
 
Output only. Pairwise question answering prediction choice.
explanation 
 
  string 
 
Output only. Explanation for question answering quality score.
confidence 
 
  number 
 
Output only. confidence for question answering quality score.
| JSON representation | 
|---|
|  { 
 "pairwiseChoice" 
 : 
 enum (  | 
QuestionAnsweringRelevanceResult
Spec for question answering relevance result.
explanation 
 
  string 
 
Output only. Explanation for question answering relevance score.
score 
 
  number 
 
Output only. Question Answering Relevance score.
confidence 
 
  number 
 
Output only. confidence for question answering relevance score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
QuestionAnsweringHelpfulnessResult
Spec for question answering helpfulness result.
explanation 
 
  string 
 
Output only. Explanation for question answering helpfulness score.
score 
 
  number 
 
Output only. Question Answering Helpfulness score.
confidence 
 
  number 
 
Output only. confidence for question answering helpfulness score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
QuestionAnsweringCorrectnessResult
Spec for question answering correctness result.
explanation 
 
  string 
 
Output only. Explanation for question answering correctness score.
score 
 
  number 
 
Output only. Question Answering Correctness score.
confidence 
 
  number 
 
Output only. confidence for question answering correctness score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number , "confidence" : number } | 
PointwiseMetricResult
Spec for pointwise metric result.
explanation 
 
  string 
 
Output only. Explanation for pointwise metric score.
score 
 
  number 
 
Output only. Pointwise metric score.
| JSON representation | 
|---|
| { "explanation" : string , "score" : number } | 
PairwiseMetricResult
Spec for pairwise metric result.
pairwiseChoice 
 
  enum (  PairwiseChoice 
 
) 
 
Output only. Pairwise metric choice.
explanation 
 
  string 
 
Output only. Explanation for pairwise metric score.
| JSON representation | 
|---|
|  { 
 "pairwiseChoice" 
 : 
 enum (  | 
ToolCallValidResults
Results for tool call valid metric.
toolCallValidMetricValues[] 
 
  object (  ToolCallValidMetricValue 
 
) 
 
Output only. Tool call valid metric values.
| JSON representation | 
|---|
|  { 
 "toolCallValidMetricValues" 
 : 
 [ 
 { 
 object (  | 
ToolCallValidMetricValue
Tool call valid metric value for an instance.
score 
 
  number 
 
Output only. Tool call valid score.
| JSON representation | 
|---|
| { "score" : number } | 
ToolNameMatchResults
Results for tool name match metric.
toolNameMatchMetricValues[] 
 
  object (  ToolNameMatchMetricValue 
 
) 
 
Output only. Tool name match metric values.
| JSON representation | 
|---|
|  { 
 "toolNameMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
ToolNameMatchMetricValue
Tool name match metric value for an instance.
score 
 
  number 
 
Output only. Tool name match score.
| JSON representation | 
|---|
| { "score" : number } | 
ToolParameterKeyMatchResults
Results for tool parameter key match metric.
toolParameterKeyMatchMetricValues[] 
 
  object (  ToolParameterKeyMatchMetricValue 
 
) 
 
Output only. Tool parameter key match metric values.
| JSON representation | 
|---|
|  { 
 "toolParameterKeyMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
ToolParameterKeyMatchMetricValue
Tool parameter key match metric value for an instance.
score 
 
  number 
 
Output only. Tool parameter key match score.
| JSON representation | 
|---|
| { "score" : number } | 
ToolParameterKVMatchResults
Results for tool parameter key value match metric.
toolParameterKvMatchMetricValues[] 
 
  object (  ToolParameterKVMatchMetricValue 
 
) 
 
Output only. Tool parameter key value match metric values.
| JSON representation | 
|---|
|  { 
 "toolParameterKvMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
ToolParameterKVMatchMetricValue
Tool parameter key value match metric value for an instance.
score 
 
  number 
 
Output only. Tool parameter key value match score.
| JSON representation | 
|---|
| { "score" : number } | 
CometResult
Spec for Comet result - calculates the comet score for the given instance using the version specified in the spec.
score 
 
  number 
 
Output only. Comet score. Range depends on version.
| JSON representation | 
|---|
| { "score" : number } | 
MetricxResult
Spec for MetricX result - calculates the MetricX score for the given instance using the version specified in the spec.
score 
 
  number 
 
Output only. MetricX score. Range depends on version.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectoryExactMatchResults
Results for TrajectoryExactMatch metric.
trajectoryExactMatchMetricValues[] 
 
  object (  TrajectoryExactMatchMetricValue 
 
) 
 
Output only. TrajectoryExactMatch metric values.
| JSON representation | 
|---|
|  { 
 "trajectoryExactMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectoryExactMatchMetricValue
TrajectoryExactMatch metric value for an instance.
score 
 
  number 
 
Output only. TrajectoryExactMatch score.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectoryInOrderMatchResults
Results for TrajectoryInOrderMatch metric.
trajectoryInOrderMatchMetricValues[] 
 
  object (  TrajectoryInOrderMatchMetricValue 
 
) 
 
Output only. TrajectoryInOrderMatch metric values.
| JSON representation | 
|---|
|  { 
 "trajectoryInOrderMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectoryInOrderMatchMetricValue
TrajectoryInOrderMatch metric value for an instance.
score 
 
  number 
 
Output only. TrajectoryInOrderMatch score.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectoryAnyOrderMatchResults
Results for TrajectoryAnyOrderMatch metric.
trajectoryAnyOrderMatchMetricValues[] 
 
  object (  TrajectoryAnyOrderMatchMetricValue 
 
) 
 
Output only. TrajectoryAnyOrderMatch metric values.
| JSON representation | 
|---|
|  { 
 "trajectoryAnyOrderMatchMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectoryAnyOrderMatchMetricValue
TrajectoryAnyOrderMatch metric value for an instance.
score 
 
  number 
 
Output only. TrajectoryAnyOrderMatch score.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectoryPrecisionResults
Results for TrajectoryPrecision metric.
trajectoryPrecisionMetricValues[] 
 
  object (  TrajectoryPrecisionMetricValue 
 
) 
 
Output only. TrajectoryPrecision metric values.
| JSON representation | 
|---|
|  { 
 "trajectoryPrecisionMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectoryPrecisionMetricValue
TrajectoryPrecision metric value for an instance.
score 
 
  number 
 
Output only. TrajectoryPrecision score.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectoryRecallResults
Results for TrajectoryRecall metric.
trajectoryRecallMetricValues[] 
 
  object (  TrajectoryRecallMetricValue 
 
) 
 
Output only. TrajectoryRecall metric values.
| JSON representation | 
|---|
|  { 
 "trajectoryRecallMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectoryRecallMetricValue
TrajectoryRecall metric value for an instance.
score 
 
  number 
 
Output only. TrajectoryRecall score.
| JSON representation | 
|---|
| { "score" : number } | 
TrajectorySingleToolUseResults
Results for TrajectorySingleToolUse metric.
trajectorySingleToolUseMetricValues[] 
 
  object (  TrajectorySingleToolUseMetricValue 
 
) 
 
Output only. TrajectorySingleToolUse metric values.
| JSON representation | 
|---|
|  { 
 "trajectorySingleToolUseMetricValues" 
 : 
 [ 
 { 
 object (  | 
TrajectorySingleToolUseMetricValue
TrajectorySingleToolUse metric value for an instance.
score 
 
  number 
 
Output only. TrajectorySingleToolUse score.
| JSON representation | 
|---|
| { "score" : number } | 

