Google Cloud Managed Service for Prometheus supports Prometheus-compatible rule evaluation and alerting. This document describes how to set up self-deployed rule evaluation, including the standalone rule-evaluator component.
You only need to follow these instructions if you want to execute rules and alerts against the global datastore.
Rule evaluation for self-deployed collection
After you have deployed Managed Service for Prometheus, you can continue to
evaluate rules locally in each deployed instance by using the  rule_files 
 
field of your Prometheus
configuration file. However, the maximum query window for the rules is
constrained by how long the server keeps local data.
Most rules execute only over the last few minutes of data, so running rules on each local server is often a valid strategy. In that case, no further setup is necessary.
However, sometimes it's useful to be able to evaluate rules against the global metric backend, for example, when all data for a rule is not co-located on a given Prometheus instance. For these cases, Managed Service for Prometheus also provides a rule-evaluator component.
Before you begin
This section describes the configuration needed for the tasks described in this document.
Configure your environment
To avoid repeatedly entering your project ID or cluster name, perform the following configuration:
-  Configure the command-line tools as follows: -  Configure the gcloud CLI to refer to the ID of your Google Cloud project: gcloud config set project PROJECT_ID
-  Configure the kubectlCLI to use your cluster:kubectl config set-cluster CLUSTER_NAME
 For more information about these tools, see the following: 
-  
Set up a namespace
Create the  NAMESPACE_NAME 
 
Kubernetes namespace for resources you create
as part of the example application:
kubectl create ns NAMESPACE_NAME 
 
Verify service account credentials
If your Kubernetes cluster has Workload Identity Federation for GKE enabled, then you can skip this section.
When running on GKE, Managed Service for Prometheus
automatically retrieves credentials from the environment based on the
Compute Engine default service account. The default service account has the
necessary permissions, monitoring.metricWriter 
and monitoring.viewer 
, by
default. If you don't use Workload Identity Federation for GKE, and you have previously
removed either of those roles from the default node service account, you will
have to re-add those missing permissions 
before continuing.
Configure a service account for Workload Identity Federation for GKE
If your Kubernetes cluster doesn't have Workload Identity Federation for GKE enabled, then you can skip this section.
Managed Service for Prometheus captures metric data by using the Cloud Monitoring API. If your cluster is using Workload Identity Federation for GKE, you must grant your Kubernetes service account permission to the Monitoring API. This section describes the following:
- Creating a dedicated Google Cloud service account 
, gmp-test-sa.
- Binding the Google Cloud service account to the default Kubernetes
service account 
in a test namespace, NAMESPACE_NAME.
- Granting the necessary permission to the Google Cloud service account.
Create and bind the service account
This step appears in several places in the Managed Service for Prometheus documentation. If you have already performed this step as part of a prior task, then you don't need to repeat it. Skip ahead to Authorize the service account .
The following command sequence creates the  gmp-test-sa 
 
service account
and binds it to the default Kubernetes service account in the  NAMESPACE_NAME 
 
namespace:
gcloud config set project PROJECT_ID \ && gcloud iam service-accounts create gmp-test-sa \ && gcloud iam service-accounts add-iam-policy-binding \ --role roles/iam.workloadIdentityUser \ --member "serviceAccount: PROJECT_ID .svc.id.goog[ NAMESPACE_NAME /default]" \ gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com \ && kubectl annotate serviceaccount \ --namespace NAMESPACE_NAME \ default \ iam.gke.io/gcp-service-account= gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com
If you are using a different GKE namespace or service account, adjust the commands appropriately.
Authorize the service account
Groups of related permissions are collected into roles , and you grant the roles to a principal, in this example, the Google Cloud service account. For more information about Monitoring roles, see Access control .
The following command grants the Google Cloud service account,  gmp-test-sa 
 
, the Monitoring API roles it needs to
read and write
metric data.
If you have already granted the Google Cloud service account a specific role as part of prior task, then you don't need to do it again.
gcloud projects add-iam-policy-binding PROJECT_ID \ --member=serviceAccount: gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com \ --role=roles/monitoring.viewer \ && \ gcloud projects add-iam-policy-binding PROJECT_ID \ --member=serviceAccount: gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com \ --role=roles/monitoring.metricWriter
Debug your Workload Identity Federation for GKE configuration
If you are having trouble getting Workload Identity Federation for GKE to work, see the documentation for verifying your Workload Identity Federation for GKE setup and the Workload Identity Federation for GKE troubleshooting guide .
As typos and partial copy-pastes are the most common sources of errors when configuring Workload Identity Federation for GKE, we stronglyrecommend using the editable variables and clickable copy-paste icons embedded in the code samples in these instructions.
Workload Identity Federation for GKE in production environments
The example described in this document binds the Google Cloud service account to the default Kubernetes service account and gives the Google Cloud service account all necessary permissions to use the Monitoring API.
In a production environment, you might want to use a finer-grained approach, with a service account for each component, each with minimal permissions. For more information on configuring service accounts for workload-identity management, see Using Workload Identity Federation for GKE .
Deploy the standalone rule evaluator
The Managed Service for Prometheus rule evaluator evaluates Prometheus alerting and recording rules against the Managed Service for Prometheus HTTP API and writes the results back to Monarch. It accepts the same configuration-file format and rule-file format as Prometheus. The flags are mostly identical, as well.
-  Create an example deployment of the rule evaluator that is pre-configured to evaluate an alerting and a recording rule: kubectl apply -n NAMESPACE_NAME -f https://raw.githubusercontent.com/GoogleCloudPlatform/prometheus-engine/v0.15.3/manifests/rule-evaluator.yaml 
-  Verify that the pods for the rule-evaluator deployed successfully: kubectl -n NAMESPACE_NAME get pod If the deployment was successful, then you see output similar to the following: NAME READY STATUS RESTARTS AGE ... rule-evaluator-64475b696c-95z29 2/2 Running 0 1m 
After you verify that the rule-evaluator deployed successfully, you can make adjustments to the installed manifests to do the following:
- Add your custom rules files.
- Configure the rule-evaluator to send alerts to a self-deployed
Prometheus Alertmanager 
by using the  alertmanager_configfield of the configuration file.
If your Alertmanager is located in a different cluster
than your rule-evaluator, then you might need to set up an Endpoints resource 
.
For example, if your OperatorConfig specifies that Alertmanager endpoints can be
found in Endpoints object ns=alertmanager/name=alertmanager 
, then you can
manually or programmatically create this object yourself and populate it
with reachable IPs from the other cluster.
Provide credentials explicitly
When running on GKE, the rule-evaluator
automatically retrieves credentials from the environment based on the
node's service account or the Workload Identity Federation for GKE setup.
In non-GKE Kubernetes clusters, credentials must be explicitly
provided to the rule-evaluator by using flags or the GOOGLE_APPLICATION_CREDENTIALS 
environment variable.
-  Set the context to your target project: gcloud config set project PROJECT_ID
-  Create a service account: gcloud iam service-accounts create gmp-test-saThis step creates the service account that you might have already created in the Workload Identity Federation for GKE instructions . 
-  Grant the required permissions to the service account: gcloud projects add-iam-policy-binding PROJECT_ID \ --member=serviceAccount: gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com \ --role=roles/monitoring.viewer \ && \ gcloud projects add-iam-policy-binding PROJECT_ID \ --member=serviceAccount: gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com \ --role=roles/monitoring.metricWriter 
-  Create and download a key for the service account: gcloud iam service-accounts keys create gmp-test-sa -key.json \ --iam-account= gmp-test-sa @ PROJECT_ID .iam.gserviceaccount.com 
-  Add the key file as a secret to your non-GKE cluster: kubectl -n NAMESPACE_NAME create secret generic gmp-test-sa \ --from-file=key.json= gmp-test-sa -key.json 
-  Open the rule-evaluator Deployment resource for editing: kubectl -n NAMESPACE_NAME edit deploy rule-evaluator -  Add the text shown in bold to the resource: apiVersion: apps/v1 kind: Deployment metadata: namespace: NAMESPACE_NAME name: rule-evaluator spec: template containers: - name: evaluator args: - --query.credentials-file=/gmp/key.json - --export.credentials-file=/gmp/key.json ... volumeMounts: - name: gmp-sa mountPath: /gmp readOnly: true ... volumes: - name: gmp-sa secret: secretName: gmp-test-sa ... 
-  Save the file and close the editor. After the change is applied, the pods are re-created and start authenticating to the metric backend with the given service account. 
 GOOGLE_APPLICATION_CREDENTIALSenvironment variable.Multi-project and global rule evaluationWe recommend that you run one instance of the rule evaluator in each Google Cloud project and region rather than running one instance that evaluates against many projects and regions. However, we do support multi-project rule evaluation for scenarios that require it. When deployed on Google Kubernetes Engine, the rule evaluator uses the Google Cloud project associated with the cluster, which it automatically detects. To evaluate rules that span projects, you can override the queried project by using the --query.project-idflag and specifying a project with a multi-project metrics scope. If your metrics scope contains all your projects, then your rules evaluate globally. For more information, see Metrics scopes .You must also update the permissions of the service account used by the rule evaluator so the service account can read from the scoping project and write to all monitored projects in the metrics scope. Preserve labels when writing rulesFor data the evaluator writes back to Managed Service for Prometheus, the evaluator supports the same --export.*flags andexternal_labels-based configuration as the Managed Service for Prometheus server binary. We strongly recommend that you write rules so that theproject_id,location,cluster, andnamespacelabels are preserved appropriately for their aggregation level, otherwise query performance might decline and you might encounter cardinality limits.The project_idorlocationlabels are mandatory. If these labels are missing, then the values in rule-evaluation results are set based on the configuration of the rule evaluator. Missingclusterornamespacelabels are not given values.Self-observabilityThe rule-evaluator emits Prometheus metrics on a configurable port using the --web.listen-addressflag.For example, if the pod rule-evaluator-64475b696c-95z29is exposing these metrics on port9092, the metrics can be viewed manually by usingkubectl:# Port forward the metrics endpoint. kubectl port-forward rule-evaluator-64475b696c-95z29 9092 # Then query in a separate terminal. curl localhost:9092/metricsYou can configure your Prometheus stack to collect these so you have visibility to the performance of the rule-evaluator. High-availability deploymentsThe rule evaluator can run in a highly available setup by following the same approach as documented for the Prometheus server. Alerting using Cloud Monitoring metricsYou can configure the rule evaluator to alert on Google Cloud system metrics using PromQL. For instructions on how to create a valid query, see PromQL for Cloud Monitoring metrics . 
-  

