Application observability with Prometheus on GKE


This tutorial shows you how to set up liveness probes to application microservices deployed to Google Kubernetes Engine (GKE) using open source Prometheus .

This tutorial uses open source Prometheus. However, each GKE Autopilot cluster automatically deploys Managed Service for Prometheus , Google Cloud's fully managed, multi-cloud, cross-project solution for Prometheus metrics. Managed Service for Prometheus lets you globally monitor and alert on your workloads using Prometheus, without having to manually manage and operate Prometheus at scale.

You can also use open source tools like Grafana to visualize metrics collected by Prometheus.

Objectives

  • Create a cluster.
  • Deploy Prometheus .
  • Deploy the sample application, Bank of Anthos .
  • Configure Prometheus liveness probes.
  • Configure Prometheus alerts.
  • Configure Alertmanager to get notification in a Slack channel.
  • Simulate an outage to test Prometheus.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator .

New Google Cloud users might be eligible for a free trial .

When you finish the tasks that are described in this document, you can avoid continued billing by deleting the resources that you created. For more information, see Clean up .

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the GKE API.

    Enable the API

  5. In the Google Cloud console, on the project selector page, click Create project to begin creating a new Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the GKE API.

    Enable the API

  8. Install the Helm API

Prepare the environment

In this tutorial, you use Cloud Shell to manage resources hosted on Google Cloud.

  1. Set the default environment variables:

     gcloud  
    config  
     set 
      
    project  
     PROJECT_ID 
    gcloud  
    config  
     set 
      
    compute/region  
     CONTROL_PLANE_LOCATION 
     
    

    Replace the following:

    • PROJECT_ID : your Google Cloud project ID .
    • CONTROL_PLANE_LOCATION : the Compute Engine region of the control plane of your cluster. For this tutorial, the region is us-central1 . Typically, you want a region that is close to you.
  2. Clone the sample repository used in this tutorial:

     git  
    clone  
    https://github.com/GoogleCloudPlatform/bank-of-anthos.git cd 
      
    bank-of-anthos/ 
    
  3. Create a cluster:

     gcloud  
    container  
    clusters  
    create-auto  
     CLUSTER_NAME 
      
     \ 
      
    --release-channel = 
     CHANNEL_NAME 
      
     \ 
      
    --location = 
     CONTROL_PLANE_LOCATION 
     
    

    Replace the following:

    • CLUSTER_NAME : a name for the new cluster.
    • CHANNEL_NAME : the name of a release channel .

Deploy Prometheus

Install Prometheus using the sample Helm chart:

 helm  
repo  
add  
bitnami  
https://charts.bitnami.com/bitnami
helm  
install  
tutorial  
bitnami/kube-prometheus  
 \ 
  
--version  
 8 
.2.2  
 \ 
  
--values  
extras/prometheus/oss/values.yaml  
 \ 
  
--wait 

This command installs Prometheus with the following components:

  • Prometheus Operator : a popular way to deploy and configure open source Prometheus.
  • Alertmanager : handles alerts sent by the Prometheus server and routes them to applications, such as Slack.
  • Blackbox exporter : lets Prometheus probe endpoints using HTTP, HTTPS, DNS, TCP, ICMP, and gRPC.

Deploy Bank of Anthos

Deploy the Bank of Anthos sample application:

 kubectl  
apply  
-f  
extras/jwt/jwt-secret.yaml
kubectl  
apply  
-f  
kubernetes-manifests 

Slack notifications

To set up Slack notifications, you must create a Slack application, activate Incoming Webhooks for the application, and install the application to a Slack workspace.

Create the Slack application

  1. Join a Slack workspace , either by registering with your email or by using an invitation sent by a Workspace Admin.

  2. Sign in to Slack using your workspace name and your Slack account credentials.

  3. Create a new Slack app

    1. In the Create an appdialog, click From scratch.
    2. Specify an App Nameand choose your Slack workspace.
    3. Click Create App.
    4. Under Add features and functionality, click Incoming Webhooks.
    5. Click the Activate Incoming Webhookstoggle.
    6. In the Webhook URLs for Your Workspacesection, click Add New Webhook to Workspace.
    7. On the authorization page that opens, select a channel to receive notifications.
    8. Click Allow.
    9. A webhook for your Slack application is displayed in the Webhook URLs for Your Workspacesection. Save the URL for later.

Configure Alertmanager

Create a Kubernetes Secret to store the webhook URL:

 kubectl  
create  
secret  
generic  
alertmanager-slack-webhook  
--from-literal  
 webhookURL 
 = 
 SLACK_WEBHOOK_URL 
kubectl  
apply  
-f  
extras/prometheus/oss/alertmanagerconfig.yaml 

Replace SLACK_WEBHOOK_URL with the URL of the webhook from the previous section.

Configure Prometheus

  1. Review the following manifest:

      # Copyright 2023 Google LLC 
     # 
     # Licensed under the Apache License, Version 2.0 (the "License"); 
     # you may not use this file except in compliance with the License. 
     # You may obtain a copy of the License at 
     # 
     #      http://www.apache.org/licenses/LICENSE-2.0 
     # 
     # Unless required by applicable law or agreed to in writing, software 
     # distributed under the License is distributed on an "AS IS" BASIS, 
     # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
     # See the License for the specific language governing permissions and 
     # limitations under the License. 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     frontend-probe 
     spec 
     : 
      
     jobName 
     : 
      
     frontend 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     frontend:80 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     userservice-probe 
     spec 
     : 
      
     jobName 
     : 
      
     userservice 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     userservice:8080/ready 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     balancereader-probe 
     spec 
     : 
      
     jobName 
     : 
      
     balancereader 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     balancereader:8080/ready 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     contacts-probe 
     spec 
     : 
      
     jobName 
     : 
      
     contacts 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     contacts:8080/ready 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     ledgerwriter-probe 
     spec 
     : 
      
     jobName 
     : 
      
     ledgerwriter 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     ledgerwriter:8080/ready 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     Probe 
     metadata 
     : 
      
     name 
     : 
      
     transactionhistory-probe 
     spec 
     : 
      
     jobName 
     : 
      
     transactionhistory 
      
     prober 
     : 
      
     url 
     : 
      
     tutorial-kube-prometheus-blackbox-exporter:19115 
      
     path 
     : 
      
     /probe 
      
     module 
     : 
      
     http_2xx 
      
     interval 
     : 
      
     60s 
      
     scrapeTimeout 
     : 
      
     30s 
      
     targets 
     : 
      
     staticConfig 
     : 
      
     labels 
     : 
      
     app 
     : 
      
     bank-of-anthos 
      
     static 
     : 
      
     - 
      
     transactionhistory:8080/ready 
     
    

    This manifest describes Prometheus liveness probes and includes the following fields:

    • spec.jobName : the Job name assigned to scraped metrics.
    • spec.prober.url : the Service URL of the blackbox exporter. This includes the default port for the blackbox exporter, which is defined in the Helm chart.
    • spec.prober.path : the metrics collection path.
    • spec.targets.staticConfig.labels : the labels assigned to all metrics scraped from the targets.
    • spec.targets.staticConfig.static : the list of hosts to probe.
  2. Apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
    extras/prometheus/oss/probes.yaml 
    
  3. Review the following manifest:

      # Copyright 2023 Google LLC 
     # 
     # Licensed under the Apache License, Version 2.0 (the "License"); 
     # you may not use this file except in compliance with the License. 
     # You may obtain a copy of the License at 
     # 
     #      http://www.apache.org/licenses/LICENSE-2.0 
     # 
     # Unless required by applicable law or agreed to in writing, software 
     # distributed under the License is distributed on an "AS IS" BASIS, 
     # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 
     # See the License for the specific language governing permissions and 
     # limitations under the License. 
     --- 
     apiVersion 
     : 
      
     monitoring.coreos.com/v1 
     kind 
     : 
      
     PrometheusRule 
     metadata 
     : 
      
     name 
     : 
      
     uptime-rule 
     spec 
     : 
      
     groups 
     : 
      
     - 
      
     name 
     : 
      
     Micro services uptime 
      
     interval 
     : 
      
     60s 
      
     rules 
     : 
      
     - 
      
     alert 
     : 
      
     BalancereaderUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="balancereader"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     Balance Reader Service is unavailable 
      
     description 
     : 
      
     Check Balance Reader pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'critical' 
      
     - 
      
     alert 
     : 
      
     ContactsUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="contacts"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     Contacs Service is unavailable 
      
     description 
     : 
      
     Check Contacs pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'warning' 
      
     - 
      
     alert 
     : 
      
     FrontendUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="frontend"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     Frontend Service is unavailable 
      
     description 
     : 
      
     Check Frontend pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'critical' 
      
     - 
      
     alert 
     : 
      
     LedgerwriterUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="ledgerwriter"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     Ledger Writer Service is unavailable 
      
     description 
     : 
      
     Check Ledger Writer pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'critical' 
      
     - 
      
     alert 
     : 
      
     TransactionhistoryUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="transactionhistory"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     Transaction History Service is unavailable 
      
     description 
     : 
      
     Check Transaction History pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'critical' 
      
     - 
      
     alert 
     : 
      
     UserserviceUnavaiable 
      
     expr 
     : 
      
     probe_success{app="bank-of-anthos",job="userservice"} == 0 
      
     for 
     : 
      
     1m 
      
     annotations 
     : 
      
     summary 
     : 
      
     User Service is unavailable 
      
     description 
     : 
      
     Check User Service pods and it's logs 
      
     labels 
     : 
      
     severity 
     : 
      
     'critical' 
     
    

    This manifest describes a PrometheusRule and includes the following fields:

    • spec.groups.[*].name : the name of the rule group.
    • spec.groups.[*].interval : how often rules in the group are evaluated.
    • spec.groups.[*].rules[*].alert : the name of the alert.
    • spec.groups.[*].rules[*].expr : the PromQL expression to evaluate.
    • spec.groups.[*].rules[*].for : the amount of time alerts must return for before they are considered firing.
    • spec.groups.[*].rules[*].annotations : a list of annotations to add to each alert. This is only valid for alerting rules.
    • spec.groups.[*].rules[*].labels : the labels to add or overwrite.
  4. Apply the manifest to your cluster:

     kubectl  
    apply  
    -f  
    extras/prometheus/oss/rules.yaml 
    

Simulate an outage

  1. Simulate an outage by scaling the contacts Deployment to zero:

     kubectl  
    scale  
    deployment  
    contacts  
    --replicas  
     0 
     
    

    You should see a notification message in your Slack workspace channel. GKE might take up to 5 minutes to scale the Deployment.

  2. Restore the contacts Deployment:

     kubectl  
    scale  
    deployment  
    contacts  
    --replicas  
     1 
     
    

    You should see an alert resolution notification message in your Slack workspace channel. GKE might take up to 5 minutes to scale the Deployment.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this tutorial, either delete the project that contains the resources, or keep the project and delete the individual resources.

Delete the project

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete .
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete individual resources

  1. Delete the Kubernetes resources:

     kubectl  
    delete  
    -f  
    kubernetes-manifests 
    
  2. Uninstall Prometheus:

     helm  
    uninstall  
    tutorial 
    
  3. Delete the GKE cluster:

     gcloud  
    container  
    clusters  
    delete  
     CLUSTER_NAME 
      
    --quiet 
    

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: