Symptom
When starting up, the telemetry pods go in and out of CrashLoopBackoff
state. This can cause periodic gaps in your metrics or graphs as the pods restart. You could also see discrepancies with analytics data as some sections of data are missing.
Error messages
When you use kubectl
to view the pod states, you will see one or more metric pods in the CrashLoopBackoff
state. Refer to the following command:
kubectl get pods -n APIGEE_NAMESPACE
Where APIGEE_NAMESPACE is the Kubernetes namespace for your Apigee hybrid components. For more information, see Create the apigee namespace .
Sample Output
NAME READY STATUS RESTARTS AGE apigee-metrics-default-telemetry-proxy-1104-hvwoo-zlmlw 0/1 CrashLoopBackoff 10 10m apigee-metrics-adapter-apigee-telemetry-1104-7fyff-tts65 0/1 CrashLoopBackoff 10 10m apigee-metrics-default-telemetry-proxy-1104-hvwoo-zlmlw 0/1 FailedScheduling 0 12m
Common diagnosis steps
- Check the events for issues with telemetry pods with the following command:
kubectl -n apigee get event
Sample Output
LAST SEEN TYPE REASON OBJECT MESSAGE 53m Normal SuccessfulCreate job/apigee-cassandra-schema-val-jghunt-20250709-0820206-29251940 Created pod: apigee-cassandra-schema-val-jghunt-20250709-0820206-292519fkt7j 53m Normal Completed job/apigee-cassandra-schema-val-jghunt-20250709-0820206-29251940 Job completed 43m Normal SuccessfulCreate job/apigee-cassandra-schema-val-jghunt-20250709-0820206-29251950 Created pod: apigee-cassandra-schema-val-jghunt-20250709-0820206-292519l87m8 43m Normal Completed job/apigee-cassandra-schema-val-jghunt-20250709-0820206-29251950 Job completed 33m Normal SuccessfulCreate job/apigee-cassandra-schema-val-jghunt-20250709-0820206-29251960 Created pod: apigee-cassandra-schema-val-jghunt-20250709-0820206-29251962ncc
- You can also check the events of telemetry pods with a
CrashLoopBackOffstate using the following command:kubectl -n apigee describe POD_NAMEWhere POD_NAME is the name of the pod that is in a
CrashLoopBackOffstate.Sample Output
apigee-metrics-apigee-telemetry-app-1101-qc36n-dxzrv
- You can also check the
cpustatus of the pods with the following command:kubectl -n apigee get hpa | grep unknown
Sample Output
apigee-metrics-apigee-telemetry-app-1101-qc36n-dxzrv ReplicaSet/apigee-metrics-apigee-telemetry-app-1101-qc36n-dxzrv
/80% 2 10 2 8h
Possible causes
| Cause | Description | Troubleshooting instructions applicable for |
|---|---|---|
metrics.app.resources.requests.cpu
and metrics.app.resources.limits.cpu
are missing |
The cpu
must be specified in the overrides.yaml
file. |
Apigee hybrid |
Cause
cpu
is not mentioned in the overrides.yaml
file, so cpu
gets an undefined value.
Diagnosis
Check your overrides.yaml
file to see if both cpu
values are defined for metrics.app.resources.requests.cpu
and metrics.app.resources.limits.cpu
.
Resolution
If cpu
settings are missing in your overrides.yaml
file for metrics, provide both cpu
values in the overrides.yaml
file.
-
Add the following configuration under the
metricssection in youroverrides.yamlfile:metrics : app : # The apigee - prometheus - app container in the "app" pod resources : requests : memory : 512 Mi # Default value : 512 Mi cpu : 500 m # Default value : 500 m limits : memory : 2 Gi # default : 1 Gi cpu : 500 m # Default value : 500 m
- Apply changes using the following command:
helm upgrade ENV_RELEASE_NAME apigee-env/ \ --install \ --namespace APIGEE_NAMESPACE \ --set env= ENV_NAME \ -f OVERRIDES_FILE
-
Where ENV_RELEASE_NAME is a unique name used to track installation and upgrade of the
apigee-envchart. While it's typically the same as the ENV_NAME , it must be different if your environment has the same name as your environment group. For example, if both are nameddev, you would usedev-env-releaseanddev-envgroup-releaseto distinguish them. -
Where APIGEE_NAMESPACE is the Kubernetes namespace for your Apigee hybrid components. For more information, see Create the apigee namespace .
-
Where ENV_NAME is the name you used when you created the environment in the UI.
-
Where OVERRIDES_FILE is the
overrides.yamlfile that is used during upgrades or install.
Must gather diagnostic information
If the problem persists even after following the above instructions, gather the following diagnostic information and then contact Google Cloud Customer Care :
- The
overrides.yamlfile. - The output from the Apigee hybrid must-gather script.

