Troubleshoot etcd for GKE on AWS

This pages shows you how to resolve issues with etcd for GKE on AWS.

If you need additional assistance, reach out to Cloud Customer Care .

etcd data disk is full

The following error occurs when the etcd data disk is full, and might be observed in a few different ways:

  • The etcd logs might show timeout errors for writing requests:

      rafthttp 
     : 
      
     failed 
      
     to 
      
     save 
      
     KV 
      
     snapshot 
      
     ( 
     write 
      
     /var/etcd/data/member/snap/ 
     tmp720030520 
     : 
      
     no 
      
     space 
      
     left 
      
     on 
      
     device 
     ) 
     
    

    You might also see timeout errors for connections to peers:

      rafthttp 
     : 
      
     health 
      
     check 
      
     for 
      
     peer 
      
     [ 
     peer 
     - 
     id 
     ] 
      
     could 
      
     not 
      
     connect 
     : 
      
     dial 
      
     tcp 
      
     [ 
     peer 
     - 
     ip 
     ]: 
     2380 
     : 
      
     i 
     / 
     o 
      
     timeout 
     etcd 
      
     server 
      
     doesn 
     ' 
     t 
      
     start 
     : 
     
    
  • The serial port logs might indicate that etcd can't start due to lack of space:

     failed on file /dev/stdout (No space left on device) 
    

To determine the size of your etcd instance, use one of the following methods:

SSH

  1. Connect to one of the master nodes using SSH and run the following command:

      ETCDCTL_API 
     = 
     3 
      
    etcdctl  
    --write-out = 
    table  
    endpoint  
    status 
    

    The DB_SIZE column indicates the size used, as shown in the following condensed example output:

      +------------------+------------------+---------+---------+ 
     |    ENDPOINT      |        ID        | VERSION | DB SIZE | 
     +------------------+------------------+---------+---------+ 
     | 10 
     . 
     240 
     . 
     0 
     . 
     17:2379 | 4917a7ab173fabe7 |  3 
     . 
     5 
     . 
     0  |   45 kB | 
     | 10 
     . 
     240 
     . 
     0 
     . 
     18:2379 | 59796ba9cd1bcd72 |  3 
     . 
     5 
     . 
     0  |   45 kB | 
     | 10 
     . 
     240 
     . 
     0 
     . 
     19:2379 | 94df724b66343e6c |  3 
     . 
     5 
     . 
     0  |   45 kB | 
     +------------------+------------------+---------+---------+ 
     
    

Console

  1. In the console, go to the Cloud Monitoring page.

    Go to the Cloud Monitoring page

  2. Select Metrics explorer.

  3. Select the metric etcd_mvcc_db_total_size_in_bytes metric.

To resolve this issue, resize the data disk for etcd using the appropriate procedure for your storage provider and operating system. Add enough additional space to account for future etcd growth.

  1. After the disk is resized, check if there's still a warning on disk space:

      ETCDCTL_API 
     = 
     3 
      
    etcdctl  
    alarm  
    list 
    
  2. If the last column reports NOSPACE , disarm the alarm as follows:

      ETCDCTL_API 
     = 
     3 
      
    etcdctl  
    alarm  
    disarm 
    

What's next

If you need additional assistance, reach out to Cloud Customer Care .
Create a Mobile Website
View Site in Mobile | Classic
Share by: