Ensuring compatibility of webhook certificates before upgrading to v1.23


Starting from version 1.23, Kubernetes no longer supports server identity validation using the X.509 Common Name (CN) field in certificates. Instead, Kubernetes will only rely on information in the X.509 Subject Alternative Name (SAN) fields.

To prevent impact to your clusters, you must replace incompatible certificates without SANs for backends of webhooks and aggregated API servers before upgrading your clusters to Kubernetes version 1.23.

Why Kubernetes no longer supports backend certificates without SANs

GKE operates open-source Kubernetes, which uses the kube-apiserver component to contact your webhook and aggregated API server backends using Transport Layer Security (TLS). The kube-apiserver component is written in the Go programming language.

Before Go 1.15, TLS clients validated the identity of the servers they connected to using a two-step process:

  1. Check if the DNS name (or IP address) of the server is present as one of the SANs on the server's certificate.
  2. As a fallback, check if the DNS name (or IP address) of the server is equal to the CN on the server's certificate.

RFC 6125 fully deprecated server identity validation based on the CN field in 2011. Browsers and other security-critical applications no longer use the field.

To align with the wider TLS ecosystem, Go 1.15 removed Step 2 from its validation process, but left a debug switch ( x509ignoreCN=0 ) to enable the old behavior to ease the migration process. Kubernetes version 1.19was the first version built using Go 1.15. GKE clusters on versions from 1.19 to 1.22 enabled the debug switch by default to provide customers with more time to replace the certificates for the affected webhook and aggregated API server backends.

Kubernetes version 1.23is built with Go 1.17, which removes the debug switch . Once GKE upgrades your clusters to version 1.23, calls will fail to connect from your cluster's control plane to webhooks or aggregated API services that do not provide a valid X.509 certificate with appropriate SAN.

Identifying affected clusters

For clusters running patch versions at least 1.21.9 or 1.22.3

For clusters on patch versions 1.21.9 and 1.22.3 or later with Cloud Logging enabled , GKE provides a Cloud Audit Logs log to identify calls to affected backends from your cluster. You can use the following filter to search for the logs:

 logName =~ "projects/.*/logs/cloudaudit.googleapis.com%2Factivity"
resource.type = "k8s_cluster"
operation.producer = "k8s.io"
"invalid-cert.webhook.gke.io" 

If your clusters have not called backends with affected certificates, you won't see any logs. If you do see such an audit log, it will include the hostname of the affected backend.

The following is an example of the log entry, for a webhook backend hosted by a service named example-webhookin the defaultnamespace:

  { 
  
 ... 
  
 resource 
  
 { 
  
 t 
 ype 
 : 
  
 "k8s_cluster" 
 , 
  
 "labels" 
 : 
  
 { 
  
 "location" 
 : 
  
 "us-central1-c" 
 , 
  
 "cluster_name" 
 : 
  
 "example-cluster" 
 , 
  
 "project_id" 
 : 
  
 "example-project" 
  
 } 
  
 }, 
  
 labels 
 : 
  
 { 
  
 i 
 n 
 valid 
 - 
 cer 
 t 
 .webhook.gke.io/example 
 - 
 webhook.de 
 fault 
 .svc 
 : 
  
 "No subjectAltNames returned from example-webhook.default.svc:8443" 
 , 
  
 ... 
  
 }, 
  
 logName 
 : 
  
 "projects/example-project/logs/cloudaudit.googleapis.com%2Factivity" 
 , 
  
 opera 
 t 
 io 
 n 
 : 
  
 { 
  
 ... 
  
 producer 
 : 
  
 "k8s.io" 
 , 
  
 ... 
  
 }, 
  
 ... 
 } 
 

The hostnames of the affected services (e.g. example-webhook.default.svc ) are included as suffixes in the label names that start with invalid-cert.webhook.gke.io/ . You can also get the name of the cluster that made the call from the resource.labels.cluster_name label, which has example-cluster value in this example.

Deprecation insights

You can learn which clusters use incompatible certificates from deprecation insights . Insights are available for clusters running version 1.22.6-gke.1000 or later.

Other cluster versions

If you have a cluster on a patch version earlier than 1.22.3 on the 1.22 minor version, or any patch version earlier than 1.21.9, you have two options for determining whether your cluster is affected by this deprecation:

Option 1 (recommended) : Upgrade your cluster to a patch version that supports identifying affected certificates with logs . Make sure that Cloud Logging is enabled for your cluster. After your cluster has been upgraded, the identifying Cloud Audit Logs logs will be produced each time the cluster attempts to call a Service that does not provide a certificate with an appropriate SAN. As the logs will only be produced on a call attempt, we recommend waiting for 30 days after an upgrade to make enough time for all call paths to be invoked.

Using logs to identify impacted services is recommended because this approach minimizes manual effort by automatically producing logs to show the affected services.

Option 2: Inspect the certificates used by Webhooks or Aggregated API Servers in your clusters to determine whether they are affected because of not having SANs:

  1. Get the list of Webhooks and Aggregated API Servers in your cluster and identify their backends (Services or URLs).
  2. Inspect the certificates used by the backend services.

Given the manual effort required to inspect all certificates in this way, this method should only be followed if you need to assess the impact of the deprecations in Kubernetes version 1.23before upgrading your cluster to version 1.21. If you can upgrade your cluster to 1.21, you should upgrade it first and then follow the instructions in Option 1 to avoid the manual effort.

Identifying backend services to inspect

To identify backends that might be affected by the deprecation, get the list of Webhooks and Aggregated API Services and their associated backends in the cluster.

To list all relevant webhooks in the cluster, use the following kubectl commands:

 kubectl  
get  
mutatingwebhookconfigurations  
-A  
 # mutating admission webhooks 
kubectl  
get  
validatingwebhookconfigurations  
-A  
 # validating admission webhooks 
 

You can get an associated backend Service or URL for a given Webhook by examining clientConfig.service field or webhooks.clientConfig.url field in the Webhook's configuration:

 kubectl  
get  
mutatingwebhookconfigurations  
example-webhook  
-o  
yaml 

The output of this command is similar to the following:

  apiVersion 
 : 
  
 admissionregistration.k8s.io/v1 
 kind 
 : 
  
 MutatingWebhookConfiguration 
 webhooks 
 : 
 - 
  
 admissionReviewVersions 
 : 
  
 clientConfig 
 : 
  
 service 
 : 
  
 name 
 : 
  
 example-service 
  
 namespace 
 : 
  
 default 
  
 port 
 : 
  
 443 
 

Note that clientConfig can specify its backend as a Kubernetes Service ( clientConfig.service ), or as a URL ( clientConfig.url ).

To list all relevant Aggregated API Services in the cluster, use the following kubectl command:

 kubectl  
get  
apiservices  
-A  
 | 
grep  
-v  
Local  
 # aggregated API services 
 

The output of this command is similar to the following:

  NAME                     SERVICE                      AVAILABLE   AGE 
 v1beta1.metrics.k8s.io   kube-system/metrics-server   True        237d 
 

This example returns metric-server Service from the kube-system namespace.

You can get an associated Service for a given Aggregated API by examining spec.service field:

 kubectl  
get  
apiservices  
v1beta1.metrics.k8s.io  
-o  
yaml 

The output of this command is similar to the following:

  ... 
 apiVersion 
 : 
  
 apiregistration.k8s.io/v1 
 kind 
 : 
  
 APIService 
 spec 
 : 
  
 service 
 : 
  
 name 
 : 
  
 metrics-server 
  
 namespace 
 : 
  
 kube-system 
  
 port 
 : 
  
 443 
 

Inspecting the certificate of a Service

Once you have identified relevant backend Services to inspect, you can inspect the certificate of each specific Service, such as example-service :

  1. Find the selector and target port of the service:

     kubectl  
    describe  
    service  
    example-service 
    

    The output of this command is similar to the following:

      Name 
     : 
      
     example-service 
     Namespace 
     : 
      
     default 
     Labels 
     : 
      
     run=nginx 
     Selector 
     : 
      
     run=nginx 
     Type 
     : 
      
     ClusterIP 
     IP 
     : 
      
     172.21.xxx.xxx 
     Port 
     : 
      
     443 
     TargetPort 
     : 
      
     444 
     
    

    In this example, example-service has the selector run=nginx and the target port 444 .

  2. Find a pod matching the selector:

     kubectl  
    get  
    pods  
    --selector = 
     run 
     = 
    nginx 
    

    The output of the command is similar to the following:

      NAME          READY   STATUS    RESTARTS   AGE 
     example-pod   1/1     Running   0          21m 
     
    
  3. Set up a port forward

    from your kubectl localhost to the pod.

     kubectl  
    port-forward  
    pods/example-pod  
     LOCALHOST_PORT 
    : TARGET_PORT 
      
     # port forwarding in background 
     
    

    Replace the following in the command:

    • LOCALHOST_PORT : the address to listen on.
    • TARGET_PORT the TargetPort from Step 1.
  4. Use openssl to print the certificate used by the Service:

     openssl  
    s_client  
    -connect  
    localhost: LOCALHOST_PORT 
      
    </dev/null  
     | 
      
    openssl  
    x509  
    -noout  
    -text 
    

    This example output shows a valid certificate (with SAN entries):

      Subject 
     : 
      
     CN = example-service.default.svc 
     X509v3 extensions 
     : 
      
     X509v3 Subject Alternative Name 
     : 
      
     DNS:example-service.default.svc 
     
    

    This example output shows a certificate with a missing SAN:

      Subject 
     : 
      
     CN = example-service.default.svc 
      
     X509v3 extensions 
     : 
      
     X509v3 Key Usage 
     : 
      
     critical 
      
     Digital Signature, Key Encipherment 
      
     X509v3 Extended Key Usage 
     : 
      
     TLS Web Server Authentication 
      
     X509v3 Authority Key Identifier 
     : 
      
     keyid:1A:5F:29:D8:E9:3C:54:3C:35:CC:D8:AB:D1:21:FD:C3:56:25:C0:74 
     
    
  5. Remove the port forward from running in the background with the following commands:

     $ jobs
    [1]+  Running                 kubectl port-forward pods/example-pod 8888:444 &
    $ kill %1
    [1]+  Terminated              kubectl port-forward pods/example 8888:444 
    

Inspecting the certificate of a URL backend

If the webhook uses a url backend , directly connect to the hostname specified in the URL. For example, if the URL is https://example.com:123/foo/bar , use the following openssl command to print the certificate used by the backend:

   
openssl  
s_client  
-connect  
example.com:123  
</dev/null  
 | 
  
openssl  
x509  
-noout  
-text 

Mitigating the risk of 1.23 upgrade

Once you have identified affected clusters and their backend services using certificates without SANs, you must update the webhooks and aggregated API server backends to use certificates with appropriate SANs prior to upgrading the clusters to version 1.23.

GKE will not automatically upgrade clusters on versions 1.22.6-gke.1000 or later with backends using incompatible certificates until you replace the certificates or until version 1.22reaches end of standard support .

If your cluster is on a GKE version earlier than 1.22.6-gke.1000, you can temporarily prevent automatic upgrades by configuring a maintenance exclusion to prevent minor upgrades.

Resources

See the following resources for additional information on this change:

  • Kubernetes 1.23 release notes
    • Kubernetes is built using Go 1.17. This version of Go removes the ability to use a GODEBUG=x509ignoreCN=0 environment setting to re-enable deprecated legacy behavior of treating the CN of X.509 serving certificates as a host name.
  • Kubernetes 1.19 and Kubernetes 1.20 release notes
    • The deprecated, legacy behavior of treating the CN field on X.509 serving certificates as a host name when no SANs are present is now disabled by default.
  • Go 1.17 release notes
    • The temporary GODEBUG=x509ignoreCN=0 flag has been removed.
  • Go 1.15 release notes
    • The deprecated, legacy behavior of treating the CN field on X.509 certificates as a host when no SANs are present is now disabled by default.
  • RFC 6125 (page 46)
    • Although the use of the CN value is existing practice, it is deprecated, and Certificate Authorities are encouraged to provide subjectAltName values instead.
  • Admission webhooks
Design a Mobile Site
View Site in Mobile | Classic
Share by: