Anthos Service Mesh and Traffic Director are now Cloud Service Mesh. For more information, see the Cloud Service Mesh overview .

Set up Multi-Cluster Mesh Failover

This page shows you how to design and implement a high-availability traffic routing strategy using Cloud Service Mesh in a multi-cluster environment. The following table describes the expected behavior:

Cluster State	Traffic Behavior
Both clusters healthy	50% traffic to Cluster A, 50% to B
Cluster A becomes unavailable	100% traffic to Cluster B
Cluster A recovers	Automatically restores 50/50 split

Prerequisites

As a starting point, this guide assumes that you have already:

Created two GKE clusters registered to the same fleet host project in two different regions configured for Cloud Service Mesh.
Set up a multi-cluster mesh on Cloud Service Mesh .
Istio control plane installed and configured in both clusters.
istio-ingressgateway deployed and exposed in at least one cluster (Cluster A).
hello-world application deployed in both clusters with sidecar injection enabled.

This lab uses the following regions:

Cluster A: europe-west1
Cluster B: us-central1

Set up multi-cluster mesh failover

Deploy and apply the public ingress gateway using the sample manifest from the Cloud Service Mesh repository :

 cat  
<<EOF>  
istio-ingressgateway.yaml
apiVersion:  
networking.istio.io/v1alpha3
kind:  
Gateway
metadata:  
name:  
public-gateway  
namespace:  
default
spec:  
selector:  
istio:  
ingressgateway  
servers:  
-  
port:  
number:  
 80 
  
name:  
http  
protocol:  
HTTP  
hosts:  
-  
 '*' 
EOF

kubectl  
apply  
-f  
istio-ingressgateway.yaml

This gateway exposes the hello-world service externally.

Create and apply a VirtualService in Cluster A to route traffic to the hello-world service:

 cat  
<<EOF>  
virtual-service.yaml
apiVersion:  
networking.istio.io/v1beta1
kind:  
VirtualService
metadata:  
name:  
hello-world  
namespace:  
default
spec:  
hosts:  
-  
 '*' 
  
gateways:  
-  
public-gateway  
http:  
-  
route:  
-  
destination:  
host:  
hello-world.default.svc.cluster.local
EOF

kubectl  
apply  
-f  
virtual-service.yaml

This configuration forwards HTTP requests from the gateway to the service.

Configure and apply a DestinationRule for locality-based failover

 cat  
<<EOF>  
destination-rule.yaml
apiVersion:  
networking.istio.io/v1alpha3
kind:  
DestinationRule
metadata:  
name:  
hello-world  
namespace:  
default
spec:  
host:  
hello-world.default.svc.cluster.local  
trafficPolicy:  
connectionPool:  
http:  
http2MaxRequests:  
 100 
  
outlierDetection:  
consecutive5xxErrors:  
 1 
  
interval:  
1s  
baseEjectionTime:  
30s  
maxEjectionPercent:  
 100 
  
loadBalancer:  
localityLbSetting:  
enabled:  
 true 
  
distribute:  
-  
from:  
europe-west1  
to:  
europe-west1:  
 50 
  
us-central1:  
 50 
  
-  
from:  
us-central1  
to:  
us-central1:  
 50 
  
europe-west1:  
 50 
EOF

kubectl  
apply  
-f  
destination-rule.yaml

Note the following:

The localityLbSettingunder the DestinationRuleenables even traffic split and automatic failover.
maxEjectionPercentallows Istio to failover all traffic if every endpoint in a locality is unhealthy.
distribute:ensures an even 50/50 split between the clusters, based on the source cluster's region.
failover:is implicitly handled when one locality becomes unavailable — Istio routes 100% of traffic to the healthy region.
outlierDetection:ejects failing endpoints after minimal error thresholds.

Validate

You can now validate this behavior by:

Sending requests through the Ingress Gateway in Cluster A.
Scaling down hello-world pods in europe-west1 to 0.
Observing traffic failover to us-central1 .
Scaling pods back up in europe-west1 and verifying traffic split resumes.