Locality Weighted Load Balancing

This example demonstrates the locality weighted load balancing feature in Envoy proxy. The demo simulates a scenario that a backend service resides in two local zones and one remote zone.

The components used in this demo are as follows:

  • A client container: runs Envoy proxy

  • Backend container in the same locality as the client, with priority set to 0, referred to as local-1.

  • Backend container in the same locality as the client, with priority set to 1, referred to as local-2.

  • Backend container in the the remote locality, with priority set to 1, referred to as remote-1.

  • Backend container in the the remote locality, with priority set to 2, referred to as remote-2.

The client Envoy proxy configures the 4 backend containers in the same Envoy cluster, so that Envoy handles load balancing to those backend servers. From here we can see, we have localities with 3 different priorities:

  • priority 0: local-1

  • priority 1: local-2 and remote-1

  • priority 2: remote-2

In Envoy, when the healthiness of a given locality drops below a threshold (71% by default), the next priority locality will start to share the request loads. The demo below will show this behavior.

Step 1: Start all of our containers

In terminal, move to the examples/locality_load_balancing directory.

To build this sandbox example and start the example services, run the following commands:

# Start demo
$ docker-compose up --build -d

The locality configuration is set in the client container via static Envoy configuration file. Please refer to the cluster section of the proxy configuration file.

Step 2: Scenario with one replica in the highest priority locality

In this scenario, each locality has 1 healthy replica running and all the requests should be sent to the locality with the highest priority (i.e. lowest integer set for priority - 0), which is local-1.

# all requests to local-1
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-local-1!: 100, 100.0%
Failed: 0

If locality local-1 becomes unhealthy (i.e. fails the Envoy health check), the requests should be load balanced among the subsequent priority localities, which are local-2 and remote-1. They both have priority 1. We then send 100 requests to the backend cluster, and check the responders.

# bring down local-1
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8000/unhealthy
[backend-local-1] Set to unhealthy

# local-2 and remote-1 localities split the traffic 50:50
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: 51, 51.0%
Hello from backend-local-2!: 49, 49.0%
Failed: 0

Now if local-2 becomes unhealthy also, priority 1 locality is only 50% healthy. Thus priority 2 locality starts to share the request load. Requests will be sent to both remote-1 and remote-2.

# bring down local-2
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8000/unhealthy

# remote-1 locality receive 100% of the traffic
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: actual weight 69.0%
Hello from backend-remote-2!: actual weight 31.0%
Failed: 0

Step 3: Recover servers

Before moving on, we need to server local-1 and local-2 first.

# recover local-1 and local-2 after the demo
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8000/healthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8000/healthy

Step 4: Scenario with multiple replicas in the highest priority locality

To demonstrate how locality based load balancing works in multiple replicas setup, let’s now scale up the local-1 locality to 5 replicas.

$ docker-compose up --scale backend-local-1=5 -d

We are going to show the scenario that local-1 is just partially healthy. So let’s bring down 4 of the replicas in local-1.

# bring down local-1 replicas
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_2:8000/unhealthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_3:8000/unhealthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_4:8000/unhealthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_5:8000/unhealthy

Then we check the endpoints again:

# check healthiness
$ docker-compose exec -T client-envoy curl -s localhost:8001/clusters | grep health_flags

backend::172.28.0.4:8000::health_flags::/failed_active_hc
backend::172.28.0.2:8000::health_flags::/failed_active_hc
backend::172.28.0.5:8000::health_flags::/failed_active_hc
backend::172.28.0.6:8000::health_flags::/failed_active_hc
backend::172.28.0.7:8000::health_flags::healthy
backend::172.28.0.8:8000::health_flags::healthy
backend::172.28.0.3:8000::health_flags::healthy

We can confirm that 4 backend endpoints become unhealthy.

Now we send the 100 requests again.

# watch traffic change
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100

Hello from backend-remote-1!: actual weight 37.0%
Hello from backend-local-2!: actual weight 36.0%
Hello from backend-local-1!: actual weight 27.0%
Failed: 0

As local-1 does not have enough healthy workloads, requests are partially shared by secondary localities.

If we bring down all the servers in priority 1 locality, it will make priority 1 locality 0% healthy. The traffic should split between priority 0 and priority 2 localities.

$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8000/unhealthy
$ docker-compose exec -T client-envoy curl -s locality-load-balancing_backend-remote-1_1:8000/unhealthy
$ docker-compose exec -T client-envoy python3 client.py http://localhost:3000/ 100

Hello from backend-remote-2!: actual weight 77.0%
Hello from backend-local-1!: actual weight 23.0%
Failed: 0