Locality Weighted Load Balancing
This example demonstrates the locality weighted load balancing feature in Envoy proxy. The demo simulates a scenario that a backend service resides in two local zones and one remote zone.
The components used in this demo are as follows:
A client container: runs Envoy proxy
Backend container in the same locality as the client, with priority set to 0, referred to as
local-1
.Backend container in the same locality as the client, with priority set to 1, referred to as
local-2
.Backend container in the the remote locality, with priority set to 1, referred to as
remote-1
.Backend container in the the remote locality, with priority set to 2, referred to as
remote-2
.
The client Envoy proxy configures the 4 backend containers in the same Envoy cluster, so that Envoy handles load balancing to those backend servers. From here we can see, we have localities with 3 different priorities:
priority 0:
local-1
priority 1:
local-2
andremote-1
priority 2:
remote-2
In Envoy, when the healthiness of a given locality drops below a threshold (71% by default), the next priority locality will start to share the request loads. The demo below will show this behavior.
Step 1: Start all of our containers
In terminal, move to the examples/locality_load_balancing
directory.
To build this sandbox example and start the example services, run the following commands:
# Start demo
$ docker compose up --build -d
The locality configuration is set in the client container via static Envoy configuration file. Please refer to the cluster
section of the proxy configuration
file.
Note
The locality_weighted_lb_config
must be set in common_lb_config
for the load_balancing_weight
to be used.
Step 2: Scenario with one replica in the highest priority locality
In this scenario, each locality has 1 healthy replica running and all the requests should be sent to the locality with the highest priority (i.e. lowest integer set for priority - 0
), which is local-1
.
# all requests to local-1
$ docker compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-local-1!: 100, 100.0%
Failed: 0
If locality local-1
becomes unhealthy (i.e. fails the Envoy health check), the requests should be load balanced among the subsequent priority localities, which are local-2
and remote-1
. They both have priority 1. We then send 100 requests to the backend cluster, and check the responders.
# bring down local-1
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8080/unhealthy
[backend-local-1] Set to unhealthy
# local-2 and remote-1 localities split the traffic 50:50
$ docker compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: 51, 51.0%
Hello from backend-local-2!: 49, 49.0%
Failed: 0
Now if local-2
becomes unhealthy also, priority 1 locality is only 50% healthy. Thus priority 2 locality starts to share the request load. Requests will be sent to both remote-1
and remote-2
.
# bring down local-2
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8080/unhealthy
# remote-1 locality receive 100% of the traffic
$ docker compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: actual weight 69.0%
Hello from backend-remote-2!: actual weight 31.0%
Failed: 0
Step 3: Recover servers
Before moving on, we need to server local-1 and local-2 first.
# recover local-1 and local-2 after the demo
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_1:8080/healthy
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8080/healthy
Step 4: Scenario with multiple replicas in the highest priority locality
To demonstrate how locality based load balancing works in multiple replicas setup, let’s now scale up the local-1
locality to 5 replicas.
$ docker compose up --scale backend-local-1=5 -d
We are going to show the scenario that local-1
is just partially healthy. So let’s bring down 4 of the replicas in local-1
.
# bring down local-1 replicas
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_2:8080/unhealthy
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_3:8080/unhealthy
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_4:8080/unhealthy
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-1_5:8080/unhealthy
Then we check the endpoints again:
# check healthiness
$ docker compose exec -T client-envoy curl -s localhost:8001/clusters | grep health_flags
backend::172.28.0.4:8080::health_flags::/failed_active_hc
backend::172.28.0.2:8080::health_flags::/failed_active_hc
backend::172.28.0.5:8080::health_flags::/failed_active_hc
backend::172.28.0.6:8080::health_flags::/failed_active_hc
backend::172.28.0.7:8080::health_flags::healthy
backend::172.28.0.8:8080::health_flags::healthy
backend::172.28.0.3:8080::health_flags::healthy
We can confirm that 4 backend endpoints become unhealthy.
Now we send the 100 requests again.
# watch traffic change
$ docker compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-1!: actual weight 37.0%
Hello from backend-local-2!: actual weight 36.0%
Hello from backend-local-1!: actual weight 27.0%
Failed: 0
As local-1
does not have enough healthy workloads, requests are partially shared by secondary localities.
If we bring down all the servers in priority 1 locality, it will make priority 1 locality 0% healthy. The traffic should split between priority 0 and priority 2 localities.
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-local-2_1:8080/unhealthy
$ docker compose exec -T client-envoy curl -s locality-load-balancing_backend-remote-1_1:8080/unhealthy
$ docker compose exec -T client-envoy python3 client.py http://localhost:3000/ 100
Hello from backend-remote-2!: actual weight 77.0%
Hello from backend-local-1!: actual weight 23.0%
Failed: 0