Overview¶
The Envoy v2 APIs are defined as proto3 Protocol Buffers in the api tree. They support:
Streaming delivery of xDS API updates via gRPC. This reduces resource requirements and can lower the update latency.
A new REST-JSON API in which the JSON/YAML formats are derived mechanically via the proto3 canonical JSON mapping.
Delivery of updates via the filesystem, REST-JSON or gRPC endpoints.
Advanced load balancing through an extended endpoint assignment API and load and resource utilization reporting to management servers.
Stronger consistency and ordering properties when needed. The v2 APIs still maintain a baseline eventual consistency model.
See the xDS protocol description for further details on aspects of v2 message exchange between Envoy and the management server.
Bootstrap configuration¶
To use the v2 API, it’s necessary to supply a bootstrap configuration file. This
provides static server configuration and configures Envoy to access dynamic
configuration if needed. This is supplied on the command-line via
the -c
flag, i.e.:
./envoy -c <path to config>.{json,yaml,pb,pb_text}
where the extension reflects the underlying v2 config representation.
The Bootstrap message is the root of the configuration. A key concept in the Bootstrap message is the distinction between static and dynamic resources. Resources such as a Listener or Cluster may be supplied either statically in static_resources or have an xDS service such as LDS or CDS configured in dynamic_resources.
Example¶
Below we will use YAML representation of the config protos and a running example of a service proxying HTTP from 127.0.0.1:10000 to 127.0.0.2:1234.
Static¶
A minimal fully static bootstrap config is provided below:
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 127.0.0.1, port_value: 9901 }
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 127.0.0.1, port_value: 10000 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: some_service }
http_filters:
- name: envoy.router
clusters:
- name: some_service
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: some_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 1234
Mostly static with dynamic EDS¶
A bootstrap config that continues from the above example with dynamic endpoint discovery via an EDS gRPC management server listening on 127.0.0.1:5678 is provided below:
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 127.0.0.1, port_value: 9901 }
static_resources:
listeners:
- name: listener_0
address:
socket_address: { address: 127.0.0.1, port_value: 10000 }
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: some_service }
http_filters:
- name: envoy.router
clusters:
- name: some_service
connect_timeout: 0.25s
lb_policy: ROUND_ROBIN
type: EDS
eds_cluster_config:
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
- name: xds_cluster
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
upstream_connection_options:
# configure a TCP keep-alive to detect and reconnect to the admin
# server in the event of a TCP socket half open connection
tcp_keepalive: {}
load_assignment:
cluster_name: xds_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 5678
Notice above that xds_cluster is defined to point Envoy at the management server. Even in an otherwise completely dynamic configurations, some static resources need to be defined to point Envoy at its xDS management server(s).
It’s important to set appropriate TCP Keep-Alive options in the tcp_keepalive block. This will help detect TCP half open connections to the xDS management server and re-establish a full connection.
In the above example, the EDS management server could then return a proto encoding of a DiscoveryResponse:
version_info: "0"
resources:
- "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
cluster_name: some_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.2
port_value: 1234
The versioning and type URL scheme that appear above are explained in more detail in the streaming gRPC subscription protocol documentation.
Dynamic¶
A fully dynamic bootstrap configuration, in which all resources other than those belonging to the management server are discovered via xDS is provided below:
admin:
access_log_path: /tmp/admin_access.log
address:
socket_address: { address: 127.0.0.1, port_value: 9901 }
dynamic_resources:
lds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
cds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
static_resources:
clusters:
- name: xds_cluster
connect_timeout: 0.25s
type: STATIC
lb_policy: ROUND_ROBIN
http2_protocol_options: {}
upstream_connection_options:
# configure a TCP keep-alive to detect and reconnect to the admin
# server in the event of a TCP socket half open connection
tcp_keepalive: {}
load_assignment:
cluster_name: xds_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.1
port_value: 5678
The management server could respond to LDS requests with:
version_info: "0"
resources:
- "@type": type.googleapis.com/envoy.api.v2.Listener
name: listener_0
address:
socket_address:
address: 127.0.0.1
port_value: 10000
filter_chains:
- filters:
- name: envoy.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.config.filter.network.http_connection_manager.v2.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
rds:
route_config_name: local_route
config_source:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
http_filters:
- name: envoy.router
The management server could respond to RDS requests with:
version_info: "0"
resources:
- "@type": type.googleapis.com/envoy.api.v2.RouteConfiguration
name: local_route
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: some_service }
The management server could respond to CDS requests with:
version_info: "0"
resources:
- "@type": type.googleapis.com/envoy.api.v2.Cluster
name: some_service
connect_timeout: 0.25s
lb_policy: ROUND_ROBIN
type: EDS
eds_cluster_config:
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: xds_cluster
The management server could respond to EDS requests with:
version_info: "0"
resources:
- "@type": type.googleapis.com/envoy.api.v2.ClusterLoadAssignment
cluster_name: some_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 127.0.0.2
port_value: 1234
xDS API endpoints¶
A v2 xDS management server will implement the below endpoints as required for gRPC and/or REST serving. In both streaming gRPC and REST-JSON cases, a DiscoveryRequest is sent and a DiscoveryResponse received following the xDS protocol.
gRPC streaming endpoints¶
-
POST
/envoy.api.v2.ClusterDiscoveryService/StreamClusters
¶
See cds.proto for the service definition. This is used by Envoy as a client when
cds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set in the dynamic_resources of the Bootstrap config.
-
POST
/envoy.api.v2.EndpointDiscoveryService/StreamEndpoints
¶
See eds.proto for the service definition. This is used by Envoy as a client when
eds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set in the eds_cluster_config field of the Cluster config.
-
POST
/envoy.api.v2.ListenerDiscoveryService/StreamListeners
¶
See lds.proto for the service definition. This is used by Envoy as a client when
lds_config:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set in the dynamic_resources of the Bootstrap config.
-
POST
/envoy.api.v2.RouteDiscoveryService/StreamRoutes
¶
See rds.proto for the service definition. This is used by Envoy as a client when
route_config_name: some_route_name
config_source:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set in the rds field of the HttpConnectionManager config.
-
POST
/envoy.api.v2.ScopedRoutesDiscoveryService/StreamScopedRoutes
¶
See srds.proto for the service definition. This is used by Envoy as a client when
name: some_scoped_route_name
scoped_rds:
config_source:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set in the scoped_routes field of the HttpConnectionManager config.
-
POST
/envoy.service.discovery.v2.SecretDiscoveryService/StreamSecrets
¶
See sds.proto for the service definition. This is used by Envoy as a client when
name: some_secret_name
config_source:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set inside a SdsSecretConfig message. This message is used in various places such as the CommonTlsContext.
-
POST
/envoy.service.discovery.v2.RuntimeDiscoveryService/StreamRuntime
¶
See rtds.proto for the service definition. This is used by Envoy as a client when
name: some_runtime_layer_name
config_source:
api_config_source:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_xds_cluster
is set inside the rtds_layer field.
REST endpoints¶
-
POST
/v2/discovery:clusters
¶
See cds.proto for the service definition. This is used by Envoy as a client when
cds_config:
api_config_source:
api_type: REST
cluster_names: [some_xds_cluster]
is set in the dynamic_resources of the Bootstrap config.
-
POST
/v2/discovery:endpoints
¶
See eds.proto for the service definition. This is used by Envoy as a client when
eds_config:
api_config_source:
api_type: REST
cluster_names: [some_xds_cluster]
is set in the eds_cluster_config field of the Cluster config.
-
POST
/v2/discovery:listeners
¶
See lds.proto for the service definition. This is used by Envoy as a client when
lds_config:
api_config_source:
api_type: REST
cluster_names: [some_xds_cluster]
is set in the dynamic_resources of the Bootstrap config.
-
POST
/v2/discovery:routes
¶
See rds.proto for the service definition. This is used by Envoy as a client when
route_config_name: some_route_name
config_source:
api_config_source:
api_type: REST
cluster_names: [some_xds_cluster]
is set in the rds field of the HttpConnectionManager config.
Note
The management server responding to these endpoints must respond with a DiscoveryResponse along with a HTTP status of 200. Additionally, if the configuration that would be supplied has not changed (as indicated by the version supplied by the Envoy client) then the management server can respond with an empty body and a HTTP status of 304.
Aggregated Discovery Service¶
While Envoy fundamentally employs an eventual consistency model, ADS provides an opportunity to sequence API update pushes and ensure affinity of a single management server for an Envoy node for API updates. ADS allows one or more APIs and their resources to be delivered on a single, bidirectional gRPC stream by the management server. Without this, some APIs such as RDS and EDS may require the management of multiple streams and connections to distinct management servers.
ADS will allow for hitless updates of configuration by appropriate sequencing. For example, suppose foo.com was mapped to cluster X. We wish to change the mapping in the route table to point foo.com at cluster Y. In order to do this, a CDS/EDS update must first be delivered containing both clusters X and Y.
Without ADS, the CDS/EDS/RDS streams may point at distinct management servers, or when on the same management server at distinct gRPC streams/connections that require coordination. The EDS resource requests may be split across two distinct streams, one for X and one for Y. ADS allows these to be coalesced to a single stream to a single management server, avoiding the need for distributed synchronization to correctly sequence the update. With ADS, the management server would deliver the CDS, EDS and then RDS updates on a single stream.
ADS is only available for gRPC streaming (not REST) and is described more fully in xDS document. The gRPC endpoint is:
-
POST
/envoy.service.discovery.v2.AggregatedDiscoveryService/StreamAggregatedResources
¶
See discovery.proto for the service definition. This is used by Envoy as a client when
ads_config:
api_type: GRPC
grpc_services:
envoy_grpc:
cluster_name: some_ads_cluster
is set in the dynamic_resources of the Bootstrap config.
When this is set, any of the configuration sources above can be set to use the ADS channel. For example, a LDS config could be changed from
lds_config:
api_config_source:
api_type: REST
cluster_names: [some_xds_cluster]
to
lds_config: {ads: {}}
with the effect that the LDS stream will be directed to some_ads_cluster over the shared ADS channel.
Delta endpoints¶
The REST, filesystem, and original gRPC xDS implementations all deliver “state of the world” updates: every CDS update must contain every cluster, with the absence of a cluster from an update implying that the cluster is gone. For Envoy deployments with huge amounts of resources and even a trickle of churn, these state-of-the-world updates can be cumbersome.
As of 1.12.0, Envoy supports a “delta” variant of xDS (including ADS), where updates only contain resources added/changed/removed. Delta xDS is a gRPC (only) protocol. Delta uses different request/response protos than SotW (DeltaDiscovery{Request,Response}); see discovery.proto. Conceptually, delta should be viewed as a new xDS transport type: there is static, filesystem, REST, gRPC-SotW, and now gRPC-delta. (Envoy’s implementation of the gRPC-SotW/delta client happens to share most of its code between the two, and something similar is likely possible on the server side. However, they are in fact incompatible protocols. The specification of the delta xDS protocol’s behavior is here.)
To use delta, simply set the api_type field of your ApiConfigSource proto(s) to DELTA_GRPC. That works for both xDS and ADS; for ADS, it’s the api_type field of DynamicResources.ads_config, as described in the previous section.
Management Server Unreachability¶
When an Envoy instance loses connectivity with the management server, Envoy will latch on to the previous configuration while actively retrying in the background to reestablish the connection with the management server.
Envoy debug logs the fact that it is not able to establish a connection with the management server every time it attempts a connection.
connected_state statistic provides a signal for monitoring this behavior.
Statistics¶
Management Server has a statistics tree rooted at control_plane. with the following statistics:
Name |
Type |
Description |
---|---|---|
connected_state |
Gauge |
A boolean (1 for connected and 0 for disconnected) that indicates the current connection state with management server |
rate_limit_enforced |
Counter |
Total number of times rate limit was enforced for management server requests |
pending_requests |
Gauge |
Total number of pending requests when the rate limit was enforced |
xDS subscription statistics¶
Envoy discovers its various dynamic resources via discovery services referred to as xDS. Resources are requested via subscriptions, by specifying a filesystem path to watch, initiating gRPC streams or polling a REST-JSON URL.
The following statistics are generated for all subscriptions.
Name |
Type |
Description |
---|---|---|
config_reload |
Counter |
Total API fetches that resulted in a config reload due to a different config |
init_fetch_timeout |
Counter |
Total initial fetch timeouts |
update_attempt |
Counter |
Total API fetches attempted |
update_success |
Counter |
Total API fetches completed successfully |
update_failure |
Counter |
Total API fetches that failed because of network errors |
update_rejected |
Counter |
Total API fetches that failed because of schema/validation errors |
version |
Gauge |
Hash of the contents from the last successful API fetch |
control_plane.connected_state |
Gauge |
A boolean (1 for connected and 0 for disconnected) that indicates the current connection state with management server |