Rate Limit Quota Service (RLQS) (proto)
Warning
This API feature is currently work-in-progress. API features marked as work-in-progress are not considered stable, are not covered by the threat model, are not supported by the security team, and are subject to breaking changes. Do not use this feature without understanding each of the previous points.
The Rate Limit Quota Service (RLQS) is a Envoy global rate limiting service that allows to delegate rate limit decisions to a remote service. The service will aggregate the usage reports from multiple data plane instances, and distribute Rate Limit Assignments to each instance based on its business logic. The logic is outside of the scope of the protocol API.
The protocol is designed as a streaming-first API. It utilizes watch-like subscription model. The data plane groups requests into Quota Buckets as directed by the filter config, and periodically reports them to the RLQS server along with the Bucket identifier, BucketId. Once RLQS server has collected enough reports to make a decision, it’ll send back the assignment with the rate limiting instructions.
The first report sent by the data plane is interpreted by the RLQS server as a “watch” request,
indicating that the data plane instance is interested in receiving further updates for the
BucketId
. From then on, RLQS server may push assignments to this instance at will, even if
the instance is not sending usage reports. It’s the responsibility of the RLQS server
to determine when the data plane instance didn’t send BucketId
reports for too long,
and to respond with the AbandonAction,
indicating that the server has now stopped sending quota assignments for the BucketId
bucket,
and the data plane instance should abandon
it.
If for any reason the RLQS client doesn’t receive the initial assignment for the reported bucket, in order to prevent memory exhaustion, the data plane will limit the time such bucket is retained. The exact time to wait for the initial assignment is chosen by the filter, and may vary based on the implementation. Once the duration ends, the data plane will stop reporting bucket usage, reject any enqueued requests, and purge the bucket from the memory. Subsequent requests matched into the bucket will re-initialize the bucket in the “no assignment” state, restarting the reports.
Refer to Rate Limit Quota configuration overview for further details.
service.rate_limit_quota.v3.RateLimitQuotaUsageReports
[service.rate_limit_quota.v3.RateLimitQuotaUsageReports proto]
{
"domain": ...,
"bucket_quota_usages": []
}
- domain
(string, REQUIRED) All quota requests must specify the domain. This enables sharing the quota server between different applications without fear of overlap. E.g., “envoy”.
Should only be provided in the first report, all subsequent messages on the same stream are considered to be in the same domain. In case the domain needs to be changes, close the stream, and reopen a new one with the different domain.
- bucket_quota_usages
(repeated service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage, REQUIRED) A list of quota usage reports. The list is processed by the RLQS server in the same order it’s provided by the client.
service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage
[service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage proto]
The usage report for a bucket.
Note
Note that the first report sent for a BucketId
indicates to the RLQS server that
the RLQS client is subscribing for the future assignments for this BucketId
.
{
"bucket_id": {...},
"time_elapsed": {...},
"num_requests_allowed": ...,
"num_requests_denied": ...
}
- bucket_id
(service.rate_limit_quota.v3.BucketId, REQUIRED)
BucketId
for which request quota usage is reported.
- time_elapsed
(Duration, REQUIRED) Time elapsed since the last report.
- num_requests_allowed
(uint64) Requests the data plane has allowed through.
- num_requests_denied
(uint64) Requests throttled.
service.rate_limit_quota.v3.RateLimitQuotaResponse
[service.rate_limit_quota.v3.RateLimitQuotaResponse proto]
{
"bucket_action": []
}
- bucket_action
(repeated service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction, REQUIRED) An ordered list of actions to be applied to the buckets. The actions are applied in the given order, from top to bottom.
service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction
[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction proto]
Commands the data plane to apply one of the actions to the bucket with the bucket_id.
{
"bucket_id": {...},
"quota_assignment_action": {...},
"abandon_action": {...}
}
- bucket_id
(service.rate_limit_quota.v3.BucketId, REQUIRED)
BucketId
for which request the action is applied.
- quota_assignment_action
(service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction) Apply the quota assignment to the bucket.
Commands the data plane to apply a rate limiting strategy to the bucket. The process of applying and expiring the rate limiting strategy is detailed in the QuotaAssignmentAction message.
Precisely one of quota_assignment_action, abandon_action must be set.
- abandon_action
(service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction) Abandon the bucket.
Commands the data plane to abandon the bucket. The process of abandoning the bucket is described in the AbandonAction message.
Precisely one of quota_assignment_action, abandon_action must be set.
service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction
[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction proto]
Quota assignment for the bucket. Configures the rate limiting strategy and the duration for the given bucket_id.
Applying the first assignment to the bucket
Once the data plane receives the QuotaAssignmentAction
, it must send the current usage
report for the bucket, and start rate limiting requests matched into the bucket
using the strategy configured in the rate_limit_strategy
field. The assignment becomes bucket’s active
assignment.
Expiring the assignment
The duration of the assignment defined in the assignment_time_to_live
field. When the duration runs off, the assignment is expired
, and no longer active
.
The data plane should stop applying the rate limiting strategy to the bucket, and transition
the bucket to the “expired assignment” state. This activates the behavior configured in the
expired_assignment_behavior
field.
Replacing the assignment
If the rate limiting strategy is different from bucket’s
active
assignment, or the current bucket assignment isexpired
, the data plane must immediately end the current assignment, report the bucket usage, and apply the new assignment. The new assignment becomes bucket’sactive
assignment.If the rate limiting strategy is the same as the bucket’s
active
(notexpired
) assignment, the data plane should extend the duration of theactive
assignment for the duration of the new assignment provided in the assignment_time_to_live field. Theactive
assignment is considered unchanged.
{
"assignment_time_to_live": {...},
"rate_limit_strategy": {...}
}
- assignment_time_to_live
(Duration) A duration after which the assignment is be considered
expired
. The process of the expiration is described above.If unset, the assignment has no expiration date.
If set to
0
, the assignment expires immediately, forcing the client into the “expired assignment” state. This may be used by the RLQS server in cases when it needs clients to proactively fall back to the pre-configured ExpiredAssignmentBehavior, f.e. before the server going into restart.
Attention
Note that expiring the assignment is not the same as abandoning the assignment. While expiring the assignment just transitions the bucket to the “expired assignment” state; abandoning the assignment completely erases the bucket from the data plane memory, and stops the usage reports.
- rate_limit_strategy
(type.v3.RateLimitStrategy) Configures the local rate limiter for the request matched to the bucket. If not set, allow all requests.
service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction
[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction proto]
Abandon action for the bucket. Indicates that the RLQS server will no longer be sending updates for the given bucket_id.
If no requests are reported for a bucket, after some time the server considers the bucket inactive. The server stops tracking the bucket, and instructs the the data plane to abandon the bucket via this message.
Abandoning the assignment
The data plane is to erase the bucket (including its usage data) from the memory. It should stop tracking the bucket, and stop reporting its usage. This effectively resets the data plane to the state prior to matching the first request into the bucket.
Restarting the subscription
If a new request is matched into a bucket previously abandoned, the data plane must behave as if it has never tracked the bucket, and it’s the first request matched into it:
The process of subscription and reporting starts from the beginning.
The bucket transitions to the “no assignment” state.
Once the new assignment is received, it’s applied per “Applying the first assignment to the bucket” section of the QuotaAssignmentAction.
service.rate_limit_quota.v3.BucketId
[service.rate_limit_quota.v3.BucketId proto]
The identifier for the bucket. Used to match the bucket between the control plane (RLQS server), and the data plane (RLQS client), f.e.:
the data plane sends a usage report for requests matched into the bucket with
BucketId
to the control planethe control plane sends an assignment for the bucket with
BucketId
to the data plane Bucket ID.
Example:
bucket:
name: my_bucket
env: staging
Note
The order of BucketId
keys do not matter. Buckets { a: 'A', b: 'B' }
and
{ b: 'B', a: 'A' }
are identical.
{
"bucket": {...}
}