Rate Limit Quota Service (RLQS) (proto)

Warning

This API feature is currently work-in-progress. API features marked as work-in-progress are not considered stable, are not covered by the threat model, are not supported by the security team, and are subject to breaking changes. Do not use this feature without understanding each of the previous points.

The Rate Limit Quota Service (RLQS) is a Envoy global rate limiting service that allows to delegate rate limit decisions to a remote service. The service will aggregate the usage reports from multiple data plane instances, and distribute Rate Limit Assignments to each instance based on its business logic. The logic is outside of the scope of the protocol API.

The protocol is designed as a streaming-first API. It utilizes watch-like subscription model. The data plane groups requests into Quota Buckets as directed by the filter config, and periodically reports them to the RLQS server along with the Bucket identifier, BucketId. Once RLQS server has collected enough reports to make a decision, it’ll send back the assignment with the rate limiting instructions.

The first report sent by the data plane is interpreted by the RLQS server as a “watch” request, indicating that the data plane instance is interested in receiving further updates for the BucketId. From then on, RLQS server may push assignments to this instance at will, even if the instance is not sending usage reports. It’s the responsibility of the RLQS server to determine when the data plane instance didn’t send BucketId reports for too long, and to respond with the AbandonAction, indicating that the server has now stopped sending quota assignments for the BucketId bucket, and the data plane instance should abandon it.

If for any reason the RLQS client doesn’t receive the initial assignment for the reported bucket, in order to prevent memory exhaustion, the data plane will limit the time such bucket is retained. The exact time to wait for the initial assignment is chosen by the filter, and may vary based on the implementation. Once the duration ends, the data plane will stop reporting bucket usage, reject any enqueued requests, and purge the bucket from the memory. Subsequent requests matched into the bucket will re-initialize the bucket in the “no assignment” state, restarting the reports.

Refer to Rate Limit Quota configuration overview for further details.

service.rate_limit_quota.v3.RateLimitQuotaUsageReports

[service.rate_limit_quota.v3.RateLimitQuotaUsageReports proto]

{
  "domain": ...,
  "bucket_quota_usages": []
}

domain

(string, REQUIRED) All quota requests must specify the domain. This enables sharing the quota server between different applications without fear of overlap. E.g., “envoy”.

Should only be provided in the first report, all subsequent messages on the same stream are considered to be in the same domain. In case the domain needs to be changes, close the stream, and reopen a new one with the different domain.

bucket_quota_usages: (repeated service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage, REQUIRED) A list of quota usage reports. The list is processed by the RLQS server in the same order it’s provided by the client.

service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage

[service.rate_limit_quota.v3.RateLimitQuotaUsageReports.BucketQuotaUsage proto]

The usage report for a bucket.

Note

Note that the first report sent for a BucketId indicates to the RLQS server that the RLQS client is subscribing for the future assignments for this BucketId.

{
  "bucket_id": {...},
  "time_elapsed": {...},
  "num_requests_allowed": ...,
  "num_requests_denied": ...
}

bucket_id: (service.rate_limit_quota.v3.BucketId, REQUIRED) BucketId for which request quota usage is reported.

time_elapsed: (Duration, REQUIRED) Time elapsed since the last report.

num_requests_allowed: (uint64) Requests the data plane has allowed through.

num_requests_denied: (uint64) Requests throttled.

service.rate_limit_quota.v3.RateLimitQuotaResponse

[service.rate_limit_quota.v3.RateLimitQuotaResponse proto]

{
  "bucket_action": []
}

bucket_action: (repeated service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction, REQUIRED) An ordered list of actions to be applied to the buckets. The actions are applied in the given order, from top to bottom.

service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction

[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction proto]

Commands the data plane to apply one of the actions to the bucket with the bucket_id.

{
  "bucket_id": {...},
  "quota_assignment_action": {...},
  "abandon_action": {...}
}

bucket_id: (service.rate_limit_quota.v3.BucketId, REQUIRED) BucketId for which request the action is applied.

quota_assignment_action

(service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction) Apply the quota assignment to the bucket.

Commands the data plane to apply a rate limiting strategy to the bucket. The process of applying and expiring the rate limiting strategy is detailed in the QuotaAssignmentAction message.

Precisely one of quota_assignment_action, abandon_action must be set.

abandon_action

(service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction) Abandon the bucket.

Commands the data plane to abandon the bucket. The process of abandoning the bucket is described in the AbandonAction message.

Precisely one of quota_assignment_action, abandon_action must be set.

service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction

[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.QuotaAssignmentAction proto]

Quota assignment for the bucket. Configures the rate limiting strategy and the duration for the given bucket_id.

Applying the first assignment to the bucket

Once the data plane receives the QuotaAssignmentAction, it must send the current usage report for the bucket, and start rate limiting requests matched into the bucket using the strategy configured in the rate_limit_strategy field. The assignment becomes bucket’s active assignment.

Expiring the assignment

The duration of the assignment defined in the assignment_time_to_live field. When the duration runs off, the assignment is expired, and no longer active. The data plane should stop applying the rate limiting strategy to the bucket, and transition the bucket to the “expired assignment” state. This activates the behavior configured in the expired_assignment_behavior field.

Replacing the assignment

If the rate limiting strategy is different from bucket’s active assignment, or the current bucket assignment is expired, the data plane must immediately end the current assignment, report the bucket usage, and apply the new assignment. The new assignment becomes bucket’s active assignment.
If the rate limiting strategy is the same as the bucket’s active (not expired) assignment, the data plane should extend the duration of the active assignment for the duration of the new assignment provided in the assignment_time_to_live field. The active assignment is considered unchanged.

{
  "assignment_time_to_live": {...},
  "rate_limit_strategy": {...}
}

assignment_time_to_live

(Duration) A duration after which the assignment is be considered expired. The process of the expiration is described above.

If unset, the assignment has no expiration date.
If set to 0, the assignment expires immediately, forcing the client into the “expired assignment” state. This may be used by the RLQS server in cases when it needs clients to proactively fall back to the pre-configured ExpiredAssignmentBehavior, f.e. before the server going into restart.

Attention

Note that expiring the assignment is not the same as abandoning the assignment. While expiring the assignment just transitions the bucket to the “expired assignment” state; abandoning the assignment completely erases the bucket from the data plane memory, and stops the usage reports.

rate_limit_strategy: (type.v3.RateLimitStrategy) Configures the local rate limiter for the request matched to the bucket. If not set, allow all requests.

service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction

[service.rate_limit_quota.v3.RateLimitQuotaResponse.BucketAction.AbandonAction proto]

Abandon action for the bucket. Indicates that the RLQS server will no longer be sending updates for the given bucket_id.

If no requests are reported for a bucket, after some time the server considers the bucket inactive. The server stops tracking the bucket, and instructs the the data plane to abandon the bucket via this message.

Abandoning the assignment

The data plane is to erase the bucket (including its usage data) from the memory. It should stop tracking the bucket, and stop reporting its usage. This effectively resets the data plane to the state prior to matching the first request into the bucket.

Restarting the subscription

If a new request is matched into a bucket previously abandoned, the data plane must behave as if it has never tracked the bucket, and it’s the first request matched into it:

The process of subscription and reporting starts from the beginning.
The bucket transitions to the “no assignment” state.
Once the new assignment is received, it’s applied per “Applying the first assignment to the bucket” section of the QuotaAssignmentAction.

service.rate_limit_quota.v3.BucketId

[service.rate_limit_quota.v3.BucketId proto]

The identifier for the bucket. Used to match the bucket between the control plane (RLQS server), and the data plane (RLQS client), f.e.:

the data plane sends a usage report for requests matched into the bucket with BucketId to the control plane
the control plane sends an assignment for the bucket with BucketId to the data plane Bucket ID.

Example:

bucket:
  name: my_bucket
  env: staging

Note

The order of BucketId keys do not matter. Buckets { a: 'A', b: 'B' } and { b: 'B', a: 'A' } are identical.

{
  "bucket": {...}
}

bucket: (repeated map<string, string>)