Rate Limiting¶
Ocelot implements rate limiting [1] for upstream requests to prevent downstream services from being overwhelmed.
RateLimitOptions Schema¶
Class: FileRateLimitByHeaderRule
As you may already know from the Configuration chapter and the Route Schema and Dynamic Route Schema sections, there is a special RateLimitOptions object schema for routes:
"RateLimitOptions": {
// rule, partition by
"ClientIdHeader": "",
"ClientWhitelist": [""],
// management opts
"EnableRateLimiting": true,
"EnableHeaders": true,
// algorithm
"Limit": 1,
"Period": "",
"Wait": "",
// extended opts
"StatusCode": 1,
"QuotaMessage": "",
"KeyPrefix": ""
}
Additionally, the Global Configuration Schema allows configuring global Rate Limiting options.
Note 1: The complete route-level
RateLimitOptionsschema, including all available properties, is defined in the C# FileRateLimitByHeaderRule class. The globalRateLimitOptionsschema includes an additionalRouteKeysarray option, which allows grouping routes to which the global options will apply (refer to the C# FileGlobalRateLimitByHeaderRule class for details). If theRouteKeysoption is not defined in the globalRateLimitOptions, the global settings will apply to all routes.Note 2: You do not need to set all of these options due to default values, but the following rule options are required:
LimitandPeriod. If these required options are undefined and no global configuration is present, Ocelot will fail to start due to an internally generated validation error, which will be visible in the logs.Note 3: Several deprecated options originating from version 24.0 and earlier (see old schema) are retained for one release cycle. Both introduced and deprecated options are detailed in the Configuration 2 table below.
Configuration [2]¶
A complete configuration consists of both route-level and global Rate Limiting.
You can configure the following options in the GlobalConfiguration section of ocelot.json:
"Routes": [
{
"Key": "R1",
"RateLimitOptions": {
"ClientWhitelist": ["ocelot-client1-preshared-key"],
"Limit": 1000,
"Period": "20s", // (milli)seconds, minutes, hours, days
"Wait": "1.5m" // (milli)seconds, minutes, hours, days
"StatusCode": 418, // I'm a teapot -> this is special status
"QuotaMessage": "Out of coffee! Our bar can only serve up to {0} cups of coffee every {1}. In the meantime, why not grab some tea and relax for Retry-After seconds until we're ready to serve again?"
}
}
],
"GlobalConfiguration": {
"BaseUrl": "https://api.ocelot.net",
"RateLimitOptions": {
"RouteKeys": ["R1"], // if undefined or empty array, opts will apply to all routes
"ClientIdHeader": "Oc-Client", // std (default) header name
"Limit": 100,
"Period": "30s", // ms, s, m, h, d
"Wait": "1m", // ms, s, m, h, d
"StatusCode": 429, // Too Many Requests -> standard status
"QuotaMessage": "Ocelot API calls quota exceeded! Maximum admitted {0} per {1}.", // standard template with 2 parameters
"KeyPrefix": "ocelot-rate-limiting" // for caching key
}
}
Schema Option |
Description |
|---|---|
|
Specifies the header used to identify clients, with “Oc-Client” set as the default. |
|
An array that contains the clients exempt from rate limiting. |
|
Enables or disables rate limiting. Defaults to |
|
Specifies whether the |
|
The maximum number of requests a client can make within a given time |
|
Rate limiting period (fixed window) can be expressed as milliseconds (1ms), as seconds (1s), minutes (1m), hours (1h), or days (1d).
If the exact |
|
Rate limiting wait window (no servicing period) can be expressed as milliseconds (1ms), as seconds (1s), minutes (1m), hours (1h), or days (1d).
This option can have shorter or longer durations compared to the fixed window duration specified as |
|
The rejection status code returned during the Quota Exceeded period*. Default value: 429 (Too Many Requests). |
|
Specifies the message displayed when the quota is exceeded. The value to be used as the formatter for the Quota Exceeded* response message. If none specified the default will be informative. |
|
The counter prefix, used to compose the rate limiting counter caching key to be used by the |
“Quota Exceeded period” term
The Quota Exceeded period refers to the Wait window, if defined, or the remaining duration of the fixed Period following the moment the request Limit is exceeded.
During this time, the configured rejection StatusCode is returned, and the formatted QuotaMessage is written to the response body.
To determine when this period ends, clients should inspect the Retry-After header, which provides a floating-point value indicating the number of seconds until the next allocated fixed window begins.
The X-RateLimit-* headers are included in the response during the Quota Exceeded period, provided that headers are enabled via the EnableHeaders option.
Note 1: If the
RouteKeysoption is not defined or the array is empty in the globalRateLimitOptions, the global settings will apply to all routes. If the array contains route keys, it defines a single group of routes to which the global options apply. Routes excluded from this group must specify their own route-levelRateLimitOptions.Note 2: The string values for the
PeriodandWaitoptions must contain a floating-point number followed by one of the supported time units: ‘ms’, ‘s’, ‘m’, ‘h’, or ‘d’. If no unit is specified, the value defaults to milliseconds. For example, “333.5” is interpreted as 333 milliseconds and 500 microseconds (equivalent to “333.5ms”). The floating-point component may be omitted; for example, “10.0s” is equivalent to “10s”. These values are parsed dynamically at runtime, so the requiredPeriodoption in ocelot.json is validated early through fluent validation when the Ocelot app starts. If an invalid value is provided, the Rate Limiting middleware will throw aFormatException, which is logged accordingly.
Deprecated options [3]¶
Warning
Here are the deprecated options from the old schema:
Deprecated and Introduced Options |
Description |
|---|---|
|
Specifies whether the |
|
This parameter specifies the time, in seconds, after which a retry is allowed.
During this interval, the |
|
Specifies the HTTP status code returned during rate limiting, with a default value of 429 (Too Many Requests). |
|
Specifies the message displayed when the quota is exceeded. This option is optional, and the default message is informative. |
|
Specifies the counter prefix used to construct the rate limiting counter cache key. |
Notes¶
Note
1. Prior to version 24.1, global options were only accessible in the special Dynamic Routing mode. Since version 24.1, global configuration has been available for both static and dynamic routes. As a team, we would consider the idea of implementing such a global configuration for aggregated routes. However, an aggregated route is essentially a combination of static routes.
2. Global rate limiting options may not be practical as they apply limits to all routes. In a microservices architecture, it is unusual to enforce the same limitations across all routes. Configuring per-route rate limiting could offer a more tailored solution. However, global rate limiting can be logical if all routes share the same downstream hosts, thereby restricting the usage of a single service or a single product.
3. The DisableRateLimitHeaders option is deprecated as of version 24.1.
Use EnableHeaders instead, applying boolean value negation as needed.
If DisableRateLimitHeaders is defined, it takes precedence; otherwise, EnableHeaders will be used.
Do not define both options.
This setting is retained for backward compatibility but is subject to change.
Therefore, the DisableRateLimitHeaders option will be removed in the upcoming major release, version 25.0.
The same applies to other deprecated options.
4. Ocelot’s own rate limiting does not utilize built-in ASP.NET Core features, so it is not based on the “Rate limiting middleware in ASP.NET Core” described in the Roadmap below. The Ocelot team believes that the ASP.NET Core rate limiting middleware enables global limitations through its rate-limiting policies.
Algorithms¶
The currently implemented rate limiter algorithms in Ocelot are:
Fixed window: Based on the
Periodoption, without theWaitoption (previously known as the deprecatedPeriodTimespan).Hybrid fixed window: The combination of
PeriodandWaitenables fixed-window-like behavior with additional control over the duration and handling of the “Quota Exceeded period”.
Historically, Ocelot’s rate limiting algorithm was a hybrid, combining the classic “fixed window” approach with a waiting no-service period. Since version 24.1, the hybrid algorithm has been split into two distinct algorithms, allowing the classic “fixed window” to be used independently.
To understand the terminology, please refer to the Handy Articles listed at the beginning of this chapter. For beginners, here is a quick link: Announcing ASP.NET Core rate limiting algorithms. For professionals, we recommend reading the official Microsoft Learn article “Rate limiting middleware in ASP.NET Core”, especially the Rate Limiter Algorithms section, and/or searching the internet for additional resources.
Note 1: Ocelot’s own rate limiter does not implement other classic algorithms such as “Sliding Window”, “Token Bucket”, or “Concurrency”. However, these algorithms are outlined in the Roadmap.
Note 2: Ocelot’s own rate limiter does not manage concurrent HTTP requests via a queue. Therefore, all concurrency handling and decision-making should be implemented on the client side using classic retry patterns to ensure quality of service. The management strategy is deliberately simple: First-In means First Wins. If the first request acquires a virtual lease from the limiting quota and the quota is immediately exceeded, the second request will be rejected with a 429 Too Many Requests response.
Rules (Partitions)¶
Ocelot’s rate limiting rule is a superset of the configuration options used to set up rate-limited access to a route. It enables partitioned rate limiting by processing the following artifacts through distinct stages: the client’s identifier, a dedicated partition counter (quota), rate limiter algorithms, and the quota-exceeded response behavior.
By Client’s Header¶
Class: FileRateLimitByHeaderRuleJSON: RateLimitOptions Schema
Currently, Ocelot’s own rate limiting middleware supports and processes only the “By Client’s Header” rule (partition), commonly referred to as the “API Key partition” in ASP.NET Core terminology. Ocelot’s rate limiting architecture provides dedicated subpartitions for each route, each with an independent counter for the rate limiter algorithm. Therefore, when client traffic enters the Ocelot pipeline, the current request is processed as follows:
Ocelot identifies the route by matching the URL path to the upstream route path, and allows the rate limiting middleware to process the client as part of the route partition.
Ocelot’s rate limiting middleware creates the client’s identity based on the configured “By Client’s Header” rule and assigns a dedicated rate limiter counter to that client.
The rate limiting middleware executes the configured rate limiter algorithm, specifically the (hybrid) fixed window. Refer to the currently implemented Algorithms for details.
If the quota is exceeded, the rate limiting middleware returns appropriate “Quota Exceeded period” artifacts in the response, such as the status code, body message, and headers including
Retry-After.
Note
If the client is not successfully identified, the rate limiting middleware blocks the request with a 503 Service Unavailable status and writes an appropriate error message to the response body.
Possible reasons for an empty identity include a missing header or an invalid ClientIdHeader value, as explained in the warning below.
Whitelisted clients (defined via the ClientWhitelist option) are processed without limitation.
Warning
Ocelot’s rate limiting middleware is not responsible for validating API keys, also known as client header values, to be read from the configured header (ClientIdHeader option).
Users and developers must register these header values as pre-shared API keys on Ocelot’s side and ensure they are validated before handing control over to the RateLimitingMiddleware.
We recommend implementing a custom middleware to validate API keys (client header values) and injecting it into the Ocelot pipeline using the Middleware Injection feature.
Specifically, the PreErrorResponderMiddleware (position 3) should be overridden, as it is invoked before the RateLimitingMiddleware at position 10.
A more advanced solution may involve using the SecurityMiddleware at position 7, but in this case, users must implement their own ISecurityPolicy service and replace it in the Dependency Injection (DI) container.
To understand the Ocelot pipeline and its middleware positions, refer to the “Ocelot Pipeline Builder” documentation.
Roadmap¶
Feature label: `Rate Limiting`_Development history: Rate Limiting [4]
Rules: The Ocelot team is considering a redesign of the Rate Limiting feature in light of the “Announcing Rate Limiting for .NET” article by Brennan Conroy, published on July 13th, 2022.
Note
Discover the new rate limiting functionality in ASP.NET Core:
The RateLimiter Class, available since ASP.NET Core 7.0
The System.Threading.RateLimiting NuGet package
The Rate limiting middleware in ASP.NET Core article by Arvin Kahbazi, Maarten Balliauw, and Rick Anderson
As of now, the decision has been made to retain Ocelot’s own RateLimitingMiddleware and extend it with an additional rule that will reference the attached ASP.NET Core rate limiting policy. This new rule is highly likely to be delivered in version 25.0, following the opening of pull request 2188.
Algorithms: In addition to the currently implemented hybrid “Fixed window” algorithm, which is built into Ocelot, the team plans to introduce other industry-standard algorithms, such as “Sliding window”, “Token bucket”, and “Concurrency, with priority given to “Sliding window” as the first. These lightweight algorithms should be easily configurable via JSON by end users who are not .NET developers, in order to avoid writing additional C# source code. Other interesting algorithms are welcome for discussion.
We encourage you to share your thoughts with us in the Discussions of the repository.
Filter the current discussions by the Rate Limiting label.