Pattern Sections
Pattern: Rate Limit
a.k.a. Quota, Usage Limitation
Context
An API endpoint and the API contract defining operations, messages, and data representations have been established. If required, an API Description has been defined that specifies messages exchange patterns and protocol. Clients of the API might have signed up with the provider and, if required, have agreed to the terms and conditions that govern the usage of the endpoint and operations. Alternatively, the offering might not require any contractual relation, e.g., when offered as an open government data service or during a trial period.
Problem
How can the API provider prevent API clients from excessive API usage?1
Forces
When preventing excessive API usage that may harm provider operations or other clients, solutions to the following design issues have to be found:
- Economic aspects
- Performance
- Reliability
- Impact and severity of risks of API abuse
- Client awareness
Pattern forces are explained in depth in the book.
Solution
Introduce and enforce a Rate Limit to safeguard against API clients that overuse the API.
Sketch
A solution sketch for this pattern from pre-book times is:
Example
GitHub uses this pattern to control access to its RESTful HTTP API:
Once a Rate Limit is exceeded,
subsequent requests are answered with HTTP status code
429 Too Many Requests
. To inform clients about the current
state of each Rate Limits and to
help clients manage their allowance of tokens, custom HTTP headers are
sent with each rate-limited response.
The following code listing shows an excerpt of such a rate-limited response from the GitHub API. The API has a limit of 60 requests per hour, of which 59 remain:
GET https://api.github.com/users/misto
HTTP/1.1 200 OK
...
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 59
X-RateLimit-Reset: 1498811560
The X-RateLimit-Reset
indicates the time when the limit
will be reset with a Unix timestamp2.
Are you missing implementation hints? Our papers publications provide them (for selected patterns).
Consequences
The resolution of pattern forces and other consequences are discussed in our book.
Known Uses
Rate Limits are implemented in many public Web APIs:
- The GitHub API v3 has a 5000 requests per hour per user limit for authenticated requests. Clients can also make unauthenticated requests but these are limited to just 60 requests per hour (as can be seen in the example above). In the new GraphQL-based GitHub v4 API, the Rate Limit has become more sophisticated and takes into account the number of queried nodes.
- Open Weather Map calls its rate limits access limitation and restricts clients to a certain amount of calls per minute, depending on the subscription.
- Rate Limits in Quandl depend on the subscription level and also have a limit on the number of concurrent requests.
- The Twitter REST API only allows authenticated clients and has Rate Limits divided into 15 minute intervals.
- The Swiss Federal Administration’s registry of companies
(“UID-Register”) has a public webservice
API. The API is free to use but is limited to 20 requests per
minute. If the limit is exceeded, a
Request_limit_exceeded
error is returned. - The Opedata.ch Transport API and the [timetable.search.ch API](https://timetable.search.ch/api/help APIs use the pattern too.
- Many API Gateways, such as MuleSoft API Manager, allow developers to introduce Rate Limits. API gateways often also support throttling to further protect the exposed APIs.
- The open Certificate Authority (CA) Let’s Encrypt limits the weekly number of certificates issued per registered domain, but also provides a renewal exemption. Its Automatic Certificate Management Environment (ACME) API also limits the number of accounts that can be registered by a given IP address every hour.
Some Web frameworks provide Rate Limit as an optional feature. For example, the Play-Guard library for the Java/Scala Play Framework provides a basic implementation.
More Information
Related Patterns
The details of a Rate Limit can be part of a Service Level Agreement. A Rate Limit can be dependent on the client’s subscription level, which is further described in the Pricing Plan pattern. In such cases the Rate Limit is used to enforce different billing levels of the Pricing Plan.
To observe individual clients and manage their allowances, the service provider needs to identify the client making a request. Therefore, clients need to present some form of identification (e.g. an API Key, an IP address or another authentication practice) so that the API provider can do the bookkeeping.
A Wish List and a Wish Template can help to ensure that data-bound Rate Limits are not violated.
The current state of the Rate Limit, e.g., how many requests remain in the current billing period, can be communicated via a Context Representation.
The systems management patterns published by Hohpe and Woolf (2003) can help to implement metering and can thus also be used as enforcement points. For example, a Control Bus can be used to increase or decrease certain limits dynamically at runtime.
As discussed above, Leaky Bucket Counter Hanmer (2007) offers a possible implementation variant for Rate Limit.
References
What exactly is deemed excessive needs to be defined by the API provider. A flat rate subscription typically imposes different limitations than a free billing plan. See the Pricing Plan pattern for a detailed discussion of the trade-offs of different subscription models.↩︎
Unix timestamps count the number of seconds since January 1st, 1970.↩︎