Adam Cassar

Co-Founder

4 min read

Rate limits are a useful control for protecting a web application from abuse. When setting them for a web application, the key elements to consider are:

  • What endpoints on the web application require protecting?
  • Do different endpoints require separate handling?
  • What is the normal request rate for the entire application over a time period?
  • How many concurrent connections are typically used by your clients?
  • What errors does your API endpoint return in response to requests?

Before setting those policies, it helps to understand how rate limits protect an application from abuse or misuse, the types of attacks they can reduce, and how the rate limiting algorithm makes decisions.

What kinds of attacks are stopped by rate limiting?

When protecting an application with rate limiting, common attack scenarios include:

  • Brute force and enumeration attacks
  • Denial of Service (DoS) and Distributed Denial of Service (DDoS)
  • Site scraping

What else can rate limiting protect?

Public APIs and authenticated APIs can be subject to both abuse and misuse. Sensible rate limit policies can be applied on these endpoints to help prevent attacks and maintain service availability. Rate limiting can help protect these endpoints.

How does rate limiting work with user logins?

A well designed web application should allow only a limited number of failed login attempts before locking an account and requiring a password reset. This is designed to protect against brute force attacks against an account. Bots commonly attempt to brute force logins to WordPress and other popular web applications. Determined attackers can also attempt to brute force API login endpoints.

Rate limiting on a login page can be applied to the IP address of a user attempting to log in. By rate limiting by IP address, you can limit both password brute force attacks and simpler username enumeration attempts.

Using Peakhour.IO rate limiting, responses to requests can be monitored and IPs blocked for administrator-defined periods. This saves origin server resources and stops repeated attempts before they reach the application.

How could API rate limiting work?

APIs are ubiquitous across the modern web. Single Page Applications (SPAs) can be built almost entirely on REST or GraphQL APIs, while legacy applications often use form submits. Even when browsing this blog, you have consumed a range of APIs.

Because APIs are often publicly available, rate limits are commonly used to reduce abuse. Rate limiting for APIs can protect against malicious attacks. An attacker could script a bot to perform many API calls and make the service unavailable for other users, causing unplanned downtime - a layer 7 DoS or DDoS attack.

APIs

Public and private APIs can be subject to abuse or misuse. Public APIs are discoverable by anyone and can be scripted for data mining or attacks. Rate limiting these endpoints based on fair use policies is commonplace. Keeping track of this within an endpoint can be expensive, so handling it through Peakhour can offload that work from developers.

Overzealous 'good bots'

Peakhour has seen websites where up to 65% of requests come from automated bots. These bots are typically indiscriminate when mining information, and they do not carry the operational cost when your site slows down or fails. Rate limiting good bots separately from your main users helps ensure these crawlers do not stop your site from generating revenue.

How is rate limiting implemented?

Rate limiting is typically implemented using several common methods:

Fixed window

Window-based rate limiting is the simplest to understand. Fixed window limits are easy to define, such as 5,000 requests per 60 minutes. Fixed window rate limiting is subject to spikes at the edges of the window. For example, 5,000 requests in the first 5 minutes of the window may overwhelm a service.

Sliding window

A sliding window keeps much of the simplicity of a fixed window, but uses a rolling window. This allows bursts to be smoothed.

Token bucket

A token bucket is an algorithm where tokens are placed into a fixed-capacity bucket. Tokens could be defined as bytes transferred or hits to an API. When a request is considered for rate limiting, tokens are removed from the bucket. If the bucket has a sufficient quantity of tokens, the request can proceed. If there are insufficient tokens, the request is considered to be non-conforming. Non-conforming requests are dropped.

Leaky bucket

Leaky buckets are a mirror image of token buckets. Instead of removing tokens from a bucket, tokens are added to a bucket. Tokens are removed from the bucket (leaks) at a fixed rate. When a request is considered for rate limiting, it is compared to the number of tokens in the bucket. If the bucket is full, the request is considered non-conforming and is dropped.

If rate limiting is something you need to do to protect and secure your website, reach out to see how we can help.


Learn how Peakhour's Application Security Platform can improve your application's performance and security. Contact our team to get started.