Protecting Your APIs with Rate Limiting in ASP.NET

It’s 3 a.m., and your phone lights up with a flood of alerts. Your API is under siege — a single IP address is hammering it with 10,000 requests every second. Servers buckle under the load, paying customers are suddenly locked out, and every passing minute translates into potential revenue loss.

Scenarios like this are unfortunately common, and the frustrating part is that many could have been prevented with one simple feature: rate limiting.

When configured properly, rate limiting protects your service from abuse while ensuring that genuine users continue to enjoy a smooth experience. It acts as both a shield and a gatekeeper — filtering out excessive or malicious traffic before it reaches your core application.

In this article, we’ll explore what rate limiting is, why it matters, the algorithms behind it, and most importantly, how to implement it in ASP.NET Core using the built-in middleware available from NET version 7+ .

What Is Rate Limiting?

At its core, rate limiting defines how many requests a given client can make to your API within a certain period. Think of it like the doorman at an exclusive club. Everyone is welcome — but only at a controlled pace. Step outside the limits, and you’ll be asked to wait.

This simple mechanism provides tremendous benefits:

Prevents abuse: Stops denial-of-service attacks before they take down your systems.
Protects resources: Keeps servers responsive, even under pressure.
Fair allocation: Ensures no single client hogs all the bandwidth.
Improves reliability: Smooths out traffic spikes, creating a more predictable experience.

Of course, no solution is without drawbacks. Overly strict limits may accidentally block real users during traffic surges, distributed deployments require extra complexity, and there’s always a slight performance cost (though in practice, modern implementations are highly efficient).

Common Rate Limiting Algorithms

Not all rate limiting strategies are alike. Here are the most common approaches you’ll encounter:

Fixed Window: Allows X requests per fixed time frame. Easy to implement but prone to bursts at window boundaries.
Sliding Window: Continuously evaluates activity over the last N seconds, avoiding sharp cutoff issues. Smoother but requires more memory.
Token Bucket: Clients “spend” tokens with each request. Tokens replenish gradually, making it ideal for workloads that see short bursts of activity.
Concurrency Limiting: Focuses on the number of concurrent requests rather than totals. Particularly useful when protecting expensive operations such as database-heavy endpoints.

Rate Limiting in Action

To illustrate the difference, let’s compare a server under heavy load:

Without limits: The server is overwhelmed, latency skyrockets, and the system may crash.
With limits: Only a controlled portion of requests are allowed through, while the rest are rejected with a clear 429 Too Many Requests response. The service remains stable and responsive.

Setting Up Rate Limiting in ASP.NET Core

The good news is that starting with ASP.NET Core 7, rate limiting is available out of the box — no third-party packages required.

Basic Fixed Window Example

Here’s how to add a simple fixed window limiter:

using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);

// Add rate limiting services
builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("MyApiPolicy", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100; // 100 requests per minute
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
    options.OnRejected = async (context, token) =>
    {
        context.HttpContext.Response.StatusCode = 429;
        await context.HttpContext.Response.WriteAsync(
            "Too many requests. Try again later.", token);
    };
});
var app = builder.Build();

// Enable rate limiting middleware
app.UseRateLimiter();

// Apply rate limiting to specific endpoints
app.MapGet("/api/products", () => "Here are your products!")
   .RequireRateLimiting("MyApiPolicy");
app.Run();

This setup restricts clients to 100 requests per minute and ensures excess requests are politely rejected.

Advanced Scenarios

The built-in middleware allows more sophisticated strategies:

Sliding Window: Smooths out traffic spikes by breaking a window into segments.
Token Bucket: Perfect for APIs that need to allow short bursts of legitimate traffic.
Per-User Policies: Assign different rules to anonymous, standard, or premium users.
Controller-Level Policies: Apply limits directly via attributes on controllers or actions.

These options make it easy to tailor limits to your unique needs.

Sliding Window for Smoother Traffic Example

builder.Services.AddRateLimiter(options =>
{
    options.AddSlidingWindowLimiter("MySlidingPolicy", opt =>
    {
        opt.Window = TimeSpan.FromMinutes(1);
        opt.PermitLimit = 100;
        opt.SegmentsPerWindow = 6; // 10-second segments
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
    });
});

Token Bucket for Burst Traffic

builder.Services.AddRateLimiter(options =>
{
    options.AddTokenBucketLimiter("MyBurstPolicy", opt =>
    {
        opt.TokenLimit = 100;
        opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        opt.QueueLimit = 10;
        opt.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
        opt.TokensPerPeriod = 20; // Add 20 tokens every 10 seconds
        opt.AutoReplenishment = true;
    });
});

Per-User Rate Limiting

You might want different limits for different types of users:

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("MyPerUserPolicy", httpContext =>
    {
        var userId = httpContext.User?.FindFirst("sub")?.Value ?? "anonymous";
        
        return RateLimitPartition.GetFixedWindowLimiter(userId, _ =>
            new FixedWindowRateLimiterOptions
            {
                PermitLimit = GetUserLimit(userId),
                Window = TimeSpan.FromMinutes(1)
            });
    });
});

static int GetUserLimit(string userId)
{
    return userId switch
    {
        "anonymous" => 10,                      // Anonymous users: 10 req/min
        var id when IsPremiumUser(id) => 1000,  // Premium: 1000 req/min
        _ => 100                                // Regular users: 100 req/min
    };
}

Controller-Level Rate Limiting

You can also apply rate limits at the controller level:

[ApiController]
[Route("api/[controller]")]
[EnableRateLimiting("MyApiPolicy")]
public class ProductsController : ControllerBase
{
    [HttpGet]
    public IActionResult GetProducts()
    {
        return Ok(new { Message = "Here are your products!" });
    }

    //

    [HttpPost]
    [EnableRateLimiting("MyStrictPolicy")]    // Different policy for POST
    public IActionResult CreateProduct([FromBody] Product product)
    {
        return Ok(new { Message = "Product created!" });
    }

    //

}

Real-World Benchmarks

Testing under load highlights the benefits. In one simulation with a 100-requests-per-minute cap:

Without limits: Success rate dropped to ~20%, and the API crashed within 30 seconds.
With limits: 95% of requests succeeded, 5% were rejected gracefully, and the system stayed stable.

Different algorithms also show varying performance characteristics. Fixed Window tends to use the least memory, Sliding Window offers smoother fairness, Token Bucket handles bursts best, and Concurrency Limiting keeps latency lowest for resource-intensive calls.

Distributed Scenarios with Redis

In cloud environments or when running multiple instances of your API (load balancer), rate limiting must be distributed. A common solution is to store counters in Redis. With StackExchange.Redis, you can implement logic to track requests across instances using atomic scripts.

This ensures that all clients face consistent enforcement, no matter which server instance they hit.

Add the NuGet package:

dotnet add package StackExchange.Redis

And consume it like

public class RedisRateLimitService
{
    private readonly IDatabase _database;
    
    public RedisRateLimitService(IConnectionMultiplexer redis)
    {
        _database = redis.GetDatabase();
    }
    
    public async Task<bool> IsAllowedAsync(string key, int limit, TimeSpan window)
    {
        var script = @"
            local current = redis.call('GET', KEYS[1])
            if current == false then
                redis.call('SET', KEYS[1], 1)
                redis.call('EXPIRE', KEYS[1], ARGV[2])
                return 1
            else
                local count = tonumber(current)
                if count < tonumber(ARGV[1]) then
                    redis.call('INCR', KEYS[1])
                    return 1
                else
                    return 0
                end
            end";
        
        var result = await _database.ScriptEvaluateAsync(
            script, 
            new RedisKey[] { key }, 
            new RedisValue[] { limit, (int)window.TotalSeconds }
        );
        
        return result.ToString() == "1";
    }
}

Enhancing the User Experience

Good rate limiting is not just about blocking traffic — it’s also about communicating clearly. Adding headers such as X-RateLimit-Limit, X-RateLimit-Remaining, and Retry-After helps developers understand their current status and when they can safely retry:

options.OnRejected = async (context, token) =>
{
    var response = context.HttpContext.Response;
    response.StatusCode = 429;
    response.Headers.Add("X-RateLimit-Limit", "100");
    response.Headers.Add("X-RateLimit-Remaining", "0");
    response.Headers.Add("X-RateLimit-Reset", 
        DateTimeOffset.UtcNow.AddMinutes(1).ToUnixTimeSeconds().ToString());
    response.Headers.Add("Retry-After", "60");
    
    await response.WriteAsync(
        "Rate limit exceeded. Try again in 60 seconds.", token);
};

You can also apply IP-based policies for public APIs or create user-tiered limits to reward premium subscribers with higher allowances:

options.AddPolicy("MyIpPolicy", httpContext =>
{
    var ipAddress = httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
    
    return RateLimitPartition.GetFixedWindowLimiter(ipAddress, _ =>
        new FixedWindowRateLimiterOptions
        {
            PermitLimit = 100,
            Window = TimeSpan.FromMinutes(1)
        });
});

Pitfalls to Avoid

While powerful, rate limiting can backfire if misused. Common mistakes include:

Setting limits too aggressively, frustrating real users.
Forgetting to account for multiple user tiers.
Not implementing distributed coordination in multi-instance setups.
Providing vague or unhelpful error responses.

Always balance protection with usability. Rate limiting should enhance reliability, not become an obstacle.

Key Takeaways

Rate limiting is a fundamental safeguard for any production API. Thanks to the middleware included in ASP.NET Core 7 and later, implementation has never been simpler.

Here’s a quick guide to choosing the right algorithm:

Fixed Window: Best when you need simplicity and minimal memory usage.
Sliding Window: Use when fairness and smooth traffic flow matter.
Token Bucket: Ideal when your users occasionally generate bursts of traffic.
Concurrency: Protects heavy operations by limiting simultaneous calls.

Finally, remember to:

Enable distributed coordination for multi-server deployments (load balancer).
Provide clear feedback to clients through error messages and headers.
Test thoroughly under realistic loads.
Monitor the effect on both performance and user experience.

By applying these principles, you can protect your systems, keep your customers happy, and sleep soundly — even when your API faces its next 3 a.m. traffic surge.