7 min read

Rate limiting in Next.js — the correct way in 2026

Unlimited API endpoints are how $150K OpenAI bills happen. Here is how to add proper rate limiting to a Next.js app using Vercel Edge Middleware, Upstash, or your existing Redis.

If any of your Next.js routes costs money per call (LLM inference, SMS, transactional email, image processing), rate limiting is the difference between a normal month and a $50K bill from an angry user or a compromised key. This guide covers three implementations in order of complexity.

What it is

Rate limiting restricts how many requests a caller can make in a time window. Proper rate limiting has both a global limit (for fairness) and a per-user / per-IP limit (for abuse). Modern implementations combine fixed-window counters with sliding-window algorithms for accuracy.

Vulnerable example

// No rate limit. One attacker with a leaked OpenAI key burns $20K overnight.
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const resp = await openai.chat.completions.create({
    model: "gpt-5",
    messages: [{ role: "user", content: prompt }],
  });
  return Response.json(resp);
}

Fixed example

// Upstash Rate Limit: 20 requests per IP per minute.
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(20, "1 m"),
  analytics: true,
});

export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for")?.split(",")[0] ?? "anon";
  const { success } = await ratelimit.limit(ip);
  if (!success) return new Response("slow down", { status: 429 });
  // ...continue
}

How Securie catches it

Securie's cost-firewall agent identifies every call-site that reaches a paid API (OpenAI, Stripe, Twilio, SendGrid, Resend, and 50+ others). Any call-site without an upstream rate limit becomes a finding.

Checklist

  • Every paid-API endpoint is rate-limited per IP AND per authenticated user
  • Global caps exist per API (daily spend ceiling)
  • Rate-limit state lives in Redis (not in-memory — memory resets on deploy)
  • 429 responses include Retry-After header
  • Rate limiting applies at the CDN edge where possible (Cloudflare, Vercel Edge)
  • Burst allowance is explicit, not accidental

FAQ

What limit should I pick?

Start with 20 requests per IP per minute for anonymous users and 120 per minute for authenticated. Adjust based on legitimate usage observed in the first week.

Is IP-based rate limiting enough?

No. Layer per-IP + per-user + per-API-key limits. IP alone is defeated by anyone with a residential-proxy botnet.