Rate limiting

Rate limiting is the primary defence against credential stuffing on the AR-user sign-in surface, scrape behaviour against the principal-side read endpoints, and brute-forcing of the step-up TOTP path on terminal actions. Lending Agent Oversight exposes a fixed API surface, and each route has a ceiling that fits human pace and rejects automation.

This page sets the ceilings, the sign-in lockout policy, the recommended storage backend, and the middleware that enforces them.

Ceilings (production)

Limits are token-bucket per session for authenticated routes and per IP for unauthenticated routes. All numbers are the steady-state ceiling; bursts up to twice the ceiling are absorbed by the bucket and replenished at the limit rate.

Method	Path	Limit	Window	Keyed on	Notes
POST	/api/auth/session	5	1 min	IP	Sign-in. Lockout at 10 fails per email per hour, see below.
DELETE	/api/auth/session	30	1 min	session	Sign-out.
POST	/api/auth/session/refresh	60	1 min	session	Slides session expiry.
POST	/api/auth/step-up	10	1 min	session	Re-enter password + TOTP for terminal actions. 3 consecutive TOTP fails escalate to sign-in lockout.
GET	/api/me	60	1 min	session	Current user.
GET	/api/ars	120	1 min	session	Paged list, filtered.
POST	/api/ars	30	1 min	session	Onboard new AR (principal-admin).
GET	/api/ars/:id	240	1 min	session	Higher ceiling for AR detail page polling.
PATCH	/api/ars/:id	30	1 min	session	Optimistic concurrency on `If-Match`.
GET	/api/ars/:id/risk	120	1 min	session	Risk trajectory data.
POST	/api/breaches	30	1 min	session	AR-user files; principal can file on behalf.
GET	/api/breaches	120	1 min	session	Triage queue.
PATCH	/api/breaches/:id	30	1 min	session	Resolution status changes.
POST	/api/breaches/:id/notify-fca	10	1 min	session	Step-up required. SUP 15 notification recorded.
GET	/api/reviews	120	1 min	session	File review list.
POST	/api/reviews	30	1 min	session	Schedule a review (compliance).
PATCH	/api/reviews/:id	60	1 min	session	Inline saving.
POST	/api/reviews/:id/complete	10	1 min	session	Locks findings, recomputes AR score.
GET	/api/mi-returns	120	1 min	session	List.
POST	/api/mi-returns	10	1 min	session	Idempotent on `(arId, period)`.
POST	/api/mi-returns/:id/submit	10	1 min	session	Recomputes anomaly score.
GET	/api/annual-reviews	60	1 min	session	List.
POST	/api/annual-reviews/:id/sign-off	10	1 min	session	Step-up required. Terminal.
GET	/api/audit	60	1 min	session	Read-only chain.
POST	/api/audit/export	5	1 hour	session	Bounded so a compromised session cannot exfiltrate the entire tenant in seconds.
GET	/api/health	600	1 min	IP	-
GET	/api/version	600	1 min	IP	-

Limit-exceeded responses return HTTP 429 with Retry-After set to the window remainder. The 429 itself is emitted as an audit-log entry with the route and the keyed identifier (truncated to 12 characters). A spike of 429s on POST /api/auth/session for one email is the signature of credential stuffing and triggers the lockout below.

Two layers above the per-IP ceiling on POST /api/auth/session:

Per-email lockout. 10 failed sign-ins per email address per rolling hour locks the account. The lockout shows a generic “email or password incorrect, please try again later” without enumerating the email’s existence. A sign-in success resets the counter.
Per-session step-up lockout. 3 consecutive TOTP failures on POST /api/auth/step-up revoke the session and force a fresh sign-in.

Lockout windows are stored in Redis with a 1-hour TTL. The lockout state is itself an audit event (auth.lockout and auth.unlock).

Storage backend

The limiter needs an atomic counter with a TTL. Two options on Vercel:

Upstash Ratelimit with Upstash Redis. First-class on Vercel, sub-millisecond reads in-region. Sliding-window or fixed-window algorithms.
Vercel KV (Redis-backed) with a hand-rolled INCR-with-EXPIRE pattern.

Upstash Ratelimit is the default recommendation because its API is purpose-built and avoids the race conditions that creep into hand-rolled patterns.

Middleware shape

A single Next.js middleware applied to the API routes. The routeFor function resolves a route to a limiter; the keyFor function picks the right key (session id, IP, or email).

import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
import { NextRequest, NextResponse } from "next/server";

const redis = Redis.fromEnv();

const limiters = {
  signin:       new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(5,   "1 m"), prefix: "rl:si"   }),
  stepUp:       new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10,  "1 m"), prefix: "rl:su"   }),
  meRead:       new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(60,  "1 m"), prefix: "rl:me"   }),
  arList:       new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(120, "1 m"), prefix: "rl:arl"  }),
  arDetail:     new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(240, "1 m"), prefix: "rl:ard"  }),
  write30:      new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(30,  "1 m"), prefix: "rl:w30"  }),
  terminal:     new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10,  "1 m"), prefix: "rl:t10"  }),
  auditRead:    new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(60,  "1 m"), prefix: "rl:ar"   }),
  auditExport:  new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(5,   "1 h"), prefix: "rl:ae"   }),
} as const;

export async function middleware(req: NextRequest) {
  const route = routeFor(req);
  if (!route) return NextResponse.next();

  const key = keyFor(route, req);
  const { success, reset } = await limiters[route].limit(key);
  if (!success) {
    const retryAfter = Math.max(1, Math.ceil((reset - Date.now()) / 1000));
    await emitAudit({ action: "rate.exceeded", route, keyHint: key.slice(0, 12) });
    return new NextResponse("Too many requests", {
      status: 429,
      headers: { "Retry-After": retryAfter.toString() },
    });
  }

  return NextResponse.next();
}

keyFor picks the right scope: session id from the cookie for authenticated routes, IP from x-forwarded-for for unauthenticated routes, email from the request body for the per-email sign-in counter (kept on a separate prefix).

Operational notes

The limit keys deliberately use the session id rather than IP for authenticated routes. A user behind a corporate NAT shares an IP with thousands of colleagues; an IP-keyed limit on the principal-side read endpoints would punish legitimate users on a shared egress. The session cookie binds to the right scope.

For unauthenticated routes (sign-in, health, version), the IP is the right key. The sign-in surface combines an IP limit with a per-email lockout because credential stuffing rotates email/password pairs from one IP and rotates IPs against one email; the two together bound both shapes of attack.

A separate global circuit breaker is not in scope. If the per-key limits are correct, the global rate is the per-key rate multiplied by the number of legitimate keys, which scales linearly with the principal’s user estate. Vercel’s edge protection handles the volumetric DDoS layer underneath.

The audit log records 429 events but at a sampled rate (1 in 10 within a single window) to avoid the audit log itself becoming a target. The full count is captured in the limiter’s metrics, which the operator dashboard surfaces.

Rate limiting

Ceilings (production)

Sign-in lockout

Storage backend

Middleware shape

Operational notes