Rate limiting
Rate limiting is the primary defence against credential stuffing on the AR-user sign-in surface, scrape behaviour against the principal-side read endpoints, and brute-forcing of the step-up TOTP path on terminal actions. Lending Agent Oversight exposes a fixed API surface, and each route has a ceiling that fits human pace and rejects automation.
This page sets the ceilings, the sign-in lockout policy, the recommended storage backend, and the middleware that enforces them.
Ceilings (production)
Section titled “Ceilings (production)”Limits are token-bucket per session for authenticated routes and per IP for unauthenticated routes. All numbers are the steady-state ceiling; bursts up to twice the ceiling are absorbed by the bucket and replenished at the limit rate.
| Method | Path | Limit | Window | Keyed on | Notes |
|---|---|---|---|---|---|
| POST | /api/auth/session | 5 | 1 min | IP | Sign-in. Lockout at 10 fails per email per hour, see below. |
| DELETE | /api/auth/session | 30 | 1 min | session | Sign-out. |
| POST | /api/auth/session/refresh | 60 | 1 min | session | Slides session expiry. |
| POST | /api/auth/step-up | 10 | 1 min | session | Re-enter password + TOTP for terminal actions. 3 consecutive TOTP fails escalate to sign-in lockout. |
| GET | /api/me | 60 | 1 min | session | Current user. |
| GET | /api/ars | 120 | 1 min | session | Paged list, filtered. |
| POST | /api/ars | 30 | 1 min | session | Onboard new AR (principal-admin). |
| GET | /api/ars/:id | 240 | 1 min | session | Higher ceiling for AR detail page polling. |
| PATCH | /api/ars/:id | 30 | 1 min | session | Optimistic concurrency on If-Match. |
| GET | /api/ars/:id/risk | 120 | 1 min | session | Risk trajectory data. |
| POST | /api/breaches | 30 | 1 min | session | AR-user files; principal can file on behalf. |
| GET | /api/breaches | 120 | 1 min | session | Triage queue. |
| PATCH | /api/breaches/:id | 30 | 1 min | session | Resolution status changes. |
| POST | /api/breaches/:id/notify-fca | 10 | 1 min | session | Step-up required. SUP 15 notification recorded. |
| GET | /api/reviews | 120 | 1 min | session | File review list. |
| POST | /api/reviews | 30 | 1 min | session | Schedule a review (compliance). |
| PATCH | /api/reviews/:id | 60 | 1 min | session | Inline saving. |
| POST | /api/reviews/:id/complete | 10 | 1 min | session | Locks findings, recomputes AR score. |
| GET | /api/mi-returns | 120 | 1 min | session | List. |
| POST | /api/mi-returns | 10 | 1 min | session | Idempotent on (arId, period). |
| POST | /api/mi-returns/:id/submit | 10 | 1 min | session | Recomputes anomaly score. |
| GET | /api/annual-reviews | 60 | 1 min | session | List. |
| POST | /api/annual-reviews/:id/sign-off | 10 | 1 min | session | Step-up required. Terminal. |
| GET | /api/audit | 60 | 1 min | session | Read-only chain. |
| POST | /api/audit/export | 5 | 1 hour | session | Bounded so a compromised session cannot exfiltrate the entire tenant in seconds. |
| GET | /api/health | 600 | 1 min | IP | - |
| GET | /api/version | 600 | 1 min | IP | - |
Limit-exceeded responses return HTTP 429 with Retry-After set to the window remainder. The 429 itself is emitted as an audit-log entry with the route and the keyed identifier (truncated to 12 characters). A spike of 429s on POST /api/auth/session for one email is the signature of credential stuffing and triggers the lockout below.
Sign-in lockout
Section titled “Sign-in lockout”Two layers above the per-IP ceiling on POST /api/auth/session:
- Per-email lockout. 10 failed sign-ins per email address per rolling hour locks the account. The lockout shows a generic “email or password incorrect, please try again later” without enumerating the email’s existence. A sign-in success resets the counter.
- Per-session step-up lockout. 3 consecutive TOTP failures on
POST /api/auth/step-uprevoke the session and force a fresh sign-in.
Lockout windows are stored in Redis with a 1-hour TTL. The lockout state is itself an audit event (auth.lockout and auth.unlock).
Storage backend
Section titled “Storage backend”The limiter needs an atomic counter with a TTL. Two options on Vercel:
- Upstash Ratelimit with Upstash Redis. First-class on Vercel, sub-millisecond reads in-region. Sliding-window or fixed-window algorithms.
- Vercel KV (Redis-backed) with a hand-rolled
INCR-with-EXPIREpattern.
Upstash Ratelimit is the default recommendation because its API is purpose-built and avoids the race conditions that creep into hand-rolled patterns.
Middleware shape
Section titled “Middleware shape”A single Next.js middleware applied to the API routes. The routeFor function resolves a route to a limiter; the keyFor function picks the right key (session id, IP, or email).
import { Ratelimit } from "@upstash/ratelimit";import { Redis } from "@upstash/redis";import { NextRequest, NextResponse } from "next/server";
const redis = Redis.fromEnv();
const limiters = { signin: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(5, "1 m"), prefix: "rl:si" }), stepUp: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m"), prefix: "rl:su" }), meRead: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(60, "1 m"), prefix: "rl:me" }), arList: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(120, "1 m"), prefix: "rl:arl" }), arDetail: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(240, "1 m"), prefix: "rl:ard" }), write30: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(30, "1 m"), prefix: "rl:w30" }), terminal: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m"), prefix: "rl:t10" }), auditRead: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(60, "1 m"), prefix: "rl:ar" }), auditExport: new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(5, "1 h"), prefix: "rl:ae" }),} as const;
export async function middleware(req: NextRequest) { const route = routeFor(req); if (!route) return NextResponse.next();
const key = keyFor(route, req); const { success, reset } = await limiters[route].limit(key); if (!success) { const retryAfter = Math.max(1, Math.ceil((reset - Date.now()) / 1000)); await emitAudit({ action: "rate.exceeded", route, keyHint: key.slice(0, 12) }); return new NextResponse("Too many requests", { status: 429, headers: { "Retry-After": retryAfter.toString() }, }); }
return NextResponse.next();}keyFor picks the right scope: session id from the cookie for authenticated routes, IP from x-forwarded-for for unauthenticated routes, email from the request body for the per-email sign-in counter (kept on a separate prefix).
Operational notes
Section titled “Operational notes”The limit keys deliberately use the session id rather than IP for authenticated routes. A user behind a corporate NAT shares an IP with thousands of colleagues; an IP-keyed limit on the principal-side read endpoints would punish legitimate users on a shared egress. The session cookie binds to the right scope.
For unauthenticated routes (sign-in, health, version), the IP is the right key. The sign-in surface combines an IP limit with a per-email lockout because credential stuffing rotates email/password pairs from one IP and rotates IPs against one email; the two together bound both shapes of attack.
A separate global circuit breaker is not in scope. If the per-key limits are correct, the global rate is the per-key rate multiplied by the number of legitimate keys, which scales linearly with the principal’s user estate. Vercel’s edge protection handles the volumetric DDoS layer underneath.
The audit log records 429 events but at a sampled rate (1 in 10 within a single window) to avoid the audit log itself becoming a target. The full count is captured in the limiter’s metrics, which the operator dashboard surfaces.