Feature

API rate limiting on every privileged surface
enforced in code, returning a uniform 429

Per-endpoint rate limits sit in front of authentication, scans, code scans, AI chat, contact submission, bulk import, document upload, domain verification, credential storage, repository connection, finding writes, client creation, invoice creation, and the public scanner tools. The limiter is Redis-backed in production with a per-instance in-memory fallback, and every limited surface returns a uniform 429 (with a Retry-After header on the authentication and AI surfaces) so the client always knows when the gate refused the request.

No credit card required. Free plan available forever.

API rate limiting on every privileged surface, enforced in code

Enterprise security architects, AppSec leads, GRC owners running vendor questionnaires, and CISOs reviewing platform security posture all share the same first question when they read a SaaS security platform: which endpoints are rate-limited, what the limits are, and what the response shape is when the gate refuses a request. SecPortal answers the question in code rather than in marketing prose. A single rate-limiter primitive sits in front of every privileged surface, the failure path returns a uniform 429 (with a Retry-After header on the authentication and AI surfaces), and the key strategy is chosen per surface based on whether the right axis to protect is the host, the account, the workspace user, or the email.

The limiter ships in two flavours: an optimistic synchronous check with an asynchronous Redis sync (the default, used on most surfaces) and a strict awaited check (used on the contact form and on any surface where a single burst before Redis catches up would be unacceptable). Both modes use Upstash Redis when the connection environment variables are set and fall back to a per-instance in-memory map otherwise. The two modes return the same result shape, so the API route never has to know which backend served the decision.

Two enforcement modes, one result contract

rateLimit (optimistic with async Redis sync)

Synchronous in-memory check, fire-and-forget Redis increment. Adds zero blocking latency to the hit path.

Returns the in-memory decision immediately so the API route does not pay a Redis round-trip on every call. The Redis INCR runs in the background and synchronises the local counter on the next request when the distributed view says the workspace is over the limit. The trade is a small enforcement lag in exchange for sub-millisecond hit latency on the common path. Used by login, signup, reset-password, scans, code-scans, AI chat, AI workspace chat, document upload, credential writes, repo writes, git OAuth start, git provider config, finding writes, bulk finding import, client creation, invoice creation, domain creation, domain verification, and the public scanner tools.

rateLimitStrict (await Redis on every call)

Awaits the Redis round-trip on every call. Adds roughly five to fifteen milliseconds but guarantees the distributed counter is checked before the response.

Used on the contact form endpoint, where the trade flips: a few milliseconds of latency are acceptable, but allowing a burst before Redis catches up is not. The strict variant is appropriate for public-facing surfaces that need a tight ceiling and a deterministic reject the first time the limit is exceeded.

Every limited surface, by route, key, and budget

The table below names every surface that runs through the rate limiter, the key strategy it uses, the budget it carries, and the reason the surface needs the gate. The list is sourced from the API routes themselves, so the page and the production code never disagree.

SurfaceKeyBudgetReason
POST /api/auth/loginlogin:<ip> and login:<email> after a failed attempt10 attempts per IP, 5 attempts per email, both per 15 minutesIP gate stops a single host from running a credential spray. Email gate stops a credential-stuffing attacker who is rotating IPs from grinding one account.
POST /api/auth/signupsignup:<ip>5 signups per IP per 15 minutesStops automated workspace creation that would otherwise be used to mine free-tier scan quota or pre-position accounts for abuse.
POST /api/auth/reset-passwordreset:<ip>5 resets per IP per 15 minutesStops bulk password-reset spam against known emails. The reset endpoint also returns the same generic response for known and unknown emails, so the limiter is the only feedback signal a probing client gets.
POST /api/contactcontact:<ip> (rateLimitStrict)3 submissions per IP, awaited against RedisThe contact form is unauthenticated and writes to Resend, so spam carries a per-message cost. Strict awaited mode stops a burst before any email leaves the server.
POST /api/scansscan:<user.id>Per-user per 15 minutes (composes with plan-based per-month scan quota)Per-user gate stops a runaway script in a workspace from saturating the scan worker queue. The monthly plan quota stops a workspace from exhausting the tier ceiling; the rate gate stops a single user from spamming requests inside the monthly budget.
POST /api/code-scanscode-scan:<user.id>10 code scans per user per hour (composes with plan-based per-month code scan quota)Semgrep runs are CPU and storage heavy. Per-hour gate stops a developer with a noisy script from monopolising the worker for the workspace.
POST /api/ai/chat and POST /api/ai/workspace-chatai-workspace:<user.id> and ai-chat:<user.id>40 messages per user per 15 minutesAnthropic API spend is metered per token. Per-user gate protects the workspace from a single chatty session draining the AI budget that other members rely on.
POST /api/findings/bulkbulk-import:<user.id>5 bulk imports per user per 15 minutesBulk import parses third-party scanner output (Nessus, Burp Suite, OWASP ZAP, Semgrep, CSV) and writes many finding rows in one call. Per-user gate stops accidental double-submits and stops a script from importing the same file in a tight loop.
POST /api/findingscreate-finding:<user.id>60 findings per user per 15 minutesLoose enough for a security engineer to triage a real backlog, tight enough to stop a runaway script that writes findings at machine speed.
POST /api/documents/uploaddoc-upload:<user.id>30 uploads per user per 15 minutesStops a runaway client from filling workspace storage in one burst. Composes with the per-plan workspace storage ceiling so the gate that runs first is whichever ceiling is closer.
POST /api/credentialsadd-credential:<user.id>Per-user per 15 minutesStops bulk credential-write attempts. Credentials are encrypted at rest with AES-256-GCM, but the limiter still gates the upstream write surface so the encrypted storage layer never sees a credential-spam burst.
POST /api/reposadd-repo:<user.id>Per-user per 15 minutesEach repo write triggers a provider API call against GitHub, GitLab, or Bitbucket. Per-user gate respects the upstream OAuth provider rate limits.
POST /api/git/connect/[provider]git-connect:<user.id>Per-user per 15 minutesOAuth dance kicks off the upstream provider redirect flow. Per-user gate stops a script from cycling the redirect endlessly.
POST /api/git/provider-config and POST /api/git/provider-config/testgit-provider-config:<user.id> and git-provider-test:<user.id>Per-user per 15 minutesProvider credential writes and the test-the-credential round-trip both run against upstream APIs, so the gate respects the upstream surface.
POST /api/domains and POST /api/domains/[id]/verifyadd-domain:<user.id> and verify-domain:<user.id>5 domain writes and verifications per user per 15 minutesStops accidental verification-spam against DNS and HTTP probes, and stops a script from cycling the verification endpoint while DNS propagates.
POST /api/clientscreate-client:<user.id>20 clients per user per 15 minutesLoose enough for an operator migrating a portfolio onto the workspace, tight enough to stop a runaway script from creating thousands of client rows.
POST /api/invoicescreate-invoice:<user.id>20 invoices per user per 15 minutesEach invoice write hits Stripe Connect. Per-user gate keeps the workspace inside reasonable Stripe write rates and stops accidental double-submits.
POST /api/tools/security-headers, POST /api/tools/ssl-check, POST /api/tools/dns-analyzesecurity-headers:<ip>, ssl-check:<ip>, dns-analyze:<ip>10 requests per IP per minute on each toolThe public scanner tools are unauthenticated. Per-IP gate stops a script from using the free utility surface as a DNS or SSL probe rig.

Key strategy, by surface intent

The right key is the axis the attacker would otherwise rotate. Per-IP keys protect unauthenticated and pre-login surfaces. Per-user keys protect authenticated workspace writes. Per-email keys catch credential-stuffing rotation against one account. The four strategies below cover every surface in the limited-surfaces table above.

Per-IP keys (login, signup, reset, contact, public tools)

Used on unauthenticated surfaces and on credential-stuffing-sensitive surfaces where the attacker has not yet logged in. The IP is read from x-forwarded-for, falling back to unknown if the header is missing. Per-IP gates pair well with strict ceilings because the surface is anonymous and a single host should never need a high burst.

Per-user keys (workspace write paths)

Used on every authenticated write that has a user identity. The user.id from Supabase Auth becomes the key suffix so the limit follows the user across IPs (laptop at home, laptop on conference Wi-Fi, mobile phone) and across browser sessions. Pairs well with looser ceilings because the user is identifiable and the workspace owner can hold the operator accountable.

Per-email keys (login fallback against credential stuffing)

Used on the login endpoint after a failed credential attempt. If the IP gate is too loose to catch a stuffing attacker who rotates IPs, the per-email gate catches the second axis: the same email tried from many IPs in a short window.

Composed keys (rate gate plus plan quota plus role gate)

The rate gate runs first as a request-shaping primitive, the role gate runs second as an authorisation primitive, and the plan quota runs third as a capacity primitive. The three layers compose without one replacing another: a member with manage_scans can still hit the per-user rate gate on /api/scans, and a Pro-tier workspace can still hit the per-month scan ceiling after the rate gate lets the request through.

Backend behaviour

The limiter is small, deterministic, and resilient. The Redis path uses atomic primitives so concurrent first-hits cannot create a key without a TTL. The in-memory fallback exists so a Redis outage degrades enforcement rather than disabling the API.

  • Production deployments set UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN, and the limiter creates a lazy singleton Redis client on first use
  • Each rate-limit key is namespaced rl: in Redis so the rate counters are easy to grep against other Upstash keyspaces in the same project
  • The Redis path uses atomic INCR and sets EXPIRE only on the first request in a window, so two concurrent first-hits cannot create a key without a TTL
  • The optimistic mode falls back to in-memory enforcement if Redis is unreachable; the strict mode also falls back to in-memory rather than failing closed so a Redis outage does not lock every workspace out of the API
  • The in-memory fallback is per-instance, so it is correct for single-instance development and degrades gracefully under multi-instance scale-out by enforcing each window once per pod
  • Cleanup of expired in-memory entries runs at most every 60 seconds inside the next hit on the limiter, so the in-memory map does not grow unbounded in a long-running pod

Failure response contract

Every limited surface returns the same failure shape when the gate refuses a request. The client only has to learn the contract once.

  • 429 status code returned to the client on every limited surface
  • JSON body carries an error string the dashboard can render directly to the user
  • Retry-After header carries the wait time in whole seconds on the authentication and AI endpoints, derived from the remaining TTL on the Redis key or the resetAt timestamp on the in-memory entry
  • No internal counter, no per-window debug fields, and no Redis key names are leaked back to the client
  • Per-route error copy stays consistent ("Too many login attempts. Please try again later." on auth, "Too many AI requests. Please wait a moment before trying again." on AI, and so on) so the dashboard does not need a special case for every endpoint
  • The dashboard, the SDK consumer, and the procurement reviewer all parse the same 429 contract regardless of which limited surface returned it

Where the rate gate sits in the platform security model

The rate gate is one layer in a composed stack of security primitives. The layers run in a deliberate order, and each layer answers a different question. Reading them as a stack helps the security architect understand which primitive is responsible for which behaviour.

Rate gate (this page)

Request shaping. Caps the rate of calls into a surface so other primitives downstream are not asked to handle a burst they were never sized for.

Role gate

Authorisation. Asks whether the actor is allowed to take the action at all. Runs after the rate gate so a denied actor still pays the same per-IP or per-user budget.

Plan quota gate

Capacity. Asks whether the workspace has remaining monthly budget for the action. Runs after the rate gate and the role gate so the capacity check is only paid by requests that were going to be allowed.

Scan authorization guards

Target validation on scan endpoints. Asks whether the workspace owns the domain being scanned. Layered on top of the rate gate so an unauthorised target is rejected after the per-user scan budget is paid.

MFA gate

Session assurance. Asks whether the session is AAL2 before any privileged action runs. Composes with the rate gate so a session that has not promoted still pays the same rate budget on the surface it tried to hit.

Failure modes the limiter defends against

Each row below names a named abuse mode and the rate-gate primitive that catches it. The list is the practical defence map a security architect reads in a platform security review.

Single host running a credential spray against many accounts

Per-IP gate on POST /api/auth/login at 10 attempts per 15 minutes stops the spray after the first ten guesses from that IP, regardless of which account each guess targets.

Distributed credential stuffing rotating IPs against one account

Per-email gate on POST /api/auth/login at 5 failed attempts per 15 minutes catches the rotation because the email is the same even when the IPs vary.

Free-tier signup abuse to mine scan quota

Per-IP gate on POST /api/auth/signup at 5 signups per 15 minutes makes single-IP free-tier farming uneconomic. Workspace creation also runs through Stripe billing on paid tiers, so the abuse path is limited to the Starter quota.

Runaway script in a workspace flooding /api/findings

Per-user gate at 60 findings per 15 minutes catches a script that writes findings at machine speed without blocking a human security engineer doing a real triage.

Accidental double-submit on bulk finding import

Per-user gate at 5 imports per 15 minutes catches the second click on the import button and the second invocation of the import script, both of which are common operator mistakes when an import takes longer than expected.

Burst of AI chat messages draining the Anthropic token budget

Per-user gate at 40 messages per 15 minutes caps the per-user spend on the AI surface. Pairs with the plan-tier AI report monthly quota for the longer-cycle protection.

Spam on the unauthenticated public scanner tools

Per-IP gate at 10 calls per minute on each of security-headers, ssl-check, and dns-analyze stops a third party from using the free utility surface as a probe rig.

Reset-password spam against known emails

Per-IP gate at 5 resets per 15 minutes stops bulk password-reset abuse. The reset endpoint also returns the same generic response for known and unknown emails so the limiter is the only feedback the probing client sees.

Procurement questions answered up front

Most vendor security questionnaires (SIG, CAIQ, custom enterprise variants) ask the same rate-limiting questions. The Q&A grid below covers the questions that show up on almost every review.

Which endpoints are rate-limited?

Every authenticated write path, every authentication surface, every AI surface, every public unauthenticated surface, and every scan surface. The full list is on the limited-surfaces section of this page, named by HTTP path and key strategy.

What does the failure response look like?

HTTP 429 with a per-route error string in the JSON body. The authentication and AI endpoints additionally carry a Retry-After header with the wait time in whole seconds. The shape is identical on the Redis path and the in-memory fallback path so a client parsing the contract does not need to know which backend served the response.

Is the limiter shared across serverless instances?

Yes when UPSTASH_REDIS_REST_URL and UPSTASH_REDIS_REST_TOKEN are set. The limiter creates a lazy singleton Redis client and uses atomic INCR with EXPIRE to share counters across every Vercel function instance.

What happens if Redis is unreachable?

The limiter falls back to per-instance in-memory enforcement rather than failing closed. A Redis outage degrades distributed enforcement to per-pod enforcement; it does not lock the workspace out of the API.

How does the limiter pair with multi-factor authentication?

MFA enforcement is a separate gate that requires session promotion to AAL2 before any privileged action. The rate limiter sits in front of the MFA verify path too, so a credential-stuffing attacker who has guessed a password still cannot brute-force the second factor through unlimited guesses.

How does the limiter pair with role-based access control?

The role check answers whether the actor is allowed to take the action. The rate limiter answers how often the actor can ask the question. A member with manage_scans hits the per-user rate gate on /api/scans the same as any other actor; the role gate decides allowed, the rate gate decides how fast.

How does the limiter pair with plan-based quotas?

Plan quotas are monthly capacity ceilings (50 scans, 20 code scans, 5 verified domains on Pro). Rate limits are short-window request-shaping caps (10 code scans per hour, 5 domain writes per 15 minutes). The two layers compose: a workspace can hit the rate gate inside a month they still have plan budget for, or hit the plan ceiling inside a window they still have rate budget for, whichever comes first.

Does the limiter protect against credential stuffing?

Yes. The login endpoint runs a per-IP gate first, then runs a per-email gate after a failed attempt. The combination defeats both single-IP credential spray and multi-IP credential rotation against one account.

How the limiter composes with the rest of the platform

The rate limiter is one of several composed primitives that protect the workspace surface. Reading them together gives the procurement reviewer the architectural picture rather than a single isolated feature.

The role-based access control layer answers whether the actor is allowed to take the action. The rate gate sits in front of the role gate so a denied actor still pays the same per-IP or per-user budget. The two gates compose without one replacing the other: a member with manage_scans still hits the per-user rate cap on /api/scans, and an unauthorised actor still pays the per-IP rate cap on /api/auth/login before any credential is even compared.

The plan-based limits and quotas layer answers whether the workspace has remaining monthly capacity for the action. Rate limits are the short-window request-shaping cap; plan quotas are the long-window capacity ceiling. A Pro-tier workspace can hit the per-user rate cap on /api/scans inside a month it still has 30 scans of plan budget, or hit the plan ceiling inside a 15-minute window it still has rate budget for, whichever comes first.

The multi-factor authentication layer enforces a session promotion to AAL2 before any privileged action runs. The rate limiter sits in front of the MFA verify path too, so a credential-stuffing attacker who has guessed a password still cannot brute-force the second factor through unlimited guesses.

The scan authorization guards primitive runs after the rate gate on /api/scans and /api/code-scans. The rate gate caps the rate of scan requests; the authorization guard refuses unauthorised targets. The two layers compose so an attacker who burns through the per-user scan budget still cannot scan a domain the workspace does not own, and an authorised operator targeting their own domain still respects the per-user scan rate cap.

Who reads this page

Six audiences arrive at this page with different reading goals. The cards below name the audience and the practical use the page serves for each role.

Internal security architects evaluating SaaS

Read the limited-surfaces section against the SaaS security checklist and the third-party risk assessment template. The named keys, named budgets, and named windows answer the abuse-control question without a sales call.

AppSec leads owning the application security review

Read the failure modes section against the OWASP API Security Top 10. Map each defence to the corresponding API risk and capture the page as evidence in the security architecture review.

GRC and compliance owners

Read the failure contract and the procurement Q&A against the vendor security questionnaire. The 429 plus Retry-After shape and the named keys answer the "how does the vendor protect their API" question that shows up on most questionnaires.

CISOs and security directors

Read the composition-layers section against the security architecture review. The rate gate, role gate, plan quota gate, scan authorization guards, and MFA gate are five composed primitives, and reading them as a stack helps the leadership review understand the platform security model.

Procurement and vendor risk teams

Use the procurement Q&A as a copy-paste answer set for the standard rate-limiting questions on the vendor security questionnaire. The named limits, windows, and Redis behaviour answer most rows on the SIG, CAIQ, and custom questionnaires without follow-up.

Security engineering teams running the SDK

Read the failure-contract section to understand the 429 parse contract. The Retry-After header is set on authentication and AI surfaces in whole seconds; other limited surfaces return a 429 with the error string and the client should respect the standard back-off the SDK wrapper applies.

Operational sizing for the limiter itself

The limiter has two operational dimensions worth sizing during a deployment review: Redis latency under load, and the in-memory fallback footprint.

  • Redis round-trip latency: the optimistic mode does not block on Redis, so the hit path latency is unaffected by Upstash region or load. The strict mode pays the round trip per call (approximately 5 to 15 milliseconds on a healthy Upstash region in the same continent as the function).
  • In-memory map footprint: each unique key holds a small counter plus a resetAt timestamp. Cleanup runs at most every 60 seconds inside the next hit, so the map size is bounded by the count of active unique keys in the current window rather than by lifetime traffic.
  • Redis outage behaviour: optimistic mode keeps serving traffic from the in-memory fallback. Strict mode also falls back to in-memory rather than failing closed, so a short Redis outage degrades distributed enforcement to per-pod enforcement without locking workspaces out of the API.
  • Per-IP keys read x-forwarded-for and fall back to the literal string "unknown" if the header is missing. This is the right shape on Vercel where the platform writes the header on every inbound request; behind a different proxy the operator should confirm the header propagation before relying on per-IP enforcement.

For the workflow that operationalises the limiter alongside the security testing programme, see the security testing programme management use case. For the audit-evidence layer that records when the limiter refused a privileged action, see the activity log feature. For the tenant boundary the limiter runs under, see the tenant subdomain isolation feature.

For the audience pages that frame the rate-limiting question in the right buyer language, see the internal security teams page, the CISO page, and the security architects page; each one frames the platform-security review in the language the respective buyer uses.

Honest scope: SecPortal does not currently ship configurable rate-limit policies through the dashboard, customer-owned rate-limit overrides, per-tenant rate-limit isolation beyond the per-user keying already in place, integration with external WAF or API gateway products, native Cloudflare or AWS WAF rule sync, custom Retry-After payloads beyond the standard header, or formal SLA commitments on rate-limit decisions. Rate limits are defined in code, gated in the API, and uniformly enforced through the optimistic Redis path or the in-memory fallback when Redis is unreachable.

Read the rate-limit posture before signing the RFP

Every limited surface is named, keyed, and budgeted in code. Procurement reads the limits and windows once and stops asking the abuse-control question in every follow-up call.

No credit card required. Free plan available forever.