Scanner guide13 min read

Vulnerability Scanner Blocking and WAF Allowlisting

A scan that gets silently blocked at the WAF, CDN, or origin produces a clean-looking report with hidden coverage gaps. The fix is identification, scope, and rate envelope agreed before the scan runs, not removing the protective layer. External scans, authenticated scans, and scheduled rescans all reach the asset through one or more protective layers. Each layer can deny the scanner without producing an obvious failure. The result is a scan that completes, generates output, and looks healthy while the actual asset surface was never tested at depth.

This guide covers how scanners get blocked, where the blocks land in a typical stack, how to allowlist a scanner without weakening production protections, what a defensible preflight checklist looks like, and how SecPortal pairs scanner identification with the engagement record so the allowlist conversation has a single source of truth.

Why scanners get blocked

Protective layers cannot tell from one request whether the traffic is hostile or authorised. The signals they use (request rate, payload variation, repeated probes against scanned paths, fingerprintable user-agents) match both attack patterns and legitimate scanning. The conservative default is denial. Without an allowlist that identifies the scanner explicitly, the scan races against rate limits, signature rules, and IP reputation systems that were tuned to stop exactly the kind of traffic the scanner is generating.

Rate-based denials

A scanner sweeping an asset at hundreds of requests per second hits the same per-IP rate limits a credential-stuffing tool would. The protective layer responds with 429 status codes, captcha challenges, or progressive backoff. The scanner sees technical responses but cannot complete the test plan inside the time budget the rate limit allows.

Signature-based denials

Many scanner payloads (SQL injection probes, XSS payloads, traversal sequences) match WAF signatures by design. The WAF blocks at the request body level. The scanner records the request as sent but never sees the application response that would have indicated whether the underlying issue exists.

Reputation and behaviour denials

Bot-management layers score traffic on header consistency, TLS fingerprints, cookie behaviour, and source reputation. A scanner without browser fidelity scores low on most of those checks and gets routed to challenge or block, often without any signal in the response body that this is what happened.

Origin denials

Some applications maintain their own per-IP throttling or auth-aware lockouts that the CDN and WAF do not see. A scan that gets past the edge can still trip an origin lockout, especially during authenticated testing where repeated session creation or password attempts look like account abuse.

Where the blocks land in a typical stack

Mapping the layers before the scan is the cheapest way to avoid blocked production scans later. Most internet-facing assets sit behind at least three layers; many sit behind five. Each layer needs the allowlist record explicitly, because protective layers do not share allowlists with each other.

LayerWhat it seesCommon denial signal
CDN edgeTLS, source IP, cached vs origin route, request rate per IP, geographic origin.Edge 403, captcha challenge, IP block, geo block.
WAFRequest URL, headers, body, payload signatures, anomaly score per request.Signature block (403/406), rule-based denial, body inspection block.
Bot managementHeader consistency, TLS fingerprint (JA3/JA4), cookie behaviour, source reputation.Challenge interstitial, JavaScript challenge, silent throttle.
Reverse proxyConnection count per IP, concurrent request limit, TLS version requirements.Connection drops, 429 responses, TLS handshake failures.
Application serverAuthenticated session behaviour, account lockout policy, per-IP throttling.Account lockout, session reset, authenticated-route denials.

The allowlist record has to apply at every layer that can deny the scan. A rule on the WAF that does not propagate to the CDN, or a rule on the CDN that the bot-management layer overrides, produces partial blocking that is hard to detect and harder to debug once the scan output already exists.

Three identification options that survive review

Allowlist rules are only as strong as the identifier they pin against. A rule that allows a generic class of traffic (any request with the word scanner in the user-agent, any connection from a residential IP) is the kind of rule auditors flag because it permits traffic that was not authorised. The pattern that survives review names the scanner specifically and pairs the identifier to the engagement record.

Identifier 1: user-agent plus verified ownership

The scanner sends a documented user-agent (for SecPortal scans, that is SecPortal-Scanner/1.0 and SecPortal-Verifier/1.0 with a public reference URL). The allowlist rule matches the user-agent and is bound to the asset that the workspace has verified through domain ownership. This is the durable identifier because the user-agent does not drift between scans and the ownership record is auditable.

Identifier 2: source IP allowlist

Source IP allowlisting works when the scanning infrastructure publishes a stable IP range and the security team is willing to maintain that list. The advantage is deterministic matching at the network layer; the disadvantage is that IP ranges drift over time and an out-of-date list silently blocks legitimate scans. IP allowlists work best as a second factor on top of user-agent matching, not as the only factor.

Identifier 3: signed header or token

The scanner sends a custom header containing a signed token issued for the engagement. The protective layer verifies the signature and matches against the engagement record. This is the strongest identifier because forgery requires the signing key, but it requires header inspection at every layer that can deny. Useful for high-sensitivity assets where user-agent or IP matching is not enough.

Detecting partial blocks before they ship

Partial blocks are the dangerous case because the scan completes and the output looks normal. Three checks against the scan log catch most partial-block patterns before the report goes out.

  • Request volume sanity: compare the scanner request count against the asset surface. A hundred-page application should produce far more than a few thousand requests; a low ratio means most pages were never reached. Scope coverage is a function of request volume, not scan duration.
  • Response body inspection: check the recorded responses for block-page patterns (Cloudflare, Akamai, AWS WAF interstitials, captcha forms, generic 403/406 pages). Any cluster of those responses indicates the protective layer denied the request before it reached the application.
  • Cross-log reconciliation: compare the scanner request log against the WAF and CDN access logs over the same window. If the two sources disagree on what was permitted, the difference is the gap; the scan report has to acknowledge that gap rather than ignore it.

The discipline is treating partial blocks as a coverage finding in their own right rather than as a technical inconvenience. A scan that ran 30% blocked is a scan with 70% coverage, and the report has to say so. The downstream scanner coverage and limits guide covers what each scanner class actually finds when it does reach the asset, which is a different question from whether the scan reached the asset at all.

A scanner allowlist preflight checklist

The preflight runs before the production scan. Each step is cheap individually and saves a blocked production scan that has to be rerun. The scanning team owns the preflight; the security team owning the asset owns the allowlist rule. Both records close on the engagement so the verification trail is durable.

Step 1: identify and record the scanner

  • User-agent string and reference URL recorded on the engagement.
  • Source IP range (if used) listed with renewal date.
  • Custom token (if used) signed and bound to the engagement identifier.
  • Identifier copied to the security team owning the asset before the scan window opens.

Step 2: map the protective layers

  • CDN, WAF, bot-management, reverse proxy, application throttling listed by name.
  • Owning team named for each layer so the allowlist rule lands with the right reviewer.
  • Allowlist rule type and lifetime (engagement window or fixed expiry) agreed in writing.
  • Allowlist rule scope confirmed against the verified asset list, not the broader estate.

Step 3: agree the rate envelope

  • Maximum requests per second documented on the engagement.
  • Concurrency limits agreed at each protective layer.
  • Rate envelope kept inside the published authentication and rate limits of the asset.
  • Backoff policy defined so the scanner reduces rate when 429 responses appear.

Step 4: run a confirmation scan

  • A small, low-rate scan runs against a subset of in-scope paths to confirm allowlisting.
  • The scan log is reviewed for block patterns before the production scan starts.
  • WAF and CDN logs are checked for denials over the confirmation window.
  • Any layer that denied is corrected before the production scan runs, not after.

Step 5: close the allowlist after the engagement

  • The allowlist rule is removed or expired at the end of the engagement window.
  • The closure record is attached to the engagement so the audit trail is durable.
  • Renewals require a new attestation, not extension of the original rule.
  • Persistent allowlists are flagged as a recurring audit finding rather than a permanent state.

Allowlist anti-patterns that show up at audit

Three patterns recur across allowlist records that auditors flag. Each pattern starts as a convenience and turns into a control gap that survives years of engagements because nobody owns the cleanup.

  • The forever rule: an allowlist created for one engagement that never expires. It survives the engagement, the tester, the firm, and three rounds of WAF migrations. The rule still exists; nobody can find the attestation that authorised it. A finding waiting to happen.
  • The generic rule: an allowlist that matches any user-agent containing the word scanner or any IP from a residential range. Anyone can match it; the rule no longer represents the original authorisation. The audit trail does not survive review.
  • The asymmetric rule: the allowlist exists at the WAF but not at the CDN, or at the CDN but not at the bot-management layer. The scan partially completes; the report partially covers the asset; nobody catches the gap until the next scan reproduces it. The fix is mapping the layers upfront, not debugging the gap after the report has shipped.

How SecPortal pairs scanner identification with the engagement

SecPortal scans are bound to verified assets. A workspace cannot run an external scan against a domain it has not proven ownership of through DNS TXT record, file upload, or HTML meta tag. Each scan attaches to an attestation that records who authorised the test, against which assets, and over which window. The audit trail captures who triggered each scan, when it ran, and what the scanner output said.

The external scanning feature documents the user-agent strings used by the scanner so the security team can write a narrow allowlist rule. The domain verification feature holds the ownership record so the allowlist rule has a verified counterpart on the engagement. The authenticated scanning feature extends that pairing into the application layer for tests that go past unauthenticated coverage.

For the broader workflow, the external security assessment use case covers how the engagement record, the allowlist record, and the report deliverable stay synchronised. The domain verification and responsible scanning guide covers the upstream ownership step that the allowlist rule depends on.

Once the scan is allowlisted and runs cleanly, the next discipline is interpreting output. The vulnerability scanner false positives guide covers triage; the scan scoping and target selection guide covers the upstream scope decision that the allowlist record has to mirror; the scanner output deduplication guide covers consolidation across tools once the scans complete; and the scanner rate limiting and throttling guide covers the rate decision that has to sit under the WAF baseline so the allowlisted scan still operates inside the asset budget rather than tripping rule fires from a different angle.

Scope and limitations

Allowlisting is a coordination problem, not a technical one. The reason scans get blocked is that the team running the scan and the team owning the protective layers are separated by an organisational boundary. The fix is paperwork (attestation, engagement record, allowlist lifetime) rather than network configuration. Programmes that try to solve allowlisting by removing WAF rules or weakening CDN protections inherit the cost the next time a real attacker reaches the asset under those weakened protections.

Allowlisting also does not protect against the scanner running outside its scope. The allowlist permits the scanner to reach the in-scope assets; it does not prevent the scanner from reaching out-of-scope hosts if the scope is wrong. The discipline is keeping the engagement scope and the allowlist rule synchronised, which is what the preflight checklist is for.

Frequently Asked Questions

Run scanner allowlisting as part of the engagement, not as an afterthought

SecPortal records scan attestation, verified ownership, and scanner identification on the engagement so the allowlist conversation has a single source of truth and the audit trail survives the engagement window.