Enterprise18 min read

Vulnerability Remediation SLA Policy Guide: Design, Codify, and Operationalise the Policy

A vulnerability remediation SLA policy is the rule that turns a queue of confirmed findings into a calendar of accountable actions. It names the severity tiers, sets the target time per tier, codifies the clock rules, defines the breach behaviour, draws the exception path, names the owner of each step, and produces the evidence the audit, the board, and the customer reviewer read. This guide is for AppSec, vulnerability management, GRC, security engineering, and CISOs who are establishing a first formal SLA policy, rationalising an inherited one, or evidencing a policy that operates in practice but has never been written down. It walks through the policy structure that holds up under engineering load, the clock rules that make the SLA reportable, the exception path that prevents the policy from collapsing under exceptional findings, the escalation rules that survive role changes, and the evidence base that satisfies the framework reads from SOC 2, ISO 27001, PCI DSS, NIST SP 800-53, and the CISA KEV catalogue.

Why a Vulnerability Remediation SLA Policy Is More Than a Document

Most organisations have an SLA document somewhere. It is usually a one-pager that lists four severity tiers and four target windows, signed off by the security director in a previous quarter, and stored in a folder almost nobody opens during day-to-day work. The document is not the policy. The policy is the rule the team applies on every finding, the timestamp on the record, the breach behaviour that fires when the window passes, the exception that lands on a register with an expiry, and the evidence that holds up when the auditor reads the trail. Documents that nobody operates are the most common cause of SLA breach reports that nobody trusts; policies that nobody can find on the finding record are the most common cause of audit findings against vulnerability management programmes.

The shift from document to policy is what separates programmes that report defensible SLA compliance from programmes that report numbers nobody can reconcile. The shift is operational, not editorial. The SLA target moves from the PDF onto the finding record. The confirmation timestamp moves from the analyst's memory to a field on the record. The breach behaviour moves from a verbal escalation in chat to a written escalation on the activity log. The accepted-risk exception moves from a director's email to a register with an expiry and a review trigger. The reporting moves from a quarterly slide that gets rebuilt every cycle to a running view that reads from the same record the team operates on.

The remainder of this guide is the structure that supports the operational shift. Each section is a part of the policy that needs an explicit decision, a written rule, and an operating mechanism. Programmes that skip the sections tend to find the gap during the first audit; the programmes that work through them tend to find that the policy stops being a document and starts being a rhythm.

Severity Tiers and Target Windows

The first decision is the tier set. The defensible starting point is the CVSS band, calibrated against the asset tier list and the regulatory regime. CVSS gives an objective seed; the calibration produces a defensible target. Anchoring on CVSS alone produces a policy that is hard to defend on findings where exploitability, exposure, or asset criticality push the actual risk up or down the band.

Critical (CVSS 9.0 to 10.0): 7 days

Active exploitation risk, public exposure, or no compensating control in place. The clock runs through weekends and holidays. The breach behaviour is immediate escalation to the engagement lead and the named senior approver. Critical SLAs are the line PCI DSS, ISO 27001, and CISA KEV evidence packs read first.

High (CVSS 7.0 to 8.9): 30 days

Significant impact mitigated by a control, requires authentication, or sits behind internal exposure. The window aligns with a normal release cycle, so engineering can slot the fix into the next planned deployment rather than treating it as out-of-band. Breach routes to the application owner and the security lead together.

Medium (CVSS 4.0 to 6.9): 90 days

Material risk that can be addressed with normal release cadence. Tracked but does not interrupt the roadmap. Most remediation programmes ship medium fixes in batches; the SLA tier creates the deadline that keeps the batch from drifting into the next quarter.

Low (CVSS 0.1 to 3.9): 180 days

Hardening or hygiene items. Bundled into a remediation sprint or shipped as a side effect of other work. Low SLAs are the tier most likely to age silently because they rarely escalate; surfacing them on the queue keeps the long tail visible rather than buried.

Informational: best effort, no breach state

No directly exploitable risk. Tracked for context, often closed with an accepted-risk decision and a written rationale. Informational findings do not get a breach state, but they do get a review cadence so the queue does not fill with stale records.

Tighter windows are appropriate for KEV-listed exploited vulnerabilities, internet-facing assets, and findings on systems that process regulated data; wider windows are appropriate for internal-only, fully mitigated, or low-data findings. The calibration lives on the policy, not in the analyst's judgement. A practical vulnerability SLA policy template is the starting artefact; the policy is the calibrated version your programme operates.

Clock Rules: Start, Pause, Resume, Stop

The clock rules decide whether the SLA reporting is trustworthy. The four rules are start, pause, resume, and stop. Each rule has a defined trigger, a recorded timestamp, and a written owner.

Start: at confirmation, not discovery

Discovery is the raw scanner output or pentest finding. Confirmation is the moment the triage owner has validated the finding is true positive, scoped to the affected asset, and ready for remediation. Starting at discovery is the most common cause of SLA metrics nobody trusts because the queue is full of false positives, duplicates, and findings pending triage. Starting at confirmation aligns the clock with the work engineering can actually do.

Pause: written reason, named approver, expiry

Defensible pause reasons include a vendor patch dependency, a planned third-party deployment window, a documented compensating control under review, or a finding moved to in-progress on a confirmed change ticket. Every pause needs an expiry; pauses without expiry quietly become permanent and corrupt the metrics. The pause record lives on the finding and on the activity log so the audit can read it.

Resume: at trigger, with the remaining window preserved

The resume trigger is the same trigger the pause was conditioned on (the vendor patch released, the deployment window closed, the change ticket completed). The remaining window is preserved rather than reset so the pause cannot be used to game the metric. The resume timestamp lands on the record.

Stop: at verified close, not claimed resolution

Resolved is the engineering claim that the fix has shipped. Verified is the security confirmation that the fix actually closed the vulnerability. The SLA stops at verified. Separating the two timestamps is the practical foundation of retesting workflows; treating resolved as verified produces SLA reports that overstate compliance and audit findings that catch up the next quarter.

Base Severity, Residual Severity, and Overrides

Base CVSS captures the inherent severity of the vulnerability under the assumption that no compensating control is in place. Residual severity reflects the effect of controls that are already present: a WAF rule on the perimeter, network segmentation that limits blast radius, an upstream IAM policy that restricts the exploit precondition, a missing prerequisite the attacker would need. A mature SLA policy uses base severity to seed the tier and residual severity to adjust the target, with the adjustment written down.

The adjustment lives in three structured forms. A severity override changes the working severity for SLA purposes when the residual is materially different from the base. A false positive override removes a finding from the SLA when triage confirms it was not exploitable in context. An accepted-risk override moves the finding out of the active queue with an expiry and a review trigger. Each override has a recorded reason, an actor, and a timestamp. SecPortal supports the pattern through finding overrides, which preserves the decision across scan cycles so the next scan diff does not re-raise the same finding.

The override is not the same as the exception. The override changes how the SLA is computed for a specific finding for documented reasons (false positive, accepted residual, recalibrated severity). The exception, covered below, is the path for findings the programme has decided not to fix on the original timeline.

Exceptions and the Accepted-Risk Path

The policy collapses without an exception path. Programmes that try to force every finding to the same target time produce shadow workarounds: findings marked resolved without verification, findings closed and reopened on the next scan, findings reassigned to a different owner to reset the perceived clock. The exception path is the formal alternative.

A defensible exception record names the finding, the residual risk after compensating controls, the business justification, the named risk owner, the named approver, the expiry, and the review trigger. Findings that remain on the queue past the expiry without renewal return to the active SLA. The exception lives in an exception register that the programme owner reviews on a documented cadence (monthly for high-tier exceptions, quarterly for the rest). The security exception register template is one practical starting point for the register format.

Exception governance keeps the policy honest. When the exception count for a given tier rises faster than the burn-down, the signal is not that the policy is wrong; it is that the capacity is mismatched to the target or that a structural problem in the engineering organisation is absorbing the slack. The policy review then turns to either the target, the capacity, or the backlog rather than to a longer accepted-risk list.

The renewal cadence on the register is a deliberate cost decision rather than a calendar inheritance from the audit cycle. The exception renewal cadence economics research covers the four per-cycle cost components, the three avoided-cost components, the severity-banded cadence defaults, and the approver bandwidth constraint that bounds the design. Pair the SLA policy described here with that cadence model so the exception register stays current between audit cycles rather than scrambling to reconcile at fieldwork time.

Breach Behaviour and Escalation Rules

A breach is a trigger for a written action, not a failure state by itself. The standing options are explicit per tier so the engagement lead does not improvise the response in chat.

Escalate

Route to the named approver and the application owner. The escalation message includes the finding, the target date, the current state, the proposed remediation, and the proposed next review date. The escalation lands on the finding and on the activity log so the trail survives role changes. Critical-tier escalations have a tighter clock for the response than the original SLA tier.

Accept with rationale

Record an accepted-risk override with an expiry and a review trigger. The accept option is gated by tier; critical and high accepts require the named senior approver, medium and low accepts can be delegated. Accept is not the same as defer; defer extends the target, accept removes the finding from the active queue with a written reason.

Extend with justification

Move the target date and record the reason. Extensions are limited per finding and per tier so the option does not silently become the default. Each extension has a written review trigger and a named owner. The extension trail lands on the activity log.

Close as no longer applicable

Asset retired, finding superseded by a different remediation, environment decommissioned, or duplicate detected. The close reason is recorded; the finding stays on the audit trail rather than being deleted so the history of the queue remains reconstructible.

The four options are exhaustive at the policy level. Side-channel resolutions in chat that do not produce a record on the finding fall back into one of the four when the policy is applied. The vulnerability SLA breach escalation workflow is the operational form of the breach behaviour.

Ownership: Who Owns Each Step

An SLA policy without named owners produces breaches that nobody owns and escalations that nobody answers. The policy lists the role for each step rather than the person, so the policy survives organisational changes. The eight roles below cover most programmes.

  • Triage owner. Validates the finding, confirms severity, sets the SLA target, assigns the remediation owner. Often the AppSec engineer or vulnerability management analyst.
  • Remediation owner. Engineers the fix, ships the change, records the resolved-at timestamp with the proposed evidence. Often the application owner or platform owner of the affected asset.
  • Verification owner. Confirms the fix actually closed the vulnerability, records the verified-at timestamp. Often the AppSec engineer who handled triage.
  • Escalation approver. Signs off on extensions, accepts, and tier-driven escalations. Often the security director or the head of engineering.
  • Risk owner. Signs off on accepted-risk exceptions with documented rationale and expiry. Often the business owner of the affected asset.
  • Programme owner. Owns the policy itself, reviews the operating metrics, defends the budget, runs the periodic review. Often the head of vulnerability management or the security programme manager.
  • Audit liaison. Produces the evidence pack on demand, reconciles the policy against the framework reads. Often GRC or compliance.
  • Communications owner. Maintains the reporting cadence to leadership, the board, and where applicable, the customer reviewer. Often the CISO or the security programme manager.

The named roles map to RBAC on the platform that holds the work. SecPortal supports the ownership pattern through team management with four roles (owner, admin, member, viewer), where the workspace owner and admins handle policy administration and approvals, members operate the remediation and verification work, and viewers consume the reporting without write access.

Evidence the Policy Produces

The evidence pack is the part of the policy that holds up under audit, customer review, and board scrutiny. Six categories cover most read paths.

The policy document

Versioned, approver-named, effective-date stamped. Reviewed on the documented cadence. Stored alongside the security programme's other policies so the auditor reads it in context.

The finding record

Severity, SLA target, assigned owner, confirmed_at, resolved_at, verified_at, plus any pause and resume events. Findings management holds the canonical record with CVSS 3.1 vectors and 300+ finding templates so the severity seed is auditable.

The activity log

Attributed, timestamped entries for every state change. The activity log is the workspace-wide audit trail with CSV export so the auditor reads the trail rather than reconstructs it from chat threads. The activity log is on by default and records finding state transitions, override decisions, document uploads, and team changes.

The override record

Three structured types (false positive, accepted risk, severity) with reason, actor, timestamp, and target scope. The upsert key preserves the decision across scan cycles so the next scan does not re-raise the same false positive.

The exception register

The consolidated list of accepted-risk decisions with their expiry and review schedule. Reviewed on a documented cadence so expired exceptions return to the active queue rather than aging silently.

The reporting pack

SLA compliance by tier, breach distribution, exception count, trend lines, capacity utilisation, top-aging findings. Generated on the cadence the programme operates (often weekly for the operating team, monthly for leadership, quarterly for the board). AI report generation in SecPortal accelerates the first draft from the underlying findings data; the reviewer keeps the final word.

Framework Alignment

A well-designed SLA policy is read across frameworks rather than rewritten per audit. The policy is the single source; the framework read points to the relevant clauses.

SOC 2

CC7.1 (system operations to detect, prevent, and respond to deviations) and CC8.1 (change management). The auditor reads the policy, the finding record, the activity log, and the exception register.

ISO 27001

Annex A 8.8 (management of technical vulnerabilities), 5.7 (threat intelligence), and 8.32 (change management). The certification body reads the policy, the operating evidence, and the review minutes. The ISO 27001 audit checklist covers the broader Annex A walk-through.

PCI DSS

Requirement 6.3.3 names explicit windows (one month for critical, three months for high under v4.0). Requirement 11.4 ties penetration test findings into the same remediation workflow.

NIST SP 800-53

RA-5 (vulnerability monitoring and scanning) and SI-2 (flaw remediation). The control catalogue expects documented response times and recorded operating evidence.

CISA KEV catalogue

Binding Operational Directive 22-01 names due dates for federal civilian agencies; many commercial programmes align critical-tier windows to the KEV deadline when the vendor publishes the entry. The CISA KEV catalogue guide covers the cataloguing and reading workflow.

CIS Controls v8

Control 7 (continuous vulnerability management) names the operating expectations. The framework expects a documented SLA, an operating queue, and evidence the queue is worked on the documented cadence.

Operating Metrics That Make the Policy Reportable

A policy without metrics is unevaluable. Six metric families hold up under leadership and audit review.

  • SLA compliance by tier. Percentage of findings closed within the target window, per tier, per period. Trending the rate over multiple periods catches drift before it becomes systemic.
  • Breach distribution. Count and severity of breaches in the period, grouped by application owner and asset tier. The distribution catches structural breaches that are masked by an aggregate compliance rate.
  • Mean time to remediate by severity. Median and 90th percentile remediation duration per tier. Median is the operating signal; the 90th catches the long tail the median hides.
  • Exception register state. Total count, expiry distribution, count of expired-but-not-renewed entries. A rising exception count without rising programme capacity is a structural signal, not a clerical one.
  • Reopen rate. Findings reopened after being marked resolved or verified. A high reopen rate points to verification quality or fix completeness rather than the SLA itself.
  • Capacity utilisation. Active findings per remediation owner and per application team. Pairs the SLA compliance rate with the underlying workload so leadership can read the capacity story alongside the compliance story.

The same metrics are the foundation of board reporting and customer assurance. The board reads a subset (often six to eight metrics across the policy and the broader programme); customers reading the security questionnaire read the policy, the operating metrics, and the audit evidence pack as a coherent set. The broader security programme KPIs and metrics framework covers the metric-design conventions.

Policy Review and Renewal

The policy is reviewed on three triggers. An annual scheduled review against the operating metrics, the regulatory landscape, the asset tier list, and the threat picture. An ad-hoc review when a material event shifts the assumptions: a KEV addition that affects a tier-zero asset, a regulator action in the sector, a new framework adoption, a major incident, a tool replacement, or a structural change in the engineering organisation. A continuous review through the operating metrics: if the critical-tier breach rate exceeds a threshold for two consecutive quarters, the policy is reviewed against the target or the underlying capacity rather than patched with a one-off exception list.

The renewal output is a versioned policy document, a list of changes from the previous version, the operating metrics that drove the change, and a re-approval by the named approver. The renewal lands in the same evidence path the auditor reads so the review history is reconstructible from the trail.

Common Failure Modes

The policy is the PDF

The SLA target lives in a document nobody opens during day-to-day work. The fix is moving the target onto the finding record so the runway is visible at a glance.

The clock starts at discovery

The queue is full of false positives, duplicates, and findings pending triage. Compliance reports become unbelievable. Starting at confirmation aligns the clock with the work engineering can do.

Resolved is treated as verified

The SLA report overstates compliance; the audit catches it the next quarter. Separating resolved_at and verified_at on the finding record fixes the mismatch.

Pauses without expiry

Pauses quietly become permanent. The metrics drift. The fix is requiring an expiry and a review trigger on every pause.

Exception count rises without capacity review

The exception register absorbs the slack of an under-resourced programme. The signal is structural; the response is a capacity or target review, not a longer accepted-risk list.

Escalation happens in chat

The escalation is invisible six months later. Auditors cannot read it; replacement engagement leads cannot reconstruct it. The fix is escalation that lands on the finding and on the activity log.

Closing

A vulnerability remediation SLA policy is the part of the vulnerability management programme that turns a queue of confirmed findings into a calendar of accountable actions. The policy sits on the finding record, not in a PDF; the clock starts at confirmation, not discovery; pauses have an expiry; resolved is not the same as verified; breaches trigger one of four written actions; exceptions land on a register with an expiry; ownership is named per step; evidence is produced as a side effect of operating, not as a quarterly project. Programmes that operate the policy in this shape tend to report SLA compliance that reconciles to the audit, defend the budget with metrics leadership trusts, and renew customer assurance on better terms.

The substance of the policy is calibrated to the asset tier list, the regulatory regime, the engineering capacity, and the threat picture the programme operates against. The structure in this guide is the spine. The calibration is the work.

Frequently Asked Questions

What is a vulnerability remediation SLA policy?

The written rule that ties the severity of a confirmed finding to the time the organisation has to remediate, mitigate, or formally accept it. It names tiers, target windows, clock rules, breach behaviour, exception paths, owners, and the evidence the policy produces.

How should remediation SLA tiers be defined?

Seed the tiers on a CVSS band and calibrate with exploitability, exposure, and asset criticality. A defensible starting set is 7, 30, 90, and 180 days for critical, high, medium, and low. Tighten the windows for KEV-listed, internet-facing, or regulated-data findings.

When does the SLA clock start?

At confirmation, not discovery. Discovery is the raw scanner output; confirmation is the triage decision that the finding is true positive and ready for remediation. Starting at discovery produces SLA metrics nobody trusts.

Can the SLA clock be paused?

Yes, with a written reason, a named approver, and an expiry. Defensible pause reasons include vendor patch dependencies, planned third-party deployment windows, and findings under documented compensating-control review. Pauses without expiry quietly become permanent.

What happens when an SLA breaches?

Escalate to the named approver, accept the risk with rationale and an expiry, extend the target with justification, or close as no longer applicable. The four options are exhaustive; side-channel resolutions in chat fall back into one of the four when the policy is applied.

How does residual severity interact with the SLA?

Base CVSS seeds the tier; residual severity after compensating controls adjusts the target. The adjustment lives in a structured override (false positive, accepted risk, severity) with a reason, an actor, and a timestamp.

How does the policy map to compliance frameworks?

SOC 2 CC7.1 and CC8.1; ISO 27001 Annex A 8.8, 5.7, and 8.32; PCI DSS 6.3.3 and 11.4; NIST SP 800-53 RA-5 and SI-2; CISA KEV catalogue and BOD 22-01; CIS Controls v8 Control 7. The policy is written once and read across frameworks rather than rewritten per audit.

What evidence does the policy produce?

The policy document; the finding record with severity, target, owner, and timestamps; the activity log with attributed state changes; the override record; the exception register; and the reporting pack with SLA compliance, breach distribution, and trend lines.

How often should the policy be reviewed?

Annually on a scheduled review; ad-hoc when a material event shifts the assumptions; and continuously through the operating metrics. Sustained tier-level breach rates trigger a target or capacity review, not a longer accepted-risk list.

Operate your vulnerability remediation SLA policy on SecPortal

Hold the policy, the tiers, the clock state, the override register, and the audit evidence in one workspace. Findings management with CVSS scoring, retesting workflows with verified-at timestamps, finding overrides for false positives and accepted risk, the activity log with CSV export, team management with RBAC, AI report generation, and the document store support each step of the policy. Free plan available, no credit card required.

Get Started Free