XPath Injection
detect, understand, remediate
XPath injection manipulates queries against XML data stores by injecting filter syntax into unsanitised input. Login forms backed by XML user files, configuration lookups, and legacy enterprise apps that rely on XQuery or XPath expressions can leak the full document, bypass authentication, or surface admin-only nodes when input is concatenated into the query string.
No credit card required. Free plan available forever.
What is XPath injection?
XPath injection (CWE-643) is an injection attack against applications that build XPath or XQuery expressions from user input without proper escaping. When an application stores data in XML files or queries an XML database, and constructs the lookup expression by concatenating untrusted input into the query string, an attacker can inject XPath syntax to alter the query logic, bypass authentication, or read parts of the document the application never intended to expose.
XPath is the query language for navigating XML documents, similar in spirit to how SQL queries a relational database. Where SQL injection manipulates a relational query, XPath injection manipulates the path expression that selects nodes from an XML tree. The same root cause applies: unparameterised string concatenation gives an attacker partial control over the syntax of the query, and the data store evaluates whatever the application hands it.
XPath injection still appears in pentests against legacy enterprise applications, XML-backed authentication forms, configuration management systems, content publishing pipelines, and integrations with SOAP services that lean on XPath inside business logic. The class is closely related to LDAP injection and NoSQL injection: the syntax differs but the root cause and the impact pattern are the same.
How it works
Find an XML-backed input
The attacker identifies endpoints that consume XML, expose user search, run XML-based authentication, or lean on configuration lookups against an XML file or database.
Probe with quote and predicate breaks
Single quotes, double quotes, brackets, and parentheses are sent in form fields and parameters. Verbose XPath errors, blank responses, or shifted result sets reveal that the input is concatenated into a query rather than escaped.
Inject the predicate manipulation
A payload such as a closing quote followed by an or 1=1 predicate rewrites the XPath expression so it matches every node in the document, regardless of the original filter.
Bypass auth or extract the document
The reshaped query authenticates as the first user in the file, extracts admin-only nodes, or, in blind cases, lets the attacker walk the document one character at a time using string-length and substring comparisons.
A worked example
The pattern is easiest to read against an XML user file used for login. The application stores users in a document like the one below.
<users>
<user>
<name>alice</name>
<password>s3cret</password>
<role>user</role>
</user>
<user>
<name>admin</name>
<password>r00tpw</password>
<role>admin</role>
</user>
</users>The login handler concatenates the submitted username and password into an XPath expression that fetches a matching user node.
// Vulnerable: string-concatenated XPath
const expr = "//user[name/text()='" + username +
"' and password/text()='" + password + "']";
const match = doc.evaluate(expr, doc, null, XPathResult.ANY_TYPE, null);An attacker submits a username of ' or '1'='1 and any password. The query is rebuilt as:
//user[name/text()='' or '1'='1' and password/text()='anything']
The predicate becomes always-true and matches the first user node in the document, which the application then treats as the authenticated user. The same shape covers blind XPath injection, where the attacker uses string-length and substring functions to enumerate nodes one character at a time when the application only confirms whether a query matched anything.
Common causes
String concatenation in XPath expressions
Building XPath or XQuery expressions by gluing user input directly into the query string. The same anti-pattern that produces SQL injection produces XPath injection in XML-backed apps.
XML files used as auth or config stores
Legacy applications that authenticate against an XML user file, or that read business rules from XML configuration, often pre-date the move to parameterised queries and have never been refactored.
XQuery and SOAP endpoints
XQuery against native XML databases (Saxon, BaseX, MarkLogic) and SOAP services that include XPath in their stored procedures expose the same class of bug under a different name.
Verbose XPath error messages
Default error pages that surface the raw XPath expression and the parser exception give attackers the exact syntax to manipulate, accelerating exploitation from probe to working payload.
How to detect it
Automated detection
- SecPortal's code scanning flags string-concatenated XPath, XQuery, and SOAP query construction in source so the bug is caught before the application ships.
- Authenticated DAST sends quote-breaking, predicate-injecting, and boolean-shifting payloads against form inputs and parameters, then compares responses for length, status, and timing differences that suggest filter manipulation.
- Error fingerprinting catches XPath-specific exceptions (parser errors, expected token messages, namespace warnings) leaking through default error handlers, which usually means the input reaches an unparameterised query.
Manual testing
- Submit a single quote, then a double quote, in every input that drives a lookup or login. A 500 response, a verbose XPath error, or a shifted result set is the first signal the input is concatenated into the query.
- Try an always-true predicate such as
' or '1'='1and an always-false predicate such as' or '1'='2. A diverging response between the two confirms the predicate is reaching the parser. - For blind cases, use
string-length()andsubstring()in the predicate to walk a target node one character at a time, validating each guess against a true or false response shape.
How to fix it
Use parameterised XPath APIs
Most XPath implementations support variable bindings. In Java, javax.xml.xpath supports XPathVariableResolver. In .NET, XPathExpression.SetContext can attach a custom IXsltContextVariable. In XQuery, declare external variables and bind them at execution time. Pass user input as a bound variable rather than embedding it in the expression string.
Escape on the way in if you cannot parameterise
When a legacy library does not support binding, escape single quotes by switching them to numeric character references (') or by using XPath concat() to assemble strings that contain quote characters. Strict allowlists on input characters are a defensible last resort, but escaping or parameterising is preferable.
Validate input against a tight schema
Reject input that violates an explicit type or length contract before it reaches the query. A username field that accepts XPath syntax characters is a missing input contract, not a query bug. The validation belongs as close to the boundary as the query.
Suppress verbose error output
Default exception handlers should not return parser errors, namespace mismatches, or evaluation traces to the user. Log them server-side. Verbose XPath errors are often the difference between a probe and a working exploit.
Move sensitive data off plain XML files
Authentication and authorisation rules stored as flat XML are a long-standing anti-pattern. Migrating to a hashed credential store (or, at minimum, a database with parameterised queries) eliminates the exposure XPath injection trades on, and removes the legacy file from the threat model entirely.
Reporting an XPath injection finding
XPath injection findings carry the most weight when the report names the exact endpoint, the input that reached the query, the payload that demonstrated the predicate manipulation, and the data the manipulated query returned. A finding written as "XPath injection in login form" with no payload and no extracted node is hard for engineering to reproduce and easy to dispute. A finding written with the request, the response, and the proof-of-concept extraction reads as a working exploit.
On a SecPortal engagement, the finding sits on the engagement record with the affected endpoint, the CVSS 3.1 vector (typically high to critical depending on what the document holds), the CWE-643 mapping, the request and response evidence, and the remediation guidance. Pentest engagement records keep the proof of concept and the retest verification on the same finding, so the close-out conversation references the original payload rather than reconstructing it from email. The finding triage workflow covers how to separate scanner-derived signals from manually validated XPath findings during the testing window.
Compliance impact
OWASP Top 10
A03:2021 Injection
OWASP ASVS
V5.3 Output Encoding and Injection
PCI DSS
Req. 6.2 Secure Development Practices
ISO 27001
A.8.28 Secure Coding Practices
A pentester checklist for XPath injection
The list below is the minimum coverage a tester should walk before declaring an XML-backed surface clear of XPath injection.
- Inventory every input that reaches an XML data store, an XQuery endpoint, or a SOAP service that uses XPath inside its handlers.
- Probe each input with quote and bracket characters, then look for parser errors, blank responses, status changes, or shifted result sets.
- Test always-true and always-false predicates side by side. A diverging response shape is the strongest signal for an exploitable filter.
- For blind cases, build a string-length and substring oracle and walk the document one character at a time. Capture the full extracted node as evidence.
- Where authentication is XML-backed, verify whether the predicate manipulation also returns the password node, the role, or the session attributes the application uses for authorisation.
- Record the CVSS vector, the CWE-643 mapping, the request and response evidence, and the affected file path or service so the finding is reproducible at retest.
Catch XPath injection before the report ships
SecPortal probes XML-backed inputs with quote-breaking and predicate-injection payloads, flags string-concatenated XPath in source, and keeps proof of concept attached to the finding through retest. Start free.
No credit card required. Free plan available forever.