CWE-611: XML External Entity (XXE) Injection - PHP
Overview
PHP XML security depends on the PHP and libxml versions and the parse flags used. Modern PHP/libxml disables entity substitution by default, but XXE risk is reintroduced when code enables entity substitution, DTD loading, DTD validation, or external subsets for untrusted XML.
Primary Defence: For untrusted XML, avoid LIBXML_NOENT, LIBXML_DTDLOAD, LIBXML_DTDATTR, and LIBXML_DTDVALID; use LIBXML_NONET; reject DOCTYPE/ENTITY declarations unless explicitly required; on PHP < 8.0 call libxml_disable_entity_loader(true) before parsing, and on PHP 8.4+/libxml 2.13+ use LIBXML_NO_XXE if entity substitution or DTD features are unavoidable.
Common Vulnerable Patterns
simplexml_load_string with Unsafe Flags
<?php
// VULNERABLE - entity substitution and DTD loading are enabled
$xml = $_POST['xml'];
$data = simplexml_load_string(
$xml,
'SimpleXMLElement',
LIBXML_NOENT | LIBXML_DTDLOAD
);
// Attacker sends:
// <?xml version="1.0"?>
// <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
// <root><name>&xxe;</name></root>
Why this is vulnerable:
LIBXML_NOENTsubstitutes entities andLIBXML_DTDLOADloads external subsets.- Enables file disclosure, SSRF, and entity expansion DoS.
DOMDocument::loadXML with Unsafe Flags
<?php
// VULNERABLE - DOMDocument configured to load DTDs and substitute entities
$xml = file_get_contents('php://input');
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD); // DANGEROUS!
$name = $dom->getElementsByTagName('name')->item(0)->nodeValue;
Why this is vulnerable:
- Entity substitution and external DTD loading are explicitly enabled.
- Enables file disclosure and SSRF.
XMLReader with Unsafe Flags
<?php
// VULNERABLE - XMLReader configured to substitute entities
$xml = $_POST['xml'];
$reader = new XMLReader();
$reader->XML($xml, null, LIBXML_NOENT | LIBXML_DTDLOAD);
while ($reader->read()) {
if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'name') {
echo $reader->readString();
}
}
Why this is vulnerable:
- DTD loading and entity substitution are explicitly enabled.
- Enables file disclosure and SSRF via XML input.
SimpleXMLElement with Entity Substitution
<?php
// VULNERABLE - SimpleXMLElement with unsafe parser flags
$xml = $_POST['xml'];
$element = new SimpleXMLElement($xml, LIBXML_NOENT | LIBXML_DTDLOAD);
echo $element->name;
Why this is vulnerable:
- Entity substitution and DTD loading are enabled.
- Enables file disclosure and blind SSRF.
Secure Patterns
DOMDocument with Safe Flags
<?php
// SECURE - Avoid entity substitution and DTD loading
function parse_xml_secure($xml) {
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
$previous = libxml_use_internal_errors(true);
$dom = new DOMDocument();
// Do not pass LIBXML_NOENT, LIBXML_DTDLOAD, LIBXML_DTDATTR, or LIBXML_DTDVALID
$success = $dom->loadXML($xml, LIBXML_NONET);
libxml_use_internal_errors($previous);
if (!$success) {
throw new Exception('Invalid XML');
}
return $dom;
}
// Usage:
try {
$dom = parse_xml_secure($_POST['xml']);
$name = $dom->getElementsByTagName('name')->item(0)->nodeValue;
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Avoids the flags that enable entity substitution or external DTD loading.
LIBXML_NONETblocks network access if a future change introduces external loading.- The PHP < 8.0 compatibility call disables external entity loading for older deployments.
simplexml_load_string with Safe Flags
<?php
// SECURE - Avoid entity substitution and DTD loading
function parse_simplexml_secure($xml) {
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
// Clear previous errors
libxml_clear_errors();
libxml_use_internal_errors(true);
// Parse XML
$data = simplexml_load_string(
$xml,
'SimpleXMLElement',
LIBXML_NONET | LIBXML_NOCDATA // Block network access; no entity substitution
);
if ($data === false) {
$errors = libxml_get_errors();
libxml_clear_errors();
throw new Exception('XML parse error: ' . json_encode($errors));
}
return $data;
}
// Usage:
try {
$xml_obj = parse_simplexml_secure($_POST['xml']);
$name = (string)$xml_obj->name;
$email = (string)$xml_obj->email;
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Avoids entity substitution and DTD loading while blocking network access.
- SimpleXML stays safe for untrusted input.
XMLReader with Safe Flags
<?php
// SECURE - XMLReader without entity substitution or DTD loading
function parse_xmlreader_secure($xml) {
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
$reader = new XMLReader();
$reader->XML($xml, null, LIBXML_NONET);
$data = [];
while ($reader->read()) {
if ($reader->nodeType == XMLReader::ELEMENT) {
$elementName = $reader->name;
$reader->read();
if ($reader->nodeType == XMLReader::TEXT) {
$data[$elementName] = $reader->value;
}
}
}
$reader->close();
return $data;
}
// Usage:
try {
$data = parse_xmlreader_secure($_POST['xml']);
echo json_encode($data);
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Avoids entity substitution and blocks network access for streaming parsing.
- Safe for large XML inputs without loading the full document.
Validating XML Before Parsing
<?php
// SECURE - Block DOCTYPE entirely
function validate_and_parse_xml($xml) {
// Block if contains DOCTYPE or ENTITY declarations
if (stripos($xml, '<!DOCTYPE') !== false) {
throw new Exception('DOCTYPE not allowed');
}
if (stripos($xml, '<!ENTITY') !== false) {
throw new Exception('ENTITY declarations not allowed');
}
// Check for entity references (except safe ones)
if (preg_match('/&(?!amp;|lt;|gt;|quot;|apos;)[a-zA-Z0-9_]+;/', $xml)) {
throw new Exception('Custom entity references not allowed');
}
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET);
return $dom;
}
Why this works:
- Rejects DOCTYPE/ENTITY and custom entity refs before parsing.
- Parser hardening provides a second safety layer.
Framework-Specific Guidance
Use these patterns as starting points for common PHP frameworks:
- Enforce
application/xmlcontent types before parsing. - Centralize secure XML parsing helpers and reuse them.
- Validate extracted fields with framework validators.
Laravel
<?php
// SECURE - Laravel controller with secure XML parsing
namespace App\Http\Controllers;
use Illuminate\Http\Request;
use Illuminate\Http\JsonResponse;
class XmlController extends Controller
{
public function processXml(Request $request): JsonResponse
{
// Validate content type
if ($request->header('Content-Type') !== 'application/xml') {
return response()->json(['error' => 'Invalid content type'], 400);
}
try {
$xml = $request->getContent();
// Parse securely
$dom = $this->parseXmlSecure($xml);
// Extract data
$name = $dom->getElementsByTagName('name')->item(0)?->nodeValue;
$email = $dom->getElementsByTagName('email')->item(0)?->nodeValue;
// Validate
$validated = validator([
'name' => $name,
'email' => $email
], [
'name' => 'required|string|max:100',
'email' => 'required|email'
])->validate();
// Create user
$user = User::create($validated);
return response()->json($user, 201);
} catch (\Exception $e) {
return response()->json(['error' => $e->getMessage()], 400);
}
}
private function parseXmlSecure(string $xml): \DOMDocument
{
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
libxml_use_internal_errors(true);
$dom = new \DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET)) {
$errors = libxml_get_errors();
libxml_clear_errors();
throw new \Exception('Invalid XML: ' . json_encode($errors));
}
return $dom;
}
}
Why this works:
- Validates
application/xmland uses a hardened parser. - Uses framework validation before persistence.
Symfony
<?php
// SECURE - Symfony controller with XML parsing
namespace App\Controller;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\JsonResponse;
use Symfony\Component\Validator\Validator\ValidatorInterface;
class ApiController extends AbstractController
{
public function processXml(Request $request, ValidatorInterface $validator): JsonResponse
{
if ($request->getContentType() !== 'xml') {
return new JsonResponse(['error' => 'Content-Type must be application/xml'], 400);
}
try {
$xml = $request->getContent();
// Parse with security
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
libxml_use_internal_errors(true);
$dom = new \DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET)) {
throw new \Exception('Invalid XML');
}
// Extract and create entity
$user = new User();
$user->setName($dom->getElementsByTagName('name')->item(0)?->nodeValue ?? '');
$user->setEmail($dom->getElementsByTagName('email')->item(0)?->nodeValue ?? '');
// Validate entity
$errors = $validator->validate($user);
if (count($errors) > 0) {
return new JsonResponse(['errors' => (string)$errors], 400);
}
// Save
$entityManager = $this->getDoctrine()->getManager();
$entityManager->persist($user);
$entityManager->flush();
return new JsonResponse([
'id' => $user->getId(),
'name' => $user->getName(),
'email' => $user->getEmail()
], 201);
} catch (\Exception $e) {
return new JsonResponse(['error' => $e->getMessage()], 400);
}
}
}
Why this works:
- Enforces XML content type and avoids entity substitution or DTD loading.
- Validates the entity before saving.
RSS/Atom Feed Parsing
<?php
// SECURE - Parse RSS feed safely
function parse_rss_feed($feed_url) {
// Fetch feed
$xml = file_get_contents($feed_url);
if ($xml === false) {
throw new Exception('Failed to fetch feed');
}
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
// Parse with SimpleXML
$feed = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NONET);
if ($feed === false) {
throw new Exception('Invalid RSS feed');
}
$items = [];
foreach ($feed->channel->item as $item) {
$items[] = [
'title' => (string)$item->title,
'link' => (string)$item->link,
'description' => (string)$item->description
];
}
return $items;
}
// Usage:
try {
$items = parse_rss_feed('https://example.com/feed.xml');
foreach ($items as $item) {
echo htmlspecialchars($item['title']) . "<br>";
}
} catch (Exception $e) {
error_log($e->getMessage());
}
SOAP Client
<?php
// SECURE - SOAP client with XXE protection
class SecureSoapClient extends SoapClient {
public function __doRequest($request, $location, $action, $version, $one_way = 0) {
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
return parent::__doRequest($request, $location, $action, $version, $one_way);
}
}
// Usage:
$wsdl = 'https://example.com/service?wsdl';
$options = [
'trace' => 1,
'exceptions' => true,
'features' => SOAP_SINGLE_ELEMENT_ARRAYS
];
$client = new SecureSoapClient($wsdl, $options);
try {
$result = $client->someMethod(['param' => 'value']);
} catch (SoapFault $e) {
error_log('SOAP error: ' . $e->getMessage());
}
Input Validation
<?php
// Validate XML structure and content
function validate_user_xml($xml) {
// Pre-validation: Block dangerous patterns
if (preg_match('/<!DOCTYPE|<!ENTITY/i', $xml)) {
throw new Exception('DOCTYPE and ENTITY declarations not allowed');
}
if (PHP_VERSION_ID < 80000) {
libxml_disable_entity_loader(true);
}
$dom = new DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET)) {
throw new Exception('Invalid XML format');
}
// Validate structure
$root = $dom->documentElement;
if ($root->tagName !== 'user') {
throw new Exception('Root element must be <user>');
}
// Extract elements
$name = $dom->getElementsByTagName('name')->item(0);
$email = $dom->getElementsByTagName('email')->item(0);
// Validate presence
if (!$name || !$name->nodeValue) {
throw new Exception('Name is required');
}
if (!$email || !$email->nodeValue) {
throw new Exception('Email is required');
}
// Validate content
$nameValue = trim($name->nodeValue);
$emailValue = trim($email->nodeValue);
if (strlen($nameValue) > 100) {
throw new Exception('Name too long');
}
if (!filter_var($emailValue, FILTER_VALIDATE_EMAIL)) {
throw new Exception('Invalid email format');
}
return [
'name' => $nameValue,
'email' => $emailValue
];
}
PHP Configuration
; php.ini security settings
; Disable external entity loading globally (PHP 8.0+)
; (Note: libxml_disable_entity_loader is deprecated in PHP 8.0
; as external entity loading is disabled by default)
; For older PHP versions, always call in code:
; libxml_disable_entity_loader(true);
Remediation Steps
- Locate every XML parsing path, including
DOMDocument,SimpleXML,XMLReader, SOAP clients, RSS/Atom feed parsing, uploaded XML/SVG files, and framework helpers. - Identify whether untrusted XML can reach DTD loading, entity substitution, validation, external subsets, or downstream reparsing.
- Remove dangerous parse flags such as
LIBXML_NOENT,LIBXML_DTDLOAD,LIBXML_DTDATTR, andLIBXML_DTDVALIDfrom untrusted XML paths. - Reject
DOCTYPEandENTITYdeclarations at the application boundary unless a documented trusted integration requires them. - Use
LIBXML_NONET, safe parser wrappers, and PHP-version-specific protections such aslibxml_disable_entity_loader(true)on PHP < 8.0. - Add size limits, schema validation where appropriate, and monitoring for unexpected file or network access during XML processing.
Testing
- Test normal XML, SOAP, and RSS/Atom payloads expected by the application.
- Test file-disclosure XXE payloads that reference local files and confirm they are rejected or never expanded.
- Test SSRF-style external entities that reference internal HTTP endpoints or cloud metadata addresses.
- Test Billion Laughs or nested entity expansion payloads with strict input size and timeout limits.
- Test mixed-case
DOCTYPEandENTITYdeclarations across request bodies, uploads, queues, and third-party responses. - Retest static analysis findings for dangerous libxml flags and review runtime logs for blocked XML parsing attempts.
Common Pitfalls
- Assuming modern PHP defaults stay safe after adding
LIBXML_NOENT, DTD loading, or validation flags. - Calling
libxml_disable_entity_loader(true)on older PHP but still passing unsafe parse flags. - Blocking only
DOCTYPEwhile allowingENTITYdeclarations or downstream reparsing. - Sanitizing or validating XML after entity expansion has already occurred.
- Treating
LIBXML_NONETas protection against local file disclosure; it only blocks network access. - Protecting request-body XML but forgetting SOAP clients, feed readers, uploaded SVG/XML files, and queued payloads.
Dependencies and Installation
- PHP DOM, SimpleXML, XMLReader, SOAP, and libxml behavior varies by PHP and libxml version; verify production versions.
libxml_disable_entity_loader()is relevant for PHP < 8.0 and deprecated in PHP 8.0 because external entity loading is disabled by default.LIBXML_NO_XXEis available only on newer PHP/libxml combinations and should be used as additional protection when DTD/entity features are unavoidable.- Keep PHP and libxml current, and avoid parser wrappers that hide unsafe libxml flags.