Skip to content

CWE-91: XML Injection - C#

Overview

XML Injection in C#/.NET applications occurs when untrusted user input is used to construct XML documents without proper validation or escaping. Attackers can manipulate XML structure by injecting special characters like <, >, &, ', and ", leading to data corruption, authentication bypass, or information disclosure.

Primary Defence: Use LINQ to XML (XElement, XAttribute) with proper API methods instead of string concatenation, validate and escape user input with SecurityElement.Escape() before including in XML, use XML Schema validation to ensure structure integrity, disable external entity processing (XmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit) to prevent XXE attacks, and use parameterized XPath queries to prevent XML/XPath injection.

Common C# XML Vulnerability Scenarios:

  • Building XML with string concatenation or interpolation
  • Using user input directly in XML elements or attributes
  • SOAP/REST XML payloads with unsanitized data
  • XML configuration files with user data
  • XPath query injection

C#/.NET XML APIs:

  • System.Xml.Linq (LINQ to XML): Modern, recommended API
  • System.Xml: Legacy XML APIs (XmlDocument, XmlWriter)
  • System.Xml.Serialization: XML serialization
  • System.Xml.XPath: XPath queries
  • System.ServiceModel: WCF/SOAP services
  • System.Security.SecurityElement.Escape: XML escaping utility

XML Special Characters Requiring Escaping:

  • <&lt;
  • >&gt;
  • &&amp;
  • '&apos;
  • "&quot;

Common Vulnerable Patterns

String Interpolation

// VULNERABLE - Direct string interpolation
using System;

public class VulnerableXmlBuilder
{
    public string CreateUserXml(string username, string email)
    {
        // VULNERABLE - User input directly in XML string
        string xml = $@"<?xml version=""1.0""?>
<user>
    <username>{username}</username>
    <email>{email}</email>
</user>";

        return xml;
    }
}

// Attack: username = "</username><admin>true</admin><username>"
// Result: <username></username><admin>true</admin><username></username>
// Creates unintended <admin> element

Why this is vulnerable:

  • No escaping of XML special characters
  • String interpolation allows injection
  • Can modify XML structure
  • Bypasses validation

ASP.NET Core REST API

// VULNERABLE - ASP.NET Core endpoint returning XML
using Microsoft.AspNetCore.Mvc;

[ApiController]
[Route("api/[controller]")]
public class VulnerableUserController : ControllerBase
{
    [HttpGet]
    [Produces("application/xml")]
    public IActionResult GetUser([FromQuery] string username, [FromQuery] string email)
    {
        // VULNERABLE - Query parameters in XML
        string xmlResponse = $@"<?xml version=""1.0"" encoding=""UTF-8""?>
<response>
    <user>
        <name>{username}</name>
        <email>{email}</email>
    </user>
</response>";

        return Content(xmlResponse, "application/xml");
    }
}

// Attack: username = "<admin>true</admin>"
// Response contains: <name><admin>true</admin></name>

Why this is vulnerable:

  • ASP.NET Core doesn't auto-escape manual XML strings
  • Request parameters directly in XML
  • No validation or sanitization
  • Information disclosure possible

SOAP Request Construction

// VULNERABLE - Building SOAP XML manually
using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;

public class VulnerableSoapClient
{
    public async Task<string> CallSoapService(string userId, string action)
    {
        // VULNERABLE - User input in SOAP envelope
        string soapEnvelope = $@"<?xml version=""1.0""?>
<soap:Envelope xmlns:soap=""http://schemas.xmlsoap.org/soap/envelope/"">
    <soap:Body>
        <GetUserData>
            <UserId>{userId}</UserId>
            <Action>{action}</Action>
        </GetUserData>
    </soap:Body>
</soap:Envelope>";

        using (HttpClient client = new HttpClient())
        {
            var content = new StringContent(soapEnvelope, Encoding.UTF8, "text/xml");
            var response = await client.PostAsync("https://api.example.com/soap", content);
            return await response.Content.ReadAsStringAsync();
        }
    }
}

// Attack: userId = "</UserId><Role>admin</Role><UserId>"
// Injects admin role into SOAP request

Why this is vulnerable:

  • SOAP envelope built with string interpolation
  • Allows element injection
  • Can escalate privileges
  • Modify request structure

XML Configuration Files

// VULNERABLE - Writing XML config with user data
using System.IO;

public class VulnerableConfigWriter
{
    public void SaveUserSettings(string username, string theme, string language)
    {
        // VULNERABLE - User input in XML config
        string configXml = $@"<?xml version=""1.0""?>
<config>
    <user>{username}</user>
    <preferences>
        <theme>{theme}</theme>
        <language>{language}</language>
    </preferences>
</config>";

        File.WriteAllText("config.xml", configXml);
    }
}

// Attack: theme = "</theme><admin_access>true</admin_access><theme>"
// Modifies configuration structure

Why this is vulnerable:

  • Configuration files parsed by XML parser
  • Persistent injection
  • Can modify application behavior
  • Privilege escalation

XmlDocument with String Input

// VULNERABLE - XmlDocument parsing of concatenated string
using System.Xml;

public class VulnerableXmlDocBuilder
{
    public XmlDocument CreateXmlResponse(string data)
    {
        // VULNERABLE - Building XML string manually
        string xmlStr = $@"<response>
    <status>success</status>
    <data>{data}</data>
</response>";

        XmlDocument doc = new XmlDocument();
        doc.LoadXml(xmlStr);

        return doc;
    }
}

// Attack: data = "</data><malicious>payload</malicious><data>"
// Injects malicious elements

Why this is vulnerable:

  • XmlDocument doesn't escape string concatenation
  • Parser accepts malformed structure
  • Injection before parsing
  • No validation

XPath Query Injection

// VULNERABLE - User input in XPath query
using System.Xml;
using System.Xml.XPath;

public class VulnerableXPathQuery
{
    public XmlNodeList FindUserByName(XmlDocument doc, string username)
    {
        // VULNERABLE - XPath injection
        string xpathExpr = $"//user[name='{username}']";

        return doc.SelectNodes(xpathExpr);
    }
}

// Attack: username = "' or '1'='1"
// XPath: //user[name='' or '1'='1']
// Returns all users

Why this is vulnerable:

  • XPath query built with string interpolation
  • Boolean-based injection
  • Bypasses authentication checks
  • Information disclosure

XML Attribute Injection

// VULNERABLE - User input in XML attributes
public class VulnerableAttributeBuilder
{
    public string CreateElement(string name, string value, string attrValue)
    {
        // VULNERABLE - Attribute injection
        string xml = $@"<element name=""{name}"" value=""{value}"" custom=""{attrValue}""/>";
        return xml;
    }
}

// Attack: attrValue = "test\" malicious=\"true"
// Result: <element name="..." value="..." custom="test" malicious="true"/>
// Injects additional attributes

Why this is vulnerable:

  • Attribute quotes can be escaped
  • Allows additional attribute injection
  • Modifies element properties
  • Can bypass security checks

Blazor Component with XML

// VULNERABLE - Blazor component generating XML
using Microsoft.AspNetCore.Components;
using Microsoft.AspNetCore.Components.Web;

public partial class XmlExportComponent : ComponentBase
{
    [Parameter]
    public string Username { get; set; }

    [Parameter]
    public string Email { get; set; }

    private string GenerateXml()
    {
        // VULNERABLE - Component parameters in XML
        return $@"<?xml version=""1.0""?>
<user>
    <username>{Username}</username>
    <email>{Email}</email>
</user>";
    }
}

Why this is vulnerable:

  • Blazor component parameters directly in XML
  • No validation or escaping
  • Client-side injection point
  • Framework doesn't prevent injection

Secure Patterns

LINQ to XML (XDocument)

// SECURE - Using LINQ to XML (recommended)
using System;
using System.Linq;
using System.Xml.Linq;
using System.Text.RegularExpressions;

public class SecureXmlBuilder
{
    private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");
    private static readonly Regex EmailPattern = 
        new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");

    public string CreateUserXml(string username, string email)
    {
        // SECURE - Validate inputs
        if (!UsernamePattern.IsMatch(username))
        {
            throw new ArgumentException("Invalid username");
        }
        if (!EmailPattern.IsMatch(email))
        {
            throw new ArgumentException("Invalid email");
        }

        // SECURE - Use LINQ to XML API
        XDocument doc = new XDocument(
            new XDeclaration("1.0", "utf-8", null),
            new XElement("user",
                new XElement("username", username),  // Automatically escaped
                new XElement("email", email)         // Automatically escaped
            )
        );

        return doc.ToString();
    }
}

// Example usage:
// CreateUserXml("<script>alert('xss')</script>", "test@example.com")
// Result: <username>&lt;script&gt;alert('xss')&lt;/script&gt;</username>
// Special characters properly escaped

Why this works: LINQ to XML (XDocument, XElement) automatically escapes XML special characters (<, >, &, ', ") when setting element content, preventing attackers from injecting closing tags like </username><admin>true</admin><username>. The XElement constructor treats the second parameter as text content, not markup - so even if username contains <admin>true</admin>, it becomes &lt;admin&gt;true&lt;/admin&gt; in the output XML.

Regex validation provides defense-in-depth: the username pattern (^[a-zA-Z0-9._-]{1,100}$) blocks XML metacharacters before they reach the API, and email validation prevents addresses like admin@example.com</email><role>admin</role><email>.

Declarative XML construction via nested XElement calls ensures well-formed documents - the API enforces proper nesting and closing tags, making structural injection impossible. XDocument.ToString() serializes the entire tree, including the XML declaration, producing syntactically valid output that parsers can safely consume. This pattern is immune to injection because the API maintains a DOM tree internally and serializes it safely, never concatenating raw strings.

XmlWriter for Streaming

// SECURE - Using XmlWriter
using System;
using System.IO;
using System.Text;
using System.Xml;
using System.Collections.Generic;
using System.Text.RegularExpressions;

public class SecureXmlStreamWriter
{
    private static readonly Regex KeyPattern = new Regex(@"^[a-zA-Z_][a-zA-Z0-9_-]*$");

    public string CreateXmlWithWriter(Dictionary<string, string> data)
    {
        // SECURE - Validate input
        foreach (var kvp in data)
        {
            if (!KeyPattern.IsMatch(kvp.Key))
            {
                throw new ArgumentException($"Invalid XML element name: {kvp.Key}");
            }
            if (kvp.Value != null && kvp.Value.Length > 1000)
            {
                throw new ArgumentException("Value too long");
            }
        }

        StringBuilder sb = new StringBuilder();
        XmlWriterSettings settings = new XmlWriterSettings
        {
            Indent = true,
            IndentChars = "  ",
            NewLineChars = "\n",
            NewLineHandling = NewLineHandling.Replace,
            Encoding = Encoding.UTF8
        };

        using (XmlWriter writer = XmlWriter.Create(sb, settings))
        {
            writer.WriteStartDocument();
            writer.WriteStartElement("response");

            writer.WriteElementString("status", "success");

            writer.WriteStartElement("data");

            foreach (var kvp in data)
            {
                writer.WriteElementString(kvp.Key, kvp.Value);  // Auto-escaped
            }

            writer.WriteEndElement(); // data
            writer.WriteEndElement(); // response
            writer.WriteEndDocument();
        }

        return sb.ToString();
    }
}

Why this works: XmlWriter provides low-level control with automatic escaping - methods like WriteElementString() and WriteAttributeString() escape XML entities (<&lt;, &&amp;) without requiring manual SecurityElement.Escape() calls. The streaming API writes XML incrementally to a StringBuilder or file, making it memory-efficient for large documents (gigabytes) compared to LINQ to XML's in-memory DOM.

Element name validation (^[a-zA-Z_][a-zA-Z0-9_-]*$) prevents injection via malformed element names like user><admin>true</admin><user - XML parsers reject invalid element names, but validating upfront provides fail-fast behavior. Length limits (1000 chars) prevent DoS via extremely long values that could exhaust memory or cause parser hangs.

XmlWriterSettings enforces UTF-8 encoding and consistent formatting (Indent, NewLineHandling), ensuring the output is standards-compliant. The explicit WriteStartElement / WriteEndElement pairing ensures proper nesting - forgetting WriteEndElement causes XmlWriter to throw, preventing malformed XML. This pattern is ideal for generating large XML files (exports, feeds) where LINQ to XML's memory overhead is prohibitive.

SecurityElement.Escape

// SECURE - Using SecurityElement.Escape for manual escaping
using System;
using System.Security;
using System.Text.RegularExpressions;

public class SecureXmlEscaper
{
    private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");

    public string CreateUserXmlWithEscaping(string username, string email)
    {
        // SECURE - Validate inputs
        if (!UsernamePattern.IsMatch(username))
        {
            throw new ArgumentException("Invalid username");
        }

        // SECURE - Use SecurityElement.Escape
        string safeUsername = SecurityElement.Escape(username);
        string safeEmail = SecurityElement.Escape(email);

        string xml = $@"<?xml version=""1.0""?>
<user>
    <username>{safeUsername}</username>
    <email>{safeEmail}</email>
</user>";

        return xml;
    }
}

Why this works: SecurityElement.Escape() is a built-in .NET method that escapes the five XML special characters (<&lt;, >&gt;, &&amp;, '&apos;, "&quot;), making it safe to embed user input in manually constructed XML strings. This is useful when LINQ to XML or XmlWriter are impractical (e.g., integrating with legacy code that expects XML strings, or templating scenarios).

Pre-validation (^[a-zA-Z0-9._-]{1,100}$) provides defense-in-depth - even if SecurityElement.Escape() has edge cases or encoding issues, the allowlist blocks malicious input.

However, this pattern is less preferred than LINQ to XML or XmlWriter because manual string concatenation is error-prone - developers might forget to escape a variable, or escape incorrectly (e.g., escaping only < and > but not &). SecurityElement.Escape() is in System.Security namespace, requiring an explicit using statement, which signals its security purpose. Note: this method only escapes content, not attribute values in all contexts - for attributes in XPath expressions or other complex scenarios, LINQ to XML is safer. Use this pattern only when APIs like XDocument are unavailable and you understand the escaping rules.

ASP.NET Core with LINQ to XML

// SECURE - ASP.NET Core with validation and escaping
using Microsoft.AspNetCore.Mvc;
using System.Xml.Linq;
using System.Text.RegularExpressions;

[ApiController]
[Route("api/[controller]")]
public class SecureUserController : ControllerBase
{
    private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{3,64}$");
    private static readonly Regex EmailPattern = 
        new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");

    [HttpGet]
    [Produces("application/xml")]
    public IActionResult GetUser([FromQuery] string username, [FromQuery] string email)
    {
        // SECURE - Validate inputs
        if (!UsernamePattern.IsMatch(username ?? ""))
        {
            var errorDoc = new XDocument(
                new XElement("error", "Invalid username")
            );
            return BadRequest(Content(errorDoc.ToString(), "application/xml"));
        }

        if (!EmailPattern.IsMatch(email ?? ""))
        {
            var errorDoc = new XDocument(
                new XElement("error", "Invalid email")
            );
            return BadRequest(Content(errorDoc.ToString(), "application/xml"));
        }

        // SECURE - Build XML with LINQ to XML
        XDocument doc = new XDocument(
            new XDeclaration("1.0", "utf-8", null),
            new XElement("response",
                new XElement("user",
                    new XElement("name", username),
                    new XElement("email", email)
                )
            )
        );

        return Content(doc.ToString(), "application/xml");
    }
}

Why this works: Regex validation (^[a-zA-Z0-9._-]{3,64}$ for usernames, email pattern for addresses) blocks XML metacharacters (<, >, &, quotes) before they reach the XML API, preventing injection attempts like username=</name><admin>true</admin><name>. LINQ to XML (XDocument, XElement) automatically escapes any remaining content, providing layered defense - even if validation is bypassed, escaping prevents structural changes.

Generic error messages ("Invalid username") in XDocument prevent information disclosure - attackers don't learn whether rejection was due to regex mismatch, length limits, or null input. Content(doc.ToString(), "application/xml") sets the correct Content-Type header, ensuring browsers and API clients parse the response as XML (not HTML).

Early validation (checking inputs before XML construction) provides fail-fast behavior - invalid requests return 400 Bad Request immediately without expensive XML processing. The declarative XDocument structure makes the code auditable - reviewers can see that username and email map directly to <name> and <email> elements, with no string concatenation.

This pattern is ideal for ASP.NET Core REST APIs returning XML responses (e.g., legacy SOAP clients, RSS feeds).

XML Serialization

// SECURE - Using XML Serialization
using System;
using System.IO;
using System.Text;
using System.Xml.Serialization;
using System.Text.RegularExpressions;

[XmlRoot("user")]
public class User
{
    [XmlElement("username")]
    public string Username { get; set; }

    [XmlElement("email")]
    public string Email { get; set; }

    [XmlElement("bio")]
    public string Bio { get; set; }
}

public class SecureXmlSerializer
{
    private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");

    public string SerializeUser(string username, string email, string bio)
    {
        // SECURE - Validate inputs
        if (!UsernamePattern.IsMatch(username))
        {
            throw new ArgumentException("Invalid username");
        }

        // SECURE - XmlSerializer handles escaping
        User user = new User
        {
            Username = username,
            Email = email,
            Bio = bio
        };

        XmlSerializer serializer = new XmlSerializer(typeof(User));

        using (StringWriter writer = new StringWriter())
        {
            serializer.Serialize(writer, user);
            return writer.ToString();
        }
    }
}

Why this works: XmlSerializer uses reflection and attributes ([XmlRoot], [XmlElement]) to map C# objects to XML, automatically escaping property values during serialization - even if Bio contains <script>alert('xss')</script>, it becomes &lt;script&gt;alert('xss')&lt;/script&gt; in the XML output. Type safety ensures only declared properties (Username, Email, Bio) appear in the XML - attackers can't inject arbitrary elements like <admin>true</admin> because the User class doesn't have an Admin property.

Pre-validation (^[a-zA-Z0-9._-]{1,100}$) provides defense-in-depth, though serialization escaping is sufficient. Declarative attributes ([XmlElement("username")]) control element naming, making the mapping explicit and auditable - reviewers can see that Username property maps to <username> element. StringWriter / Serialize() pattern returns the XML as a string, enabling logging, caching, or further processing.

This approach is ideal for REST APIs with complex data models (nested objects, collections, enums) where LINQ to XML's manual construction becomes verbose. Trade-off: XmlSerializer requires parameterless constructors and public properties, limiting use with immutable types. For simple XML, LINQ to XML is more flexible; for complex object graphs, XmlSerializer is more maintainable.

XPath with Safe Practices

// SECURE - XPath with validation and safe querying
using System;
using System.Xml;
using System.Text.RegularExpressions;

public class SecureXPathQuery
{
    private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,50}$");

    public XmlNode FindUserByNameSecure(XmlDocument doc, string username)
    {
        // SECURE - Validate input
        if (!UsernamePattern.IsMatch(username))
        {
            throw new ArgumentException("Invalid username format");
        }

        // SECURE - Iterate and compare (safest approach)
        XmlNodeList users = doc.SelectNodes("//user");

        foreach (XmlNode user in users)
        {
            XmlNode nameNode = user.SelectSingleNode("name");
            if (nameNode != null && nameNode.InnerText == username)
            {
                return user;
            }
        }

        return null;
    }
}

Why this works:

  • No string concatenation: Uses SelectNodes() + iteration + string comparison (nameNode.InnerText == username) instead of $"//user[name='{username}']" preventing injection like alice' or '1'='1
  • Regex validation: ^[a-zA-Z0-9._-]{1,50}$ blocks XPath metacharacters (', ", [, ], *, /) before query, preventing //user[name='' or '1'='1']
  • Exact comparison: == after fetching nodes ensures no wildcards/boolean logic - even bypassing validation fails without exact match
  • Static XPath: SelectSingleNode("name") uses no user input, eliminating injection attack surface
  • Safest pattern: "Iterate and compare" approach maximally secure; .NET XPath APIs lack true parameterization like SQL's @username; less efficient (O(n)) but secure for small XML

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

  • Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
  • Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
  • Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
  • Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
  • Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
  • Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
  • Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
  • Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced

Additional Resources