CWE-91: XML Injection - C#
Overview
XML Injection in C#/.NET applications occurs when untrusted user input is used to construct XML documents without proper validation or escaping. Attackers can manipulate XML structure by injecting special characters like <, >, &, ', and ", leading to data corruption, authentication bypass, or information disclosure.
Primary Defence: Use LINQ to XML (XElement, XAttribute) with proper API methods instead of string concatenation, validate and escape user input with SecurityElement.Escape() before including in XML, use XML Schema validation to ensure structure integrity, disable external entity processing (XmlReaderSettings.DtdProcessing = DtdProcessing.Prohibit) to prevent XXE attacks, and use parameterized XPath queries to prevent XML/XPath injection.
Common C# XML Vulnerability Scenarios:
- Building XML with string concatenation or interpolation
- Using user input directly in XML elements or attributes
- SOAP/REST XML payloads with unsanitized data
- XML configuration files with user data
- XPath query injection
C#/.NET XML APIs:
- System.Xml.Linq (LINQ to XML): Modern, recommended API
- System.Xml: Legacy XML APIs (XmlDocument, XmlWriter)
- System.Xml.Serialization: XML serialization
- System.Xml.XPath: XPath queries
- System.ServiceModel: WCF/SOAP services
- System.Security.SecurityElement.Escape: XML escaping utility
XML Special Characters Requiring Escaping:
<→<>→>&→&'→'"→"
Common Vulnerable Patterns
String Interpolation
// VULNERABLE - Direct string interpolation
using System;
public class VulnerableXmlBuilder
{
public string CreateUserXml(string username, string email)
{
// VULNERABLE - User input directly in XML string
string xml = $@"<?xml version=""1.0""?>
<user>
<username>{username}</username>
<email>{email}</email>
</user>";
return xml;
}
}
// Attack: username = "</username><admin>true</admin><username>"
// Result: <username></username><admin>true</admin><username></username>
// Creates unintended <admin> element
Why this is vulnerable:
- No escaping of XML special characters
- String interpolation allows injection
- Can modify XML structure
- Bypasses validation
ASP.NET Core REST API
// VULNERABLE - ASP.NET Core endpoint returning XML
using Microsoft.AspNetCore.Mvc;
[ApiController]
[Route("api/[controller]")]
public class VulnerableUserController : ControllerBase
{
[HttpGet]
[Produces("application/xml")]
public IActionResult GetUser([FromQuery] string username, [FromQuery] string email)
{
// VULNERABLE - Query parameters in XML
string xmlResponse = $@"<?xml version=""1.0"" encoding=""UTF-8""?>
<response>
<user>
<name>{username}</name>
<email>{email}</email>
</user>
</response>";
return Content(xmlResponse, "application/xml");
}
}
// Attack: username = "<admin>true</admin>"
// Response contains: <name><admin>true</admin></name>
Why this is vulnerable:
- ASP.NET Core doesn't auto-escape manual XML strings
- Request parameters directly in XML
- No validation or sanitization
- Information disclosure possible
SOAP Request Construction
// VULNERABLE - Building SOAP XML manually
using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
public class VulnerableSoapClient
{
public async Task<string> CallSoapService(string userId, string action)
{
// VULNERABLE - User input in SOAP envelope
string soapEnvelope = $@"<?xml version=""1.0""?>
<soap:Envelope xmlns:soap=""http://schemas.xmlsoap.org/soap/envelope/"">
<soap:Body>
<GetUserData>
<UserId>{userId}</UserId>
<Action>{action}</Action>
</GetUserData>
</soap:Body>
</soap:Envelope>";
using (HttpClient client = new HttpClient())
{
var content = new StringContent(soapEnvelope, Encoding.UTF8, "text/xml");
var response = await client.PostAsync("https://api.example.com/soap", content);
return await response.Content.ReadAsStringAsync();
}
}
}
// Attack: userId = "</UserId><Role>admin</Role><UserId>"
// Injects admin role into SOAP request
Why this is vulnerable:
- SOAP envelope built with string interpolation
- Allows element injection
- Can escalate privileges
- Modify request structure
XML Configuration Files
// VULNERABLE - Writing XML config with user data
using System.IO;
public class VulnerableConfigWriter
{
public void SaveUserSettings(string username, string theme, string language)
{
// VULNERABLE - User input in XML config
string configXml = $@"<?xml version=""1.0""?>
<config>
<user>{username}</user>
<preferences>
<theme>{theme}</theme>
<language>{language}</language>
</preferences>
</config>";
File.WriteAllText("config.xml", configXml);
}
}
// Attack: theme = "</theme><admin_access>true</admin_access><theme>"
// Modifies configuration structure
Why this is vulnerable:
- Configuration files parsed by XML parser
- Persistent injection
- Can modify application behavior
- Privilege escalation
XmlDocument with String Input
// VULNERABLE - XmlDocument parsing of concatenated string
using System.Xml;
public class VulnerableXmlDocBuilder
{
public XmlDocument CreateXmlResponse(string data)
{
// VULNERABLE - Building XML string manually
string xmlStr = $@"<response>
<status>success</status>
<data>{data}</data>
</response>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xmlStr);
return doc;
}
}
// Attack: data = "</data><malicious>payload</malicious><data>"
// Injects malicious elements
Why this is vulnerable:
- XmlDocument doesn't escape string concatenation
- Parser accepts malformed structure
- Injection before parsing
- No validation
XPath Query Injection
// VULNERABLE - User input in XPath query
using System.Xml;
using System.Xml.XPath;
public class VulnerableXPathQuery
{
public XmlNodeList FindUserByName(XmlDocument doc, string username)
{
// VULNERABLE - XPath injection
string xpathExpr = $"//user[name='{username}']";
return doc.SelectNodes(xpathExpr);
}
}
// Attack: username = "' or '1'='1"
// XPath: //user[name='' or '1'='1']
// Returns all users
Why this is vulnerable:
- XPath query built with string interpolation
- Boolean-based injection
- Bypasses authentication checks
- Information disclosure
XML Attribute Injection
// VULNERABLE - User input in XML attributes
public class VulnerableAttributeBuilder
{
public string CreateElement(string name, string value, string attrValue)
{
// VULNERABLE - Attribute injection
string xml = $@"<element name=""{name}"" value=""{value}"" custom=""{attrValue}""/>";
return xml;
}
}
// Attack: attrValue = "test\" malicious=\"true"
// Result: <element name="..." value="..." custom="test" malicious="true"/>
// Injects additional attributes
Why this is vulnerable:
- Attribute quotes can be escaped
- Allows additional attribute injection
- Modifies element properties
- Can bypass security checks
Blazor Component with XML
// VULNERABLE - Blazor component generating XML
using Microsoft.AspNetCore.Components;
using Microsoft.AspNetCore.Components.Web;
public partial class XmlExportComponent : ComponentBase
{
[Parameter]
public string Username { get; set; }
[Parameter]
public string Email { get; set; }
private string GenerateXml()
{
// VULNERABLE - Component parameters in XML
return $@"<?xml version=""1.0""?>
<user>
<username>{Username}</username>
<email>{Email}</email>
</user>";
}
}
Why this is vulnerable:
- Blazor component parameters directly in XML
- No validation or escaping
- Client-side injection point
- Framework doesn't prevent injection
Secure Patterns
LINQ to XML (XDocument)
// SECURE - Using LINQ to XML (recommended)
using System;
using System.Linq;
using System.Xml.Linq;
using System.Text.RegularExpressions;
public class SecureXmlBuilder
{
private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");
private static readonly Regex EmailPattern =
new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");
public string CreateUserXml(string username, string email)
{
// SECURE - Validate inputs
if (!UsernamePattern.IsMatch(username))
{
throw new ArgumentException("Invalid username");
}
if (!EmailPattern.IsMatch(email))
{
throw new ArgumentException("Invalid email");
}
// SECURE - Use LINQ to XML API
XDocument doc = new XDocument(
new XDeclaration("1.0", "utf-8", null),
new XElement("user",
new XElement("username", username), // Automatically escaped
new XElement("email", email) // Automatically escaped
)
);
return doc.ToString();
}
}
// Example usage:
// CreateUserXml("<script>alert('xss')</script>", "test@example.com")
// Result: <username><script>alert('xss')</script></username>
// Special characters properly escaped
Why this works: LINQ to XML (XDocument, XElement) automatically escapes XML special characters (<, >, &, ', ") when setting element content, preventing attackers from injecting closing tags like </username><admin>true</admin><username>. The XElement constructor treats the second parameter as text content, not markup - so even if username contains <admin>true</admin>, it becomes <admin>true</admin> in the output XML.
Regex validation provides defense-in-depth: the username pattern (^[a-zA-Z0-9._-]{1,100}$) blocks XML metacharacters before they reach the API, and email validation prevents addresses like admin@example.com</email><role>admin</role><email>.
Declarative XML construction via nested XElement calls ensures well-formed documents - the API enforces proper nesting and closing tags, making structural injection impossible. XDocument.ToString() serializes the entire tree, including the XML declaration, producing syntactically valid output that parsers can safely consume. This pattern is immune to injection because the API maintains a DOM tree internally and serializes it safely, never concatenating raw strings.
XmlWriter for Streaming
// SECURE - Using XmlWriter
using System;
using System.IO;
using System.Text;
using System.Xml;
using System.Collections.Generic;
using System.Text.RegularExpressions;
public class SecureXmlStreamWriter
{
private static readonly Regex KeyPattern = new Regex(@"^[a-zA-Z_][a-zA-Z0-9_-]*$");
public string CreateXmlWithWriter(Dictionary<string, string> data)
{
// SECURE - Validate input
foreach (var kvp in data)
{
if (!KeyPattern.IsMatch(kvp.Key))
{
throw new ArgumentException($"Invalid XML element name: {kvp.Key}");
}
if (kvp.Value != null && kvp.Value.Length > 1000)
{
throw new ArgumentException("Value too long");
}
}
StringBuilder sb = new StringBuilder();
XmlWriterSettings settings = new XmlWriterSettings
{
Indent = true,
IndentChars = " ",
NewLineChars = "\n",
NewLineHandling = NewLineHandling.Replace,
Encoding = Encoding.UTF8
};
using (XmlWriter writer = XmlWriter.Create(sb, settings))
{
writer.WriteStartDocument();
writer.WriteStartElement("response");
writer.WriteElementString("status", "success");
writer.WriteStartElement("data");
foreach (var kvp in data)
{
writer.WriteElementString(kvp.Key, kvp.Value); // Auto-escaped
}
writer.WriteEndElement(); // data
writer.WriteEndElement(); // response
writer.WriteEndDocument();
}
return sb.ToString();
}
}
Why this works: XmlWriter provides low-level control with automatic escaping - methods like WriteElementString() and WriteAttributeString() escape XML entities (< → <, & → &) without requiring manual SecurityElement.Escape() calls. The streaming API writes XML incrementally to a StringBuilder or file, making it memory-efficient for large documents (gigabytes) compared to LINQ to XML's in-memory DOM.
Element name validation (^[a-zA-Z_][a-zA-Z0-9_-]*$) prevents injection via malformed element names like user><admin>true</admin><user - XML parsers reject invalid element names, but validating upfront provides fail-fast behavior. Length limits (1000 chars) prevent DoS via extremely long values that could exhaust memory or cause parser hangs.
XmlWriterSettings enforces UTF-8 encoding and consistent formatting (Indent, NewLineHandling), ensuring the output is standards-compliant. The explicit WriteStartElement / WriteEndElement pairing ensures proper nesting - forgetting WriteEndElement causes XmlWriter to throw, preventing malformed XML. This pattern is ideal for generating large XML files (exports, feeds) where LINQ to XML's memory overhead is prohibitive.
SecurityElement.Escape
// SECURE - Using SecurityElement.Escape for manual escaping
using System;
using System.Security;
using System.Text.RegularExpressions;
public class SecureXmlEscaper
{
private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");
public string CreateUserXmlWithEscaping(string username, string email)
{
// SECURE - Validate inputs
if (!UsernamePattern.IsMatch(username))
{
throw new ArgumentException("Invalid username");
}
// SECURE - Use SecurityElement.Escape
string safeUsername = SecurityElement.Escape(username);
string safeEmail = SecurityElement.Escape(email);
string xml = $@"<?xml version=""1.0""?>
<user>
<username>{safeUsername}</username>
<email>{safeEmail}</email>
</user>";
return xml;
}
}
Why this works: SecurityElement.Escape() is a built-in .NET method that escapes the five XML special characters (< → <, > → >, & → &, ' → ', " → "), making it safe to embed user input in manually constructed XML strings. This is useful when LINQ to XML or XmlWriter are impractical (e.g., integrating with legacy code that expects XML strings, or templating scenarios).
Pre-validation (^[a-zA-Z0-9._-]{1,100}$) provides defense-in-depth - even if SecurityElement.Escape() has edge cases or encoding issues, the allowlist blocks malicious input.
However, this pattern is less preferred than LINQ to XML or XmlWriter because manual string concatenation is error-prone - developers might forget to escape a variable, or escape incorrectly (e.g., escaping only < and > but not &). SecurityElement.Escape() is in System.Security namespace, requiring an explicit using statement, which signals its security purpose. Note: this method only escapes content, not attribute values in all contexts - for attributes in XPath expressions or other complex scenarios, LINQ to XML is safer. Use this pattern only when APIs like XDocument are unavailable and you understand the escaping rules.
ASP.NET Core with LINQ to XML
// SECURE - ASP.NET Core with validation and escaping
using Microsoft.AspNetCore.Mvc;
using System.Xml.Linq;
using System.Text.RegularExpressions;
[ApiController]
[Route("api/[controller]")]
public class SecureUserController : ControllerBase
{
private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{3,64}$");
private static readonly Regex EmailPattern =
new Regex(@"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$");
[HttpGet]
[Produces("application/xml")]
public IActionResult GetUser([FromQuery] string username, [FromQuery] string email)
{
// SECURE - Validate inputs
if (!UsernamePattern.IsMatch(username ?? ""))
{
var errorDoc = new XDocument(
new XElement("error", "Invalid username")
);
return BadRequest(Content(errorDoc.ToString(), "application/xml"));
}
if (!EmailPattern.IsMatch(email ?? ""))
{
var errorDoc = new XDocument(
new XElement("error", "Invalid email")
);
return BadRequest(Content(errorDoc.ToString(), "application/xml"));
}
// SECURE - Build XML with LINQ to XML
XDocument doc = new XDocument(
new XDeclaration("1.0", "utf-8", null),
new XElement("response",
new XElement("user",
new XElement("name", username),
new XElement("email", email)
)
)
);
return Content(doc.ToString(), "application/xml");
}
}
Why this works: Regex validation (^[a-zA-Z0-9._-]{3,64}$ for usernames, email pattern for addresses) blocks XML metacharacters (<, >, &, quotes) before they reach the XML API, preventing injection attempts like username=</name><admin>true</admin><name>. LINQ to XML (XDocument, XElement) automatically escapes any remaining content, providing layered defense - even if validation is bypassed, escaping prevents structural changes.
Generic error messages ("Invalid username") in XDocument prevent information disclosure - attackers don't learn whether rejection was due to regex mismatch, length limits, or null input. Content(doc.ToString(), "application/xml") sets the correct Content-Type header, ensuring browsers and API clients parse the response as XML (not HTML).
Early validation (checking inputs before XML construction) provides fail-fast behavior - invalid requests return 400 Bad Request immediately without expensive XML processing. The declarative XDocument structure makes the code auditable - reviewers can see that username and email map directly to <name> and <email> elements, with no string concatenation.
This pattern is ideal for ASP.NET Core REST APIs returning XML responses (e.g., legacy SOAP clients, RSS feeds).
XML Serialization
// SECURE - Using XML Serialization
using System;
using System.IO;
using System.Text;
using System.Xml.Serialization;
using System.Text.RegularExpressions;
[XmlRoot("user")]
public class User
{
[XmlElement("username")]
public string Username { get; set; }
[XmlElement("email")]
public string Email { get; set; }
[XmlElement("bio")]
public string Bio { get; set; }
}
public class SecureXmlSerializer
{
private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,100}$");
public string SerializeUser(string username, string email, string bio)
{
// SECURE - Validate inputs
if (!UsernamePattern.IsMatch(username))
{
throw new ArgumentException("Invalid username");
}
// SECURE - XmlSerializer handles escaping
User user = new User
{
Username = username,
Email = email,
Bio = bio
};
XmlSerializer serializer = new XmlSerializer(typeof(User));
using (StringWriter writer = new StringWriter())
{
serializer.Serialize(writer, user);
return writer.ToString();
}
}
}
Why this works: XmlSerializer uses reflection and attributes ([XmlRoot], [XmlElement]) to map C# objects to XML, automatically escaping property values during serialization - even if Bio contains <script>alert('xss')</script>, it becomes <script>alert('xss')</script> in the XML output. Type safety ensures only declared properties (Username, Email, Bio) appear in the XML - attackers can't inject arbitrary elements like <admin>true</admin> because the User class doesn't have an Admin property.
Pre-validation (^[a-zA-Z0-9._-]{1,100}$) provides defense-in-depth, though serialization escaping is sufficient. Declarative attributes ([XmlElement("username")]) control element naming, making the mapping explicit and auditable - reviewers can see that Username property maps to <username> element. StringWriter / Serialize() pattern returns the XML as a string, enabling logging, caching, or further processing.
This approach is ideal for REST APIs with complex data models (nested objects, collections, enums) where LINQ to XML's manual construction becomes verbose. Trade-off: XmlSerializer requires parameterless constructors and public properties, limiting use with immutable types. For simple XML, LINQ to XML is more flexible; for complex object graphs, XmlSerializer is more maintainable.
XPath with Safe Practices
// SECURE - XPath with validation and safe querying
using System;
using System.Xml;
using System.Text.RegularExpressions;
public class SecureXPathQuery
{
private static readonly Regex UsernamePattern = new Regex(@"^[a-zA-Z0-9._-]{1,50}$");
public XmlNode FindUserByNameSecure(XmlDocument doc, string username)
{
// SECURE - Validate input
if (!UsernamePattern.IsMatch(username))
{
throw new ArgumentException("Invalid username format");
}
// SECURE - Iterate and compare (safest approach)
XmlNodeList users = doc.SelectNodes("//user");
foreach (XmlNode user in users)
{
XmlNode nameNode = user.SelectSingleNode("name");
if (nameNode != null && nameNode.InnerText == username)
{
return user;
}
}
return null;
}
}
Why this works:
- No string concatenation: Uses
SelectNodes()+ iteration + string comparison (nameNode.InnerText == username) instead of$"//user[name='{username}']"preventing injection likealice' or '1'='1 - Regex validation:
^[a-zA-Z0-9._-]{1,50}$blocks XPath metacharacters (',",[,],*,/) before query, preventing//user[name='' or '1'='1'] - Exact comparison:
==after fetching nodes ensures no wildcards/boolean logic - even bypassing validation fails without exact match - Static XPath:
SelectSingleNode("name")uses no user input, eliminating injection attack surface - Safest pattern: "Iterate and compare" approach maximally secure; .NET XPath APIs lack true parameterization like SQL's
@username; less efficient (O(n)) but secure for small XML
Verification
After implementing the recommended secure patterns, verify the fix through multiple approaches:
- Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
- Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
- Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
- Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
- Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
- Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
- Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
- Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced