CWE-611: XML External Entity (XXE) Injection - Go
Overview
XXE vulnerabilities in Go applications occur when XML parsers process external entity references in untrusted XML input. Go's encoding/xml package historically had XXE vulnerabilities in older versions but has improved significantly. As of Go 1.16+, the standard library XML decoder does not resolve external entities by default, providing built-in protection. However, developers using third-party XML libraries or enabling entity resolution can still introduce vulnerabilities.
Unlike languages where XML parsers are vulnerable by default (Java, PHP), Go's standard library takes a secure-by-default approach in modern versions. The xml.Decoder processes XML without expanding external entities, preventing file disclosure and SSRF attacks through XXE. However, developers must be aware of configuration options that can weaken security, particularly when using custom decoders or third-party libraries like libxml2 bindings.
The Go ecosystem strongly favors JSON over XML for APIs and configuration, reducing XXE exposure. When XML processing is necessary (SOAP services, legacy integrations, XML configurations), understanding the security characteristics of encoding/xml and avoiding third-party parsers with unsafe defaults is critical. The simplicity of Go's standard library XML parser - with minimal configuration options - is a security advantage, as there are fewer knobs to misconfigure compared to feature-rich parsers in other languages.
Primary Defence: Use Go's standard encoding/xml package (Go 1.16+) which doesn't resolve external entities by default. Avoid third-party XML libraries unless necessary, and if used, explicitly disable external entity processing and DTD validation.
Common Vulnerable Patterns
Old Go Versions (Pre-1.16)
// VULNERABLE - Go versions before 1.16 may resolve entities
package main
import (
"encoding/xml"
"fmt"
"io"
"net/http"
"strings"
)
type Result struct {
Data string `xml:"data"`
}
func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
var result Result
// DANGEROUS in Go < 1.16: May process external entities
err := xml.Unmarshal(body, &result)
if err != nil {
http.Error(w, err.Error(), 400)
return
}
fmt.Fprintf(w, "Data: %s", result.Data)
}
// Attack XML:
// <?xml version="1.0"?>
// <!DOCTYPE foo [
// <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <result><data>&xxe;</data></result>
//
// In vulnerable versions: Reads /etc/passwd
Why this is vulnerable: Older Go versions (before 1.16) had inconsistent XXE protection. Some versions would process external entities defined in DTDs, allowing attackers to include file contents or trigger SSRF attacks. The xml.Unmarshal() function would expand entity references, making file disclosure possible.
Third-Party XML Libraries
// VULNERABLE - Using CGo-based libxml2 binding
import (
"net/http"
"github.com/lestrrat-go/libxml2"
"github.com/lestrrat-go/libxml2/parser"
)
func parseWithLibxml2(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
// DANGEROUS: libxml2 may have unsafe defaults
doc, err := libxml2.Parse(body)
if err != nil {
http.Error(w, err.Error(), 400)
return
}
defer doc.Free()
// Process document...
}
// Attack: Same XXE payload can trigger file disclosure
// if libxml2 has entity resolution enabled
Why this is vulnerable: Third-party libraries, especially those wrapping native XML parsers through CGo (libxml2, expat), may have different security defaults than Go's standard library. These parsers often default to processing external entities and DTDs for backwards compatibility, making them vulnerable unless explicitly configured with security options. CGo dependencies also complicate deployment and security reviews.
Custom Entity Resolution
// VULNERABLE - Implementing custom entity resolver
import (
"encoding/xml"
"io"
"net/http"
"os"
)
type CustomDecoder struct {
decoder *xml.Decoder
}
func (cd *CustomDecoder) Entity(name string) (*xml.Entity, error) {
// DANGEROUS: Custom entity lookup from filesystem
if name == "config" {
data, err := os.ReadFile("/etc/app/config.xml")
if err != nil {
return nil, err
}
return &xml.Entity{
Name: name,
Value: string(data),
}, nil
}
return nil, nil
}
// Attack: Trigger entity resolution with crafted XML
Why this is vulnerable: Implementing a custom entity resolver that reads files or makes network requests creates an intentional XXE vulnerability. Even if the implementation seems safe (checking entity names), it provides an attack surface if the resolver logic has flaws or can be bypassed. Custom resolvers should be avoided unless absolutely necessary and heavily restricted.
DTD Validation Enabled
// POTENTIALLY VULNERABLE - If DTD validation is somehow enabled
// (Note: encoding/xml doesn't provide built-in DTD validation,
// but this illustrates the concept)
func parseWithValidation(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
// If using a library that validates DTDs,
// external entity references in the DTD could be processed
decoder := xml.NewDecoder(strings.NewReader(string(body)))
// Process XML...
var result interface{}
err := decoder.Decode(&result)
if err != nil {
http.Error(w, err.Error(), 400)
return
}
}
// Attack: DTD with external entity declarations
Why this is vulnerable: DTD validation can trigger external entity processing because DTDs can define entities that reference external resources. While Go's encoding/xml doesn't validate DTDs, third-party libraries that do can be vulnerable if they don't disable external entity resolution during validation.
Secure Patterns
Use encoding/xml (Go 1.16+) - Default Safe
// SECURE - Standard library is safe by default in Go 1.16+
package main
import (
"encoding/xml"
"io"
"log"
"net/http"
)
type User struct {
XMLName xml.Name `xml:"user"`
Name string `xml:"name"`
Email string `xml:"email"`
Role string `xml:"role"`
}
func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
// Limit request body size to prevent DoS
r.Body = http.MaxBytesReader(w, r.Body, 1048576) // 1MB limit
body, err := io.ReadAll(r.Body)
if err != nil {
http.Error(w, "Request too large", http.StatusRequestEntityTooLarge)
return
}
defer r.Body.Close()
var user User
// SECURE: encoding/xml doesn't process external entities (Go 1.16+)
err = xml.Unmarshal(body, &user)
if err != nil {
log.Printf("XML parse error: %v", err)
http.Error(w, "Invalid XML", http.StatusBadRequest)
return
}
// Validate parsed data
if user.Name == "" || user.Email == "" {
http.Error(w, "Missing required fields", http.StatusBadRequest)
return
}
// Process user data safely
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{"status":"success","name":"%s"}`, user.Name)
}
Why this works: Go 1.16 and later versions of encoding/xml do not resolve external entities by default. When the parser encounters entity references like &xxe; defined with <!ENTITY xxe SYSTEM "file:///etc/passwd">, it ignores the external definition and treats the entity as undefined, preventing file disclosure and SSRF. The http.MaxBytesReader limits upload size to prevent XML bomb attacks (billion laughs, quadratic blowup). Input validation after parsing ensures that even if malicious XML passes parsing, the business logic won't process invalid data.
XML Decoder with Streaming (Large Documents)
// SECURE - Streaming XML parsing for large documents
import (
"encoding/xml"
"io"
"net/http"
)
type Item struct {
ID string `xml:"id,attr"`
Name string `xml:"name"`
Value string `xml:"value"`
}
func streamXMLHandler(w http.ResponseWriter, r *http.Request) {
// Create decoder from request body
decoder := xml.NewDecoder(r.Body)
defer r.Body.Close()
// SECURE: Decoder also doesn't resolve external entities
var items []Item
for {
token, err := decoder.Token()
if err == io.EOF {
break
}
if err != nil {
http.Error(w, "Parse error", 400)
return
}
// Process start elements
if se, ok := token.(xml.StartElement); ok {
if se.Name.Local == "item" {
var item Item
if err := decoder.DecodeElement(&item, &se); err != nil {
http.Error(w, "Decode error", 400)
return
}
items = append(items, item)
}
}
}
// Process items...
w.WriteHeader(http.StatusOK)
}
Why this works: xml.NewDecoder() creates a streaming decoder that processes XML incrementally, reducing memory usage for large documents. The decoder inherits the same XXE protection as xml.Unmarshal() - external entities are not resolved. Streaming also provides better control over parsing, allowing rejection of documents with unexpected structure before full parsing. This pattern is ideal for SOAP services or XML feeds where documents may be large.
Input Sanitization (Defense in Depth)
// SECURE - Reject XML with DOCTYPE declarations
import (
"bufio"
"bytes"
"encoding/xml"
"io"
"net/http"
"strings"
)
func secureXMLHandler(w http.ResponseWriter, r *http.Request) {
body, err := io.ReadAll(io.LimitReader(r.Body, 1048576))
if err != nil {
http.Error(w, "Read error", 500)
return
}
defer r.Body.Close()
// SECURE: Reject XML with DOCTYPE (defense in depth)
if containsDOCTYPE(body) {
http.Error(w, "DOCTYPE declarations not allowed", 400)
return
}
var data interface{}
err = xml.Unmarshal(body, &data)
if err != nil {
http.Error(w, "Invalid XML", 400)
return
}
w.WriteHeader(http.StatusOK)
}
func containsDOCTYPE(xmlData []byte) bool {
scanner := bufio.NewScanner(bytes.NewReader(xmlData))
for scanner.Scan() {
line := strings.TrimSpace(scanner.Text())
if strings.Contains(strings.ToUpper(line), "<!DOCTYPE") {
return true
}
}
return false
}
Why this works: While encoding/xml doesn't process external entities, explicitly rejecting XML documents with <!DOCTYPE> declarations provides defense-in-depth. This prevents any possibility of XXE regardless of Go version or parser implementation changes. Scanning for DOCTYPE is fast and occurs before XML parsing, providing an early rejection point. This is particularly valuable if supporting multiple Go versions or as insurance against future parser changes.
Prefer JSON Over XML
// SECURE - Use JSON instead of XML (best practice)
import (
"encoding/json"
"io"
"net/http"
)
type UserRequest struct {
Name string `json:"name"`
Email string `json:"email"`
Role string `json:"role"`
}
func jsonHandler(w http.ResponseWriter, r *http.Request) {
var user UserRequest
// SECURE: JSON has no XXE vulnerabilities
decoder := json.NewDecoder(io.LimitReader(r.Body, 1048576))
decoder.DisallowUnknownFields() // Strict parsing
err := decoder.Decode(&user)
if err != nil {
http.Error(w, "Invalid JSON", 400)
return
}
defer r.Body.Close()
// Validate
if user.Name == "" || user.Email == "" {
http.Error(w, "Missing fields", 400)
return
}
// Process...
w.Header().Set("Content-Type", "application/json")
json.NewEncoder(w).Encode(map[string]string{
"status": "success",
"name": user.Name,
})
}
Why this works: JSON parsers have no concept of external entities, DTDs, or entity references - the format simply doesn't support them. By using JSON instead of XML for APIs and configuration, XXE vulnerabilities are completely eliminated. JSON is also simpler to parse, has broader client support, and generates smaller payloads. This is the recommended approach for new systems unless XML is required for compatibility (SOAP, legacy systems).
Framework-Specific Guidance
Gin with XML Binding
// SECURE - Gin XML binding uses encoding/xml
package main
import (
"net/http"
"github.com/gin-gonic/gin"
)
type Config struct {
AppName string `xml:"appname"`
Version string `xml:"version"`
Debug bool `xml:"debug"`
}
func main() {
r := gin.Default()
// Limit request size globally
r.MaxMultipartMemory = 1 << 20 // 1MB
r.POST("/config", configHandler)
r.Run(":8080")
}
func configHandler(c *gin.Context) {
var config Config
// SECURE: Gin uses encoding/xml internally
if err := c.ShouldBindXML(&config); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid XML"})
return
}
// Validate
if config.AppName == "" {
c.JSON(http.StatusBadRequest, gin.H{"error": "appname required"})
return
}
c.JSON(http.StatusOK, gin.H{
"status": "configured",
"app": config.AppName,
})
}
Why this works: Gin's ShouldBindXML() and BindXML() methods use Go's standard encoding/xml package internally, inheriting its XXE protection. Gin also provides automatic request size limiting through MaxMultipartMemory configuration. The framework's binding mechanism validates that the XML structure matches the expected Go struct, providing schema validation. This makes Gin safe for XML processing in modern Go versions.
Echo XML Deserialization
// SECURE - Echo with XML binding
package main
import (
"net/http"
"github.com/labstack/echo/v4"
"github.com/labstack/echo/v4/middleware"
)
type Message struct {
To string `xml:"to"`
From string `xml:"from"`
Content string `xml:"content"`
}
func main() {
e := echo.New()
// Middleware
e.Use(middleware.Logger())
e.Use(middleware.Recover())
e.Use(middleware.BodyLimit("1M")) // Limit body size
e.POST("/message", messageHandler)
e.Start(":8080")
}
func messageHandler(c echo.Context) error {
var msg Message
// SECURE: Echo uses encoding/xml
if err := c.Bind(&msg); err != nil {
return c.JSON(http.StatusBadRequest, map[string]string{
"error": "Invalid XML",
})
}
// Validate
if msg.To == "" || msg.From == "" {
return c.JSON(http.StatusBadRequest, map[string]string{
"error": "Missing required fields",
})
}
return c.JSON(http.StatusOK, map[string]interface{}{
"status": "received",
"to": msg.To,
"from": msg.From,
"preview": msg.Content[:min(len(msg.Content), 50)],
})
}
func min(a, b int) int {
if a < b {
return a
}
return b
}
Why this works: Echo's Bind() method automatically detects the content type and uses appropriate deserializers. For Content-Type: application/xml, it uses encoding/xml, providing XXE protection. The middleware.BodyLimit middleware enforces request size limits globally across all routes, preventing XML bomb attacks. Echo's error handling ensures parsing errors are caught and returned safely without exposing internal details.