CWE-79: Cross-Site Scripting (XSS) - Java
Overview
Cross-Site Scripting (CWE-79) occurs when untrusted data is included in web pages without proper encoding. Attackers inject malicious scripts that execute in victim browsers, leading to session theft, credential harvesting, defacement, or malware distribution. Java applications must encode all user-controlled output using context-appropriate methods (HTML, JavaScript, URL, CSS, JSON). Unlike CWE-80 which focuses on basic XSS, CWE-79 encompasses the full spectrum of XSS vulnerabilities including reflected, stored, and DOM-based attacks.
Primary Defence: Use JSTL <c:out> tag for automatic HTML escaping in JSP, or OWASP Java Encoder's Encode.forHtml(), Encode.forJavaScript(), and other context-specific methods for output encoding in servlets and templates.
Common Vulnerable Patterns
Direct Output to JSP Without Encoding
<%-- VULNERABLE - scriptlet with no encoding --%>
<div>Welcome, <%= request.getParameter("username") %></div>
<%-- VULNERABLE - EL without c:out --%>
<p>Comment: ${param.comment}</p>
<%-- VULNERABLE - attribute without encoding --%>
<input type="text" value="<%= request.getParameter("search") %>">
Attack Examples:
username=<script>alert(document.cookie)</script>
comment=<img src=x onerror=alert('XSS')>
search="><script>alert(1)</script>
Why this is vulnerable: JSP scriptlets (<%= %>) and EL expressions (${}) output raw content without HTML encoding, allowing attackers to inject JavaScript or HTML that executes in the victim's browser.
Servlet PrintWriter Without Encoding
// VULNERABLE - Direct output
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String name = request.getParameter("name");
response.setContentType("text/html");
PrintWriter out = response.getWriter();
out.println("<h1>Welcome " + name + "!</h1>"); // NO ENCODING
}
// VULNERABLE - StringBuilder concatenation
String comment = request.getParameter("comment");
StringBuilder html = new StringBuilder();
html.append("<div class='comment'>");
html.append(comment); // NO ENCODING
html.append("</div>");
response.getWriter().write(html.toString());
Why this is vulnerable: Writing user input directly to HTTP response via PrintWriter without HTML encoding allows injection of malicious scripts that execute when the page renders in the browser.
JavaScript Context Without Encoding
// VULNERABLE - User data in JavaScript
String userName = request.getParameter("user");
out.println("<script>");
out.println("var currentUser = '" + userName + "';"); // NO JS ENCODING
out.println("</script>");
// Attack: user='; alert(document.cookie); //
// Result: var currentUser = ''; alert(document.cookie); //';
Why this is vulnerable: Inserting user data into JavaScript without JavaScript-specific encoding allows attackers to break out of string literals using quotes and inject arbitrary JavaScript code.
URL Context Without Encoding
// VULNERABLE - Unencoded URL parameters
String redirect = request.getParameter("returnUrl");
out.println("<a href=\"" + redirect + "\">Continue</a>"); // NO URL ENCODING
// Attack: returnUrl=javascript:alert('XSS')
Why this is vulnerable: Placing unencoded user input in href attributes allows javascript: protocol injection and other XSS vectors that execute when users click the link.
Thymeleaf th:utext (Unescaped)
<!-- VULNERABLE - th:utext bypasses escaping -->
<div th:utext="${userInput}">Content</div>
<!-- VULNERABLE - Using raw HTML from user -->
<p th:utext="${request.getParameter('content')}"></p>
Why this is vulnerable: Thymeleaf's th:utext attribute explicitly bypasses HTML escaping, rendering raw HTML/JavaScript from user input that executes in the browser.
JSON Responses with Manual Construction
// VULNERABLE - Manual JSON construction
public void doGet(HttpServletRequest request, HttpServletResponse response) {
String name = request.getParameter("name");
String json = "{\"userName\":\"" + name + "\"}";
response.setContentType("application/json");
response.getWriter().write(json); // NO ESCAPING
}
// Attack: name=test","admin":true,"x":"y
// Result: {"userName":"test","admin":true,"x":"y"}
Why this is vulnerable: Manually constructing JSON with string concatenation allows injection of quotes to manipulate the JSON structure, potentially leading to privilege escalation or XSS when the JSON is consumed by JavaScript.
Stored XSS from Database
// VULNERABLE - Not encoding database content
String userBio = database.getUserBio(userId);
out.println("<div class='bio'>" + userBio + "</div>"); // NO ENCODING
// Even trusted sources need encoding!
Why this is vulnerable: Data retrieved from databases can contain malicious scripts inserted by attackers through other input points; outputting this data without encoding causes stored XSS that affects all users viewing the content.
DOM-Based XSS
// VULNERABLE - Passing unsanitized data to client-side JavaScript
String searchTerm = request.getParameter("q");
out.println("<script>");
out.println("document.getElementById('result').innerHTML = '" + searchTerm + "';");
out.println("</script>");
Why this is vulnerable: Passing unencoded user data to client-side JavaScript that manipulates the DOM (innerHTML, eval, etc.) allows XSS attacks that bypass server-side protections.
Secure Patterns
JSTL c:out in JSP (HTML Text and Attribute Output)
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
<!-- SAFE - c:out auto-escapes HTML -->
<div>Welcome, <c:out value="${param.username}"/></div>
<!-- SAFE - Attribute context -->
<input type="text" value="<c:out value='${param.search}'/>"/>
<!-- SAFE - With default value -->
<p><c:out value="${user.bio}" default="No bio available"/></p>
<!-- SAFE - Disable escaping only for trusted content -->
<c:out value="${trustedHtml}" escapeXml="false"/> <!-- USE WITH CAUTION -->
Why this works: c:out performs XML/HTML-style escaping by default, converting <, >, &, and quotes to entity form and preventing attacker-controlled markup from being interpreted by the browser in ordinary HTML text and quoted attribute contexts. Because the escaping happens inside the JSP tag implementation, every value passed to c:out is encoded unless escapeXml="false" is explicitly set. This closes the classic reflected/stored XSS path where user input is printed directly into the DOM.
The tag renders safely in both body and quoted attribute contexts, so <c:out value='${param.search}'/> inside an attribute is still escaped for quotes and angle brackets, avoiding attribute-breaking payloads like " onmouseover=alert(1). It is not a JavaScript, CSS, or URL validator. For script strings, CSS values, and URL components, use context-specific encoders such as OWASP Java Encoder and validate dangerous URL schemes such as javascript:. Only explicitly trusted, pre-sanitized HTML should use escapeXml="false", and that flag serves as an intentional opt-out, making risky usage visible during reviews.
Operationally, this integrates with any servlet/JSP stack (Tomcat/Jetty/WildFly) and requires no extra libraries. Performance impact is negligible because escaping is linear in the size of the string. Use this pattern for most JSP output; reserve raw output only for server-generated, sanitized fragments (e.g., CMS-rendered content post-HTML sanitizer) and keep those locations narrowly scoped and code-reviewed.
OWASP Java Encoder (Recommended for Servlets)
import org.owasp.encoder.Encode;
public void doGet(HttpServletRequest request, HttpServletResponse response)
throws ServletException, IOException {
String name = request.getParameter("name");
String comment = request.getParameter("comment");
response.setContentType("text/html; charset=UTF-8");
PrintWriter out = response.getWriter();
// SAFE - HTML context
out.println("<h1>Welcome " + Encode.forHtml(name) + "!</h1>");
// SAFE - HTML attribute context
out.println("<div class=\"" + Encode.forHtmlAttribute(comment) + "\">");
// SAFE - JavaScript string context
out.println("<script>");
out.println("var user = '" + Encode.forJavaScript(name) + "';");
out.println("</script>");
// SAFE - URL query parameter plus HTML attribute context
String search = request.getParameter("q");
String url = "/search?q=" + Encode.forUriComponent(search);
out.println("<a href=\"" + Encode.forHtmlAttribute(url) + "\">Search</a>");
}
Why this works: OWASP Java Encoder applies context-specific escaping right where untrusted data is written: forHtml for element bodies, forHtmlAttribute for attributes, forJavaScript for JS strings, and forUriComponent for URLs. Escaping happens before concatenation, so characters like <, >, quotes, &, and control bytes are converted to harmless entities/escapes, preventing both reflected and stored XSS. Each function name signals its sink, making misuse easy to spot in review and reducing the chance of attribute- or script-breaking payloads such as " onmouseover=alert(1) or '</script><script>....
The library is small, dependency-free, and works on any servlet container. It respects the response charset (set to text/html; charset=UTF-8 above) to avoid multi-byte or mojibake bypasses. Use it when rendering mixed contexts in servlets/JSPs or when you cannot rely on a templating engine’s auto-escaping. For rich HTML you intentionally allow, pair with a sanitizer (AntiSamy/OWASP Java HTML Sanitizer) instead of turning off encoding.
Spring HtmlUtils
import org.springframework.web.util.HtmlUtils;
@GetMapping("/profile")
public String showProfile(@RequestParam String name, Model model) {
// SAFE - Escape before adding to model
String safeName = HtmlUtils.htmlEscape(name);
model.addAttribute("userName", safeName);
return "profile";
}
// Or in template:
<%@ page import="org.springframework.web.util.HtmlUtils" %>
<div><%= HtmlUtils.htmlEscape(request.getParameter("name")) %></div>
Why this works: HtmlUtils.htmlEscape() performs HTML entity encoding server-side, converting <, >, &, and quotes into entities so user data is emitted as text, not markup. Because encoding happens before values enter the JSP/servlet response, attacker-controlled characters cannot break out of the element or attribute, mitigating reflected and stored XSS. It ships with Spring Web (no extra dependency) and handles Unicode correctly, especially when paired with text/html; charset=UTF-8 on the response.
This is ideal for legacy JSP/servlet code that lacks template auto-escaping. For JavaScript strings or URLs, combine with context-appropriate encoders (e.g., OWASP Java Encoder) rather than HTML-escaping. If limited HTML must be allowed, run it through a sanitizer first and keep unsafe sections narrowly scoped.
Thymeleaf (Spring Boot) - Auto-Escaping
<!-- SAFE - th:text auto-escapes -->
<div th:text="${userInput}">Default</div>
<!-- SAFE - Attribute binding -->
<input type="text" th:value="${searchTerm}"/>
<!-- SAFE - URL parameters -->
<a th:href="@{/search(q=${query})}">Search</a>
<!-- SAFE - Multiple attributes -->
<div th:attr="data-user=${userName}, data-id=${userId}"></div>
Why this works: Thymeleaf auto-escapes by default for both text (th:text) and attributes (th:value, th:attr), inserting user data as text nodes instead of executable markup. URL expressions (@{...}) also encode parameters, so injected values cannot break out of the link or introduce new attributes. Because escaping is built into the engine, developers don’t have to remember per-sink encoders - safe-by-default rendering closes common reflected/stored XSS paths in forms, labels, and links.
Avoid th:utext unless the content is pre-sanitized and intentionally allowed; the code above uses only escaping expressions. Pair with a CSP for defense in depth and keep templates server-controlled to prevent template injection.
JSF (JavaServer Faces)
<!-- SAFE - h:outputText escapes by default -->
<h:outputText value="#{userBean.name}"/>
<!-- SAFE - escape=true (explicit) -->
<h:outputText value="#{userBean.comment}" escape="true"/>
<!-- SAFE - Input components auto-escape -->
<h:inputText value="#{userBean.search}"/>
Why this works: JSF renderers escape HTML by default, so values bound through EL are emitted as text, not markup. The escape="true" default converts <, >, quotes, and &, preventing attribute- or element-breaking payloads from turning into script. Because escaping lives in the component pipeline (not ad hoc string concatenation), developers gain safe-by-default behavior across views and partial page updates.
Setting escape="false" is an explicit opt-out and should be reserved for trusted, pre-sanitized HTML. Use the default for almost all outputs, pair with a sanitizer for rich text, and keep charset=UTF-8 on responses to avoid encoding tricks.
Jackson for JSON (REST APIs)
import com.fasterxml.jackson.databind.ObjectMapper;
@RestController
public class UserController {
@GetMapping("/api/user")
public ResponseEntity<Map<String, String>> getUser(@RequestParam String name) {
// SAFE - Jackson auto-escapes when serializing
Map<String, String> response = new HashMap<>();
response.put("name", name); // No manual escaping needed
response.put("timestamp", Instant.now().toString());
return ResponseEntity.ok(response); // SAFE JSON
}
}
// SAFE - Using DTO
@GetMapping("/api/profile")
public User getUserProfile(@RequestParam String id) {
User user = userService.findById(id);
return user; // Jackson handles escaping automatically
}
Why this works: Returning JSON instead of HTML removes the browser’s HTML parser from the equation when the response is served with Content-Type: application/json. Jackson serializes data as JSON strings and escapes JSON syntax characters such as ", \, and control characters according to the JSON spec. It should not be treated as HTML encoding: characters such as < may remain literal JSON string data depending on configuration. The security benefit is that the server sends data, not markup, and the client must render that data with safe DOM APIs or an auto-escaping framework.
The use of DTOs/maps with Jackson avoids template injection - fields are serialized by name, not interpolated into HTML. When the frontend consumes the JSON, frameworks (React/Vue/Angular) insert values via DOM APIs, which set textContent/value rather than innerHTML, maintaining escaping on the client as well. If a legacy client turns JSON into HTML, it should still encode for the correct context; however, the server-side guarantee is that no raw HTML leaves the API.
Operationally, this approach is the safest default for REST endpoints. Set Content-Type: application/json; charset=UTF-8 and X-Content-Type-Options: nosniff to reduce content-sniffing risks. Keep business data in JSON and let frontend templating handle display with built-in escaping; combine with CSP on the client to further reduce script injection risk.
Framework-Specific Guidance
Thymeleaf (Spring Boot)
<!-- SAFE - th:text automatically escapes -->
<div th:text="${userInput}">Default</div>
<!-- SAFE - th:attr for attributes -->
<input type="text" th:value="${searchTerm}">
<!-- VULNERABLE - th:utext outputs unescaped -->
<div th:utext="${userInput}">NEVER USE THIS</div>
<!-- Safe URL parameter -->
<a th:href="@{/search(q=${searchTerm})}">Search</a>
Why this works: Thymeleaf auto-escapes by default for both text (th:text) and attribute bindings (th:attr, th:value), converting special characters so user input is inserted as text, not markup or script. The template engine handles context-aware escaping based on where the expression is placed, preventing classic reflected/stored XSS when rendering forms, links, and labels. The explicit th:utext is the opt-out that renders unescaped HTML; keeping it out of templates or restricting it to pre-sanitized, trusted content ensures safe-by-default behavior.
URL expressions (@{...}) encode parameters, so query values cannot break out of the URL or inject new parameters. Because escaping is embedded in the engine, developers don’t have to remember per-sink encoders - security is applied automatically at render time, reducing missed spots in large views. This aligns with Spring Boot MVC conventions and works with internationalization and fragment reuse without additional code.
For production, keep th:utext behind strong validation/sanitization (e.g., AntiSamy) if you must allow limited HTML. Pair Thymeleaf auto-escaping with a Content Security Policy to limit damage from any remaining inline script paths, and ensure templates are not user-supplied - only data flows into expressions.
JSF (JavaServer Faces)
<!-- SAFE - h:outputText escapes by default -->
<h:outputText value="#{bean.userInput}"/>
<!-- SAFE - escape=true (default) -->
<h:outputText value="#{bean.comment}" escape="true"/>
<!-- VULNERABLE - escape=false -->
<h:outputText value="#{bean.comment}" escape="false"/>
Why this works: JSF components like h:outputText escape HTML by default, turning user input into text nodes instead of markup. The escape="true" setting (default) converts <, >, &, and quotes, preventing attacker-controlled data from altering the DOM or injecting scripts. Because escaping is handled by the component renderer, developers don’t need to manually encode each value, reducing omissions across views. Setting escape="false" is an explicit opt-out and should be reserved only for pre-sanitized, trusted fragments, making risky usage visible during reviews.
All standard JSF input/display components benefit from this rendering pipeline, which is consistent across server-side rendering and partial page updates (AJAX). The component model also isolates concerns - data binding populates beans, and renderers apply escaping - so business logic and presentation stay separate. When EL expressions are evaluated, JSF inserts values into the component tree, not raw strings in templates, further reducing injection risk.
In production, keep escape at its default and avoid mixing user HTML with escape="false". If rich text is required, pass it through a sanitizer before rendering. Combine with a CSP to limit inline script execution and configure the view handler to disable legacy inline JavaScript features where possible.
Apache Velocity
#* SAFE - $esc.html() for HTML context *#
<div>$esc.html($userInput)</div>
#* SAFE - $esc.url() for URLs *#
<a href="/search?q=$esc.url($searchTerm)">Search</a>
#* Configure automatic escaping in velocity.properties: *#
eventhandler.escape.html.match = /.*\.vm/
Why this works: Velocity’s $esc.html() and $esc.url() functions perform context-appropriate escaping before user data is inserted into templates, preventing it from being interpreted as HTML or script. HTML escaping converts <, >, ", ', and & to entities, stopping element or attribute injection; URL escaping makes query parameters safe, preventing delimiter injection. Enabling the escape.html event handler in velocity.properties applies HTML escaping automatically to all templates that match the configured pattern, providing a safe-by-default baseline for legacy templates that might otherwise forget to encode output.
Because escaping is done server-side in the rendering pipeline, reflected and stored XSS are mitigated regardless of browser quirks. Explicit helper calls ($esc.html, $esc.url) remain available for clarity in critical spots and other contexts. Automatic escaping reduces the chance of missed encoders in large template sets, while still allowing controlled opt-outs when rendering trusted, sanitized HTML (after a sanitizer like AntiSamy or OWASP Java HTML Sanitizer).
For production, keep automatic escaping enabled globally and review any template that bypasses it. Pair Velocity encoding with a CSP and disable legacy inline script allowances where possible. Treat user-provided templates as untrusted - do not render arbitrary template content; limit to developer-controlled .vm files stored on the server.
Context-Specific Encoding
HTML Body Context
import org.owasp.encoder.Encode;
String userInput = getUserInput();
out.println("<p>" + Encode.forHtml(userInput) + "</p>");
HTML Attribute Context
String userInput = getUserInput();
out.println("<div class=\"" + Encode.forHtmlAttribute(userInput) + "\">");
JavaScript Context
String userName = getUserName();
out.println("<script>");
out.println("var currentUser = '" + Encode.forJavaScript(userName) + "';");
out.println("</script>");
URL Parameter Context
String searchTerm = getSearchTerm();
String url = "/search?q=" + Encode.forUriComponent(searchTerm);
out.println("<a href=\"" + Encode.forHtmlAttribute(url) + "\">Search</a>");
CSS Context (Avoid if Possible)
// WARNING: WARNING: CSS context is complex - avoid user input in CSS
String color = getUserColor();
if (!color.matches("^[a-zA-Z0-9#]+$")) {
throw new SecurityException("Invalid color");
}
out.println("<div style=\"color: " + color + "\">");
Content Security Policy (CSP)
Servlet Filter
@WebFilter("/*")
public class CSPFilter implements Filter {
public void doFilter(ServletRequest request, ServletResponse response,
FilterChain chain) throws IOException, ServletException {
HttpServletResponse httpResponse = (HttpServletResponse) response;
httpResponse.setHeader("Content-Security-Policy",
"default-src 'self'; " +
"script-src 'self'; " +
"style-src 'self'; " +
"img-src 'self' https://trusted-cdn.com; " +
"connect-src 'self'; " +
"frame-ancestors 'none';"
);
chain.doFilter(request, response);
}
}
Spring Security Configuration
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.security.config.Customizer;
import org.springframework.security.config.annotation.web.builders.HttpSecurity;
import org.springframework.security.config.annotation.web.configuration.EnableWebSecurity;
import org.springframework.security.web.SecurityFilterChain;
@Configuration
@EnableWebSecurity
public class SecurityConfig {
@Bean
SecurityFilterChain securityFilterChain(HttpSecurity http) throws Exception {
http.headers(headers -> headers
.contentSecurityPolicy(csp -> csp.policyDirectives(
"default-src 'self'; " +
"script-src 'self'; " +
"object-src 'none'; " +
"base-uri 'self'; " +
"frame-ancestors 'none';"
))
.contentTypeOptions(Customizer.withDefaults()));
return http.build();
}
}
Why this works: A strict Content Security Policy (CSP) blocks execution of injected scripts even if an encoding mistake slips through. The sample policy limits scripts to the same origin (script-src 'self'), prevents plugin execution with object-src 'none', restricts base URL changes with base-uri 'self', and blocks framing with frame-ancestors 'none'. By setting the header at the servlet filter or Spring Security layer, every response inherits the policy without per-controller duplication. X-Content-Type-Options adds defense-in-depth against MIME sniffing; do not rely on legacy browser XSS filters as the primary control.
CSP reduces XSS impact by requiring scripts to come from trusted sources; injected HTML without a permitted script source will not execute. It also mitigates some DOM XSS when inline event handlers are blocked. Combining CSP with proper output encoding covers both prevention (encoding) and mitigation (CSP) layers. For apps using frameworks that inject inline code, consider nonces or hashes instead of broad 'unsafe-inline' allowances.
In production, deploy a report-only CSP first to measure breakage, then enforce. Keep the policy tight (default-src 'self') and add only necessary domains (e.g., CDNs). Pair with output encoding, input validation, and disabling server-side template features that allow raw HTML or scripts from untrusted sources.
Input Validation (Defense in Depth)
import org.owasp.validator.html.*;
public class InputValidator {
public String validateUserInput(String input) {
// Length check
if (input.length() > 1000) {
throw new IllegalArgumentException("Input too long");
}
// Allowlist pattern for specific use cases
if (!input.matches("^[a-zA-Z0-9 .,!?'-]+$")) {
throw new IllegalArgumentException("Invalid characters");
}
return input;
}
public String sanitizeHTML(String input) throws ScanException, PolicyException {
// Use OWASP AntiSamy for rich HTML input
Policy policy = Policy.getInstance(
getClass().getResourceAsStream("/antisamy-policy.xml")
);
AntiSamy antiSamy = new AntiSamy();
CleanResults results = antiSamy.scan(input, policy);
return results.getCleanHTML();
}
}
Why this works: Input validation provides early rejection of obviously unsafe payloads and constrains input size, reducing attack surface before output encoding or sanitization. Length checks prevent oversized payloads used for DoS or multi-encoding tricks. Simple allowlist regexes (^[a-zA-Z0-9 .,!?'-]+$) confine inputs to expected characters for plain text fields, blocking <, >, quotes, and other metacharacters that enable XSS. While validation alone doesn’t neutralize XSS (encoding is still required), it removes dangerous characters up front and simplifies downstream processing.
For rich HTML inputs, AntiSamy applies a policy-driven sanitizer that parses HTML and strips or rewrites disallowed tags/attributes (e.g., removes <script>, unsafe event handlers, javascript: URLs). Policies are versioned and reviewable, making the allowed HTML surface explicit. Sanitization transforms untrusted HTML into a safe subset, after which it can be rendered without relying solely on encoding. This is critical when you intentionally allow markup (comments, CMS content) where pure encoding would display raw tags instead of formatted text.
Use validation + encoding as the default for text inputs; add sanitization only when you must support limited HTML. Keep policies strict and under source control, and pair with output encoding for non-HTML contexts (attributes, JS, URLs). Always set the response charset to UTF-8 to avoid encoding-based bypasses.
JSON/API Responses
Jackson (Automatic Escaping)
import com.fasterxml.jackson.databind.ObjectMapper;
@RestController
public class ApiController {
@GetMapping("/api/user")
public User getUser(@RequestParam String name) {
User user = new User();
user.setName(name); // Jackson serializes this as JSON string data
return user; // SAFE - JSON is properly escaped
}
}
Manual JSON Construction (Avoid)
// VULNERABLE - manual JSON construction
String json = "{\"name\":\"" + userName + "\"}";
// SAFE - use JSON library
ObjectMapper mapper = new ObjectMapper();
Map<String, String> data = new HashMap<>();
data.put("name", userName);
String json = mapper.writeValueAsString(data);
Remediation Steps
- Trace each finding from the source of untrusted data to the response sink: JSP output, servlet
PrintWriter, template expression, JSON serialization, JavaScript block, URL, attribute, or CSS value. - Identify the exact browser context where the value is rendered and choose the matching encoder or framework-native escaped binding.
- Replace raw JSP scriptlets, manual response concatenation, and unescaped template output with
c:out, Thymeleafth:text/attribute bindings, JSF escaped components, or OWASP Java Encoder calls. - For intentionally allowed rich HTML, sanitize with a strict server-side policy and keep the opt-out location narrow and reviewed.
- Serve API data as
application/jsonusing Jackson or another JSON serializer, and make sure clients insert values with safe text/value APIs rather thaninnerHTML. - Add a CSP and
X-Content-Type-Options: nosniffas defense in depth after the output encoding fix is in place.
Testing
- Test normal values containing common punctuation, Unicode, quotes, and ampersands to confirm the page still renders correctly.
- Test HTML payloads such as
<script>alert(1)</script>and<img src=x onerror=alert(1)>in reflected and stored fields. - Test context-breaking payloads for attributes (
" autofocus onfocus=alert(1)), JavaScript strings (';alert(1);//), and URLs (javascript:alert(1)). - Verify rich-text fields preserve only the tags and attributes allowed by the sanitizer policy and remove event handlers or active URL schemes.
- Verify JSON endpoints return
Content-Type: application/jsonand that client code renders returned strings as text, not withinnerHTML. - Retest with CSP report-only or browser developer tools to confirm injected inline scripts and event handlers are blocked as a secondary control.
Common Pitfalls
- Using HTML escaping for JavaScript, CSS, or URL contexts. Each sink needs an encoder for that specific context.
- Treating input validation as the XSS fix. Validation can reject unexpected input, but output encoding is still required at the sink.
- Disabling template escaping with
escapeXml="false",th:utext,escape="false", or raw Velocity output for content that has not been sanitized. - Assuming JSON serialization is HTML encoding. JSON responses are safer when served as
application/json, but clients must still render values through safe DOM APIs. - Adding CSP and leaving raw output unchanged. CSP reduces impact; it does not replace context-specific output encoding.
- Allowing
javascript:,data:, or other active schemes in links after URL encoding. Validate schemes before rendering links.
Dependencies and Installation
Use maintained libraries for the context you need rather than writing custom escaping code:
org.owasp.encoder:encoderfor servlet/JSP output encoding in HTML, attribute, JavaScript, CSS, and URI contexts.- Spring Web's
org.springframework.web.util.HtmlUtilsfor simple HTML escaping in Spring applications. - Jackson
com.fasterxml.jackson.core:jackson-databindfor JSON serialization instead of manual string construction. com.googlecode.owasp-java-html-sanitizer:owasp-java-html-sanitizeror OWASP AntiSamy for intentionally allowed rich HTML; keep sanitizer policy files reviewed and version controlled.
Keep these dependencies current through the project's normal dependency management and security update process. Encoding and sanitization libraries are part of the security boundary, so stale versions should be treated like other vulnerable application dependencies.