CWE-79: Cross-Site Scripting (XSS) - Python
Overview
XSS occurs when untrusted data is included in web output without proper encoding. Python web frameworks like Django and Flask provide built-in protection, but you must use them correctly.
Primary Defence: Use framework auto-escaping (Django templates with {{ }}, Flask/Jinja2 templates for .html files) for automatic HTML encoding, or html.escape() for manual output encoding. For rich HTML content, use bleach.clean() with allowlist-based sanitization.
Common Vulnerable Patterns
Django mark_safe() Misuse
# VULNERABLE - Marking user input as safe
from django.utils.safestring import mark_safe
def profile_view(request):
user_bio = request.GET.get('bio', '')
safe_bio = mark_safe(user_bio) # DANGEROUS!
return render(request, 'profile.html', {'bio': safe_bio})
Why this is vulnerable: mark_safe() tells Django to skip HTML escaping, so malicious input like <script>alert(document.cookie)</script> renders as executable JavaScript instead of safe text, bypassing Django's automatic XSS protection.
Flask Without Auto-Escaping
# VULNERABLE - Disabling auto-escape
from flask import Flask, request, Markup
app = Flask(__name__)
@app.route('/comment')
def show_comment():
comment = request.args.get('text', '')
return f'<div>{Markup(comment)}</div>' # DANGEROUS!
Why this is vulnerable: Markup() marks strings as safe HTML, disabling Jinja2's auto-escaping, so user input like <img src=x onerror=alert(1)> executes as JavaScript instead of being encoded to safe text entities.
Manual HTML Construction
# VULNERABLE - String concatenation
from flask import Flask, request
@app.route('/greeting')
def greet():
name = request.args.get('name', 'Guest')
html = '<h1>Hello, ' + name + '</h1>'
return html # No escaping!
Why this is vulnerable: Returning raw HTML strings bypasses Jinja2's auto-escaping entirely, so malicious input like <script>alert(1)</script> or <img src=x onerror=alert(1)> executes directly when rendered by the browser.
JavaScript Context Without Escaping
# VULNERABLE - Injecting into JavaScript
def search_view(request):
query = request.GET.get('q', '')
return render(request, 'search.html', {'query': query})
# Template:
# <script>
# var searchTerm = '{{ query }}'; // Can break out with '
# </script>
Why this is vulnerable: Even with Django's HTML escaping, JavaScript string contexts require additional escaping because attackers can break out with quotes like '; alert(1); //, executing arbitrary JavaScript by closing the string and injecting code.
Secure Patterns
Django Auto-Escaping (Default)
# SECURE - Django templates auto-escape by default
from django.shortcuts import render
def profile_view(request):
user_bio = request.GET.get('bio', '')
# Django automatically HTML-escapes user_bio in template
return render(request, 'profile.html', {'bio': user_bio})
# Template (profile.html):
# <div class="bio">
# {{ bio }} <!-- Automatically escaped -->
# </div>
Why this works: Django's template engine automatically HTML-encodes all variable output using the {{ }} syntax by default. When you render {{ bio }} in a template, Django converts dangerous HTML characters (<, >, &, ", ') into their entity equivalents (<, >, &, etc.) before sending to the browser. This happens during the template rendering phase, after your view passes data to the template but before the HTTP response is generated. The auto-escaping is enabled by default in Django settings (autoescape=True) and applies to all templates unless explicitly disabled. This secure-by-default design makes XSS much harder to introduce accidentally - developers must explicitly use the |safe filter or mark_safe() function to bypass escaping, making dangerous usages easy to spot in code reviews. The auto-escaping is context-aware for HTML content and attributes, but you still need escapejs filter for JavaScript contexts and urlencode for URL contexts.
Flask/Jinja2 Auto-Escaping
# SECURE - Flask enables auto-escaping for .html templates
from flask import Flask, render_template, request
app = Flask(__name__)
@app.route('/comment')
def show_comment():
comment = request.args.get('text', '')
return render_template('comment.html', comment=comment)
# Template (comment.html):
# <div class="comment">
# {{ comment }} <!-- Jinja2 auto-escapes -->
# </div>
Why this works: Flask uses Jinja2 as its template engine, which automatically enables HTML auto-escaping for files with .html, .htm, .xml, and .xhtml extensions. When you render {{ comment }} in a template, Jinja2 applies HTML entity encoding (converting <, >, &, ", ' to their entity equivalents) before rendering. This encoding happens during template compilation and rendering, after render_template() passes data to the template but before sending the HTTP response. The auto-escaping is controlled by the file extension - .html files have auto-escaping enabled by default, while .txt files don't. To explicitly disable escaping for trusted HTML content, use the |safe filter or wrap content with Markup(), but this should only be done with sanitized content. Flask's Jinja2 configuration can be customized via app.jinja_env.autoescape, but the secure default should rarely be changed. Like Django, this protects HTML contexts but requires additional encoding for JavaScript, CSS, or URL contexts.
Explicit Escaping with MarkupSafe
# SECURE - Explicit HTML escaping
from markupsafe import escape
def build_html(user_input):
escaped = escape(user_input)
return f'<div>{escaped}</div>'
# Example:
# user_input = '<script>alert("xss")</script>'
# result = '<div><script>alert("xss")</script></div>'
Why this works: MarkupSafe is the library that powers Jinja2's auto-escaping, providing the escape() function for manual HTML encoding outside of templates. When you call escape(user_input), it converts HTML-significant characters (<, >, &, ", ') into their entity equivalents and returns a Markup object that Jinja2 recognizes as already-safe. This prevents double-escaping when the result is later used in a template. The escape() function is more robust than manual string replacement because it handles edge cases and character encodings correctly. MarkupSafe's Markup class tracks which strings have been escaped, preventing accidental double-escaping or re-escaping of safe content. Use escape() when building HTML in Python code (outside templates), especially when combining user input with HTML fragments. The function is efficient and well-tested, making it preferable to writing your own HTML encoding logic.
Context-Specific Encoding
HTML Context
# SECURE - HTML body content
from django.utils.html import escape
def display_message(request):
msg = request.GET.get('msg', '')
safe_msg = escape(msg)
return HttpResponse(f'<p>{safe_msg}</p>')
Why this works: Django's escape() function (from django.utils.html) provides HTML entity encoding for use in Python code when building HTTP responses manually. It converts the standard HTML special characters into entities, preventing XSS when you're not using template rendering. This is the Python-code equivalent of template auto-escaping. Use escape() when constructing HttpResponse objects directly, building error messages, or any scenario where you're bypassing the template layer. The function handles Unicode correctly and returns a string that's safe to embed in HTML. However, template-based rendering with auto-escaping is still preferred over manual HTML construction because it's harder to accidentally forget to encode a value.
JavaScript Context
# SECURE - JavaScript string context
from django.utils.html import escapejs
from django.shortcuts import render
def search_view(request):
query = request.GET.get('q', '')
safe_query = escapejs(query)
return render(request, 'search.html', {'safe_query': safe_query})
# Template:
# <script>
# var searchTerm = '{{ safe_query|escapejs }}';
# console.log(searchTerm);
# </script>
URL Context
# SECURE - URL encoding
from urllib.parse import quote
def build_search_url(query):
encoded_query = quote(query)
return f'/search?q={encoded_query}'
# Django template filter:
# <a href="/search?q={{ query|urlencode }}">Search</a>
JSON Responses
# SECURE - JSON is automatically escaped
from django.http import JsonResponse
from flask import jsonify
# Django:
def api_user(request, user_id):
user = User.objects.get(id=user_id)
return JsonResponse({
'name': user.name, # Automatically JSON-escaped
'bio': user.bio
})
# Flask:
@app.route('/api/user/<int:user_id>')
def api_user(user_id):
user = get_user(user_id)
return jsonify({
'name': user.name,
'bio': user.bio
})
Why this works: Both JsonResponse() in Django and jsonify() in Flask automatically serialize Python objects to JSON with proper escaping and set the Content-Type: application/json header. JSON encoding handles special characters according to JSON specification, escaping quotes, backslashes, and control characters. The critical security feature is the content type header - it tells browsers to treat the response as JSON data, not HTML, preventing the browser from parsing any HTML or JavaScript in the response. This makes JSON responses inherently safe from XSS because browsers won't execute scripts in JSON content. However, if you later insert this JSON into HTML (like with innerHTML), you'll need HTML encoding at that point. JSON encoding is perfect for API responses but doesn't replace HTML encoding for web pages.
Framework-Specific Guidance
Django
# SECURE - Default behavior is safe
from django.shortcuts import render
from django.utils.html import escape, format_html
def comment_view(request):
author = request.GET.get('author', '')
text = request.GET.get('text', '')
# Template auto-escapes these
return render(request, 'comment.html', {
'author': author,
'text': text
})
# Template (comment.html):
# <div class="comment">
# <strong>{{ author }}</strong>: {{ text }}
# </div>
# For building HTML in Python code:
from django.utils.html import format_html
def build_message(username, msg):
return format_html(
'<div class="msg"><b>{}</b>: {}</div>',
username,
msg
) # format_html auto-escapes arguments
Django Settings:
# settings.py - Ensure templates auto-escape
TEMPLATES = [{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'OPTIONS': {
'autoescape': True, # Default, don't disable!
},
}]
Flask / Jinja2
# SECURE - Jinja2 auto-escapes .html templates
from flask import Flask, render_template, request
from markupsafe import escape
app = Flask(__name__)
@app.route('/profile/<username>')
def profile(username):
bio = request.args.get('bio', '')
# Auto-escaped in template
return render_template('profile.html',
username=username,
bio=bio)
# profile.html:
# <h1>{{ username }}'s Profile</h1>
# <p>{{ bio }}</p>
# For manual HTML building:
@app.route('/message')
def message():
text = request.args.get('text', '')
escaped_text = escape(text)
return f'<div>{escaped_text}</div>'
FastAPI
# SECURE - Jinja2 templates with FastAPI
from fastapi import FastAPI, Request
from fastapi.templating import Jinja2Templates
app = FastAPI()
templates = Jinja2Templates(directory="templates")
@app.get("/profile/{user_id}")
async def profile(request: Request, user_id: int, bio: str = ""):
return templates.TemplateResponse("profile.html", {
"request": request,
"user_id": user_id,
"bio": bio # Auto-escaped
})
# JSON responses are automatically safe:
@app.get("/api/user/{user_id}")
async def get_user(user_id: int):
return {"name": "John", "bio": "<script>alert('xss')</script>"}
# FastAPI serializes to JSON, which escapes special chars
Rich HTML Sanitization
When you need to allow safe HTML (e.g., WYSIWYG editor):
# Use bleach library for HTML sanitization
import bleach
ALLOWED_TAGS = ['p', 'br', 'strong', 'em', 'ul', 'ol', 'li', 'a']
ALLOWED_ATTRIBUTES = {'a': ['href', 'title']}
def sanitize_html(dirty_html):
clean = bleach.clean(
dirty_html,
tags=ALLOWED_TAGS,
attributes=ALLOWED_ATTRIBUTES,
strip=True
)
return clean
# Django view:
from django.utils.safestring import mark_safe
def save_article(request):
content = request.POST.get('content', '')
sanitized = sanitize_html(content)
article = Article.objects.create(
title=request.POST.get('title'),
content=sanitized
)
return redirect('article_detail', pk=article.pk)
# Template (only mark_safe AFTER sanitization):
# <div class="article-content">
# {{ article.content|safe }}
# </div>
Installation:
Input Validation (Defense in Depth)
# Django forms with validation
from django import forms
class CommentForm(forms.Form):
author = forms.CharField(
max_length=100,
required=True,
validators=[
RegexValidator(
regex=r'^[a-zA-Z0-9\s]+$',
message='Only alphanumeric characters allowed'
)
]
)
text = forms.CharField(
max_length=1000,
widget=forms.Textarea,
validators=[
lambda value: '<script>' not in value.lower()
]
)
# View:
def post_comment(request):
form = CommentForm(request.POST)
if form.is_valid():
# Even with validation, template still auto-escapes
comment = form.cleaned_data['text']
Comment.objects.create(text=comment)
return redirect('comments')
Content Security Policy
# Django middleware for CSP
from django.utils.deprecation import MiddlewareMixin
class SecurityHeadersMiddleware(MiddlewareMixin):
def process_response(self, request, response):
response['Content-Security-Policy'] = (
"default-src 'self'; "
"script-src 'self' https://trusted-cdn.com; "
"style-src 'self' 'unsafe-inline'; "
"img-src 'self' data: https:; "
"frame-ancestors 'none';"
)
response['X-Content-Type-Options'] = 'nosniff'
response['X-Frame-Options'] = 'DENY'
response['X-XSS-Protection'] = '1; mode=block'
return response
# settings.py
MIDDLEWARE = [
'myapp.middleware.SecurityHeadersMiddleware',
# ... other middleware
]
# Flask:
from flask import Flask
app = Flask(__name__)
@app.after_request
def set_security_headers(response):
response.headers['Content-Security-Policy'] = (
"default-src 'self'; "
"script-src 'self'"
)
response.headers['X-Content-Type-Options'] = 'nosniff'
return response
Verification
To verify XSS protection is working:
- Test with XSS payloads: Submit common XSS patterns (
<script>alert('xss')</script>,<img src=x onerror=alert('xss')>, etc.) and verify they appear encoded in the response - Check HTML source: View the page source to confirm user input is HTML-encoded (
<appears as<,>as>) - Test JavaScript context: If data appears in
<script>tags, verify quotes and special characters are properly JavaScript-encoded usingescapejsfilter - Review templates: Search for
|safefilter,mark_safe(),Markup(), or{% autoescape off %}and verify they're only used with sanitized content - Check framework settings: Confirm Django's
autoescapeisTruein template settings, or Flask templates use.htmlextension - Test URL context: Verify user data in URLs is URL-encoded using
urlencodefilter orquote() - Check Content-Security-Policy: Verify CSP headers are present and properly configured to restrict script sources
- Use browser DevTools: Inspect rendered HTML to ensure no unencoded user input appears