Skip to content

CWE-502: Insecure Deserialization - Python

Overview

Python's pickle module can execute arbitrary code during deserialization, making it extremely dangerous when used with untrusted data. Attackers can craft malicious pickle payloads that execute commands when unpickled.

Primary Defence: Use JSON (json.loads()), MessagePack, or Protocol Buffers for data serialization instead of pickle. Never unpickle untrusted data.

Common Vulnerable Patterns

pickle.loads() with Untrusted Data

# VULNERABLE - Never unpickle untrusted data!
import pickle

def load_user(data):
    # DANGEROUS: Can execute arbitrary code
    user = pickle.loads(data)
    return user

# Attacker can craft malicious payload:
# import os; os.system('rm -rf /')

Why this is vulnerable:

  • Executes attacker-controlled opcodes during unpickling.
  • Invokes __reduce__/__setstate__ hooks.
  • Can import modules and run system commands.
  • Enables RCE before validation occurs.

pickle.load() from Untrusted File

# VULNERABLE - File could contain malicious pickle

import pickle

def load_from_file(filename):
    with open(filename, 'rb') as f:
        data = pickle.load(f)  # RCE if file is malicious!
    return data

Why this is vulnerable:

  • File contents can be attacker-controlled.
  • pickle.load() executes during read.
  • Payloads run as the app user.
  • Easy to trigger via uploads or traversal.

Django Session Deserialization

settings.py
# VULNERABLE - Using pickle for sessions
SESSION_SERIALIZER = 'django.contrib.sessions.serializers.PickleSerializer'
# Attacker can manipulate session cookie to execute code!

Why this is vulnerable:

  • If an attacker can forge or tamper with session data, pickle deserialization runs.
  • Pickle deserialization runs on every request.
  • Magic hooks can execute code server-side.
  • Enables persistent RCE via session tampering.

PyYAML unsafe_load()

# VULNERABLE - yaml.unsafe_load can execute Python code

import yaml

def load_config(yaml_data):
    config = yaml.unsafe_load(yaml_data)  # DANGEROUS!
    return config

# Can execute: !!python/object/apply:os.system ["rm -rf /"]

Why this is vulnerable:

  • Supports arbitrary object construction tags.
  • Can call functions via !!python/object/apply.
  • Executes code during parsing.
  • yaml.unsafe_load() is explicitly unsafe for untrusted YAML.

jsonpickle with Untrusted Data

# VULNERABLE - jsonpickle can instantiate arbitrary classes

import jsonpickle

def deserialize(data):
    obj = jsonpickle.decode(data)  # Can deserialize any class!
    return obj

Why this is vulnerable:

  • Embeds class metadata in JSON payloads.
  • Instantiates attacker-chosen classes.
  • __init__ and property hooks can execute code.
  • Bypasses JSON's data-only safety.

pandas.read_pickle() with Untrusted Data

# VULNERABLE - pandas uses pickle internally

import pandas as pd

def load_dataframe(filename):
    # DANGEROUS: Executes arbitrary code if file is malicious!
    df = pd.read_pickle(filename)
    return df

# Or from untrusted bytes:
def load_from_bytes(data):
    import io
    # DANGEROUS: Can execute code during deserialization
    df = pd.read_pickle(io.BytesIO(data))
    return df

Why this is vulnerable:

  • read_pickle() uses pickle.load() internally.
  • Malicious pickle payload executes on deserialization.
  • File uploads or user-provided paths can trigger RCE.
  • No validation occurs before code execution.
  • Works with files, BytesIO, or any file-like object.

pandas.DataFrame.to_pickle() Data Tampering

# VULNERABLE - Storing pickled data accessible to users

import pandas as pd

def save_user_data(user_id, dataframe):
    # Saves pickled DataFrame to predictable path
    filename = f'/tmp/user_data_{user_id}.pkl'
    dataframe.to_pickle(filename)

def load_user_data(user_id):
    filename = f'/tmp/user_data_{user_id}.pkl'
    # DANGEROUS: Attacker can replace file with malicious pickle
    return pd.read_pickle(filename)

Why this is vulnerable:

  • Pickle files can be modified by attackers.
  • No integrity checking or signatures.
  • Predictable paths enable tampering.
  • Loading modified pickle executes attacker code.
  • Trust boundary violated when files leave app control.

Secure Patterns

Use JSON Instead of Pickle

# SECURE - JSON cannot execute code

import json

def save_user(user):
    user_dict = {
        'name': user.name,
        'email': user.email,
        'age': user.age
    }
    return json.dumps(user_dict)

def load_user(json_data):
    # JSON only creates basic Python types (dict, list, str, int, etc.)
    user_dict = json.loads(json_data)

    # Manually reconstruct object
    user = User(
        name=user_dict['name'],
        email=user_dict['email'],
        age=user_dict['age']
    )
    return user

# Example

user = User(name="John", email="john@example.com", age=30)
json_str = save_user(user)
restored_user = load_user(json_str)

Why this works:

  • Builds only primitive types and collections.
  • No object instantiation or code execution.
  • Ignores class/module metadata entirely.
  • Forces explicit reconstruction with validation.
  • Prevents gadget chains by design.

PyYAML with safe_load()

# SECURE - safe_load only creates safe Python types

import yaml

def load_config(yaml_data):
    # safe_load prevents arbitrary code execution
    config = yaml.safe_load(yaml_data)
    return config

# Example YAML

yaml_data = """
database:
  host: localhost
  port: 5432
  name: mydb
"""

config = yaml.safe_load(yaml_data)
print(config['database']['host'])  # localhost

Why this works:

  • Restricts tags to safe scalar/collection types.
  • Blocks object construction tags by default.
  • No constructor or function invocation.
  • Rejects unsafe YAML with clear errors.
  • Safe for configs and data exchange.

msgpack for Binary Serialization

# SECURE - MessagePack is safe binary format

import msgpack

def serialize(data):
    # Only serializes basic types
    return msgpack.packb(data)

def deserialize(packed_data):
    # Cannot execute code
    return msgpack.unpackb(packed_data, raw=False)

# Example
user_data = {'name': 'John', 'email': 'john@example.com'}
packed = serialize(user_data)
restored = deserialize(packed)

Why this works:

  • Encodes only primitives and collections.
  • unpackb() returns built-in types only.
  • No object metadata or callables.
  • Language-agnostic, data-only format.
  • Safer binary alternative to pickle.

Installation:

pip install msgpack

pandas with Safe Formats (CSV, Parquet, Feather)

# SECURE - Use CSV, Parquet, or Feather instead of pickle

import pandas as pd

# Option 1: CSV (human-readable, widely compatible)
def save_dataframe_csv(df, filename):
    df.to_csv(filename, index=False)

def load_dataframe_csv(filename):
    # CSV is safe - only contains data, no code
    return pd.read_csv(filename)

# Option 2: Parquet (efficient, compressed, type-safe)
def save_dataframe_parquet(df, filename):
    df.to_parquet(filename, engine='pyarrow', compression='snappy')

def load_dataframe_parquet(filename):
    # Parquet is safe - binary format, no code execution
    return pd.read_parquet(filename, engine='pyarrow')

# Option 3: Feather (fast, preserves types)
def save_dataframe_feather(df, filename):
    df.to_feather(filename)

def load_dataframe_feather(filename):
    # Feather is safe - columnar format, data only
    return pd.read_feather(filename)

# Option 4: HDF5 (for large datasets, trusted sources only)
def save_dataframe_hdf(df, filename):
    df.to_hdf(filename, key='data', mode='w')

def load_dataframe_hdf(filename):
    # HDF5 should only be used with trusted files
    return pd.read_hdf(filename, key='data')

Why this works:

  • CSV/Parquet/Feather are data-only formats.
  • No object deserialization or code execution.
  • Preserves DataFrame structure and types.
  • Better performance than pickle in many cases.
  • Cross-language compatibility (especially Parquet).
  • Industry-standard formats for data science.

Format comparison:

Format Safety Speed Size Type Preservation Use Case
CSV ✅ Safe Slow Large Limited Human-readable, universal
Parquet ✅ Safe Fast Small Excellent Production, big data
Feather ✅ Safe Very Fast Medium Excellent Inter-process, caching
HDF5 ⚠️ Trusted-only Fast Small Good Scientific, time-series
Pickle ❌ Unsafe Medium Medium Perfect Never use with untrusted data

Installation for optional formats:

# For Parquet

pip install pyarrow
# or
pip install fastparquet

# For HDF5 (trusted sources only)

pip install tables

pandas with JSON for Simple DataFrames

# SECURE - JSON for DataFrames with basic types

import pandas as pd
import json

def save_dataframe_json(df, filename):
    # Convert to JSON with proper orientation
    df.to_json(filename, orient='records', lines=True)

def load_dataframe_json(filename):
    # JSON is safe - no code execution
    return pd.read_json(filename, orient='records', lines=True)

# Alternative: For API responses or small data

def dataframe_to_dict(df):
    # Convert to list of dicts
    return df.to_dict(orient='records')

def dict_to_dataframe(data):
    # Safe reconstruction from dicts
    return pd.DataFrame(data)

# Usage
df = pd.DataFrame({
    'name': ['Alice', 'Bob', 'Charlie'],
    'age': [25, 30, 35],
    'city': ['NYC', 'LA', 'Chicago']
})

# Save and load

save_dataframe_json(df, 'data.json')
restored_df = load_dataframe_json('data.json')

# API serialization

data_dict = dataframe_to_dict(df)
json_str = json.dumps(data_dict)
# ... send over network ...
parsed_data = json.loads(json_str)
new_df = dict_to_dataframe(parsed_data)

Why this works:

  • JSON only contains primitive data types.
  • read_json() doesn't execute code.
  • Works well for DataFrames with simple types.
  • Human-readable and debuggable.
  • Perfect for APIs and web applications.
  • No code execution vectors.

Note: JSON is less efficient than Parquet/Feather for large datasets and doesn't preserve all pandas dtypes perfectly (e.g., timezone-aware datetimes need special handling).

Python Library Safety Matrix

When reviewing code, use this matrix to identify unsafe deserialization libraries:

Safe Alternatives

json (standard library)

  • Use instead of pickle for all serialization needs
  • Cannot execute code or instantiate arbitrary classes
  • Only creates basic Python types (dict, list, str, int, float, bool, None)
import json
data = json.loads(untrusted_input)  # Safe

PyYAML with safe_load()

  • NOT yaml.load() - that's unsafe!
  • Only creates safe Python objects
import yaml
data = yaml.safe_load(input)  # SAFE

# yaml.load(input) is UNSAFE - allows arbitrary objects

msgpack

  • Safe binary serialization format
  • Fast and compact
  • Cannot instantiate classes
import msgpack
data = msgpack.unpackb(packed_data, raw=False)  # Safe

pandas safe formats

  • Use CSV, Parquet, or Feather instead of pickle
  • HDF5 should only be used with trusted sources
import pandas as pd

# Safe alternatives

df = pd.read_csv('data.csv')  # Safe
df = pd.read_parquet('data.parquet')  # Safe
df = pd.read_feather('data.feather')  # Safe
df = pd.read_hdf('data.h5', key='data')  # Safe
df = pd.read_json('data.json')  # Safe

NEVER Use with Untrusted Data

pickle / cPickle / _pickle

  • Always allows arbitrary code execution
  • Can invoke __reduce__ method to execute commands
  • Replace with json or msgpack
import pickle
pickle.loads(untrusted_data)  # NEVER DO THIS - RCE vulnerability!

marshal

  • Similar to pickle, allows code execution
  • Designed for internal Python use only
  • Never use with external data
import marshal
marshal.loads(data)  # Unsafe with untrusted data

shelve

  • Uses pickle internally
  • Inherits all pickle vulnerabilities
  • Replace with JSON-based storage
import shelve
db = shelve.open('data.db')  # Uses pickle - unsafe!

PyYAML yaml.load()

  • Allows arbitrary object instantiation
  • Deprecated in favor of safe_load()
  • Always use yaml.safe_load() instead
import yaml
yaml.load(data)  # UNSAFE - deprecated
yaml.unsafe_load(data)  # UNSAFE - explicitly dangerous
yaml.safe_load(data)  # SAFE

jsonpickle.decode()

  • Deserializes to Python objects
  • Can instantiate arbitrary classes
  • Use json.loads() instead
import jsonpickle
jsonpickle.decode(data)  # Can instantiate any class - unsafe!

pandas.read_pickle()

  • Uses pickle internally, inherits all pickle vulnerabilities
  • Can execute arbitrary code during DataFrame loading
  • Use pd.read_csv(), pd.read_parquet(), or pd.read_feather() instead
import pandas as pd
df = pd.read_pickle('data.pkl')  # NEVER with untrusted data - RCE vulnerability!
df = pd.read_pickle(BytesIO(untrusted_bytes))  # DANGEROUS!

Migration Recommendations

If you find these patterns in security scan results:

  1. pickle.loads() → Switch to json.loads()
  2. pickle.load() → Switch to json.load()
  3. marshal.loads() → Switch to json.loads()
  4. shelve.open() → Use JSON files or database
  5. yaml.load() → Switch to yaml.safe_load()
  6. jsonpickle.decode() → Switch to json.loads()
  7. pd.read_pickle() → Switch to pd.read_parquet(), pd.read_csv(), or pd.read_feather()

Example migration:

# BEFORE (Unsafe)
import pickle
user = pickle.loads(request.data)

# AFTER (Safe)
import json
user_data = json.loads(request.data)
user = User(**user_data)  # Manually construct object
# BEFORE (Unsafe)
import pandas as pd
df = pd.read_pickle('user_data.pkl')

# AFTER (Safe) - Option 1: Parquet (recommended for performance)
import pandas as pd
df = pd.read_parquet('user_data.parquet')

# AFTER (Safe) - Option 2: CSV (for human-readable data)
import pandas as pd
df = pd.read_csv('user_data.csv')

# AFTER (Safe) - Option 3: Feather (for fast I/O)
import pandas as pd
df = pd.read_feather('user_data.feather')

Django with JSON Serializer

# SECURE - Use JSON for Django sessions

# settings.py

SESSION_SERIALIZER = 'django.contrib.sessions.serializers.JSONSerializer'

# Or for custom serialization:

from django.core.serializers import serialize, deserialize

# Serialize Django models

json_data = serialize('json', User.objects.all())

# Deserialize

users = list(deserialize('json', json_data))

Restricted Unpickler (If Pickle Required)

# SECURE - Allowlist allowed classes

import pickle
import io

class RestrictedUnpickler(pickle.Unpickler):
    """Only allow allowlisted classes to be unpickled"""

    ALLOWED_CLASSES = {
        ('__main__', 'User'),
        ('__main__', 'Address'),
        ('builtins', 'dict'),
        ('builtins', 'list'),
        ('builtins', 'str'),
        ('builtins', 'int'),
    }

    def find_class(self, module, name):
        if (module, name) not in self.ALLOWED_CLASSES:
            raise pickle.UnpicklingError(
                f"Class {module}.{name} is not allowed"
            )
        return super().find_class(module, name)

def safe_unpickle(data):
    return RestrictedUnpickler(io.BytesIO(data)).load()

# Usage

try:
    obj = safe_unpickle(untrusted_data)
except pickle.UnpicklingError as e:
    print(f"Unsafe pickle rejected: {e}")

Framework-Specific Guidance

Django

# SECURE - Django REST Framework with JSON

from rest_framework import serializers, viewsets
from rest_framework.response import Response

class UserSerializer(serializers.ModelSerializer):
    class Meta:
        model = User
        fields = ['id', 'name', 'email', 'age']

class UserViewSet(viewsets.ModelViewSet):
    queryset = User.objects.all()
    serializer_class = UserSerializer

    def create(self, request):
        # DRF automatically deserializes JSON to User model
        serializer = self.get_serializer(data=request.data)
        serializer.is_valid(raise_exception=True)
        user = serializer.save()
        return Response(serializer.data)

# settings.py - Use JSON session serializer
SESSION_SERIALIZER = 'django.contrib.sessions.serializers.JSONSerializer'

# Never use:
# SESSION_SERIALIZER = 'django.contrib.sessions.serializers.PickleSerializer'  # INSECURE

Why this works:

  • DRF parses JSON into primitive types, not arbitrary Python objects.
  • Serializers validate and coerce fields before saving models.
  • JSON session serializer avoids pickle gadgets on request load.

Flask

# SECURE - Flask with JSON

from flask import Flask, request, jsonify
from dataclasses import dataclass, asdict
import json

app = Flask(__name__)

@dataclass
class User:
    name: str
    email: str
    age: int

@app.route('/users', methods=['POST'])
def create_user():
    # Flask automatically parses JSON
    data = request.get_json()

    # Manually construct object from dict
    user = User(
        name=data['name'],
        email=data['email'],
        age=data['age']
    )

    # Save user...
    return jsonify(asdict(user)), 201

@app.route('/users/<int:user_id>')
def get_user(user_id):
    user = get_user_from_db(user_id)
    # JSON serialization is safe
    return jsonify(asdict(user))

# For sessions, Flask uses signed cookies (integrity, not confidentiality)
app.config['SECRET_KEY'] = 'generate-strong-random-key'

Why this works:

  • request.get_json() yields basic types only (dict/list/str/int).
  • The dataclass is constructed explicitly from validated fields.
  • Signed cookies prevent tampering without exposing server-side objects.

FastAPI

# SECURE - FastAPI with Pydantic (JSON-based)

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel, EmailStr, Field

app = FastAPI()

class User(BaseModel):
    name: str = Field(..., min_length=1, max_length=100)
    email: EmailStr
    age: int = Field(..., ge=0, le=150)

@app.post("/users", response_model=User)
async def create_user(user: User):
    # FastAPI automatically validates and deserializes JSON
    # Pydantic ensures type safety
    save_user(user)
    return user

@app.get("/users/{user_id}", response_model=User)
async def get_user(user_id: int):
    user = get_user_from_db(user_id)
    if not user:
        raise HTTPException(status_code=404)
    return user

Why this works:

  • Pydantic validates and parses JSON into a safe, typed model.
  • No arbitrary class instantiation from attacker-supplied metadata.
  • Responses are serialized to JSON without executing code.

Data Classes with JSON

# SECURE - Python 3.7+ dataclasses with JSON

from dataclasses import dataclass, asdict
import json

@dataclass
class User:
    name: str
    email: str
    age: int

def serialize_user(user: User) -> str:
    return json.dumps(asdict(user))

def deserialize_user(json_str: str) -> User:
    data = json.loads(json_str)
    return User(**data)

# Usage

user = User(name="John", email="john@example.com", age=30)
json_str = serialize_user(user)
restored = deserialize_user(json_str)

Input Validation

# Validate after deserialization

from pydantic import BaseModel, validator, EmailStr

class User(BaseModel):
    name: str
    email: EmailStr
    age: int

    @validator('name')
    def name_must_not_be_empty(cls, v):
        if not v or not v.strip():
            raise ValueError('Name cannot be empty')
        return v

    @validator('age')
    def age_must_be_reasonable(cls, v):
        if v < 0 or v > 150:
            raise ValueError('Age must be between 0 and 150')
        return v

# Usage

try:
    user_data = json.loads(untrusted_json)
    user = User(**user_data)  # Validates during construction
except ValueError as e:
    print(f"Validation error: {e}")

Important: Validation is only safe after deserializing data-only formats like JSON or MessagePack. It is not sufficient for unsafe deserialization formats (pickle, yaml.unsafe_load, marshal) because code can execute during parsing.

Signature Verification

# SECURE - Verify HMAC before deserializing

import hmac
import hashlib
import json

class SignedSerializer:
    def __init__(self, secret_key: bytes):
        self.secret_key = secret_key

    def serialize(self, obj: dict) -> bytes:
        json_data = json.dumps(obj).encode('utf-8')

        # Create HMAC
        signature = hmac.new(
            self.secret_key,
            json_data,
            hashlib.sha256
        ).digest()

        # Return signature + data
        return signature + json_data

    def deserialize(self, signed_data: bytes) -> dict:
        # Extract signature and data
        signature = signed_data[:32]  # SHA-256 is 32 bytes
        json_data = signed_data[32:]

        # Verify signature
        expected_signature = hmac.new(
            self.secret_key,
            json_data,
            hashlib.sha256
        ).digest()

        if not hmac.compare_digest(signature, expected_signature):
            raise ValueError("Invalid signature")

        # Only deserialize if signature is valid
        return json.loads(json_data)

# Usage

serializer = SignedSerializer(b'your-secret-key-here')
signed = serializer.serialize({'user': 'john', 'role': 'admin'})
data = serializer.deserialize(signed)

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

  • Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
  • Code review: Confirm all instances use safe deserialization APIs and reject unsafe formats
  • Static analysis: Use security scanners to verify no unsafe deserialization patterns remain
  • Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
  • Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
  • Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
  • Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
  • Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced

Python Deserialization Library Safety Matrix

Use this reference when reviewing Python deserialization code:

json (standard library):

import json
data = json.loads(user_input)  # Safe - data only, no code execution
  • Cannot execute code
  • Only deserializes basic types (dict, list, str, int, float, bool, None)
  • Use for all untrusted data

yaml.safe_load() (PyYAML):

import yaml
data = yaml.safe_load(user_input)  # Safe - restricted to safe types
  • Only constructs standard Python objects
  • Cannot instantiate arbitrary classes
  • No !!python/object or !!python/object/apply tags

msgpack (MessagePack):

import msgpack
data = msgpack.unpackb(user_input, raw=False)  # Safe - data only
  • Binary JSON alternative
  • No code execution capability
  • Efficient for large datasets

WARNING: Requires Careful Configuration

yaml.load() with SafeLoader:

import yaml
data = yaml.load(user_input, Loader=yaml.SafeLoader)  # Same as safe_load()
  • Explicitly specify Loader=yaml.SafeLoader
  • Never use Loader=yaml.Loader or Loader=yaml.UnsafeLoader

yaml.load() without explicit loader (DEPRECATED):

# DEPRECATED in PyYAML 5.1+ - will raise warning

data = yaml.load(user_input)  # Defaults to unsafe in older versions
  • Modern PyYAML requires explicit Loader
  • Update code to use yaml.safe_load() instead

Unsafe (Never Use with Untrusted Data)

pickle / cPickle / _pickle:

import pickle
data = pickle.loads(untrusted_data)  # DANGEROUS - arbitrary code execution
  • Can execute arbitrary Python code during deserialization
  • Exploits via __reduce__, __setstate__, __getstate__ methods
  • No safe configuration exists
  • Only use with data you generated yourself in controlled environment

marshal:

import marshal
data = marshal.loads(untrusted_data)  # DANGEROUS
  • Low-level serialization for Python bytecode
  • Can execute code
  • Intended for .pyc files, not data exchange

shelve (uses pickle internally):

import shelve
db = shelve.open('data.db')
db[key] = untrusted_object  # DANGEROUS if object comes from user
  • Built on pickle
  • Inherits all pickle vulnerabilities
  • Only use for locally-generated data

yaml.unsafe_load() / yaml.full_load() with custom tags:

import yaml
data = yaml.unsafe_load(user_input)  # EXTREMELY DANGEROUS

# OR
data = yaml.full_load(user_input)  # Can instantiate arbitrary classes
  • Allows !!python/object/apply tag for arbitrary code execution:
  !!python/object/apply:os.system ['rm -rf /']
  • No legitimate use case for untrusted data

jsonpickle:

import jsonpickle
obj = jsonpickle.decode(user_input)  # DANGEROUS
  • Serializes Python objects to JSON, preserving type information
  • Can instantiate arbitrary classes
  • Vulnerable to gadget chains like pickle

dill (extended pickle):

import dill
obj = dill.loads(untrusted_data)  # DANGEROUS
  • More powerful than pickle
  • Same security issues, even worse

Migration Examples

From pickle to JSON:

# OLD (unsafe)
import pickle
with open('data.pkl', 'rb') as f:
    data = pickle.load(f)

# NEW (safe)
import json
with open('data.json', 'r') as f:
    data = json.load(f)

From yaml.load() to yaml.safe_load():

# OLD (unsafe in PyYAML < 5.1, deprecated in 5.1+)
import yaml
with open('config.yaml') as f:
    config = yaml.load(f)

# NEW (safe)
import yaml
with open('config.yaml') as f:
    config = yaml.safe_load(f)

Handling Custom Objects (Safe Pattern):

# Instead of pickling custom objects, serialize to dict:

from dataclasses import dataclass, asdict
import json

@dataclass
class User:
    name: str
    email: str

# Serialize
user = User("John", "john@example.com")
user_json = json.dumps(asdict(user))

# Deserialize
user_dict = json.loads(user_json)
restored_user = User(**user_dict)

Key Takeaway: Python's pickle/marshal/shelve modules can execute arbitrary code and should never be used with untrusted data. Always use JSON or MessagePack for data from external sources.

Additional Resources