API References
pypickle is to save and load variables from pickle files.
Name : pypickle.py Author : E.Taskesen Contact : erdogant@gmail.com Github : https://github.com/erdogant/pypickle Licence : See licences
- class pypickle.pypickle.ValidateUnpickler(file, validate_modules=None, risky_modules=None)
Unpickler that blocks risky modules but allows user-specified modules even if they would normally be blocked.
- find_class(module, name)
Return an object from a specified module.
If necessary, the module will be imported. Subclasses may override this method (e.g. to restrict unpickling of arbitrary classes and functions).
This method is called whenever a class or a function object is needed. Both arguments passed are str objects.
- pypickle.pypickle.check_logger(verbose: [<class 'str'>, <class 'int'>] = 'info')
Check the logger.
- pypickle.pypickle.clean(filename: str) str
Clean the filename to make sure the file can be saved on disk.
Description
The following characters are replaced from the filename: ‘&’, ‘,’, ‘?’, ‘$’, ‘!’ ‘/’, ‘’ with character: ‘_’
- param filename:
Filename.
- type filename:
str
- returns:
TYPE – filename
- rtype:
str
Example
>>> import pypickle >>> filename = 't/st.pkl' >>> data = [1,2,3,4,5] >>> filename = pypickle.clean(filename) >>> # Save >>> status = pypickle.save(filename, data) >>> # Load file >>> data = pypickle.load(filepath)
- pypickle.pypickle.convert_verbose_to_new(verbose)
Convert old verbosity to the new.
- pypickle.pypickle.get_allowed_paths(custom_path=None)
- pypickle.pypickle.get_critical_paths(custom_path=None)
Get critical paths.
Cross-platform safety: Including all known critical paths is a defensive design. his approach makes your security policy more fail-safe and portable. It prevents users from accidentally saving into a sensitive path, even if the code is being executed in a cross-platform environment like Docker, Networked/shared drives, Remote SSH sessions, CI pipelines.
- Parameters:
custom_path ([str, list], optional) – User defined custom critical path is appended to the existing critical paths.
- Returns:
CRITICAL_PATHS
- Return type:
list
- pypickle.pypickle.get_logger()
Return logger status.
- pypickle.pypickle.get_risk_modules()
Risk modules. These modules can be used for code execution, filesystem manipulation, networking, or loading arbitrary code.
- Returns:
risky_modules
- Return type:
dict
- pypickle.pypickle.is_critical_path(filepath: str) bool
Check if a given filepath points to or is nested inside a critical system path, unless it falls under explicitly allowed subpaths.
- Parameters:
filepath (str) – The target path to check.
critical_paths (list or None) – List of system-critical base paths to protect.
allowed_subpaths (list or None) – List of subpaths that are considered safe even if under a critical path.
- Returns:
True if the filepath is within a critical path (and not explicitly allowed).
- Return type:
bool
Notes
Handles cross-platform path resolution.
Allows exceptions for safe subdirectories like temp folders.
Examples
>>> import pypickle >>> import os >>> pypickle.is_critical_path("C:\Users\User\AppData\Local\Temp\myfile.pkl") False >>> pypickle.is_critical_path("C:\Windows\System32\config.sys") True
- pypickle.pypickle.is_known_extension(filepath: str, allowed_extensions=None) bool
Check if the file has an allowed extension.
- Parameters:
filepath (str) – The file path to validate.
allowed_extensions (list or None) – List of allowed file extensions (with dot), e.g., [‘.pkl’, ‘.pickle’, ‘.joblib’]. If None, defaults to common pickle-related formats.
- Returns:
True if the extension is in the allowed list, False otherwise.
- Return type:
bool
Examples
>>> is_known_extension("data.pkl") True >>> is_known_extension("config.yaml") False >>> is_known_extension("model.joblib", ['.joblib']) True
- pypickle.pypickle.is_safe_path(path: str) bool
Safely check if path is within allowed base directories, even across drives.
- pypickle.pypickle.load(filepath: str, fix_imports: bool = True, encoding: str = 'ASCII', errors: str = 'strict', validate: bool | list = True, verbose: str = 'info')
Load a pickle file from disk, with optional security restrictions. pickle files are directly loaded if all modules are in the allowlist. If the pickle file contains unknown modules, the modules needs to be validated using the validate parameter. pickle files that contain risky modules, i.e., those that can automatically make changes on the system or start (unwanted) applications are not allowed unless specifically specified using the validate parameter.
Module Type | Allowed? | How to Change Behavior |—————— | ——– | ———————————————- |Unknown | V | Allowed unless in risky list |Risky (os, etc.) | X | Must be explicitly added via validate=[‘nt’] |Custom safe | V | If included in validate param |- Parameters:
filepath (str) – Path to the pickle file.
fix_imports (bool) – Compatibility for loading Python 2 pickles in Python 3.
encoding (str) – Encoding for legacy Python 2 pickles.
errors (str) – Error handling for decoding.
validate (bool or list, default=True) –
True: Validate with default safe module list.
False: Disable all validation (use at own risk).
list: [‘nt’, ‘sklearn’, ‘pandas’] : modules that are allowed based on name prefixes.
verbose (str) – Verbosity level (not used here, placeholder).
- Returns:
The loaded Python object or None if loading fails.
- Return type:
object or None
Examples
>>> # Example 1 >>> import pypickle >>> filepath = 'model.pkl' >>> data = [1, 2, 3] >>> status = pypickle.save(filepath, data, overwrite=True) >>> # Load with validation (default) >>> data = pypickle.load(filepath) >>> data = pypickle.load(filepath, validate=False) >>> # >>> # Example 2 >>> # Load without validation (not recommended) >>> data = pypickle.load(filepath, validate=False) >>> # >>> # Example 3 >>> # Load without validation: exploit that start calculator >>> data = pypickle.load(r'malicious.pkl') >>> data = pypickle.load(r'malicious.pkl', validate=False) >>> mods = pypickle.validate_modules(r'malicious.pkl') >>> data = pypickle.load(r'malicious.pkl', validate=mods) >>> # >>> # Example 4 >>> # Sklearn example >>> from sklearn.linear_model import LogisticRegression >>> model=LogisticRegression() >>> status = pypickle.save('model.pkl', model, overwrite=True) >>> pypickle.load('model.pkl', validate=False) >>> pypickle.load('model.pkl', validate=True) >>> # >>> # Example 5 >>> mods = pypickle.validate_modules('model.pkl') >>> pypickle.load('model.pkl', validate=mods) >>> pypickle.load('model.pkl', validate='sklearn') >>> #
- pypickle.pypickle.load_and_validate(filepath, fix_imports=True, encoding='ASCII', errors='strict', validate_modules=None)
Securely validate pickle file contents and load it only if safe.
- Parameters:
filepath (str) – Path to the pickle file.
validate_modules (list or None) – List of allowed module prefixes (e.g. [‘sklearn’, ‘numpy’]).
fix_imports (bool)
encoding (str)
errors (str)
- Returns:
The unpickled object or None if loading failed or unsafe.
- Return type:
object or None
- pypickle.pypickle.load_pickle(filepath, fix_imports=True, encoding='ASCII', errors='strict')
Load a pickle file without validation.
- pypickle.pypickle.save(filepath: str, var, overwrite: bool = False, fix_imports: bool = True, allow_external: bool = False, verbose: str = 'info')
Save pickle file for input variables. Before saving, there are various security checks: * The filepath should be inside the safe paths (user and temp directories). However, this can be overwritten using the allow_external parameter. * It is not allowed to save pkl files into (critical) system paths. * Extention must be [‘.pkl’, ‘.pickle’, ‘.pklz’, ‘.pbz2’] to prevent overwriting other file-types * filepaths are checked on traversal
Security Mechanisms and Purpose | Mechanism | Purpose | | ——————————- | ————————————————————- | | allow_external=True | Explicit user opt-in to save outside allowed safe directories | | System path check | Prevents saving in critical system paths | | Extention check | Only save with extention: ‘.pkl’, ‘.pickle’, ‘.pklz’, ‘.pbz2’ | | Path traversal check | Prevents directory traversal exploits like ../../etc/passwd | | Audit logs for external saves | Enables monitoring and traceability of risky saves |
- Parameters:
filepath (str) – Pathname to store pickle file.
var (object) – Any Python object (list, dict, DataFrame, etc.) to be stored.
overwrite (bool, optional (default=False)) – Whether to overwrite the file if it already exists.
fix_imports (bool, optional (default=True)) – Fixes imports for compatibility with Python 2 pickle streams.
allow_external (bool, optional (default=False)) – Allow saving outside predefined safe directories (explicit opt-in). The safe paths are: user and temp directories.
verbose (str or int, optional (default='info')) – Logging verbosity level: ‘debug’, ‘info’, ‘warn’, ‘error’, ‘silent’.
- Returns:
True if the file was saved successfully, False otherwise.
- Return type:
bool
Example
>>> import pypickle >>> import os >>> import tempfile >>> >>> filepath = r'c:/temp/test.pkl' >>> data = [1,2,3,4,5] >>> status = pypickle.save(filepath, data) >>> status = pypickle.save(filepath, data, allow_external=True) >>> status = pypickle.save(filepath, data, allow_external=True, overwrite=True) >>> # >>> filepath = r'c:/temp/test.bat' >>> data = [1,2,3,4,5] >>> status = pypickle.save(filepath, data) >>> # >>> filepath = os.path.join(tempfile.gettempdir(), "test.pkl") >>> status = pypickle.save(filepath, data) >>> status = pypickle.save(filepath, data, overwrite=True) >>> # >>> filepath = r'd:/repos/test.pkl' >>> status = pypickle.save(filepath, data) >>> status = pypickle.save(filepath, data, overwrite=True) >>> status = pypickle.save(filepath, data, overwrite=True, allow_external=True) >>> # >>> filepath = r'C://test.pkl' >>> data = [1,2,3,4,5] >>> status = pypickle.save(filepath, data) >>> status = pypickle.save(filepath, data, allow_external=True, overwrite=True) >>> # >>> filepath = "C:\Users\User\AppData\Local\Temp\myfile.pkl" >>> data = [1,2,3,4,5] >>> status = pypickle.save(filepath, data) >>> status = pypickle.save(filepath, data, allow_external=True)
- pypickle.pypickle.set_logger(verbose: [<class 'str'>, <class 'int'>] = 'info')
Set the logger for verbosity messages.
- Parameters:
verbose ([str, int], default is 'info' or 20) – Set the verbose messages using string or integer values. * [0, 60, None, ‘silent’, ‘off’, ‘no’]: No message. * [10, ‘debug’]: Messages from debug level and higher. * [20, ‘info’]: Messages from info level and higher. * [30, ‘warning’]: Messages from warning level and higher. * [50, ‘critical’]: Messages from critical level and higher.
- Returns:
None.
> # Set the logger to warning
> set_logger(verbose=’warning’)
> # Test with different messages
> logger.debug(“Hello debug”)
> logger.info(“Hello info”)
> logger.warning(“Hello warning”)
> logger.critical(“Hello critical”)
- pypickle.pypickle.validate_modules(filepath: str) list
Extract unique module names from a pickle file.
- Parameters:
filepath (str) – Path to the pickle file.
warn (bool) – Print warnings for risky modules (like os, subprocess).
- Returns:
List of required module name prefixes (e.g. [‘sklearn’, ‘numpy’]).
- Return type:
list