cufile-patcher

PyPI version Python versions PyPI downloads License CI Coverage pages

cuFile-first patching toolkit for large model loading using:

Project structure

.
├── .github/workflows/
│   ├── ci.yml
│   ├── coverage-pages.yml
│   └── publish.yml
├── .vscode/
│   ├── extensions.json
│   └── settings.json
├── src/cufile_patcher/
│   ├── __init__.py
│   ├── auto_patch.py
│   ├── bindings.py
│   ├── cufile.py
│   ├── cufile_types.py
│   ├── core.py
│   ├── registry.py
│   ├── safetensor_patcher.py
│   ├── service.py
│   ├── tensorflow_patcher.py
│   ├── torch_patcher.py
│   └── plugins/
│       ├── __init__.py
│       ├── base.py
│       └── system.py
├── tests/test_auto_patch.py
├── tests/test_backend_core.py
├── tests/test_core.py
├── tests/test_safetensor_patcher.py
├── tests/test_tensorflow_patcher.py
├── tests/test_torch_patcher.py
├── AGENTS.md
├── README.md
└── pyproject.toml

Quick start

Install dependencies:

uv sync --all-groups

Install package variants:

pip install cufile-patcher
pip install "cufile-patcher[all]"
pip install "cufile-patcher[tf]"
pip install "cufile-patcher[tensorflow]"
pip install "cufile-patcher[torch]"
pip install "cufile-patcher[pytorch]"

Run lint:

uv run ruff check .

Run tests:

uv run pytest

Package function

from cufile_patcher import hello_world

print(hello_world())

Expected output:

Hello, world!

cuFile API (ported)

The package includes a modernized port of cuFile wrapper features:

Plugin architecture

The backend is plugin-based using OOP boundaries:

You can register a custom backend for mocks, testing, or alternate transports.

Framework patchers

The project provides dedicated patchers for streaming large model files:

Both patchers support:

Context manager auto patching

Use a single context manager to install and remove framework patchers automatically:

from cufile_patcher import auto_patch

with auto_patch():
    # existing framework load calls can remain unchanged
    ...
from cufile_patcher import auto_patch

# Recommended default for most projects.
with auto_patch(min_file_size_mb=100, chunk_size_mb=64):
    ...

This keeps migrations small because your existing torch.load, tf.keras.models.load_model, and safetensors.torch.load_file calls can stay as-is.

Selection and strict mode

from cufile_patcher import auto_patch

with auto_patch(torch=True, tensorflow=False, safetensors=True):
    ...

with auto_patch(strict=True):
    ...

Parameters

Parameter Default Meaning
torch None None auto-detects, True requires torch, False disables torch patching
tensorflow None None auto-detects, True requires tensorflow, False disables tensorflow patching
safetensors None None auto-detects, True requires safetensors, False disables safetensors patching
strict False Raise if a required framework is missing
min_file_size_mb 64 Minimum file size to switch from direct load to streaming path
chunk_size_mb 16 Streaming chunk size
use_cufile False Use cuFile reader instead of pure Python reader
fallback_to_original True If streaming fails, fallback to the original framework loader

Migration guidance

If your current code manually installs and uninstalls patchers, move the lifecycle to one block:

from cufile_patcher import auto_patch

def load_models():
    with auto_patch():
        # old loading calls remain unchanged
        ...

Notes and caveats

PyTorch example

import torch
from cufile_patcher import patch_torch_load

patcher = patch_torch_load(torch, min_file_size_mb=100, chunk_size_mb=64, use_cufile=True)
try:
    model_state = torch.load("/path/to/model.pt", map_location="cpu")
finally:
    patcher.uninstall()

TensorFlow example

import tensorflow as tf
from cufile_patcher import patch_tensorflow_load_model

patcher = patch_tensorflow_load_model(
    tf,
    min_file_size_mb=100,
    chunk_size_mb=64,
    use_cufile=True,
)
try:
    model = tf.keras.models.load_model("/path/to/model.keras")
finally:
    patcher.uninstall()

safetensors example

import safetensors.torch as st
from cufile_patcher import patch_safetensor_load_file

patcher = patch_safetensor_load_file(
    st,
    min_file_size_mb=100,
    chunk_size_mb=64,
    use_cufile=True,
)
try:
    tensors = st.load_file("/path/to/model.safetensors")
finally:
    patcher.uninstall()

Publishing

The publish workflow at .github/workflows/publish.yml is configured to use PyPI trusted publishing.

To use it:

  1. Configure this GitHub repository as a trusted publisher in your PyPI project.
  2. Create and push a tag like v0.2.0.
  3. GitHub Actions will build and publish the package.

Resources

Resource URL
Documentation https://maifeeulasad.github.io/cufile-patcher/
Coverage https://maifeeulasad.github.io/cufile-patcher/htmlcov/