cuFile-first patching toolkit for large model loading using:
.
├── .github/workflows/
│ ├── ci.yml
│ ├── coverage-pages.yml
│ └── publish.yml
├── .vscode/
│ ├── extensions.json
│ └── settings.json
├── src/cufile_patcher/
│ ├── __init__.py
│ ├── auto_patch.py
│ ├── bindings.py
│ ├── cufile.py
│ ├── cufile_types.py
│ ├── core.py
│ ├── registry.py
│ ├── safetensor_patcher.py
│ ├── service.py
│ ├── tensorflow_patcher.py
│ ├── torch_patcher.py
│ └── plugins/
│ ├── __init__.py
│ ├── base.py
│ └── system.py
├── tests/test_auto_patch.py
├── tests/test_backend_core.py
├── tests/test_core.py
├── tests/test_safetensor_patcher.py
├── tests/test_tensorflow_patcher.py
├── tests/test_torch_patcher.py
├── AGENTS.md
├── README.md
└── pyproject.toml
Install dependencies:
uv sync --all-groupsInstall package variants:
pip install cufile-patcher
pip install "cufile-patcher[all]"
pip install "cufile-patcher[tf]"
pip install "cufile-patcher[tensorflow]"
pip install "cufile-patcher[torch]"
pip install "cufile-patcher[pytorch]"Run lint:
uv run ruff check .Run tests:
uv run pytestfrom cufile_patcher import hello_world
print(hello_world())Expected output:
Hello, world!
The package includes a modernized port of cuFile wrapper features:
CuFileDriver singleton driver lifecycleCuFile class with mode mapping, open/close, context
manager, read/writecuFileDriverOpen, cuFileDriverClosecuFileHandleRegister,
cuFileHandleDeregistercuFileRead, cuFileWriteThe backend is plugin-based using OOP boundaries:
CuFileBackend interface in
plugins/base.pySystemCuFileBackend implementation in
plugins/system.pyBackendRegistry and default backend switching in
registry.pyYou can register a custom backend for mocks, testing, or alternate transports.
The project provides dedicated patchers for streaming large model files:
patch_torch_load(...) for torch.loadpatch_tensorflow_load_model(...) for
tf.keras.models.load_modelpatch_safetensor_load_file(...) for
safetensors.torch.load_fileBoth patchers support:
min_file_size_mb)chunk_size_mb)use_cufile=True)Use a single context manager to install and remove framework patchers automatically:
from cufile_patcher import auto_patch
with auto_patch():
# existing framework load calls can remain unchanged
...from cufile_patcher import auto_patch
# Recommended default for most projects.
with auto_patch(min_file_size_mb=100, chunk_size_mb=64):
...This keeps migrations small because your existing
torch.load, tf.keras.models.load_model, and
safetensors.torch.load_file calls can stay as-is.
from cufile_patcher import auto_patch
with auto_patch(torch=True, tensorflow=False, safetensors=True):
...
with auto_patch(strict=True):
...| Parameter | Default | Meaning |
|---|---|---|
torch |
None |
None auto-detects, True requires torch,
False disables torch patching |
tensorflow |
None |
None auto-detects, True requires
tensorflow, False disables tensorflow patching |
safetensors |
None |
None auto-detects, True requires
safetensors, False disables safetensors patching |
strict |
False |
Raise if a required framework is missing |
min_file_size_mb |
64 |
Minimum file size to switch from direct load to streaming path |
chunk_size_mb |
16 |
Streaming chunk size |
use_cufile |
False |
Use cuFile reader instead of pure Python reader |
fallback_to_original |
True |
If streaming fails, fallback to the original framework loader |
If your current code manually installs and uninstalls patchers, move the lifecycle to one block:
from cufile_patcher import auto_patch
def load_models():
with auto_patch():
# old loading calls remain unchanged
...with cufile-patcher: is not valid Python syntax.with auto_patch(...): instead.None means auto-detect and patch available
frameworks.strict=True enforces availability checks for the
selected/auto-detected set.True for a framework raises if that framework
is not installed.import torch
from cufile_patcher import patch_torch_load
patcher = patch_torch_load(torch, min_file_size_mb=100, chunk_size_mb=64, use_cufile=True)
try:
model_state = torch.load("/path/to/model.pt", map_location="cpu")
finally:
patcher.uninstall()import tensorflow as tf
from cufile_patcher import patch_tensorflow_load_model
patcher = patch_tensorflow_load_model(
tf,
min_file_size_mb=100,
chunk_size_mb=64,
use_cufile=True,
)
try:
model = tf.keras.models.load_model("/path/to/model.keras")
finally:
patcher.uninstall()import safetensors.torch as st
from cufile_patcher import patch_safetensor_load_file
patcher = patch_safetensor_load_file(
st,
min_file_size_mb=100,
chunk_size_mb=64,
use_cufile=True,
)
try:
tensors = st.load_file("/path/to/model.safetensors")
finally:
patcher.uninstall()The publish workflow at .github/workflows/publish.yml is
configured to use PyPI trusted publishing.
To use it:
v0.2.0.| Resource | URL |
|---|---|
| Documentation | https://maifeeulasad.github.io/cufile-patcher/ |
| Coverage | https://maifeeulasad.github.io/cufile-patcher/htmlcov/ |