This repository lists file formats used in ML/AI systems. It can be used as a resource for tool development and vulnerability research. We aim to keep this list as up-to-date and accurate as possible. If you discover any missing file formats, inaccuracies, or if you have more details to contribute, please raise an issue or submit a pull request.
Name | ML-specific | Framework/Organization (if applicable) | Identification Tooling | Extensions | Additional Notes |
---|---|---|---|---|---|
PyTorch v1.3 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file containing data.pkl (1 pickle file) |
PyTorch v0.1.1 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: Tar file with sys_info, pickle, storages, and tensors |
PyTorch v0.1.10 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: Stacked pickle files |
TorchScript v1.4 | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with data.pkl, constants.pkl, and version (2 pickle files and a folder) |
TorchScript v1.3 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with data.pkl and constants.pkl (2 pickle files) |
TorchScript v1.1 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with model.json and attributes.pkl (a JSON file and a pickle file) |
TorchScript v1.0 (deprecated) | Yes | PyTorch | Fickling | .pt, .pth, .bin | Description: ZIP file with model.json |
PyTorch model archive format [ZIP] | Yes | PyTorch | Fickling | .mar | Description: ZIP file that includes Python code files and pickle files |
PyTorch model archive format [TAR] | Yes | PyTorch | - | .mar | Description: TAR file that includes Python code files and pickle files |
PyTorch Package | Yes | PyTorch | - | .pt, .pth, .bin | Description: ZIP file that includes a pickled model, user files represented as a Python package, and framework files including serialized tensor data |
ExecuTorch | Yes | PyTorch | - | .pte | Description: Modified binary flatbuffer file with optional data segments appended |
Torch.export | Yes | PyTorch | - | .pt2 | Description: ZIP file with JSON files and Python code file |
PyTorch Mobile | Yes | PyTorch | - | .ptl | Description: Modified binary flatbuffer file |
Safetensors | Yes | - | PolyFile | .safetensors | Refer to our audit |
ONNX | Yes | - | - | .onnx | Refer to LobotoMI |
Keras native file format | Yes | Keras | - | .keras | Description: ZIP archive with 2 JSON files and 1 h5 file |
TensorFlow Saved Models | Yes | TensorFlow | - | .pb | Description: Custom Protobuf format. Can result in arbitrary code execution. |
TensorFlow Checkpoint | Yes | TensorFlow | - | .ckpt | Description: Custom Protobuf format. Can result in arbitrary code execution. |
TFLite | Yes | TensorFlow | - | .tflite | Description: Modified binary flatbuffer file |
TFJS | Yes | TensorFlow | - | - | Description: JSON file and binary file with weights. Technically not a singular file format. |
TF1 Hub format (deprecated) | Yes | TensorFlow | - | - | Description: Custom Protobuf format. |
Tensorizer | Yes | CoreWeave | - | - | Not uncommon especially in private production systems |
TFRecords | Yes | TensorFlow | - | .tfrecords | Description: Wrapper around a Protocol Buffer |
NPY | Yes | NumPy | - | .npy | Used to integrate pickle by default as well. |
NPZ | Yes | NumPy | - | .npz | Description: ZIP file of NPY files |
GGUF | Yes | llama.cpp/GGML | - | .gguf | - |
GGML | Yes | llama.cpp/GGML | - | .ggml | - |
GGMF (deprecated) | Yes | llama.cpp/GGML | - | .ggmf | - |
GGJT (deprecated) | Yes | llama.cpp/GGML | - | .ggjt | - |
NetCDF | Yes | - | - | .nc | - |
PMML | Yes | - | - | - | - |
MLeap | Yes | Spark | - | .mleap | - |
CoreML | Yes | Apple | - | .coreml | - |
MLFlow Format | Yes | MLFlow | - | - | - |
MLFlow TensorSpec input format | Yes | MLFlow | - | - | - |
SurrealML | Yes | SurrealDB | - | .surml | - |
Llamafile | Yes | - | - | .llamafile | - |
.prompt | Yes | HumanLoop | - | .prompt | - |
Pickle | No | Python | PolyFile | .pkl | Refer to Fickling |
Joblib | No | - | PolyFile | - | - |
Nemo | Yes | NVIDIA | - | - | - |
Riva | Yes | NVIDIA | - | - | - |
AVRO | No | - | - | - | - |
PARQUET | No | - | - | - | - |
ORC | No | - | - | - | - |
JSON | No | - | PolyFile | - | - |
CSV | No | - | - | - | - |
Protocol Buffers | No | - | - | - | Usually an underlying file format |
HDF5 | No | - | - | .h5 | - |
Caffe | Yes | Caffe | - | .caffemodel & .prototxt | Description: Protobuf-based file format |
ArmNN Flatbuffers | Yes | ArmNN | - | - | - |
Cambricon | Yes | - | - | - | - |
Circle | Yes | - | - | - | - |
ZIP | No | - | PolyFile | - | Usually an underlying file format |
CNTK v1 (deprecated) | Yes | Microsoft Cognitive Toolkit | - | - | - |
CNTK v2 | Yes | Microsoft Cognitive Toolkit | - | - | Description: Protobuf-based file format |
Darknet | Yes | Hank.ai Darknet | - | - | - |
DL4J | Yes | DL4J | - | - | Description: ZIP-based file format |
Deep Learning Container (DLC) | Yes | Qualcomm Neural Processing SDK | - | .dlc | - |