All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Add training data chunk environment variable
- Enable importance matrix for all quant formats
- Always compute an imatrix file
- Update documentation about quantization
- Fix renaming of convert python script from llama.cpp
- Use existing importance matrix files for all quant formats
- Move importance matrix files into dedicated directory
- Simplify conversion from hf to gguf
- Changed binary names to the new llama.cpp format (llama-*)
- Update list of supported quantization types
- Remove logging of repository directories
- Fix check when an importance matrix is required
- Update supported quantization types
- Add support for using unquantized models in the GGUF format from the source
- Add fallback to 'convert-hf-to-gguf.py' to support novel model architectures
- Add support for models with Byte Pair Encoding (BPE) vocabulary type
- Update documentation
- Change filenames to match the de facto standard
- Add support for IQ2_XXS, IQ2_XS and Q2_K_S quantization types
- Update list of supported quantization types
- Fix resolving of paths
- Add .env configuration
- Add Documentation
- Add download script
- Add quantization script