Neural Magic
Neural Magic helps developers in accelerating machine learning performance using automated model sparsification techniques and inference technologies.
Pinned Loading
Repositories
Showing 10 of 59 repositories
- nm-vllm-certs Public
General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
neuralmagic/nm-vllm-certs’s past year of commit activity - flash-attention Public Forked from vllm-project/flash-attention
Fast and memory-efficient exact attention
neuralmagic/flash-attention’s past year of commit activity - compressed-tensors Public
A safetensors extension to efficiently store sparse quantized tensors on disk
neuralmagic/compressed-tensors’s past year of commit activity - upstream-transformers Public Forked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
neuralmagic/upstream-transformers’s past year of commit activity - evalplus Public Forked from evalplus/evalplus
NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
neuralmagic/evalplus’s past year of commit activity - temp-llm-compressor Public Forked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
neuralmagic/temp-llm-compressor’s past year of commit activity