AKL

This is the offcial github for the paper Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models from Taiqiang Wu, Chaofan Tao, Jiahao Wang, Runming Yang, Zhe Zhao, Ngai Wong.

TL,DR: We provide a deeper insight into forward KL and reverse KL in the KD for LLM and then propose a novel AKL based on the analysis.

Blog|中文版

Conclusion:

In the KD for LLMs, the mean-seeking and mode-seeking behaviors do not hold for forward KL (FKL) and reverse KL (RKL),respectively. Instead, they share the same optimization objective. Meanwhile, FKL focuses on the head part and RKL focuses on the tail part at the beginning epochs.

Converage at begining epochs:

Total process in GIF:

Toy Examples

To reproduce the toy examples, you can refer to the toy_examples/FR_KL.ipynb and toy_examples/FR_compare.ipynb.

KD Experiments

Please follow the minillm for the environment and dataset.

Introduce the AKL into the KD setting. (Mainly on this line)

And then run the experiments and evaluate the student.

For results on Winogrande, OpenBookQA, BoolQ, ARC, please use this tool.

Contact

Taiqiang Wu: [email protected]

Citation

If you find this paper useful, please cite it by using the following BibTeX entry.

@article{wu2024rethinking,
  title={Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Models},
  author={Wu, Taiqiang and Tao, Chaofan and Wang, Jiahao and Yang, Runming and Zhao, Zhe and Wong, Ngai},
  journal={arXiv preprint arXiv:2404.02657},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
toy_examples		toy_examples
AKL.py		AKL.py
README.md		README.md
convergence_FRKL.png		convergence_FRKL.png
scatter.gif		scatter.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AKL

Toy Examples

KD Experiments

Contact

Citation

About

Releases

Packages

Languages

wutaiqiang/LLM_KD_AKL

Folders and files

Latest commit

History

Repository files navigation

AKL

Toy Examples

KD Experiments

Contact

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages