What's Changed
- Support for Whisper fine-tuning after a slice assignment bug was fixed.
- Whisper inference can now take advantage of group-quantization, where model parameters are stored in INT4, and decoded into FP16 on-the-fly as needed. The memory saving is estimated at 3.5x with minimal degradation in WER, and can be enabled via the
use_group_quantized_linears
parallelize kwarg. - KV caching and on-device generation is now also available for T5.
- Fixed interleaved training and validation for
IPUSeq2SeqTrainer
. - Added notebooks for Whisper fine-tuning, Whisper group-quantized inference, embeddings models, and BART-L summarization.
- UX improvement that ensures a dataset of sufficient size is provided to the
IPUTrainer
.
Commits
- Support C600 card by @katalinic-gc in #446
- Remove deprecated pod_type argument by @jimypbr in #447
- Fix inference replication factor pod type removal by @katalinic-gc in #448
- T5 enable self-attention kv caching by @kundaMwiza in #449
- Workflows: use explicit venv names and use --clear in creation by @jimypbr in #452
- Workflow: add venv with clear for code quality and doc-builder workflows by @jimypbr in #453
- Support overriding *ExampleTester class attribute values in test_examples.py by @kundaMwiza in #439
- Adding missing license headers and copyrights by @jimypbr in #454
- Fix shift tokens right usage which contains slice assignment by @katalinic-gc in #451
- Base models and notebooks for general IPU embeddings model by @arsalanu in #436
- Fix mt5 translation training ipu config by @kundaMwiza in #456
- Add back source optimum graphcore install in embeddings notebook by @arsalanu in #457
- Add parallelize kwargs as an IPU config entry by @katalinic-gc in #427
- Change tests to point to MPNet ipu config by @arsalanu in #458
- T5 enable generation optimisation by @kundaMwiza in #459
- Fix ipus per replica check in whisper cond encoder by @katalinic-gc in #461
- Check that the dataset has enough examples to fill a batch when creat… by @katalinic-gc in #462
- Add notebook for whisper finetuning by @katalinic-gc in #460
- Use index select in BART positional embedding for better tile placement by @katalinic-gc in #463
- Add group quantization for whisper by @jimypbr in #429
- Change max length adaption messages to debug by @katalinic-gc in #465
- Fix finetuning whisper notebook text by @katalinic-gc in #466
- Fix finetuning whisper notebook text v2 by @katalinic-gc in #467
- Add BART-L text summarization notebook by @jayniep-gc in #464
- Fix evaluate then train by @katalinic-gc in #469
- Use token=False in whisper nb by @katalinic-gc in #470
- Add Whisper inference with quantization notebook by @jimypbr in #468
Full Changelog: v0.7.0...v0.7.1