About alpha/rank in lora #1304

Vital1162 · 2024-11-18T08:20:41Z

How does $\alpha$ in Lora affect performance in training?
I usually see everyone set to $2r$. But why?
About the rank, I always set it to 128-256 if the dataset quantity is good.

Erland366 · 2024-11-18T21:56:56Z

I think because in LoRA, you use alpha for the learning rate of the LoRA, which is defined by

$$ LR_{LoRA} = \frac{\alpha}{\sqrt{r}} \times LR $$

But in finetuning, you might want to aggresively update the adapter since your data is usually fewer than pretrain.

Probably my intuition is as long as the result of $\frac{\alpha}{\sqrt{r}}$ is more than one then you good to go

Vital1162 · 2024-11-19T13:37:00Z

thank you for your response @Erland366, but does dataset size affect these parameters?

Erland366 · 2024-11-19T20:11:33Z

I've heard in the Discord that if you have smaller dataset, then use smaller rank and alpha. But I haven't tested this a lot myself

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About alpha/rank in lora #1304

About alpha/rank in lora #1304

Vital1162 commented Nov 18, 2024

Erland366 commented Nov 18, 2024

Vital1162 commented Nov 19, 2024 •

edited

Loading

Erland366 commented Nov 19, 2024

About alpha/rank in lora #1304

About alpha/rank in lora #1304

Comments

Vital1162 commented Nov 18, 2024

Erland366 commented Nov 18, 2024

Vital1162 commented Nov 19, 2024 • edited Loading

Erland366 commented Nov 19, 2024

Vital1162 commented Nov 19, 2024 •

edited

Loading