-
Hi all! TL;DR, Question is basically the title - are there any flags for targeting as many projectors as possible and setting a super high rank? I'm a recent onboardee to MLX - having previously been doing a bunch of fine-tuning on Google Colab using this really great software Unsloth. I'm currently researching if knowledge injection is possible without in-context learning using consumer-grade hardware (so far I've achieved some really promising results!!) Disclaimer: I'm out of my depth in that I don't know what a lot of these things mean. But I was told by someone in the community that for knowledge injection you've got to train all layers, with all projectors, and with a really high rank if you're using QLoRA (I was doing r = 256). I asked the resident MLX Guru if there were any flags but alas, the Guru did not know (it didn't even think that MLX supported QLoRA... here's the link to my chat if the curator of that GPT wants to debug it) In any case I hope that I'm not posting this to the wrong place - I'm not really a Github native, and I only picked up fine-tuning last weekend, so really sorry if any of these are stupid questions! (Like I'm thinking that this whole approach might be really dumb - if I'm trying to crank the rank up really high to 256... should I even be using low rank adaptation in the first place?) P.S. Super appreciate what the community is doing with this; being able to fine-tune on my Mac means I can do this research without worrying about spending money on compute that I could've spent on taking my partner out for a nice lunch 😂 |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 8 replies
-
Lol yea I guess the Guru needs an update (CC @sck-at-ucy).
Sort of but not really. RIght now you can make all the layer lora layers with The other flags you have to change manually. Though it's on our TODO list to make them configurable with a config file. If you prefer not to wait, you can change code manually: |
Beta Was this translation helpful? Give feedback.
Lol yea I guess the Guru needs an update (CC @sck-at-ucy).
Sort of but not really. RIght now you can make all the layer lora layers with
--lora-layers 32
(if you're model has 32 blocks)The other flags you have to change manually. Though it's on our TODO list to make them configurable with a config file. If you prefer not to wait, you can change code manually: