You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
You can use the 70b parameter model now as well, here is how I accomplished it:
Downloaded the 70b parameter model I wanted from https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main. In my case, I chose 'llama-2-70b-chat.ggmlv3.q5_K_M.bin'. None of my runs so far have used much more than 6-8GB of RAM. You need to modify the 'config/config.yml' to point to your newly downloaded model.
Updated the CTransformers package to the latest version which adds support for 70b (ctransformers-0.2.15 or higher): poetry run pip install ctransformers --upgrade
I also updated langchain (and I had done this first but I'm not sure it's required): poetry run pip install langchain --upgrade
Now it runs! Much slower (<1 minute became almost 10min).
The text was updated successfully, but these errors were encountered:
You can use the 70b parameter model now as well, here is how I accomplished it:
Downloaded the 70b parameter model I wanted from https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main. In my case, I chose 'llama-2-70b-chat.ggmlv3.q5_K_M.bin'. None of my runs so far have used much more than 6-8GB of RAM. You need to modify the 'config/config.yml' to point to your newly downloaded model.
Updated the CTransformers package to the latest version which adds support for 70b (ctransformers-0.2.15 or higher):
poetry run pip install ctransformers --upgrade
I also updated langchain (and I had done this first but I'm not sure it's required):
poetry run pip install langchain --upgrade
Now it runs! Much slower (<1 minute became almost 10min).
The text was updated successfully, but these errors were encountered: