Support for 70b by updating ctransformers #12

thirtysix · 2023-07-29T17:04:06Z

You can use the 70b parameter model now as well, here is how I accomplished it:

Downloaded the 70b parameter model I wanted from https://huggingface.co/TheBloke/Llama-2-70B-Chat-GGML/tree/main. In my case, I chose 'llama-2-70b-chat.ggmlv3.q5_K_M.bin'. None of my runs so far have used much more than 6-8GB of RAM. You need to modify the 'config/config.yml' to point to your newly downloaded model.
Updated the CTransformers package to the latest version which adds support for 70b (ctransformers-0.2.15 or higher):
poetry run pip install ctransformers --upgrade
I also updated langchain (and I had done this first but I'm not sure it's required):
poetry run pip install langchain --upgrade

Now it runs! Much slower (<1 minute became almost 10min).

The text was updated successfully, but these errors were encountered:

Provide feedback