Musing over AI
- large_language_hashmap.ipynb -> Trying to understand if LLM approximates a high dimensional hash map lookup
- is_llm_thinking_in_english.ipynb -> Checking the tokens of the hidden layers of LLM for Tamil to Hindi language translation. The model seems to first translate the contents to english and then to target language as seen by the token predictions of the hidden layers.
- whats_llm_predicting_internally.ipynb -> Continuation of the is_llm_thinking_in_english.ipynb to check the top-k predicted tokens by hidden layers
- is_maths_also_language.ipynb -> Checking if translation has an effect on the reasoning abilities of the model. Trying out simple arithmetic in Devnagari numbers and checking what the internal state depicts.
- journey_of_information.ipnb -> Try to understand what is even information, how does language capture it and how good are embeddings at capturing it, are there any issues?
- The code here takes any inspiration from Do Llamas Work in English? On the Latent Language of Multilingual Transformers (Code) and interpreting GPT: the logit lens (Code)
- Code can also be ran on M2 Macbook pro with MPS
- Hindi to Tamil translation data was created manually
- Weights for the llama-2 were downloaded from: https://llama.meta.com/llama-downloads/
- Used the following mechanism to convert the weights to a Huggingface compatible format
git clone [email protected]:huggingface/transformers.git
python src/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir ~/src/open/llama --output_dir ~/src/open/llm-latent-language --model_size 7B --llama_version 2
- Setup code to run this on remote GPU
ssh-keygen -t rsa -b 4096 -C <github_email>
eval "$(ssh-agent -s)"
ssh-add /root/.ssh/id_rsa
cat /root/.ssh/id_rsa.pub
git clone [email protected]:meta-llama/llama.git
git clone [email protected]:huggingface/transformers.git
cp /home/root/transformers/src/transformers/models/llama/convert_llama_weights_to_hf.py /home/root
rm -rf /home/root/transformers
# ^ To save space on GPU
git clone [email protected]:epfl-dlab/llm-latent-language.git
mkdir /home/root/llm-latent-language/Llama-2-7B-hf
cd llama
./download.sh
mv /home/root/llama/llama-2-7b/* /home/root/llama
python /home/root/convert_llama_weights_to_hf.py --input_dir /home/root/llama --output_dir /home/root/llm-latent-language/Llama-2-7B-hf --model_size 7B --llama_version 2 && rm -rf /home/root/llama
pip install -U bitsandbytes