Sentence Embedding Benchmark

Installation

Because we are downloading and using a lot of data from huggingface, you need to create a .env file and add you huggingface token, otherwise downloading the required models and datasets might fail.
```
HF_TOKEN="<your_huggingface_token>"
```
To run the benchmark, select the task you want to run, e.g. clustering_benchmarks.ipynb and run all cells.

Display the results and create all associated plots by running the plot notebook, e.g. clustering_plots.ipynb. The available task and plots are:

Task	Plot
`clustering_benchmarks.ipynb`	`clustering_plots.ipynb`
`clustering_benchmarks_cutoff.ipynb`	`clustering_plots_cutoff.ipynb`
`retrieval_benchmark_cqa.ipynb`	`retrieval_plots.ipynb`
`retrieval_benchmark_nqa_chunking.ipynb`	`retrieval_plots.ipynb`
`retrieval_benchmark_nqa_seq.ipynb`	`retrieval_plots.ipynb`
`sts_benchmarks.ipynb`	`sts_plots.ipynb`

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
plots		plots
results		results
.gitignore		.gitignore
README.md		README.md
clustering_benchmarks.ipynb		clustering_benchmarks.ipynb
clustering_benchmarks_cutoff.ipynb		clustering_benchmarks_cutoff.ipynb
clustering_plots.ipynb		clustering_plots.ipynb
clustering_plots_cutoff.ipynb		clustering_plots_cutoff.ipynb
retrieval_benchmark_cqa.ipynb		retrieval_benchmark_cqa.ipynb
retrieval_benchmark_nqa_chunking.ipynb		retrieval_benchmark_nqa_chunking.ipynb
retrieval_benchmark_nqa_seq.ipynb		retrieval_benchmark_nqa_seq.ipynb
retrieval_plots.ipynb		retrieval_plots.ipynb
sts_benchmarks.ipynb		sts_benchmarks.ipynb
sts_plots.ipynb		sts_plots.ipynb
token_count.ipynb		token_count.ipynb