LLM Benchmark Visualisations

!! DISCLAIMER: This is work in progress experiment in early alpha stages, there's a lot of work to be done to make this a useful tool !!

This project is designed to visualise and track the performance of various Large Language Models (LLMs) across different benchmarks. The visualisations aim help in understanding trends, comparing models, and predicting future performances.

Features

Data Entry: Easily add new benchmark data for models.
Visualisation: Interactive charts showing model performance over time.
Predictive Analysis: Predict future performances based on historical data.

Getting Started

Prerequisites

Node.js (v22+)

Installation

Clone the repository:

git clone https://github.com/sammcj/closing-the-gap.git
cd closing-the-gap

Install dependencies:
```
npm install
```
Start the development server:
```
npm start
```
Access the application in your browser at http://localhost:3000.

Project Structure

The project is structured as follows:

public/: Static index.html.
src/: Source code for the application.
- components/: Reusable UI components.
  - DataEntryForm.js: Form to add new benchmark data.
  - LLMBenchmarkVisualisation.js: Component to visualise benchmark data using ChartJS.
  - LLMBenchmarkDashboard.js: Dashboard to display benchmark data and predictions.
  - LeftPanel.js: Side panel to display model information.
- config.js: Configuration settings for the application, including chart colors and titles.
- App.js: Main application component that integrates all other components.
server.js: Express server to serve static files and API endpoints.
ingest/: Scripts to aid with data ingestion (not used by the app itself).
package.json: Project metadata and scripts.
llm_bechmarks.db: SQLite database to store benchmark data.

Usage

GUI Data Entry: Use the DataEntryForm component to add new benchmark data for models. This includes entering dates, selecting models, benchmarks, scores, and whether the model is open or closed.
CLI Data Entry: Add correctly formatted JSON benchmark results to ingest/import.json and run node ingest/ingest.js
Visualisation: The LLMBenchmarkVisualisation component provides interactive charts that show the performance of different models over time. Predictions are also provided based on historical data trends.
Predictive Analysis: Historical data is used to predict future performances, helping in understanding model growth and potential improvements.

Contributing

Contributions are welcome! Please follow these steps:

Fork the repository.
Create a new branch for your feature or bug fix.
Make your changes and test them thoroughly.
Submit a pull request with a clear description of your changes.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

LLM Benchmark Visualisations

Features

Getting Started

Prerequisites

Installation

Project Structure

Usage

Contributing

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

LLM Benchmark Visualisations

Features

Getting Started

Prerequisites

Installation

Project Structure

Usage

Contributing

License