🌋 Catastrophic Cyber Capabilities Benchmark (3cb)

What happens when AI agents become capable hackers? And what can we do to figure out whether they are? We create a collection of original challenges designed to solve issues in autonomous cyber offense capabilities evaluations related to legibility, coverage, and generalization.

Read more on the website

How to set up the environment

Check out the repo with git clone in your desired folder.
Your host must have the docker service running and docker-cli available. The CE / os packaged version is also fine.
1. E.g. on Debian follow these steps: https://docs.docker.com/engine/install/debian/
Copy .env.template to .env and customize it, setting the name of the local sqlite instance and your API keys.
Run poetry install to get the necessary Python packages
Run poetry run alembic upgrade head to set up the local database

Try executing a run with

poetry run python3 run_from_config.py \
  --config_path ./task_configs/binpwn_gdb_repl.toml \
  --elicitation_index 1 \
  --agent_identifier openai \
  --model_name gpt-4o-2024-08-06 \
  --print_comms true

Broad architecture overview

model.py holds ORM definitions for the local sqlite database, to which everything is persisted.
config_schema.py contains the Pydantic schemas for Task configuration objects and everything they need.
task_configs/ contains the task configurations themselves, which conform to the config schema.
alembic* holds data for management of the local DB, such as the version upgrade files.
protocol.py has the definitions for different ways of extracting the agent action from the model output, and wrapping env output for the agent.
exceptions.py define different ways things can go wrong during a run.
environment.py hold code to manage Docker images and containers which implement the sandboxes.
agent.py defines basic agent setups from common API providers.
harness.py brings everything together and implements actually running an agent in an environment against a task.

Useful scripts

run_from_config.py --config_path <config>.toml allows you to create a run from a given task_config.
debug_task_env.py --config_path <config>.toml allows you to build a Docker image defined by a task_config and get a shell there.

How to contribute

Create a new task

Find a good task idea, you're welcome to be inspired by existing CTF Jeopardy tasks, by real-world eng tasks, etc etc.
Create a new toml file in task_configs/.
Define the environment, pay special attention to the Dockerfile and the packaged bash scripts.
Make sure you yourself can finish the task when run with debug_task_env.py.
Create an elicitation or two, you probably want to use XML_TAGS as the protocol first.
Send a PR.

Create a new elicitation for an existing task

Find a task and run it a bunch of times locally.
Identify where you think the model is hampered by insufficient understanding of its environment, the problem as posed, communication protocol, etc. Or where a $200 tip can help.
Add another [[elicitations]] entry to the relevant toml config.
As long as there's a combination of (task, model), where the model performs better than status quo, it's valuable.
Send a PR.

More interesting contributions?

Chat with me on discord: abra3607

Challenge guidelines

Not very many steps (~GPT-4 should be able to have successful runs in ~20 steps)
Don't go overboard on hinting in the prompt (a general agent should be able to go through the challenge in theory)
Don't over-engineer what the challenge looks like (as long as it models some cyber capability and the model succeeds, it's fine)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🌋 Catastrophic Cyber Capabilities Benchmark (3cb)

Read more on the website

How to set up the environment

Broad architecture overview

Useful scripts

How to contribute

Create a new task

Create a new elicitation for an existing task

More interesting contributions?

Challenge guidelines

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
alembic		alembic
task_configs		task_configs
.env.template		.env.template
.gitignore		.gitignore
README.md		README.md
WORKSHOP.md		WORKSHOP.md
agent.py		agent.py
alembic.ini		alembic.ini
config_schema.py		config_schema.py
debug_task_env.py		debug_task_env.py
docker_seccomp_profile_allowing_ADDR_NO_RANDOMIZE.json		docker_seccomp_profile_allowing_ADDR_NO_RANDOMIZE.json
environment.py		environment.py
exceptions.py		exceptions.py
harness.py		harness.py
model.py		model.py
poetry.lock		poetry.lock
protocol.py		protocol.py
pyproject.toml		pyproject.toml
run_from_config.py		run_from_config.py
tinker.ipynb		tinker.ipynb
web_navigation_flag.txt		web_navigation_flag.txt

apartresearch/3cb

Folders and files

Latest commit

History

Repository files navigation

🌋 Catastrophic Cyber Capabilities Benchmark (3cb)

Read more on the website

How to set up the environment

Broad architecture overview

Useful scripts

How to contribute

Create a new task

Create a new elicitation for an existing task

More interesting contributions?

Challenge guidelines

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages