eda-bench
DocsTasksRegistryLeaderboardContributors
HomeRegistry

eda-bench Registry

Showing 7 datasets

terminal-bench-core

Github
head

The latest pre-release version of terminal-bench core. Hand crafted by undergraduate, graduate, and industry researchers.

117 tasks

terminal-bench-core

Github
0.1.0

Core task set for terminal-bench. Hand crafted by undergraduate, graduate, and industry researchers.

80 tasks

terminal-bench-core

Github
0.1.1leaderboard

Core task set for terminal-bench with recent patches. Hand crafted by undergraduate, graduate, and industry researchers.

80 tasks

deveval

Github
head

Adapter for DevEval (https://github.com/open-compass/DevEval). Dataset PR: https://github.com/laude-institute/terminal-bench-datasets/pull/1

63 tasks

evoeval-difficult

Github
head

Adapter for EvoEval (https://github.com/open-compass/EvoEval). Dataset PR: https://github.com/laude-institute/terminal-bench-datasets/pull/2

100 tasks

appworld-dev

Github
headencrypted

Adapter for AppWorld (https://github.com/open-compass/AppWorld). Dataset PR: https://github.com/laude-institute/terminal-bench-datasets/pull/3

57 tasks

swebench-verified

Github
head

Adapter for SWEBench (https://github.com/open-compass/SWEBench). Dataset PR: https://github.com/laude-institute/terminal-bench-datasets/pull/4

500 tasks