Tue Mar 03 2026News

Terminal-Bench-Science: Now in Development

Extending Terminal-Bench to complex scientific workflow tasks in the natural sciences.

terminal-bench

science

A Benchmark for Evaluating AI Agents on Complex Real-World Scientific Workflows in the Terminal

We're excited to announce that Terminal-Bench-Science is now in development — extending Terminal-Bench to the complex real-world computational workflows that natural scientists run in their research labs.

Our goal is to catalyze a "Claude Code / Codex for Science" moment by giving natural scientists a direct voice in shaping AI progress: scientists contribute real workflows from their research, frontier labs optimize against them, and the resulting AI advances flow back as more capable AI systems for scientific discovery.

We're looking for practicing scientists who want to contribute tasks from their research workflows. If you're interested in contributing, check out our contributing page or join the #tb-science channel on our Discord.

Written by

Steven Dillmann