Researcher, Evaluations
Quick Summary
Epoch AI is looking for a researcher to evaluate frontier AI models on hard-to-grade tasks drawn from real-world scenarios.
Epoch AI is looking for a researcher to evaluate frontier AI models on hard-to-grade tasks drawn from real-world scenarios.
About the Role
~3 min readWe’re seeking a Researcher to lead a new effort evaluating how well frontier models perform on the kinds of open-ended tasks that make up real office work. You will curate a suite of realistic tasks to serve as a benchmark, design the grading rubrics for AI performance, and run newly-released models through the suite, assessing their performance both quantitatively and qualitatively.
The focus is on how models handle messy, real-world work rather than on scientific knowledge or programming ability. The role makes heavy use of AI tools, but strong software engineering experience is not required. Comfort setting up AI-assisted automated workflows is a plus.
If this role sounds interesting, we are also looking for researchers on multiple other teams.
Applications are rolling.
While we welcome applicants from all time zones, we prefer candidates who can overlap with UTC–8 (Pacific Time) and UTC (Greenwich Mean Time), as most of our staff work in this range of time zones. We also prefer candidates who can travel: we hold three retreats per year, during which we record podcast episodes and other communication efforts.
Please submit all of your application materials in English and note that we require professional level English proficiency.
Epoch is committed to building an inclusive, equitable, and supportive community for you to thrive and do your best work. We’re committed to finding the best people for our team, so please don’t hesitate to apply for a role regardless of your age, gender identity/expression, political identity, personal preferences, physical abilities, veteran status, neurodiversity or any other background. Please email careers@epoch.ai if you have any questions about this role, accessibility requests, or if you want to request an extension to the application deadline. However, we will not review applications submitted to this email address; please submit your application through the link on this page.
Epoch AI is a research institute that investigates trends in machine learning and the economic consequences of AI. Our mission is to develop a comprehensive, publicly accessible knowledge base on AI that informs policymakers, industry leaders, and society at large.
We strive to achieve both rigor and accessibility to our work, as exemplified by some of our most successful projects, including our database of AI models and our AI trends dashboard. Our body of research includes our work on compute trends (IJCN 2022), data scarcity (ICML 2024), and algorithmic progress (NeurIPS 2024). You can read more about our work and mission on our website and in this Time profile.
Location & Eligibility
Listing Details
- Posted
- June 30, 2026
- First seen
- July 3, 2026
- Last seen
- July 3, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 68%
- Scored at
- July 3, 2026
Signal breakdown
Please let Epoch Ai know you found this job on Jobera.
3 other jobs at Epoch Ai
View all →Explore open roles at Epoch Ai.
Similar Researcher jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.
