Staff Software Engineer, Product
Quick Summary
About Arena Intelligence Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use.
Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Created by researchers from UC Berkeley’s SkyLab, our mission is to measure and advance the frontier of AI for real-world use.
Millions of people use Arena Intelligence each month to explore how frontier systems perform — and we use our community’s feedback to build transparent, rigorous, and human-centered model evaluations. Leading enterprises and AI labs rely on our evaluations to understand real-world reliability, alignment, and impact. Our leaderboards are the gold standard for AI performance — trusted by leaders across the AI community and shaping the global conversation on model reliability and progress.
We’re a team of researchers, engineers, academics, and builders from places like UC Berkeley, Google, Stanford, DeepMind, and Discord. We seek truth, move fast, and value craftsmanship, curiosity, and impact over hierarchy. We’re building a company where thoughtful, curious people from all backgrounds can do their best work. Everyone on our team is a deep expert in their field — our office radiates excellence, energy, and focus.
About the Role
~1 min readWe're looking for a Staff Software Engineer to own entire product areas at Arena — identifying what to build, designing and shipping the solution, driving it to measurable outcomes, and raising the bar for the team along the way. This is a hands-on IC role. You are the architect, not the delegator. You write code, design systems, and ship product — and you have a track record of doing this repeatedly with clear, attributable impact.
Own product areas end-to-end — from identifying opportunities that aren't on anyone's roadmap, to building conviction, shipping, and driving to measurable outcomes
Design complete systems: data model, API design, frontend architecture (full-stack) or backend services and infrastructure (backend), and deployment strategy
Make high-stakes technical decisions under uncertainty — build-vs-buy, monolith-vs-service, when to prototype and when to harden — and be accountable for the outcomes
Work with ML researchers to turn research services into reliable, product-consumable systems
Ship and iterate — you don't hand off after v1. You measure, course-correct, and drive to sustained impact
Raise the technical and product quality bar on the team — introduce practices others adopt, unblock teammates, and create clarity out of ambiguity
Navigate cross-functional collaboration with product, design, research, and leadership to align on what matters
8+ years of experience in software engineering, with a focus on product development
Deep experience building web applications spanning data model, API, frontend architecture, and deployment. You can design complete systems end-to-end and explain why you rejected alternatives at each decision point. You can explain how your architecture decisions shaped the product's capabilities and measurably improved product quality or team velocity.
A track record of repeated, attributable impact — multiple projects across different roles or companies where you can quantify outcomes (revenue, engagement, efficiency) and the impact sustained after you moved on
Personal technical depth, not delegation — you are the architect. You've dealt with real performance issues: slow queries, N+1 problems, caching, transaction isolation. You can go three levels deep on any decision and get more specific, not vaguer.
Product judgment and autonomous ownership — you've identified opportunities, validated them with evidence, built buy-in, shipped, and the bet paid off. You know when to prototype, harden, or kill work.
Clear, persuasive communication — you build buy-in across engineering, product, and leadership. You create clarity for others, not just yourself.
Genuine conviction about AI evaluation and Arena's mission — you can articulate why this domain matters and where the product should go
Nice to Have
~1 min readProduction experience with AI/LLM systems — inference pipelines, evaluation workflows, model integration, or AI-powered product features
Familiarity with our stack: NextJS, React, TypeScript, Tailwind, ShadCN, HonoJS, Postgres, Vitest
Experience with Supabase or Vercel's AI SDK
You've raised the bar on a team with measurable before/after — introduced practices, tools, or standards that others adopted
NextJS
React + TypeScript
Tailwind + ShadCN
HonoJS
Postgres
Vitest
What We Offer
~1 min readLocation & Eligibility
Listing Details
- Posted
- December 18, 2025
- First seen
- May 6, 2026
- Last seen
- May 23, 2026
Posting Health
- Days active
- 17
- Repost count
- 0
- Trust Level
- 18%
- Scored at
- May 23, 2026
Signal breakdown
Please let arena know you found this job on Jobera.
4 other jobs at arena
View all →Explore open roles at arena.
Similar Software Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.