Sayari
Sayari11d ago
USD 195000-205000/yr

Staff Applied Scientist - AI Evaluation & Trust

United StatesUnited StatesRemotelead
Data ScientistApplied ScientistDataData & AI
0 views0 saves0 applied

Quick Summary

Overview

About Sayari: Sayari is the leader in Agentic Systems of Work for economic security and risk. Powered by the Sayari Commercial World Model : a digital twin of global commerce resolving 10.

Technical Tools
Data ScientistApplied ScientistDataData & AI

Sayari is the leader in Agentic Systems of Work for economic security and risk. Powered by the Sayari Commercial World Model : a digital twin of global commerce resolving 10.6B+ primary-source records from 250+ jurisdictions : Sayari transforms risk and investigative teams from manual data gatherers into decisive mission leaders. By unifying corporate ownership, trade data, and risk intelligence into a single graph, Sayari uncovers connections and typologies that legacy watchlist, adverse media, and point solutions miss, enabling prescriptive execution at scale. Trusted by the world’s most demanding regulators, including U.S. Customs and Border Protection, the U.S. Treasury, and Fortune 500 enterprises, Sayari delivers the evidence-based transparency needed to prove decisions, satisfy regulators and protect global commerce. Headquartered in Washington, D.C., Sayari is used by thousands of professionals across 35+ countries to secure supply chains and dismantle illicit networks.

Our company culture is defined by a dedication to our mission of using open data to prevent illicit commercial and financial activity, a passion for finding novel approaches to complex problems, and an understanding that diverse perspectives create optimal outcomes. We embrace cross-team collaboration, encourage training and learning opportunities, and reward initiative and innovation. If you like working with supportive, high-performing, and curious teams, Sayari is the place for you.

Sayari builds AI systems for high-consequence analytical work where being "wrong" carries real-world weight. We are looking for a Staff or Principal Applied Scientist to join our AI Innovation Group as the trusted expert on AI Evaluation and Trust. You will own the "Judgment Layer" of our system: building the specialized judge models, statistical benchmarks, and multi-turn frameworks that ensure our agents act with the high bar of trustworthiness required by our national security and enterprise customers.

Responsibilities

~1 min read
  • Lead the development of specialized "judge models," moving from general-purpose frontier models to architectures purpose-built for evaluation and failure mode detection.
  • Design and execute rigorous scoring pipelines and empirical threshold calibrations for agentic systems, including multi-turn conversation and Graph RAG reasoning.
  • Establish domain-specific evaluation frameworks that measure whether a system can perform the work of human experts rather than just passing general capability benchmarks.
  • Own the full lifecycle of evaluation data, from designing annotation infrastructure and protocols to deploying evaluation services into production.
  • Research and implement advanced techniques in Mixture-of-Experts (MoE) routing, expert specialization evaluation, and ensemble calibration.
  • Collaborate cross-functionally with Product, Data Engineering, and the SVP of AI to translate complex statistical uncertainty into clear, actionable product signals.
  • Act as a technical leader and "Scientific Conscience" within the AI pod, ensuring every AI-driven risk signal is backed by an empirical derivation story.
  • 10+ years of Machine Learning experience with a focus on Deep Neural Network activities, evaluating model performance & trust.  
  • 1-2+ years’ experience focused on post-training activities
  • 1+ year experience creating benchmarks to evaluate LLMs
  • Technical Mastery: Deep expertise in LLM-as-judge architectures, multi-turn evaluation, and Reinforcement Learning (RL/RLHF/RLAIF).
  • Statistical Rigor: Mastery of statistics and experimental design, including significance testing, distribution analysis, and inter-rater reliability.
  • Architectural Depth: Experience with Mixture-of-Experts (MoE) systems, routing behavior, and expert specialization.
  • Builder Mindset: Proven ability to own the path from data collection to production deployment; we are a small team and every role is "hands-on."
  • Domain Fluency: Understanding of Graph RAG and the unique challenges of evaluating non-deterministic, agentic workflows.

Nice to Have

~1 min read
  • Judgment Task Models: Experience building, fine-tuning (LoRA, etc.), or pre-training models specifically for judgment, preference modeling, or classification tasks.
  • Domain Context: Background in cognitive science, intelligence community tradecraft, or research literature on expert judgment under uncertainty.
  • Infrastructure at Scale: Experience building or managing large-scale annotation infrastructure and quality assurance protocols.
  • Academic/Research Track Record: A record of published research or recognized work in preference modeling or AI alignment.

 

The target base salary for this position is $195,000-$205,000 plus company bonus and equity. Final offer amounts are determined by multiple factors including location, local market variances, candidate experience and expertise, internal peer equity, and may vary from the amounts listed above.

 

What We Offer

~1 min read
100% fully paid medical, vision, and dental for employees and their dependents
Generous time off; we observe all US federal holidays, close our office for a winter break (12/24-12/31), in addition to granting 18 PTO days and 10 sick days
Outstanding compensation package; competitive commissions for revenue roles and bonuses for non-revenue positions
A strong commitment to diversity, equity, and inclusion
Eligibility to participate in additional benefits such as 401k match up to 5%, 100% paid life insurance (up to $100,000 coverage),, and parental leave
A collaborative and positive culture - your team will be as smart and driven as you
Limitless growth and learning opportunities

Location & Eligibility

Where is the job
United States
Remote within one country
Who can apply
US
Listed under
United States

Listing Details

Posted
April 23, 2026
First seen
April 23, 2026
Last seen
May 5, 2026

Posting Health

Days active
11
Repost count
0
Trust Level
56%
Scored at
May 5, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Sayari
Sayari
greenhouse
Employees
5
Founded
2004
View company profile
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

SayariStaff Applied Scientist - AI Evaluation & TrustUSD 195000-205000