Quick Summary
duplicate records, mismatched member IDs, enrollment timing gaps,
Most of what makes American healthcare expensive isn’t medical care. It’s the machinery wrapped around it: middlemen taking a cut, fraud nobody stops, and billing systems designed to fight over payment instead of deliver care. The result is higher premiums, denied claims, surprise bills, and a system patients increasingly experience as adversarial.
Arlo is rebuilding health insurance for small businesses from first principles: making sure as much of every premium dollar as possible goes to care instead of getting absorbed by the system around it. We do that by identifying fraud earlier, steering members toward higher-quality and lower-cost care, automating operational overhead, and eliminating vendors whose business exists mostly to take a cut.
AI is the foundation that makes this work. We use it across underwriting, operations, clinical programs, and member experience to build an insurer that becomes more efficient as the technology improves.
We’re already operating at meaningful scale: profitable, hundreds of millions in premiums, tens of thousands of members covered, and growing quickly through brokers, employers, and partners. Backed by Upfront Ventures, 8VC, and General Catalyst, with a team from Palantir, YC companies, and longtime healthcare operators.
Arlo quotes small businesses using AI-powered underwriting, and the quality of that underwriting is only as good as the data beneath it. We're hiring a Data Engineer to build and maintain the pipelines, models, and monitoring systems that keep our data infrastructure clean, timely, and trustworthy.
This is a hands-on individual contributor role. You'll sit at the boundary between data engineering and data science, working directly with underwriting, pricing, and analytics teams to ensure the right data reaches the right systems at the right time.
Pipeline development and maintenance
Build and maintain ingestion pipelines for complex, heterogeneous data sources — TPA feeds, carrier data, census files, claims, eligibility, and enrollment records
Design and implement dbt models and transformation logic that produce clean, reliable "source of truth" tables used across underwriting, pricing, and reporting
Own pipeline orchestration using tools like Dagster or Airflow, ensuring reliable scheduling, retries, and alerting
Data quality and observability
Build monitoring and alerting for data inconsistencies: duplicate records, mismatched member IDs, enrollment timing gaps, and carrier reporting lags
Profile ingest delay characteristics across live policy data and flag where structural latency introduces systematic bias
Maintain clear documentation of known data quality limitations so downstream teams know what the data can and cannot reliably support
Collaboration with data science
Partner closely with the data science team to build and maintain feature pipelines that feed underwriting and pricing models
Support feedback loop infrastructure that carries post-quoting learnings back into upstream models
Work with engineering to prioritize data quality fixes and accelerate resolution of upstream issues
Required
3–5 years in a data engineering or backend engineering role with significant data pipeline ownership
Proficiency in Python and SQL; comfortable writing production-quality code in both
Hands-on experience with pipeline orchestration tools (Dagster, Airflow, Prefect, or similar)
Experience with dbt or equivalent transformation frameworks
Familiarity with cloud data environments (AWS, GCP, or Azure) and columnar/analytical databases
Track record working with messy, real-world datasets and building systems that handle inconsistency gracefully
Strong instincts around data quality — you catch problems before they reach downstream consumers
Nice to have
Background in health insurance, claims data, or actuarial/TPA data environments
Experience supporting ML feature pipelines or working alongside data science teams
Familiarity with MLflow or similar MLOps tooling
Exposure to healthcare data standards or sensitive regulated data environments
You'll own your projects end-to-end — from initial scoping through to production deployment and ongoing monitoring. There's no separate ML engineering handoff; you'll work directly with the people who depend on your pipelines daily. The role requires equal comfort in Python-based engineering and SQL-driven analysis, and a genuine interest in understanding the business context behind the data.
Intro call with our recruiter
Resume interview with an Arlo co-founder
Technical take-home challenge (data engineering problem)
Onsite (or virtual): technical review + behavioral/cultural interviews
What We Offer
~2 min read$180,000 - $220,000 + equity
Location & Eligibility
Listing Details
- Posted
- May 28, 2026
- First seen
- May 28, 2026
- Last seen
- June 25, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 52%
- Scored at
- May 28, 2026
Signal breakdown
Please let arlo know you found this job on Jobera.
3 other jobs at arlo
View all →Explore open roles at arlo.
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.