basis-research
New

Data Engineer, Platform

New York Officefull-timemid
Data EngineerData
0 views0 saves0 applied

Quick Summary

Overview

About Basis Basis is a nonprofit applied AI research organization with two mutually reinforcing goals. The first is to understand and build intelligence.

Key Responsibilities

Design and build data pipelines for training and evaluation across Basis research projects and platform offerings, ensuring reliability, performance, and scalability.

Technical Tools
airflowbigquerydbtkafkapythonsnowflakesqldatabase-designetlperformance-optimization

Basis is a nonprofit applied AI research organization with two mutually reinforcing goals.

The first is to understand and build intelligence. This means to establish the mathematical principles of what it means to reason, to learn, to make decisions, to understand, and to explain; and to construct software that implements these principles.

The second is to advance society’s ability to solve intractable problems. This means expanding the scale, complexity, and breadth of problems that we can solve today, and even more importantly, accelerating our ability to solve problems in the future.

To achieve these goals, we’re building both a new technological foundation that draws inspiration from how humans reason, and a new kind of collaborative organization that puts human values first.

About the Role

~2 min read

Data Engineers on the Platform team at Basis build trustworthy data pipelines with comprehensive provenance and quality gates, curate documented datasets for training and evaluation, and ensure data infrastructure scales reliably. You will work on both platform-specific data needs and cross-project data coordination, preventing duplicate work and facilitating shared datasets.

We are looking for people who are technically excellent and treat data quality as a first-class concern. The ideal Data Engineer has experience with ML data pipelines, understands the full lifecycle from raw data through model training and evaluation, and brings rigor to data provenance, lineage tracking, and quality assurance. You combine software engineering discipline with deep understanding of data systems and ML requirements.

This role is embedded across Platform and Research teams, working on infrastructure that supports both commercial offerings and internal research. You will help Basis scale data operations to support medium-scale models, ensure data governance as we serve external customers, and build systems that researchers can trust for reproducible experiments.

We seek individuals who aspire to do rigorous, high-quality, robust data engineering, but are not afraid to iterate, learn from real usage, and explore different approaches to achieve excellence.

Basis is a collaborative effort, both internally and with our external partners; we are looking for people who enjoy building data foundations for problems larger than ones they can tackle alone.

Requirements

~1 min read
  • Experience with feature stores (Tecton, Feast) or building feature platforms.

  • Background in ML research or research engineering providing understanding of data needs across experiment lifecycle.

  • Experience with data lineage tools (Apache Atlas, DataHub, Monte Carlo) and metadata management.

  • Knowledge of vector databases and embedding pipelines for modern AI applications.

  • Contributions to data engineering open-source projects (Airflow, dbt, Great Expectations).

  • Understanding of responsible AI and data governance practices.

Responsibilities

~2 min read
  • Design and build data pipelines for training and evaluation across Basis research projects and platform offerings, ensuring reliability, performance, and scalability.

  • Implement data quality frameworks including validation rules, quality gates, anomaly detection, and monitoring that catch data issues before they impact research or production systems.

  • Develop and maintain feature stores or equivalent systems that enable consistent feature access across training and serving environments, preventing train-serve skew.

  • Ensure data provenance and lineage tracking so researchers and engineers can understand data origins, transformations applied, and dependencies, enabling reproducible experiments and debugging.

  • Curate documented datasets for model training and evaluation, including dataset versioning, comprehensive documentation, quality metrics, and metadata that enables appropriate usage.

  • Coordinate cross-project data initiatives to prevent duplicate data work, facilitate shared datasets, and ensure consistent data practices across Basis as the organization scales.

  • Optimize data infrastructure for scale as compute grows, including cost optimization, performance tuning, caching strategies, and efficient data access patterns.

  • Collaborate with research and engineering teams to understand data needs, translate requirements into technical solutions, and provide consultation on data architecture and best practices.

  • Implement data governance policies ensuring compliance with privacy regulations, security requirements, and responsible AI practices as Basis serves external customers.

  • Contribute to the culture and direction of Basis by modeling data quality rigor, documentation excellence, and focus on trustworthy data infrastructure.

Exceptional candidates who may not meet all of the following criteria are still encouraged to apply.

  • FT/PT: Full-time.

  • In-person Policy: We are in the office four days a week. Be prepared to attend multi-day Basis-wide in-person events.

  • Location: New York City.

  • Salary range: Competitive salary.

Non-Discrimination Notice
Basis Research Institute provides equal employment opportunities without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, or genetics and prohibits discrimination based on all protected characteristics.

Privacy Notice

By submitting your application, you grant Basis permission to use your materials for both hiring evaluation and recruitment-related research and development purposes. Your information may be processed in different countries, including the US. You retain copyright while providing Basis a license to use these materials for the stated purposes.

Read our full Global Data Privacy Notice here.

Location & Eligibility

Where is the job
New York Office
On-site at the office
Who can apply
Same as job location

Listing Details

Posted
November 23, 2025
First seen
May 5, 2026
Last seen
May 8, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
14%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

basis-researchData Engineer, Platform