Hyphen Connect Limited2mo ago

Synthetic Data Engineer (AI Data/Training)

United Statesmid

Data EngineerData

1 views0 saves0 applied

Quick Summary

Overview

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops.

Key Responsibilities

Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting. Implement automated quality scoring and de-duplication systems. Manage data pipelines that feed directly into SFT and DPO training loops.

Requirements Summary

Proven experience building large-scale data pipelines (Airflow, Spark, Ray). Deep knowledge of prompt engineering for data generation. Familiarity with dataset distillation and bias mitigation.

Technical Tools

airflowetl

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

Responsibilities

~1 min read

→Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
→Implement automated quality scoring and de-duplication systems.
→Manage data pipelines that feed directly into SFT and DPO training loops.

Requirements

~1 min read

Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
Deep knowledge of prompt engineering for data generation.
Familiarity with dataset distillation and bias mitigation.

Location & Eligibility

Where is the job

United States

On-site within the country

Who can apply

US

Listed under

United States

Listing Details

Posted: April 24, 2026
First seen: April 24, 2026
Last seen: July 8, 2026

Posting Health

Days active: 75
Repost count: 0
Trust Level: 21%
Scored at: July 8, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust

Apply for this position

Hyphen Connect Limited

greenhouse

Web3 and AI talent recruitment agency based in Hong Kong with 700+ placements globally

Domain

hyphen-connect.com

Jobs

View company profile

External application · ~5 min on Hyphen Connect Limited's site

Please let Hyphen Connect Limited know you found this job on Jobera.

4 other jobs at Hyphen Connect Limited

Explore open roles at Hyphen Connect Limited.

General Counsel (Crypto & Commercial Litigation) - Hong Kong Onsite

Content Operations Specialist (US stock contract) - Crypto Exchange

VIP BD Manager/ Director - Global Remote

Web3 Frontend Developer (UX)

Similar Data Engineer jobs

Lead Data Engineer

Full Stack Data Engineer (m/f/d)

Analytics Engineer

USD 188000-275000

Rimes Technologies

Browse Similar Jobs

Machine Learning Engineer1.7k Data Scientist1.4k Data Analyst978 Data Manager165 Bi Developer83

Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A

B

C

D

Join 12,000+ marketers

No spam. Unsubscribe at any time.

Synthetic Data Engineer (AI Data/Training)