N
Nuancelabs1mo ago

Machine Learning Infra Engineer

United StatesSeattlemid
EngineeringData Science
0 views0 saves0 applied

Quick Summary

Key Responsibilities

Build and maintain the serving stack for multimodal AI workloads. Optimize for latency, throughput, and cost using batching strategies, autoscaling, and intelligent resource allocation.

Requirements Summary

Configure and maintain GPU clusters using Kubernetes and Terraform. Implement monitoring, autoscaling based on custom metrics, and cost optimization strategies. Developer Tooling: Build CI/CD,

Technical Tools
EngineeringData Science

About the Role

~1 min read

Nuance Labs is building the next generation of emotionally expressive, real-time AI.

This is a critical role to build the infrastructure that powers our AI platform. You will own the systems that serve models at scale, orchestrate complex data workflows, and ensure our real-time video AI runs reliably with low latency for users worldwide.

 

Responsibilities

~1 min read
  • Own Inference Infrastructure: Build and maintain the serving stack for multimodal AI workloads. Optimize for latency, throughput, and cost using batching strategies, autoscaling, and intelligent resource allocation.

  • Real-Time Video Streaming: Architect systems to handle long-lived WebRTC connections with unpredictable client behavior, ensuring smooth video and audio delivery at scale.

  • Orchestrate Data Workflows: Build robust pipelines for offline processing, evaluation, and training using orchestration frameworks like Dagster or Ray. Manage petabyte-scale video storage and network requirements.

  • GPU Cluster Management: Configure and maintain GPU clusters using Kubernetes and Terraform. Implement monitoring, autoscaling based on custom metrics, and cost optimization strategies.

  • Developer Tooling: Build CI/CD, evaluation, and versioning systems that enable safe, zero-downtime model deployments and rapid iteration cycles.

Requirements

~1 min read
  • Infrastructure Expertise: Strong practical experience with Kubernetes, Terraform, and cloud platforms. You can design secure, scalable systems and debug complex distributed issues.

  • Systems Programming: Proficiency in Python and experience with systems languages (Rust or Go). Comfortable profiling workloads and resolving compute, memory, or network bottlenecks.

  • Orchestration & Pipelines: Experience managing large-scale offline workflows using tools like Dagster, Ray, Airflow, or similar frameworks.

  • Production Operations: Deep understanding of production reliability, monitoring, incident response, and capacity planning for high-traffic services.

Nice to Have

~1 min read
  • Experience with WebRTC or real-time media pipelines in production

  • Experience running GPU-backed inference services at scale (vLLM, Triton Inference Server, TensorRT)

  • Knowledge of performance optimization and low-level systems debugging

  • Familiarity with video/audio processing and storage systems

  • $10M seed round backed by Accel, South Park Commons, Lightspeed, and top angels including Synthesia’s former CPO.

  • A world-class team of PhDs from MIT, UW, and Oxford with decades of industry experience at Apple and Meta, advancing real-time avatars from cutting-edge research to products used by millions.

  • In-person collaboration, 5 days a week at Seattle HQ

Listing Details

Posted
February 27, 2026
First seen
March 26, 2026
Last seen
April 18, 2026

Posting Health

Days active
22
Repost count
0
Trust Level
39%
Scored at
April 18, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trustcandidate experience
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

N
Machine Learning Infra Engineer