embedding-vc
New

Member of Technical Staff - Efficient ML

San Franciscofull-timelead
OtherMember Of Technical Staff
0 views0 saves0 applied

Quick Summary

Overview

Introducing Moonlake, AI for creating world simulations. Scope of Work Training efficiency Dataloaders, fusion, activation remat, gradient checkpointing. FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning. GPU + kernel performance Nsight profiling, Triton/CUDA kernels, fused ops.

Technical Tools
kubernetes

Introducing Moonlake, AI for creating world simulations.

  • Dataloaders, fusion, activation remat, gradient checkpointing.

  • FSDP/ZeRO/tensor+pipeline parallel; NCCL tuning.

  • Nsight profiling, Triton/CUDA kernels, fused ops.

  • Flash-attention–style speedups, sequence packing, KV-cache tricks.

  • Low-latency serving, continuous batching, speculative decoding.

  • Quantization (GPTQ/AWQ), distillation, pruning.

  • SLURM/K8s multi-node jobs, checkpoint hygiene.

  • Determinism, env pinning, GPU failure handling.

We are committed to being an on-site, in-person team currently based in San Mateo

Location & Eligibility

Where is the job
San Francisco Bay Area
On-site at the office
Who can apply
Same as job location

Listing Details

Posted
January 15, 2026
First seen
May 6, 2026
Last seen
May 8, 2026

Posting Health

Days active
0
Repost count
0
Trust Level
12%
Scored at
May 6, 2026

Signal breakdown

freshnesssource trustcontent trustemployer trust
Newsletter

Stay ahead of the market

Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.

A
B
C
D
Join 12,000+ marketers

No spam. Unsubscribe at any time.

embedding-vcMember of Technical Staff - Efficient ML