Senior AI Inference Engineer - Model Optimization & Deployment
Quick Summary
The Perception team is pioneering the development of a multi-modality foundation model to drive the next generation of autonomous system intelligence. As a Model Optimization & Deployment Engineer,
As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient, production-ready large-scale models to our on-vehicle stack. We are looking for experts with hands-on experience in compressing, accelerating, and deploying complex models (LLMs, VLMs, or FMs) for power- and thermal-constrained vehicle SOCs. You will optimize the ML models, write custom CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution on edge devices.
Architect and implement model conversion and compilation pipelines using TensorRT and TensorRT-LLM for edge deployment.
Perform rigorous parity checking, accuracy recovery, and latency benchmarking between PyTorch frameworks and compiled edge binaries.
Location & Eligibility
Listing Details
- Posted
- April 11, 2026
- First seen
- April 11, 2026
- Last seen
- April 29, 2026
Posting Health
- Days active
- 18
- Repost count
- 0
- Trust Level
- 49%
- Scored at
- April 29, 2026
Signal breakdown

Zoox, a subsidiary of Amazon, designs fully autonomous vehicles focusing on making urban transportation safer and more efficient.
View company profilePlease let Zoox know you found this job on Jobera.
3 other jobs at Zoox
View all →Explore open roles at Zoox.
Similar Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.