Staff AI Inference and Acceleration Engineer
Quick Summary
Own the on-board inference architecture — mapping models to available accelerators (NPU, GPU, DSP, CPU) based on latency, power, and memory budgets.
Figure is an AI robotics company developing autonomous general-purpose humanoid robots. The goal of the company is to ship humanoid robots with human level intelligence. Its robots are engineered to perform a variety of tasks in the home and commercial markets. Figure is headquartered in San Jose, CA.
We are looking for a Staff AI Inference & Acceleration Engineer to join the Platform Software team and own the on-board inference architecture for Figure’s humanoid robots. You will be the technical authority on how AI workloads are mapped, optimized, and executed across the robot’s compute hardware — driving down power consumption and cost while meeting the strict latency and reliability demands of a real-time autonomous system.
Responsibilities
~1 min read- →Own the on-board inference architecture — mapping models to available accelerators (NPU, GPU, DSP, CPU) based on latency, power, and memory budgets.
- →Partition inference workloads across heterogeneous compute resources, balancing real-time performance with power and thermal constraints.
- →Define and maintain a system-level compute budget across all inference tasks running on the robot.
- →Evaluate next-generation acceleration hardware and contribute to the definition of future compute platform requirements.
- →Optimize inference toolchains end-to-end — from model export through runtime execution — for target hardware.
- →Apply quantization (INT8, INT4, mixed-precision), pruning, operator fusion, and other compression techniques to reduce compute, memory, and power footprint.
- →Profile inference pipelines to identify and eliminate bottlenecks in latency, memory bandwidth, and power consumption.
- →Optimize kernel scheduling, memory layout, and data movement across the compute hierarchy.
- →Partner closely with the AI/ML team to define model architecture constraints that are hardware-friendly from the outset.
- →Work with the Platform Software team on runtime integration, scheduling, and power management.
- →Engage with silicon vendors and research teams to track the accelerator landscape and influence hardware roadmaps.
Requirements
~2 min read- M.S. or Ph.D. in Computer Engineering, Electrical Engineering, Computer Science, or a related field — or equivalent industry experience.
- At least 8 years of industry experience in hardware acceleration, ML systems, or compute architecture.
- Deep understanding of AI/ML inference — model formats (ONNX, TFLite, etc.), inference runtimes, and deployment pipelines.
- Hands-on experience optimizing models for edge or embedded hardware using quantization, pruning, and operator-level tuning.
- Strong understanding of computer architecture — memory hierarchies, data movement, and heterogeneous compute.
- Experience profiling and benchmarking inference workloads across CPU, GPU, NPU, DSP.
- Familiarity with low-level toolchains and compilation frameworks (e.g. TVM, MLIR, TensorRT, Torch, SNPE/QNN, JAX, CUDA, ROCm).
- Solid software engineering skills in C++ and Python.
- Strong cross-functional communication skills — able to work effectively across hardware, software, and AI/ML teams.
- Knowledge of real-time operating constraints and their impact on inference scheduling.
- Track record of co-designing model architectures with ML teams to meet hardware constraints.
The US base salary range for this full-time position is between $180,000 - $275,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components/benefits depending on the specific role. This information will be shared if an employment offer is extended.
Location & Eligibility
Listing Details
- Posted
- June 26, 2026
- First seen
- June 26, 2026
- Last seen
- June 26, 2026
Posting Health
- Days active
- 0
- Repost count
- 0
- Trust Level
- 60%
- Scored at
- June 26, 2026
Signal breakdown
Please let Figureai know you found this job on Jobera.
3 other jobs at Figureai
View all →Explore open roles at Figureai.
Similar Engineer jobs
View all →Browse Similar Jobs
Stay ahead of the market
Get the latest job openings, salary trends, and hiring insights delivered to your inbox every week.
No spam. Unsubscribe at any time.