Amazon RIVR is a robotics company pioneering Physical AI through real-world doorstep delivery. Founded in 2024 as an ETH Zurich spin-off, RIVR developed wheeled-legged robots designed to operate in complex, unstructured environments such as stairs, gates, doors, and uneven urban terrain. We believe that achieving general physical intelligence requires solving real customer problems in the real world, where robots can learn from rich operational data at scale.
Following our acquisition by Amazon in March 2026, we are continuing this mission with greater reach and speed. By combining custom robot hardware, onboard autonomy, and cloud-based coordination, Amazon RIVR is building the next generation of safe, reliable autonomous robots for last-mile delivery
Job Description
Our fleet of delivery robots operates globally today, generating vast amounts of robotic real-world data. By utilizing state-of-the-art Vision-Language-Action (VLA) models, large-scale generalist models (like Transformers), generative AI, and similar methods, we can leverage this pool of data to significantly enhance its autonomy, navigation, and manipulation skills. In this role, you will develop multi-modal models that enable robots to autonomously generate actions from demonstrations, real-time sensor data, and natural language commands. We are seeking an expert in VLA models, imitation learning, and generative AI techniques with a deep knowledge of supervised, and self-supervised learning algorithms. If you are passionate about pushing the boundaries of AI we invite you to join us in shaping the future of intelligent robotics.
Develop and implement cutting-edge Vision-Language-Action (VLA) models, generalist robot transformers, and imitation learning algorithms (e.g., diffusion policies) to enable robots to autonomously execute complex tasks.
Design, test, and refine your algorithms to meet the demands of complex real-world autonomy and navigation tasks, with a focus on spatial reasoning and generalization.
Streamline the data collection and training workflow to efficiently expand model capabilities with new tasks and data sources.
Collaborate with the reinforcement learning team to innovate methods that leverage both simulated and real-world data.
Optimize and distill networks for real-time deployment on the edge (e.g. Nvidia Jetson Thor).
Build, lead and mentor an exceptional team of software engineers.
Provide expert guidance to product managers and executives for strategic decision-making.
Create and maintain documentation, guidelines, and best practices to streamline knowledge sharing.
Master’s degree or higher in a relevant field such as Engineering, Robotics, or Machine Learning.
A minimum of three years of industry or research experience, with PhD experience applicable.
Strong deep learning fundamentals including supervised learning, self-supervised learning, Transformer-based architectures, policy optimization algorithms, imitation learning, and generative AI techniques (including Diffusion Models).
Proven experience in developing Vision-Language-Action (VLA) models or large-scale generalist robot models (e.g., RT-2, Octo, etc.).
Strong background in robotics including autonomy, navigation.
Experience with deploying artificial neural networks on hardware platforms.
Ability to prototype algorithms and train deep neural networks in Python (Pytorch)
PhD degree in Robotics, Engineering, Computer Science, Machine Learning or a similar discipline, or an equivalent amount of research experience.
Publications at top-tier conferences.
Experience in managing a software team.
Ability to write production-level code in modern C++