ML Platform Engineer – Robotics Infrastructure & MLOps
Role Overview
As a machine learning platform engineer, you will build the specialized infrastructure required to manage the massive, heterogeneous data generated by our physical robots. You are the architect of our “feedback loop”—ensuring that data from the real world feeds back into our training pipelines instantly. You will manage “Hardware-in-the-Loop” (HITL) infrastructure, where trained models are automatically tested on physical robots before being merged.
Key Responsibilities
- Design data pipelines specifically for high-frequency robotic sensor logs (rosbags, telemetry, video).
- Maintain scalable training clusters capable of handling large-scale physical data distribution.
- Build CI/CD pipelines that incorporate automated physical hardware testing.
- Develop experiment tracking tools that correlate simulation training metrics with real-world deployment success.
Requirements
- Expertise in distributed systems, Kubernetes, and Docker.
- Experience in MLOps tools (MLflow, Weights & Biases) and workflow automation.
- Strong background in managing large-scale sensor data and ROS/ROS2 infrastructure.
- Familiarity with Hardware-in-the-Loop (HITL) testing and fleet management systems.
Outcome
A scalable, hardware-integrated MLOps pipeline that reduces the friction between training and deployment, enabling continuous improvement through high-velocity data feedback loops.