Senior Machine Learning Engineer – Edge Inference & Optimization

Location: Berlin

Full time (40h/week)

Apply for this position

Vision being the dominant human sense, eye tracking constitutes a powerful approach for understanding the human mind! At Pupil Labs, our mission is to provide cutting-edge eye-tracking solutions, which are more robust, accurate, accessible, and user-friendly than ever before. Already today, our products empower thousands of users in academia and industry, clinical surgeons, elite athletes, astronauts on the International Space Station, and many more. Unlocking the full potential of eye-tracking technology relies on solving hard research problems, ranging from core gaze-estimation algorithms to developing cloud-based and edge-based tools for the real-time analysis of terabytes of egocentric video and physiological data.

The interdisciplinary R&D team at Pupil Labs, comprising members with backgrounds in Computer Science, Computational Neuroscience, Mathematics, and Physics, is tackling these challenges head-on! In close collaboration with other engineering teams, we identify promising R&D avenues and take pride in seeing our results swiftly integrated into the latest products shipped to our customers.

To support our efforts, we are looking to grow our team in Berlin with a full-time Senior Machine Learning Engineer with a strong background in edge inference, performance optimization, and ML systems design. This is an on-site position (with up to two home-office days per week).

Pupil Labs offers a competitive salary, flexible work arrangements, a great team of coworkers, a young and dynamic company structure, and a culture of participation and feedback.

You are excited about joining an ambitious, international, diverse, interdisciplinary, young, enthusiastic, and talented team of researchers and engineers? You have a growth mindset, thrive in fast-paced work environments, and enjoy working on hard problems? Then we are looking forward to hearing from you!

What you would do

Design, implement, and optimize machine learning pipelines for low-latency, energy-efficient inference on edge devices.
Collaborate with research teams to bring state-of-the-art models into production, adapting them for resource-constrained environments.
Apply and evaluate quantization, pruning, compression, and distillation techniques to balance performance, power, and accuracy.
Work across the ML lifecycle: from model architecture decisions to deployment and benchmarking on real hardware.
Interface with hardware engineers (as needed) to profile and optimize inference performance.
Contribute to our internal ML infrastructure, with an eye toward scalability, reproducibility, and maintainability.

Who you are

You have 5+ years of experience in machine learning engineering, with a focus on deploying models to edge or embedded systems.
You are proficient in Python, including ML frameworks like PyTorch.
You understand the trade-offs between accuracy, latency, compute, and power consumption in real-world deployments.
You have hands-on experience with model optimization techniques (quantization, pruning, knowledge distillation, etc.).
Bonus: You have worked with edge hardware platforms, such as the Qualcomm QNN/SNPE, NVIDIA Jetson series, FPGAs, or other embedded systems.
are comfortable with benchmarking and profiling tools and can use them to guide engineering decisions.
You are proactive, collaborative, and excited to work in a fast-moving, interdisciplinary environment.
You are comfortable in written and spoken English.

Perks

A beautiful office in the heart of Berlin.
Up to two home-office days per week.
15 mobile-office days per year.
Continued learning and professional development, including attending relevant conferences or technical workshops.
Flexible working hours.
Patent and Publish your inventions.
6 weeks of holidays per year.

Apply

Please submit your application here.