Machine learning infra and ops engineer

Job description

Your technical expertise helps us bring our advanced deep learning algorithms to every dental practice in the world so we can increase the quality of healthcare globally. Using micro services on Kubernetes and AWS you make sure our algorithms perform well, and our data is securely transported across our platform

Promaton is changing dental healthcare by automating diagnostics and treatment workflows using AI, making healthcare more affordable and accessible for everyone. Did you know dentists miss up to 30% of pathologies on an X-Ray? We are on a mission to eliminate errors in dentistry by improving diagnostic accuracy, and automating mundane work like creating 3D models by hand from an X-Ray. See our company page to learn more about what we do.

You'll be:

Deploying the tools to our infrastructure, and operating our infrastructure, such that our AI Product teams can work effectively. This means:

  • Create tooling that automates the setup of environments where we can train and experiment
  • Work on our on-prem Kubernetes cluster of big GPUs, and see which workloads we can schedule there
  • Create the best CI/CD in the world for medical AI.

The challenges you'll face:

  • Quality metrics in medical AI are hard. On contrary to a simple metric like "revenue" or an F1 score, the accuracy of a 3D model is much harder to convey. On top of that, we are engineers not clinicians so we need to have that human in the loop. You will have to find ways to automate as much as we can, but also create tools and interfaces that allow a human-in-the-loop to ensure quality.
  • We work with large files, which prohibits us from using some of the ML frameworks that are really geared towards text and audio.
  • We work with many types of input and output data, and pipelines that have all kinds of different workload characteristics, like GPU heavy, CPU Heavy or disk heavy steps. To get the best performance out of our platform, we need to be creative in how we set this up.

The perks:

  • 💰 Excellent employment terms
  • 🏡 Freedom to work from anywhere you like (and any time you like). We only have a few touch points. Not just because of Covid, we are a remote company by design, and have people working from all over Europe.
  • 👩‍🔬 Dedicated time for hackathons and research, to explore new ideas of your own
  • 🎓 Real training budget for books and conferences or anything else you need to grow.
  • 🚀 Work with the latest technology, on the front-end of a rapidly changing field in medical
  • 💪 Loads of responsibility and autonomy, zero bureaucracy and a chance to make a real impact
  • 🏖 Awesome yearly company retreat, and quarterly team events.
  • ⛺️ 25 days of annual leave
  • 💻 Top-notch gear, and even bigger servers to play with
  • 🏄‍♂️ Promaton is funded for many years to come, meaning you can have the impact you only get at a startup, but with the job security of an established company.
  • 🛬 For all international hackers: Promaton is recognized as a visa sponsor by the Dutch government

Our tech stack:

  • Kubernetes on AWS (with EKS)
  • Currently all IaC is written in Cloudformation, but you will help us migrate to either Terraform or CDK
  • All pipelines are written in Python. We also use Rust here and there to get better performance
  • We still have to decide which ML Framework we want to deploy: Argo, Kubeflow, etc.

Job requirements

  • Computer science level degree or equivalent
  • Programmer by heart, preferably in any of these languages: Java, C#, Python, Go, JavaScript.
  • 5 years experience building back-end services and/or infrastructure
  • 2+ years with Machine learning infrastructure
  • Your mindset: open-minded, innovative, detail-oriented 
  • Based in a time zone between UTC-1 and UTC+3 (-2 to +2 hours Amsterdam time)
  • Ability to fly in for company events 4 times a year (1 week per quarter)

Bonus points: 

  • Passion for machine learning
  • Previous experience in a regulated environment

Sounds like you? Let's talk!