Skip to content

Senior Site Reliability Engineer

Remote
  • Amsterdam, Noord-Holland, Netherlands
Science and Engineering

Job description


Promaton is changing the dental healthcare landscape by automating treatment planning workflows using AI, making healthcare more affordable and accessible for everyone. We are on a mission to eliminate errors in dentistry by improving diagnostic accuracy and automating treatment planning workflows, see our company page to learn more about what we do.

Our team’s mission is to (1) ensure that our AI can be accessed efficiently and effectively by thousands of customers world-wide and that (2) our internal Product teams have the best experience when developing new features.

We are looking for a highly motivated and experienced Site Reliability Engineer (SRE) to join our team. The ideal candidate will have a strong background in software engineering and IT operations and will be responsible for ensuring the reliability, scalability, and performance of our production systems.

Do you want to join us in our journey to improve the lives of patients?

You will:

  • Be in charge of our products’ reliability
    • Designing, implementing, and maintaining the monitoring and alerting systems to ensure system availability and performance.

    • Managing the system’s capacity, load balancing, and performance to anticipate and mitigate problems before they occur.

    • Defining and measuring our service level agreements (SLAs), Service Level Indicators (SLIs), Service Level Objectives(SLOs), and error budgets.

    • Collaborating with the development teams to improve the reliability and scalability of applications.

  • Promote a DevOps philosophy inside the engineering teams
    • Conduct post-mortem analysis of incidents and develop action plans to prevent future issues.

    • Have an Incident Commander mentality and deal with incidents and on-call rotas.

    • Participating in the planning and execution of software deployments.

  • Improve our Stack
    • Automating repetitive operational tasks using tools and scripts.

    • Be responsible for keeping the tech stack up to date and helping other teams with that.

Our tech stack:

Docker | Kubernetes | AWS | Grafana | Prometheus | GitHub & GitHub Actions | TypeScript | Node.JS | Express | PostgreSQL | Metabase | OpenAPI | Python | PyTorch | TensorFlow | ArgoCD & Workflows | ClearML | Packer

Our whole stack runs on AWS using EKS, and we deploy our infrastructure changes in a GitOps pipeline using CDK. Our applications are deployed in a GitOps fashion using ArgoCD.

Our backend is mostly written in TypeScript and Python, and all our machine-learning applications are in Python. We have an efficient and effective design and development process around RFCs, PR reviews, and pair programming.

The perks of working at Promaton:

🎈Inclusive environment, we value and celebrate diversity.

🏡 Excellent work/life balance. Freedom to work from home or anywhere you like (and any time you like). We only have a few touchpoints.

💪 Loads of responsibility and autonomy (we stay away from micromanagement) and a chance to make a real impact.

👩‍🔬 Dedicated time for hackathons and growth to explore new ideas of your own. Every quarter, we have a hackathon week where you can work on anything you like to expand your skill set!

🎓 Real training budget for books, conferences, or anything else you need to grow.

💰 Attractive salary package and excellent employment terms.

🚀 Work with the latest technology at the forefront of a rapidly developing field in medical imaging AI.

🏖 Awesome yearly company retreat and quarterly team events.

💻 Top-notch gear and even bigger servers to play with.

🏄‍♂️ Promaton is funded for many years to come, meaning you can have the impact you only get at a startup but with the job security of an established company.

🛬 For international engineers based in the NL (already relocated to the Netherlands), we are able to offer visa sponsorship.

Job requirements

  • Bachelor’s degree in Computer Science, Software Engineering, or a related field.

  • Over 5 years of proven experience in software engineering, with at least 3 years in a similar role

  • Experience with configuration management tools like Ansible, Puppet, or Chef

  • Experience with IaC (Infrastructure as Code) tools such as AWS CDK, Pulumi , or Terraform.

  • Familiarity with monitoring systems like Prometheus, Grafana, Nagios, or ELK Stack.

  • Knowledge of operating system administration (Linux/Unix) and networking.

  • Experience with containerization and orchestration (Docker, Kubernetes).

  • Knowledge of SQL and NoSQL databases (MySQL, PostgreSQL, MongoDB, Cassandra).

  • Experience with cloud service providers (AWS, Google Cloud, Azure).

  • Advanced knowledge of programming languages such as Python, Go, or Java.

  • Excellent problem-solving and critical thinking skills.

  • Strong communication skills and ability to work in a team.

  • Ability to manage multiple projects and priorities effectively.

  • Based in Europe, in a time zone between UTC-1 and UTC+3 (-2/+2 hours Amsterdam time) and willing to join our company events 4 times a year

Bonus points:

  • Experience in medical AI or similar regulated fields

Sounds like you? Let's talk!


Good to know:

  • It goes without saying that we love the power of AI, but we believe the human touch is irreplaceable in recruitment. We are looking forward to your personalized answers to our screening questions, not ChatGPT's insights!

  • Wondering if you should you apply if your experience doesn't fit all of the job requirements? In general, we are aiming for an 80% match, so please go ahead if you got excited by the role and by the idea of joining our team! Not the right role? You can still send us an open application!

  • Read our blog about How to be successful in our selection process for more tips and tricks!

or