As a Senior Site Reliability Engineer at Neptune, you’ll be making sure that production services are always up and running. You’ll be responsible for the reliability and usability of the developer platform (CI/CD). Among your tasks will be driving initiatives to improve the API error rate and latency.
You’ll have a lot of independence and space to test your creative ideas. We are looking for a self-driven, hands-on, proactive, and high-energy person who isnt afraid to take responsibility for the outcome.
In this role, you will:
- Own and operate platform and storage services like Kubernetes, Kafka, Elasticsearch, MySQL;
- Monitor the infrastructures utilization and plan capacity;
- Own the service level health indicators: Service Level Metrics & Service Level Objectives;
- Own the developer platform: CI/CD;
- Own the installation/upgrade process for on-prem deployments;
- Build tools and design processes that help improve observability and system resiliency;
- Establish design patterns for monitoring, benchmarking and deploying new features for the backend services;
- Automate and operationalize engineering tasks – data migrations, capacity changes, etc.
You need to have:
- 5+ years of experience in a Site Reliability Engineer, Backend Engineer or similar role;
- Knowledge of Linux: administration, networking, containerization;
- Experience with and deep understanding of Kubernetes;
- Experience with MySQL and ElasticSearch;
- Good, communicative spoken and written English.
It will be nice if you also have:
- Coding experience in one or more of the following languages: Python, Java, Scala, Bash;
- CI/CD tools expertise
- Experience with Helm
- Experience with Kafka
- Experience with Terraform
We offer you:
- Flat structure and startup atmosphere;
- The thrill of building a world-class product for some of the smartest people on earth;
- Friendly working environment and a lot of autonomy;
- Opportunity to learn, experiment with ideas, and grow;
- Competitive base salary and opportunity to participate in the Employee Stock Option Plan;
- Flexible working hours and fully remote work if you want;
- Multisport card, medical care and free lunch at the office.
We are a VC-funded, quite an extraordinary startup on the European scene. Why? Because we have an ambitious goal to become a key component of the Machine Learning Operations stack, similar to what GitHub is for software engineers. Currently, we are a team of 26 (and growing). You can take a look at our friendly faces and investors here: https://neptune.ai/about-us