Use this Site Reliability Engineer job description to advertise your vacancies and find qualified candidates. Feel free to modify responsibilities and requirements based on your needs.
Site Reliability Engineer responsibilities include:
- Working on-call shift to prevent incidents from ever happening
- Running our infrastructure with Chef, Ansible, Terraform, GitLab CI/CD, and Kubernetes
- Building monitoring that alerts on symptoms rather than on outages
We are looking for a Site Reliability Engineer to join our team and develop software systems and automated solutions for operational aspects in an organization.
Site Reliability Engineer responsibilities include monitoring computer systems and building alerts for various operational issues that computer systems can experience.
Ultimately, you will work with our IT team to ensure our organization can continue to deliver products and services in our computer system environment.
- Administer production jobs
- Understand debugging info
- “Drain” traffic away from a cluster
- Roll back a bad software push
- Block or rate-limiting unwanted traffic
- Bring up additional serving capacity
- Use the monitoring systems (for alerting and dashboards)
Requirements and skills
- Proven work experience as a Site Reliability Engineer or similar role
- Collaborate and communicate asynchronously
- Document all the things so you don’t need to learn the same thing twice
- Have an enthusiastic, go-for-it attitude
- Relevant training and/or certifications as a Site Reliability Engineer
Frequently asked questions
What does a Site Reliability Engineer do?
A site reliability engineer (SRE) creates a bridge between development and IT operations by taking on the tasks typically done by operations.
What are the duties and responsibilities of a Site Reliability Engineer?
A Site Reliability Engineer has many responsibilities, including improving computer systems in an organization to help the IT department with emergency response and capacity planning.
What makes a good Site Reliability Engineer?
A good Site Reliability Engineer must have excellent leadership and communication, as they need to work with various IT professionals in our organization to ensure our computer systems run as efficiently as possible.
Who does a Site Reliability Engineer work with?
A Site Reliability Engineer will work with many professionals, such as IT Managers, to ensure an organization’s computer systems work as efficiently as possible.