Today’s organizations deal with a higher volume of change in a more complex tech environment leading to a higher risk of outages and incidents. IT teams must improve service reliability and system resiliency. With automation and observability becoming key factors for more efficient and rapid deployments, the SRE profile has become one of the fastest-growing enterprise roles and set of operational practices for managing services at scale.
To maintain the highest quality learning for our community, DevOps Institute Certifications expire two years from the date of completion. Members can maintain their certification by participating in the Continuing Education Program and earning Continuing Education Units through participation in learning opportunities.
Course Objectives
At the end of the course, the following learning objectives are expected to be
achieved:
- A practical view of how to successfully implement a flourishing SRE culture in your
organization. - The underlying principles of SRE and an understanding of what it is not in terms of
anti-patterns, and how you become aware of them to avoid them. - The organizational impact of introducing SRE.
- Acing the art of SLIs and SLOs in a distributed ecosystem and extending the usage of
Error Budgets beyond the normal to innovate and avoid risks - Building security and resilience by design in a distributed, zero-trust environment.
- How do you implement full stack observability, and distributed tracing and bring about
an Observability-driven development culture? - Curating data using AI to move from reactive to proactive and predictive incident
management. Also, how do you use DataOps to build clean data lineage? - Why is Platform Engineering so important in building consistency and predictability of
SRE culture? - Implementing practical Chaos Engineering.
- Major incident response responsibilities for a SRE based on incident command
framework, and examples of the anatomy of unmanaged incidents. - The perspective of why SRE can be considered as the purest implementation of DevOps.
- SRE Execution model
- Understanding the SRE role and understanding why reliability is everyone’s problem.
- SRE success story learnings
Learner Materials
- Twenty-four (24) hours of instructor-led training and exercise facilitation
- Learner Manual (excellent post-class reference)
- Participation in unique exercises designed to apply concepts
- Sample documents, templates, tools, and techniques
- Access to additional value-added resources and communities
Prerequisites
It is highly recommended that learners attend the SRE Foundation course with an
accredited DevOps Institute Education Partner and earn the SRE Foundation
certification prior to attending the SRE Practitioner course and exam. An understanding
and knowledge of common SRE terminology, concepts, principles, and related work
experience is recommended.
Certification Exam
Successfully passing (65%) the 90-minute examination, consisting of 40 multiple-choice
questions lead to the SRE Practitioner certificate. The certification is governed and
maintained by DevOps Institute.