Site Reliability Engineering Practitioner

Course ID: TGS-2024050785
Course Duration: 3 days, 9am-6pm

The SRE (Site Reliability Engineering) Practitioner course introduces ways to economically and reliably scale services in an organisation. It explores strategies to improve agility, cross-functional collaboration, and transparency of health of services towards building resiliency by design, automation and closed loop remediations.

The course aims to equip participants with the practices, methods, and tools to engage people across the organisation involved in reliability through the use of real-life scenarios and case stories. Upon completion of the course, participants will have tangible takeaways to leverage when back in the office such as implementing SRE models that fit their organisational context, building advanced observability in distributed systems, building resiliency by design and effective incident responses using SRE practices.

The course is developed by leveraging key SRE sources, engaging with thought-leaders in the SRE space and working with organisations embracing SRE to extract real-life best practices and has been designed to teach the key principles & practices necessary for starting SRE adoption.

  • SRE Anti-patterns
    SRE in a distributed ecosystem
    ○ Avoiding SRE antipatterns
  • SLO is a proxy for customer happiness
    What has changed with SLOs?
    ○ SLIs and system boundaries
    ○ Error Budgets, velocity and risk
  • Building secure and reliable systems
    Non-Abstract Large Scale Design
    ○ Fault-tolerant designs
    ○ Designing for security, resiliency, scalability and changing landscapes
  • Full-stack observability
    Pillars of Observability
    ○ Observability MELT
    ○ Using Open Telemetry
  • Platform Engineering and AIOps
    Platform-centric approaches
    ○ Using DataOps and AIOps to improve resiliency
    ○ AIOps Simple Recipe
  • SRE & Incident Response Management
    Incident Command Framework
    ○ OODA Loop
    ○ SRE and closed-loop remediation
    ○ AI/ML and Swarming for better incident management
  • Chaos Engineering
    Chaos Engineering Defined
    ○ Myths of Chaos
    ○ Chaos Engineering Experiments and Resources
    ○ Game Day Basics and Exercises
  • SRE is the purest form of DevOps
    Key Principles of SRE
    ○ Metrics for Success
    ○ SRE Execution Models
    ○ Culture and behavioural skills
    ○ Transformations and SRE

By completing this course, the following Learning Outcomes (LO) will be achieved:

  • LO1: Curate information for user guides and training materials of infrastructure administrative activities to meet Service Level Objectives.
  • LO2: Manage infrastructure configuration and support activities for secure and reliable systems.
  • LO3: Diagnose underlying technical problems causing disruptions guided by Observability.
  • LO4: Create plans for infrastructure upgrades and propose improvements based on user needs.
  • LO5: Manage technical issues within an agreed timeframe utilising Site Reliability Engineering, Incident Response Management and Problem Management.
  • LO6: Implement tests of infrastructure systems to evaluate the impact of potential upgrades and updates using Chaos Engineering.
The target audience for this course are professionals including IT Operations, Site Reliability Engineers, IT Operations, Business Managers and Stakeholders, Change Agents, Consultants, DevOps Practitioners, IT Directors/Managers/Team Leaders, Product Owners, Scrum Masters, Software Engineers, System Integrators, Tool Providers
 
Recommended Learner Profile:
  • Language and literacy proficiency level : Minimum 3 GCE ‘O’ Levels Passes including English or WPL Level 5
  • Required years of experience in relevant domain : Minimum of 1 year of working experience.

Candidates who attend the course will be better positioned to successfully complete the SRE Practitioner certification examination. The SRE Foundation certification is a prerequisite to attempt the SRE Practitioner examination.

About the examination:

  • Exam duration: 60 minutes to complete the exam.
  • Exam format: 40 questions, multiple choice, proctored, open-book.
  • Passing score: 26 marks required to pass (out of 40 available) – 65%.

All exams will be conducted online. The exams are web-proctored by PeopleCert and are available 24/7. Successful participants earn the SRE Practitioner certification.

Certificate of Attendance from Sapience Consulting:
Upon meeting at least 75% attendance and passing the assessment(s), participants will receive a Certificate of Attendance from Sapience Consulting.

The following information are relevant for candidates who are seeking SSG-funding support for the course:

  • Assessments
    Candidates must pass all prescribed tests/assessments and attain 100% competency to be eligible for funding support.
    Mode of Assessment: Written Assessment, Case Study Assessment.
  • Statement of Attainment (SOA) from SkillsFuture Singapore:
    After passing the assessment(s), you’ll receive a SkillsFuture Singapore Statement of Attainment (SOA) certifying that you have achieved the following Competency Standard(s):
    ICT-OUS-3007-1.1 – Infrastructure Support-3.
We offer flexible learning options (online, instructor-led, hybrid) to fit your learning style

Hear What Our Students Say About Us

Funding Available

SSG Funding

Course ID: TGS-2024050785

SSG Funding

Terms and conditions apply. Please visit our SkillsFuture Singapore (SSG) Funding page for full details.
LEARN MORE

PSEA Funding

PSEA Funding

Terms and conditions apply. Please visit our IBF STS programme page for full details.  PSEA page for more info.
LEARN MORE

SkillsFuture Credit

SkillsFuture Credit

Terms and conditions apply. Please visit our SkillsFuture Credit page for full details.
LEARN MORE

Supported by UTAP

Supported by UTAP

NTUC members can use the Union Training Assistance Programme (UTAP) to partially cover the cost of their training. Visit our  UTAP page for more info.
LEARN MORE

Why Us?

Complimentary refresher

Participants can attend a complimentary refresher if they wish(1-year validity and subject to approval)

Post Course Advisory Support

Should you have questions after the course, you may contact the trainer for assistance regarding course material

E-learning Portal Access

1 year access to our E-learning portal. Including:
- E-books available for download
- Official sample exam
- Randomised quiz formulated by Sapience Trainers based on past examinations

Meet Your Trainer

Principal Consultant
Sapience Consulting

See more

Principal Consultant
Sapience Consulting

See more

Senior Consultant
Sapience Consulting

See more 

Senior Consultant
Sapience Consulting

See more

Senior Consultant
Sapience Consulting

See more 

Senior Consultant
Sapience Consulting

See more

Consultant
Sapience Consulting

See more 

Related courses

ITIL® and the Swirl logo are registered trademarks of the PeopleCert group. Used under licence from PeopleCert. All rights reserved.