What are SLO, SLA, and SLI?
If you’ve ever dived into SRE (Site Reliability Engineering) or service monitoring, you’ve likely seen three mysterious abbreviations: SLO, SLA, and SLI. They might look similar, but they play different roles in how we measure and guarantee reliability.
SLI — Service Level Indicator
This is a metric that tells you how your service is doing.
Examples:
- Latency (e.g., “response time of API requests”)
- Availability (e.g., “percentage of successful requests”)
- Error rate (e.g., “number of 5xx responses”)
👉 Think of SLI as a thermometer — it measures the state of your system.
SLO — Service Level Objective
This is a target you set for your SLIs.
Examples:
- “95% of requests should respond in < 300ms”
- “99.9% of uptime per month”
👉 SLO is your goal. It tells your team what “good enough” means.
SLA — Service Level Agreement
This is a formal contract (usually with customers) that defines what happens if you don’t meet your SLO.
Examples:
- “If uptime is less than 99.9%, we refund 10% of monthly fee.”
- “If latency exceeds 500ms for more than 1% of requests, customer gets credits.”
👉 SLA is the promise — and sometimes the penalty if you fail.
Putting It All Together
- SLI — the measurement (“uptime = 99.8%”).
- SLO — the objective (“uptime should be ≥ 99.9%”).
- SLA — the agreement (“refund if uptime < 99.9%”).
Real-Life Analogy
Imagine you order pizza delivery:
- SLI: The actual delivery time (e.g., 35 minutes).
- SLO: The restaurant’s internal goal (e.g., “deliver within 30 minutes”).
- SLA: The customer contract (e.g., “if we’re late, you get the pizza for free”).
Conclusion
SLIs, SLOs, and SLAs help organizations align reliability with user expectations.
- SLIs measure reality,
- SLOs set targets,
- SLAs define accountability.
Together, they are the backbone of reliability engineering.