2026.03.20I·02Incident Management: Writing Postmortems and Managing Incidents
Incidents will happen. What matters is how fast you recover and what you learn. From severity levels and incident roles to blameless postmortems and action items that actually get done.
Incident ManagementPostmortemSRE
→2025.09.18I·16What Is SRE: Google's Philosophy for Turning Operations into Engineering
Running a service means failures will happen. Reading Google's SRE book made me realize that operations is a high-level engineering problem, not just toil. I walk through how the concepts of SLI, SLO, and Error Budget shift your mindset from firefighter to architect.
SREDevOpsReliability
→2025.08.29F·167Idempotency: Safely Handling Duplicate Requests
Understanding idempotency concepts and implementation through practical experience
idempotencyapidistributed-systems
→