Concept · Agile & Software Development
Site Reliability Engineering (SRE)
Origin: Google / Ben Treynor
Biological Parallel
Immune systems maintain organismal reliability through continuous monitoring, rapid threat response, and memory formation—distinguishing self from non-self at millisecond timescales. SRE is organizational immune function: monitoring detects anomalies, incident response neutralizes threats, postmortems encode immunological memory. The human immune system doesn't prevent all infections; it maintains reliability within acceptable error budgets, exactly as SRE balances innovation velocity against system stability through quantified risk tolerance.