Start here
A short path to build intuition, then get practical.
- TTFB and origin latencyWhat “time to first byte” really measures and how origin distance and work show up.
- Queueing basics and latency budgetsWhy p95/p99 explode near saturation and how to design budgets that survive traffic spikes.
- Performance regressions checklistA step-by-step workflow to localise and confirm a regression without guessing.
All notes in this topic
- TTFB and origin latencyBreaks down request phases and shows how to separate network, TLS, and origin processing time.
- Cache hierarchy: edge to originHow multi-layer caches work (browser → CDN → service caches) and how to reason about hit ratios and staleness.
- Queueing basics and latency budgetsA practical view of utilisation, waiting time, tail latency, and what to do when load is spiky.
- Performance regressions checklistA repeatable investigation path: confirm, isolate, bisect, measure, and mitigate.
Common pitfalls
- Optimising the wrong stageShaving milliseconds from code paths that are not on the critical request path.
- Using averages as a decision toolp95/p99 behaviour is where regressions hide; mean latency can look fine while users suffer.
- Ignoring queueingNear saturation, small demand increases cause large tail spikes; this is not a “mystery” effect.
- Trusting a single measurement sourceCorrelate client timings, edge logs, and origin metrics; one view is rarely complete.
Related topics
- Formal methods notesUseful when you want crisp specifications of behaviour and invariants for performance-critical components.
- Distributed systems notesFailure semantics, retries, and observability change performance outcomes and measurement meaning.