Lab 01 · The Origin Server
Run it:
make lab-01
Source:labs/lab-01-origin-server/main.go
The Problem
Before you can understand what a CDN does, you need to understand what it protects. The origin server is the authoritative source of content — the thing that actually knows what the response should be. It might be:
- A Go/Python/Rails application querying a PostgreSQL database
- An S3 bucket serving static files
- A legacy monolith that someone is afraid to touch
- A media encoder writing MPEG-TS segments in real time
The origin’s fundamental problem is latency × concurrency. Every request pays the full cost of whatever work the origin must do: database queries, template rendering, business logic, external API calls.
The math
At 80 ms average latency per request, handling 1,000 requests/second requires 80 simultaneously active goroutines just to keep up. That’s 80 database connections, 80 in-flight external calls, 80 units of CPU work happening at once. At 500 rps it becomes 40. These numbers sound manageable until traffic spikes 10×.
Now imagine the home page of a news site during a breaking story. 50,000 concurrent users hit Refresh. At 80 ms latency, you need 4,000 simultaneous origin threads. No origin handles that gracefully — but a CDN can serve all 50,000 from a single cached response stored at the edge.
What This Lab Shows
Lab 01 is intentionally simple: just the origin, no proxy, no cache.
User → Origin (:9001)
Every request pays the full --latency cost (default: 80 ms). You can
see this directly in the output — 12 sequential requests each taking ~80 ms,
for ~960 ms total.
The key observable: X-Origin-Hit increments for every single request.
When you add a cache in Lab 03, you’ll see this counter stop growing after
the first few requests.
The Origin Server Contract
A well-behaved origin sets these headers:
| Header | Purpose |
|---|---|
Cache-Control: public, max-age=N | Tells CDN: cache for N seconds |
Cache-Control: private | Tells CDN: don’t cache (user-specific) |
Cache-Control: no-store | Never cache anywhere |
ETag: "abc123" | Content fingerprint for conditional requests |
Vary: Accept-Encoding | Different response per encoding |
X-Served-By: origin | Debug header: which tier served this |
The lab origin sets Cache-Control: public, max-age=30 — correct for
publicly cacheable content. Labs 04–05 build on this contract in depth.
Production Detail: Origin Capacity Planning
CDN engineers think about origin capacity as the residual load after the CDN absorbs its share. If your CDN achieves a 90% hit ratio and you expect 10,000 req/s peak traffic:
Origin load = 10,000 × (1 - 0.90) = 1,000 req/s
Capacity-plan your origin for this number, not the full 10,000. But factor in cold-start scenarios: after a deploy, a CDN cache flush, or a network partition that invalidates a large fraction of cache simultaneously. Your origin must survive a sudden 10× spike above its steady-state CDN-assisted load.
This is why Cloudflare, Fastly, and AWS CloudFront all have “origin overload protection” features (origin shield, request collapsing, retries with circuit breakers) — labs 06 and 13.
Failure Modes
| Failure | Symptom | Fix |
|---|---|---|
| Origin latency spike | All edge responses slow | Stale-while-revalidate (lab 07) |
| Origin error rate spike | 502/503 from CDN | Stale-if-error (lab 07) |
| Origin cold start | High latency on deploy | Warm cache before cutover |
| DDoS bypass | Attacker hits origin IP directly | IP allowlist: CDN IPs only |
Security note: Always allowlist your origin to accept connections only from CDN IP ranges. If attackers discover your origin IP, they can bypass the CDN entirely and DDoS it directly. All major CDNs publish their IP ranges (Cloudflare:
https://cloudflare.com/ips).
What to Measure
# Origin request rate (should stay low and stable)
rate(origin_requests_total[1m])
# Origin p99 latency (your SLA baseline)
histogram_quantile(0.99, rate(origin_response_duration_seconds_bucket[5m]))
# Origin error rate (alert at >0.1%)
rate(origin_errors_total[5m]) / rate(origin_requests_total[5m])
Try It
make lab-01
# In another terminal:
curl http://localhost:9001/article/1 -v
# With higher latency:
go run ./labs/lab-01-origin-server/ --latency 200ms --requests 5
# With errors:
go run ./labs/lab-01-origin-server/ --error-rate 0.3
Watch X-Origin-Hit increment with every single request. When you reach
Lab 03 and add a cache, you’ll see it stop.