Lab 03 · The First Cache
Run it:
make lab-03
Source:labs/lab-03-first-cache/main.go
The Problem
The reverse proxy in Lab 02 blindly forwards every request to the origin. A cache short-circuits that path: if we’ve seen this URL recently and have a stored response, serve it directly from memory without touching the origin.
The fundamental trade-off: freshness vs. cost. A cached response might be stale, but serving it is:
- Orders of magnitude faster (memory read vs. network round trip)
- Origin-free (no database query, no CPU work)
- Deterministic (no dependency on origin availability)
The Cache Lifecycle: MISS → HIT → EXPIRED
Request arrives
│
▼
┌─────────────────────────────────────┐
│ Lookup key = normalize(URL) │
└─────────────────────────────────────┘
│
├─► Entry not found → MISS
│ │
│ ▼
│ Fetch from origin
│ Store in cache with deadline = now + TTL
│ Return response to client
│
├─► Entry found, not expired → HIT
│ │
│ ▼
│ Return cached response immediately
│
└─► Entry found, expired → EXPIRED (= MISS)
│
▼
Revalidate or re-fetch
Replace cache entry
The X-Cache response header tells the client (and debugging engineers)
which branch was taken:
X-Cache: MISS # first request for this URL
X-Cache: HIT # served from cache
Implementation: sync.Map + TTL
type cacheEntry struct {
response []byte
headers http.Header
status int
expiry time.Time
}
var cache sync.Map // map[string]*cacheEntry
func get(key string) (*cacheEntry, bool) {
v, ok := cache.Load(key)
if !ok { return nil, false }
entry := v.(*cacheEntry)
if time.Now().After(entry.expiry) {
cache.Delete(key) // lazy expiry
return nil, false
}
return entry, true
}
Why sync.Map? The standard map plus sync.RWMutex would work,
but sync.Map is optimized for a specific workload: many reads, few
writes, stable key set. CDN caches have a hot set of URLs that are read
millions of times per second and written (populated) far less often.
sync.Map achieves this via an atomic “read map” that requires no
locking on reads for existing keys.
However, sync.Map has a known weakness: its internal dirty map can
accumulate entries and requires a periodic promotion step. For very
write-heavy caches (cold start, high churn), a sharded map +
sync.RWMutex pattern can be more efficient.
TTL: Where Does It Come From?
In Lab 03 the TTL is hardcoded. Lab 04 shows how to parse it properly
from Cache-Control headers:
Cache-Control: public, max-age=300
→ TTL = 300 seconds
Cache-Control: no-store
→ Do not cache at all
Cache-Control: private
→ Do not store in shared (CDN) cache
Cache-Control: no-cache
→ Store but always revalidate before serving
Ignoring Cache-Control is the #1 cause of CDN misconfiguration.
If you cache a private response, you may serve one user’s data to
another. If you cache no-store, you violate the application’s contract.
Background Sweep: Avoiding Memory Leaks
A cache without eviction grows unboundedly. Lab 03 runs a background goroutine that sweeps expired entries:
go func() {
ticker := time.NewTicker(30 * time.Second)
for range ticker.C {
var expired []string
cache.Range(func(k, v any) bool {
if time.Now().After(v.(*cacheEntry).expiry) {
expired = append(expired, k.(string))
}
return true
})
for _, k := range expired {
cache.Delete(k)
}
}
}()
Note the two-phase delete: first collect expired keys (during which
we hold the range lock), then delete. You cannot modify sync.Map during
a Range iteration.
Production caches use more sophisticated eviction:
| Policy | Description | Use case |
|---|---|---|
| TTL expiry | Remove at expiry | All caches |
| LRU | Evict least-recently-used | Bounded memory (Lab 08) |
| LFU | Evict least-frequently-used | Popularity-skewed workloads |
| ARC | Adaptive Replacement Cache | Self-tuning between LRU and LFU |
| S3-FIFO | Simple, Scalable, Segmented FIFO | Modern alternative to LRU (lower overhead) |
The Deliberate Limitations of Lab 03
The lab explicitly documents what it doesn’t do yet:
- No
Cache-Controlparsing — TTL is hardcoded. Fixed in Lab 04. - No
singleflight— concurrent misses all hammer origin. Fixed in Lab 06. - Unbounded memory — LRU eviction arrives in Lab 08.
- No content negotiation — same key for
Accept-Encoding: gzipandAccept-Encoding: br. Fixed in Lab 05 via Vary. - No conditional requests — always fetches full response, no 304. Fixed in Lab 04.
This incremental approach is pedagogically important: each lab adds exactly one concept so the interaction is clear.
Production Detail: Cache Serialization Format
Real CDN disk caches store responses in compact binary formats. Varnish uses its own VCL-controlled storage. Nginx uses a format with:
[8 bytes: key hash]
[8 bytes: expiry timestamp]
[4 bytes: headers length]
[4 bytes: body length]
[headers (HTTP/1.1 text)]
[body bytes]
Lab 08 uses file-system storage with xxhash-named files, which is functionally equivalent but less efficient (filesystem metadata overhead).
For in-memory caches, Google’s Groupcache and Fastly’s own cache daemon use Protocol Buffers for serialization, enabling:
- Zero-copy responses via
io.WriterTo - Shared-memory between processes
- Binary compatibility across versions
What to Measure
# Hit ratio (requests)
sum(rate(cache_hits_total[5m])) /
sum(rate(cache_requests_total[5m]))
# Miss rate (triggers origin fetches)
rate(cache_misses_total[5m])
# Cache entries currently stored
cache_entries_current
# Evictions (if bounded cache)
rate(cache_evictions_total[5m])
Try It
make lab-03
# First request — should be MISS
curl http://localhost:8080/article/1 -v | grep X-Cache
# Second request — should be HIT (< 1ms)
curl http://localhost:8080/article/1 -v | grep X-Cache
# X-Origin-Hit should only increment on first request
curl http://localhost:8080/article/1 -H "X-Debug: origin-count"