Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Lab 13 · Origin Shield

Run it: make lab-13
Source: labs/lab-13-origin-shield/main.go


The Problem

A CDN with 200 PoPs worldwide, each with an independent cache. Your origin handles peak traffic fine: 100 req/s of cache misses.

Then a popular video goes viral. Every PoP simultaneously gets cache misses for that video URL. 200 PoPs × simultaneous misses = 200 simultaneous origin requests. Origin collapses.

Even with singleflight within a single PoP (Lab 06), there’s no deduplication across PoPs. Each PoP independently decides to fetch from the origin.


Origin Shield: A Designated Parent PoP

The solution: designate one PoP as the shield (or parent PoP). All 200 edge PoPs forward their misses to the shield instead of to the origin. The shield may have the cached copy; if not, it fetches from origin once and serves all 200 edge misses from that single fetch.

200 Edge PoPs (all miss simultaneously)
    │
    ├── NYC:  forward to shield
    ├── LHR:  forward to shield
    ├── NRT:  forward to shield
    │   ...
    └── SYD:  forward to shield
             │
             ▼
        Shield PoP (e.g. IAD)
             │
             ├── Shield HIT → serve all 200 edges
             │
             └── Shield MISS → 1 origin request
                      │
                      ▼
                   Origin

Result: 200 origin requests → 1 origin request.

The shield applies singleflight itself: even if 200 edges arrive within a millisecond, the shield collapses all 200 into a single upstream fetch. Combined with shield-level caching, origin sees at most 1 request per content piece per TTL period regardless of CDN scale.


Vendor Implementations

VendorShield nameDesignation
FastlyShielding / POP-to-POPAny PoP can be shield
CloudFrontOrigin ShieldSingle regional shield
CloudflareTiered CacheSmart Tiering (auto)
AkamaiSureRoute / Tiered DistributionHierarchical

Fastly Shielding allows any PoP to be designated as shield, with routing based on latency to origin. You configure it per-service in VCL:

sub vcl_recv {
    if (req.backend == F_origin && !req.http.Fastly-FF) {
        set req.backend = shield:IAD;   # route through IAD shield
    }
}

CloudFront Origin Shield is a dedicated regional tier between the edge PoPs and your origin. You enable it with:

{
  "OriginShield": {
    "Enabled": true,
    "OriginShieldRegion": "us-east-1"
  }
}

CloudFront charges $0.0087–0.0050/10,000 requests for origin shield traffic — still vastly cheaper than paying for origin infrastructure to handle unshielded traffic.


Implementation

Three-Tier Architecture

Client → Edge (:8080, :8081) → Shield (:8082) → Origin (:9001)

Each tier is a separate Go process. The edge nodes use consistent hashing (Lab 12) to select which shield node handles each URL, and singleflight to collapse concurrent same-key requests within the edge:

// At the edge, for a cache miss:
result, _, shared := sfGroup.Do(cacheKey, func() (interface{}, error) {
    return fetchFromShield(cacheKey)
})

The shield does the same before forwarding to origin:

// At the shield, for a cache miss:
result, _, _ := sfGroup.Do(cacheKey, func() (interface{}, error) {
    return fetchFromOrigin(cacheKey)
})

The Shield Selection

For a shield tier with multiple shield nodes, use consistent hashing to select which shield handles each URL:

Edge → consistent_hash(url) → ShieldNode-X → Origin

All edges route requests for URL X to the same shield node, maximizing shield cache hit ratio. If a shield node fails, consistent hashing automatically routes to the next node (only 1/N of URLs are remapped).


The Math: Origin Request Reduction

Without origin shield:

E = number of edge PoPs (200)
T = TTL (300 seconds)
R = request rate per URL (1000/s across all PoPs)

Origin requests per URL = E = 200 (on each TTL expiry)

With origin shield:

S = number of shield nodes (2–5 typically)
Origin requests per URL = S = 2–5 (one per shield node per TTL)

Reduction factor: 200 ÷ 3 ≈ 67× fewer origin requests.

In practice with singleflight at the shield level, even S requests are collapsed to 1. Origin sees exactly 1 request per URL per TTL regardless of CDN scale.


Shield Latency Tradeoff

Origin shielding adds one network hop. Edge → Shield adds latency:

Without shield: Edge → Origin = 150 ms
With shield:    Edge → Shield → Origin = 5 ms + 150 ms = 155 ms

5 ms overhead for edge-to-shield hop (same region, dedicated link). The tradeoff is worth it because:

  1. 99% of requests are cache hits at either edge or shield
  2. The 5 ms penalty only applies to the remaining ~1% miss path

For a well-shielded CDN serving popular content:

Hit ratio at edge:    85%  → 0 ms overhead
Hit ratio at shield:  12%  → 5 ms overhead
Cache miss:            3%  → 155 ms (5 + 150)

Average added latency: 0.85×0 + 0.12×5 + 0.03×155 = 5.25 ms average

Origin protection vastly outweighs the 5.25 ms average latency cost.


Failure Modes

FailureBehavior without shieldBehavior with shield
Origin spike200 PoPs × misses = 200 requests1–3 shield requests
Origin down200 PoPs serve stale or error1–3 shield requests (stale-if-error)
Shield node downEdge falls back to origin directlyConsistent hash routes to next node
Shield cache invalidationMust purge all edges tooPurge shield = automatic edge invalidation

Try It

make lab-13

# Start all three tiers
# Lab automatically starts edge1(:8080), edge2(:8081), shield(:8082), origin(:9001)

# Request through edge 1
curl http://localhost:8080/article/1 -H "X-Debug: tiers"
# Response should show: Edge MISS → Shield MISS → Origin HIT

# Same request through edge 2 (different PoP)
curl http://localhost:8081/article/1 -H "X-Debug: tiers"
# Should show: Edge MISS → Shield HIT (shield already has it)

# Repeat both — edges should be HIT now
curl http://localhost:8080/article/1
curl http://localhost:8081/article/1