Lab 02 · The Naive Reverse Proxy
Run it:
make lab-02
Source:labs/lab-02-naive-proxy/main.go
The Problem
Adding a proxy between users and the origin is the first step in CDN architecture — before caching, before edge compute, before any optimization.
But why does a proxy even help if it doesn’t cache? Several reasons:
- TLS offloading: The CDN terminates TLS on fast, dedicated hardware so the origin doesn’t pay the cryptographic overhead for every user.
- Connection pooling: The proxy maintains persistent HTTP/1.1 keep-alive or HTTP/2 multiplexed connections to the origin, amortizing TCP handshake cost across many requests.
- Protocol upgrade: Users connect via HTTP/2 or HTTP/3; the CDN speaks HTTP/1.1 to a legacy origin.
- DDoS surface reduction: The origin is invisible to the internet.
- Header normalization: Strip tracking headers, add forwarding metadata.
- Rate limiting, WAF: Applied at the proxy before the origin even sees the request.
How httputil.ReverseProxy Works
Go’s standard library httputil.ReverseProxy is the canonical building
block for a reverse proxy:
proxy := &httputil.ReverseProxy{
Director: func(req *http.Request) {
req.URL.Scheme = "http"
req.URL.Host = "origin:9001"
// Strip hop-by-hop headers
req.Header.Del("Connection")
req.Header.Del("Upgrade")
// Append X-Forwarded-For
if clientIP, _, err := net.SplitHostPort(req.RemoteAddr); err == nil {
req.Header.Add("X-Forwarded-For", clientIP)
}
},
ModifyResponse: func(resp *http.Response) error {
resp.Header.Set("X-Served-By", "proxy")
return nil
},
ErrorHandler: func(w http.ResponseWriter, r *http.Request, err error) {
http.Error(w, "Bad Gateway", http.StatusBadGateway)
},
}
Director mutates the request before forwarding. It runs in the same
goroutine as the handler, so it must be fast and side-effect-free.
ModifyResponse mutates the response before sending back to the
client. Use this to add headers like X-Cache, normalize Content-Type,
or strip internal headers.
Transport is the HTTP client used to reach the origin. Default is
http.DefaultTransport, which maintains a connection pool. For production,
tune:
Transport: &http.Transport{
MaxIdleConnsPerHost: 200, // connection pool per origin
MaxConnsPerHost: 500, // max concurrent connections
IdleConnTimeout: 90 * time.Second,
ResponseHeaderTimeout: 30 * time.Second,
DisableKeepAlives: false, // ALWAYS keep-alives on
ForceAttemptHTTP2: true, // H2 to origin if supported
TLSHandshakeTimeout: 5 * time.Second,
// DialContext: custom dialer for DNS override, binding, etc.
}
Hop-by-Hop Headers
HTTP defines two classes of headers:
End-to-end headers: forwarded unchanged through all proxies to the
final recipient. Examples: Content-Type, ETag, Cache-Control,
Authorization.
Hop-by-hop headers: meaningful only for the immediate connection. Must be stripped before forwarding. Defined in RFC 7230 §6.1:
Connection, Keep-Alive, Proxy-Authenticate, Proxy-Authorization,
TE, Trailers, Transfer-Encoding, Upgrade
Additionally, any header listed in the Connection header value is
hop-by-hop for that hop:
Connection: X-Custom-Header, Keep-Alive
→ Strip X-Custom-Header too
Failing to strip hop-by-hop headers causes subtle bugs: the origin may
try to negotiate an Upgrade on a connection it doesn’t have, or the
downstream client may receive a Transfer-Encoding: chunked header
that doesn’t match the actual response framing.
X-Forwarded-For and the IP Chain
When a proxy adds X-Forwarded-For: 1.2.3.4, and then another proxy
adds another layer, you get:
X-Forwarded-For: 1.2.3.4, 10.0.0.1
The leftmost IP is the client (set by the first trusted proxy). The rightmost is the last proxy before the origin. Origin applications should parse the first untrusted IP from the left — but only if they know how many trusted proxies are in front of them.
In production, CDNs like Cloudflare expose the real client IP via:
CF-Connecting-IP: 1.2.3.4 (always the real client IP)
True-Client-IP: 1.2.3.4 (Cloudflare Enterprise)
This avoids the ambiguity of X-Forwarded-For in multi-proxy setups.
Security trap: Never trust
X-Forwarded-Forfor access control if any user can send it directly. Validate the header only when you can confirm the request came through a trusted proxy.
The Proxy Overhead Measurement
The lab measures raw proxy overhead by timing the same request through:
- Direct origin call
- Through the proxy
Typical result: < 0.5 ms proxy overhead. This is negligible vs. actual origin latency (80+ ms). The overhead comes from:
- Goroutine scheduling (< 1 µs)
- Memory copy of request/response buffers
- Two additional TCP reads/writes
This is why the caching layer in Lab 03 — which adds zero network hops on a hit — provides dramatic speedups: it collapses 80 ms to < 0.1 ms.
Production Detail: Connection Pools
http.DefaultTransport uses a connection pool per host:port. When
Go’s HTTP client gets a response, it returns the underlying TCP connection
to the pool for reuse on the next request to the same origin.
At scale, pool sizing matters:
| Scenario | MaxIdleConnsPerHost |
|---|---|
| Single origin, low traffic | 10 (default) |
| Single origin, high traffic | 100–500 |
| Origin cluster behind load balancer | 200+ (connections spread across backends) |
| Origin with connection limit (MySQL) | Match origin’s max_connections |
Setting this too low forces new TCP handshakes under load, adding ~5 ms of SYN/ACK round-trip on every miss. At 10,000 cache misses/second, that’s 50 seconds/second of wasted TCP overhead.
Failure Modes
| Failure | Symptom | Fix |
|---|---|---|
| Origin timeout | 504 Gateway Timeout | Set ResponseHeaderTimeout; circuit break |
| Origin 5xx | 502 Bad Gateway | ErrorHandler; retry on idempotent requests |
| Connection pool exhaustion | Latency spike | Increase MaxIdleConnsPerHost; queue requests |
| Memory leak | Unbounded growth | Always read resp.Body to EOF even if discarding |
| Hop-by-hop not stripped | Protocol negotiation failure | Explicit header removal in Director |
Try It
make lab-02
# Direct origin (no proxy)
curl http://localhost:9001/article/1
# Through proxy
curl http://localhost:8080/article/1 -v
# Compare response headers — should see X-Served-By: proxy
# and X-Forwarded-For header in origin logs