Skip to main content
Networking

CDN Strategy for Real Workloads: Beyond Edge Cache 101

Six CDN strategies that matter once you are past the blog-post use case and running real traffic through real edge infrastructure.

John Lane 2023-06-23 6 min read
CDN Strategy for Real Workloads: Beyond Edge Cache 101

Most CDN articles are written for the case where you have a static site and want to put Cloudflare in front of it. That case is trivial. The interesting CDN decisions start when you have authenticated traffic, dynamic APIs, video, and a long-tail of cache misses that actually touch your origin. Here are six strategies that we use with customers running real workloads across Cloudflare, Fastly, Akamai, AWS CloudFront, and a mix of smaller specialist providers.

1. Separate Static, Dynamic, and API Traffic at the Edge

The default CDN setup dumps everything through one configuration with one cache behavior. That almost never matches real traffic patterns. A marketing homepage, a logged-in dashboard, and a mobile API have completely different caching profiles, security requirements, and cost characteristics.

The strategy that works: carve your traffic into distinct behaviors at the edge, each with its own cache rules, security policies, and origin. Static assets get aggressive caching and long TTLs. Authenticated HTML gets short TTLs or no caching at all, with careful attention to the Vary header. APIs get custom rules based on method and path — GETs may be cacheable for short windows, mutations are passthrough.

The Vary header trap

The single most common cause of cached-content-leaks-to-wrong-user incidents we have seen in incident response. If your authenticated HTML depends on a cookie or authorization header and you do not vary the cache key on it, the CDN will happily serve one user's content to another. Test this explicitly before you go to production, and never trust a default configuration.

2. Cache Key Engineering

The default cache key is usually the URL. That is almost never what you want. Query parameters, cookies, headers, device type, and geolocation can all affect the correct response, and a naive cache key either ignores them (serving wrong content) or includes all of them (destroying your hit rate).

The strategy: design the cache key deliberately for each behavior. Strip tracking parameters you do not want to include (utm_*, gclid, fbclid). Include the parameters that actually affect the response. Normalize case and parameter ordering. Most providers let you do this with custom rules or edge workers. The payoff is a meaningful jump in cache hit rate on traffic that looks uncacheable at first glance.

Hit rate is the metric that matters

A 95 percent hit rate and a 60 percent hit rate look the same on the CDN dashboard until you look at origin load. The 95 percent hit rate sends one in 20 requests to origin; the 60 percent hit rate sends eight times as many. Your origin capacity planning is entirely downstream of cache hit rate, and a few percentage points are the difference between a happy origin and a smoking one.

3. Origin Shielding

CDNs are globally distributed by design, which means every edge location can independently miss a cache and fetch from origin. For a popular piece of content, this can mean hundreds of simultaneous origin fetches for the same file during a cache cold-start or invalidation event. This is the pattern that took down a number of origins during viral moments in the last few years.

Origin shielding (Cloudflare's tiered caching, Fastly's shielding, CloudFront's origin shield) adds a middle layer between edge POPs and origin. When an edge misses, it asks the shield first; the shield deduplicates requests so your origin sees exactly one fetch per object per TTL. This is free for most providers and dramatically reduces origin load, yet many teams never enable it.

When not to use it

Shielding adds a small amount of latency on cache misses and can complicate some stateful scenarios. For most workloads this is a non-issue; for ultra-low-latency APIs where every millisecond counts, measure it before you enable it globally.

4. Edge Compute for Dynamic Workloads

Every major CDN now offers edge compute — Cloudflare Workers, Fastly Compute@Edge, AWS Lambda@Edge, Akamai EdgeWorkers. The honest reality is that these platforms have become capable enough to replace a surprising amount of traditional backend code, especially for read-heavy, latency-sensitive paths.

The strategies we deploy at the edge most often:

  • A/B testing and feature flags. Rewrite URLs or inject headers based on user segment without touching origin.
  • Authentication and authorization. Validate JWTs at the edge and reject unauthorized requests before they reach origin, cutting origin load and attack surface.
  • Request shaping. Normalize requests, strip PII, rewrite headers, transform responses.
  • Personalization. Serve a cached page with small edge-computed personalization (user name, cart count) patched in via ESI or HTML rewriting.

The constraint

Edge runtimes have real limits. CPU time per request is measured in milliseconds. Memory is tight. Network calls from edge are possible but not free. State lives somewhere else. If you stay inside those constraints, edge compute is excellent. If you do not, you are better off on a regional function platform.

5. Video and Large File Delivery

CDN strategy for video is a different discipline. HLS and DASH segments are small files, but there are many of them; ABR ladders multiply the asset count; DRM adds token-signed URLs that are hard to cache naively. The strategies that matter here are packaging at the edge (just-in-time packaging from a single mezzanine), token authentication that still allows shared caches to work, and prefetching strategies that warm segments before users request them.

For large file delivery (software updates, game downloads, OS images) the strategy is different again: prefer providers with generous cache storage at the edge, use range requests, and consider multi-CDN for the biggest events to avoid single-provider capacity limits during a launch.

6. Multi-CDN, Done Carefully

Multi-CDN is the strategy everyone wants to talk about and few actually execute well. The premise is sound: no single CDN provider has uniformly the best performance and availability in every market, and having a second provider you can cut over to is genuine insurance. The reality is that operating two CDNs in parallel adds real complexity — two configurations to keep in sync, two WAF rule sets, two invalidation workflows, two billing relationships.

When it is worth it

Multi-CDN is worth the effort when you are delivering high-value content where a CDN outage would have significant business impact (ticketing, live events, e-commerce during peak season, streaming video). For most other workloads, a well-configured primary with a tested failover plan to a secondary is a better cost/benefit than full active-active multi-CDN.

The tooling to manage multi-CDN has improved — Cedexis (now Citrix ITM), NS1, and others offer real-time steering based on synthetic and RUM data — but the operational discipline to keep two configurations equivalent is the hard part. Drift between primary and secondary configurations is the number one multi-CDN failure we see.

The Hidden Costs of CDN

A few cost realities that do not show up in the initial pitch:

  • Bandwidth at scale. Listed CDN pricing is for small customers. At high volume every provider negotiates; if you are not, you are paying retail and overpaying by 50 percent or more.
  • Feature upsell. WAF, bot management, image optimization, and analytics are all extra. The base CDN price is the tip of the iceberg.
  • Egress from origin. CDNs reduce egress from your origin, but the egress that does happen still hits your cloud bill. Cache hit rate affects origin egress cost as much as it affects origin compute cost.

What We Actually Configure

A working CDN setup for a mid-size SaaS customer typically looks like: separate behaviors for static assets, HTML, and API routes; cache keys engineered per behavior; origin shielding enabled; edge compute handling auth and simple transformations; WAF tuned for the actual application; RUM-based performance monitoring feeding back into cache rule tuning. The difference between this and a default configuration is usually a 2 to 5x reduction in origin load and a meaningful drop in end-user latency.

Three Takeaways

  1. Cache key design drives everything. Hit rate is the metric that matters and it is a direct consequence of how you define the key.
  2. Origin shielding is free and most teams never enable it. Turn it on.
  3. Multi-CDN is a discipline, not a checkbox. Only commit to it if the business case justifies the operational overhead, and if it does, invest in the configuration drift tooling up front.

Talk with us about your infrastructure

Schedule a consultation with a solutions architect.

Schedule a Consultation
Talk to an expert →