Skip to main content
Cloud

Picking a Cloud Region: Latency, Sovereignty, and the Cost of Being Wrong

Which cloud region you deploy to is a decision that looks cheap until you try to move. Here is the framework we actually use to pick one.

John Lane 2023-02-11 6 min read
Picking a Cloud Region: Latency, Sovereignty, and the Cost of Being Wrong

Cloud region selection gets treated as a throwaway decision on the first day of an architecture review, and then haunts the project for years. The default of "put it in the region closest to the headquarters" is usually wrong, and the default of "put it wherever the sales rep recommends" is usually worse. Here is the framework we actually use when we are helping a customer pick a region, and the six things we look at before we commit.

One: Where Are the Users, Actually

The first question is not "where is the business headquartered" but "where does the request originate." For a B2B SaaS product serving North American customers, a single region in us-east-1 or eastus2 is usually fine — round-trip latency to the farthest coast is around 70 ms, which is imperceptible for most workloads. For a consumer-facing product or a latency-sensitive API, you need to think about p95 latency for the farthest user, not the average.

Pull your logs or your analytics and plot request origin on a map. If 90 percent of your traffic comes from a 500-mile radius around one city, put the primary region there. If it is distributed across two coasts or three continents, you need a multi-region story from day one — and that is a fundamentally different architecture conversation.

The mistake we see is deploying to the region closest to the engineering team for convenience, then discovering that the actual users are on another continent. Regional latency is physics; there is no engineering fix that removes it.

Two: Where the Data Has to Live

Data sovereignty is the constraint that trumps everything else. If you are processing EU personal data, the data has to stay in a region that has been determined adequate under GDPR, which in practice means an EU or EEA region, or a region under an approved transfer mechanism. If you are processing Canadian personal information, PIPEDA and some provincial laws (Quebec's Law 25, for example) push you toward Canadian regions. US public sector means GovCloud or equivalent. Healthcare means HIPAA-capable regions with a signed BAA.

Do not fight this. The paperwork trail your auditors will demand is much easier to produce when the data never left the sovereign region, and the political risk of "we thought it was fine" is not worth the latency savings from a cross-border deployment. If you have users in multiple jurisdictions with conflicting requirements, you have a multi-region problem whether you wanted one or not.

Three: Service Availability by Region

Not every cloud service is available in every region. This is true on all three hyperscalers and it is especially true for the newest services. If your architecture depends on a specific managed service — say, Azure OpenAI, or a particular AWS Bedrock model, or a new GCP data warehouse tier — check the regional availability matrix before you commit to a region. We have watched projects get to the detailed design phase and then discover the service they architected around is not in the sovereign region they are required to use.

The practical workaround is to either use a service that is available everywhere (core compute, storage, networking, basic databases) or to explicitly separate the latency-sensitive workload from the specialized service. You can run your application in a sovereign region and call out to a specialized service in a different region, if the data flow is compliant and the latency is tolerable. Just do it on purpose, not by accident.

Four: Region Pairs and Disaster Recovery

Every hyperscaler publishes recommended region pairs for disaster recovery. Azure's pairs are explicit and used for certain platform services. AWS's pairs are less formal but exist in practice. GCP's multi-region storage classes imply pairs. Use them.

The cost of being wrong here is real. We have seen customers deploy primary in one region and DR in a region across a political border, then discover at audit time that they had unintentionally created a cross-border data transfer. We have also seen customers deploy primary and DR in two regions close enough that a single natural disaster could plausibly hit both. Neither is clever. Pick a pair that gives you real geographic separation but stays within your regulatory perimeter.

And test the failover. A DR region you have never failed over to is not a DR region, it is a cost center that will not save you.

Five: Egress and Inter-Region Cost

Hyperscaler egress pricing is asymmetric, and inter-region traffic is priced differently from internet egress. If your architecture involves a lot of inter-region data movement — for replication, analytics, backup, or serving traffic — model the cost before you commit. The three-region deployment that looked balanced in the architecture diagram can produce a five-figure monthly egress bill you did not budget for.

The practical pattern is to keep data gravity on one side. Pick a primary region where the bulk of the data lives, and accept that other regions are either caches (eventually consistent, lower cost) or specialized satellites (limited data flow, controlled cost). Do not build a genuinely symmetric multi-region topology unless you have a business reason that pays for it.

Six: The Region's Future, Not Just Its Present

A region that meets your needs today might not be the region you want to be in five years from now. Some regions are growing rapidly (new service availability, more availability zones, better peering); others are effectively in maintenance mode. Before you commit, check the release cadence — is the region getting new services on the same schedule as the flagship regions, or is it lagging by six to twelve months? Lagging regions accumulate technical debt; you end up architecting around missing services and then migrating when they eventually show up.

Also check the physical realities. Availability zone count matters — a region with three AZs gives you real fault tolerance, a region with two AZs gives you an illusion of it. Power and network peering in the region's geography matter for sustained performance. None of this is in the hyperscaler's sales deck, but it is in the documentation and the community experience reports if you look for it.

What We Would Actually Do

If a customer comes to us with a greenfield cloud deployment, here is the shape we recommend most of the time.

  • Start with one primary region, chosen for user latency and sovereignty. Not for HQ convenience, not for the sales rep's recommendation.
  • Pick a DR region that is part of the recommended pair, within the same regulatory perimeter. Test failover at least once a quarter. A quarterly test that takes an hour is worth more than an annual test that nobody remembers to run.
  • Keep data gravity in the primary region. Replicate only what DR needs, and treat other regions as caches or read-only satellites.
  • Audit region selection annually. New regions come online. Sovereignty rules change. What was the right region three years ago may not be today.
  • Do not try to be clever with multi-region active-active unless you genuinely need it. It is an order of magnitude harder than anything else in this list, and most workloads do not need it.

The cost of being wrong on region selection is not obvious on day one, but it shows up on the migration bill two years later when you realize the data has to move. Pick deliberately. The framework above is boring on purpose — boring region decisions stay out of the incident retrospectives.

Talk with us about your infrastructure

Schedule a consultation with a solutions architect.

Schedule a Consultation
Talk to an expert →