Most DR plans are paperwork. They exist to satisfy an auditor, they have never been tested end-to-end, and they will fail in production when somebody actually needs them. The good DR plans share a few characteristics: they match the recovery objective to the business value of the workload, they budget realistically, and they get tested on a schedule.

Here are the five DR techniques we recommend, the RTO and RPO ranges they deliver, the budget tier they fit, and how often we test them.

1. Backup and Restore (RTO: days, RPO: hours)

The cheapest DR strategy. Take backups, store them somewhere that survives the primary failure, restore when needed.

Budget tier: $ — This is the baseline. Object storage is cheap; backup software is the main cost.

What it looks like in practice:

Veeam, Commvault, or Rubrik handles the backup job
Destination is immutable cloud object storage (Azure Blob immutable tier, S3 Object Lock in Compliance mode, Wasabi)
Retention follows 3-2-1-1-0: three copies, two media types, one offsite, one immutable, zero errors in the last restore test

Realistic recovery time: 4 to 48 hours depending on how much data you need to rehydrate and how the restore target is provisioned. Large databases are the bottleneck — a 5 TB SQL Server restore takes hours no matter how fast your network is.

Test cadence: Monthly partial restores, quarterly full restore of at least one tier-one workload.

When to use it: Tier 3 and 4 workloads. Dev, QA, internal tools, reporting databases. Anything where "we'll be down for a day" is acceptable.

2. Pilot Light (RTO: hours, RPO: minutes)

A minimal replica of production runs continuously in the DR region — enough to fail over the database and scale up compute on demand, but not enough to serve traffic.

Budget tier: $$ — Database replication cost plus minimal compute footprint.

What it looks like:

Database replicated continuously (Always On Availability Groups, PostgreSQL streaming replication, RDS cross-region read replicas, Cosmos DB multi-region)
Compute is either stopped VMs, minimal auto-scaling groups, or Infrastructure-as-Code ready to apply
DNS is preconfigured with short TTL or uses Traffic Manager / Route 53 health checks

Realistic recovery time: 1 to 4 hours including database promotion, compute scale-up, DNS propagation, and smoke tests.

Test cadence: Quarterly failover drill, at least one per year to the DR region for real.

When to use it: Tier 2 workloads. Important business systems where a few hours of downtime is survivable but a day is not. This is the sweet spot for most mid-market ERP, CRM, and file-sharing workloads.

3. Warm Standby (RTO: minutes, RPO: seconds)

A scaled-down but fully functional replica of production runs continuously. On failover, you scale up and redirect traffic.

Budget tier: $$$ — 20 to 50 percent of production compute cost running continuously in DR.

What it looks like:

Both regions are provisioned and receive continuous data replication
DR region runs at minimum capacity (one web server per tier, smallest viable database SKU)
Automated failover scripts scale up and promote on trigger
Global load balancer (Azure Front Door, CloudFront, Cloudflare) handles traffic redirection

Realistic recovery time: 5 to 30 minutes.

Test cadence: Monthly partial failovers, quarterly full failover.

When to use it: Tier 1 customer-facing workloads where minutes matter but seconds don't.

4. Active-Active Multi-Region (RTO: seconds, RPO: zero-ish)

Both regions serve live traffic. Failure of one region takes out a fraction of users briefly while the load balancer reroutes.

Budget tier: $$$$ — Roughly 2x the cost of single-region with additional networking.

What it looks like:

Database is either globally distributed (Cosmos DB, DynamoDB global tables, Spanner) or multi-master with conflict resolution
Application is stateless or session-state is replicated
Global load balancer with health-based routing
Careful attention to eventual consistency, because strong consistency across regions is expensive

Realistic recovery time: Seconds. But the architectural complexity is the highest of any option on this list.

Test cadence: Continuous. Chaos engineering (kill a region during business hours) is the only real test.

When to use it: Tier 0 workloads where downtime is measured in revenue per minute. Most organizations don't need this and can't afford the architectural cost. If you do need it, you probably already know.

5. Cloud-to-Cloud or Cloud-to-On-Prem Replication

The pattern people forget. Your primary is cloud, but your DR is somewhere else — either a second cloud or your own data center.

Budget tier: $$ to $$$ — Depends on egress and duplicate infrastructure.

Why it exists: To protect against the one scenario the other four don't — a full cloud vendor outage, a region-wide compromise, or a billing/account-lock event. These are rare but not zero. OVH's Strasbourg fire in 2021 was real. AWS us-east-1 has gone sideways more than once.

What it looks like:

Database backups shipped to a second cloud via rclone, Azure Data Box Gateway, or native cross-cloud services
Infrastructure-as-Code defined in a cloud-neutral format (Terraform with separate providers, or two parallel IaC stacks)
DNS controlled outside either cloud (NS1, Cloudflare, or a self-managed authoritative)

Test cadence: Quarterly at minimum. This is the test most people skip and most regret later.

What We'd Actually Do

For a typical mid-market customer with 50 to 200 applications:

80 percent: backup and restore. Immutable object storage, Veeam or equivalent, tested monthly.
15 percent: pilot light. The important-but-not-critical systems. ERP, CRM, primary file shares.
5 percent: warm standby. The customer-facing revenue-generating workloads.
Cloud-to-cloud shipping of critical backups as insurance against vendor-level events.

Don't over-build. A pilot light DR for a dev database is money lit on fire.

Three Takeaways

Match the DR tier to the business value. Every workload does not need the same RTO.
Immutable backups are the single highest-leverage DR investment in 2025. Ransomware made them non-negotiable.
An untested DR plan is not a DR plan. Quarterly failover drills are the difference between a plan that works and a plan that makes auditors happy.

5 DR Techniques That Actually Work (And What They Cost)

1. Backup and Restore (RTO: days, RPO: hours)

2. Pilot Light (RTO: hours, RPO: minutes)

3. Warm Standby (RTO: minutes, RPO: seconds)

4. Active-Active Multi-Region (RTO: seconds, RPO: zero-ish)

5. Cloud-to-Cloud or Cloud-to-On-Prem Replication

What We'd Actually Do

Three Takeaways

Talk with us about your infrastructure

On-Premise Infrastructure

Private Cloud

Public Cloud

AI & Automation

5 DR Techniques That Actually Work (And What They Cost)

1. Backup and Restore (RTO: days, RPO: hours)

2. Pilot Light (RTO: hours, RPO: minutes)

3. Warm Standby (RTO: minutes, RPO: seconds)

4. Active-Active Multi-Region (RTO: seconds, RPO: zero-ish)

5. Cloud-to-Cloud or Cloud-to-On-Prem Replication

What We'd Actually Do

Three Takeaways

Talk with us about your infrastructure