Route 53 Deep Dive: Multi-Region Latency Routing with Health-Based Failover
2024-07-21
Introduction
In this deep dive, we'll explore how to configure AWS Route 53 to route user traffic to the region with the lowest latency and automatically fail over to another region if the primary becomes unhealthy. This setup ensures high availability and optimal performance for global applications.
Why Should You Care? Implementing latency-based routing with health checks in Route 53 allows your application to:
- Serve users from the nearest healthy region, reducing latency.
- Automatically reroute traffic during regional outages, enhancing resilience.
- Maintain high availability without manual intervention.
Prerequisites
- Basic understanding of AWS Route 53 and DNS concepts.
- An AWS account with Route 53 and CloudWatch access.
- A domain name registered with Route 53 (if hosted on Cloudflare or elsewhere, make sure you delegate the domain to Route 53 or create a subdomain to subdelegate).
- Terraform installed on your local machine.
Architecture Overview
We'll deploy identical applications in two AWS regions: us-east-1 and eu-west-1. Each region will have:
- An Application Load Balancer (ALB) fronting the application.
- A Route 53 health check monitoring the ALB's /health endpoint.
- A latency-based DNS record directing traffic to the region with the lowest latency.
If a health check fails, Route 53 will exclude that region from DNS responses, effectively failing over to the healthy region.
Step-by-Step Guide
-
Set Up ALBs in Both Regions Deploy your application behind an ALB in both us-east-1 (US Virginia) and eu-west-1 (EU Ireland). Ensure each ALB has a listener on port 80 and a target group with healthy targets.
-
Create Route 53 Health Checks Define health checks for each ALB's /health endpoint.
resource "aws_route53_health_check" "us_east" { fqdn = "alb-us-east-1.example.com" port = 80 type = "HTTP" resource_path = "/health" failure_threshold = 3 request_interval = 30 } resource "aws_route53_health_check" "eu_west" { fqdn = "alb-eu-west-1.example.com" port = 80 type = "HTTP" resource_path = "/health" failure_threshold = 3 request_interval = 30 }
3. Configure Latency-Based DNS Records
Create latency-based DNS records pointing to each ALB, associating them with the respective health checks.
resource "aws_route53_record" "us_east" { zone_id = aws_route53_zone.primary.zone_id name = "r53-demo.moabukar.co.uk" type = "A" alias { name = "alb-us-east-1.example.com" zone_id = "Z35SXDOTRQ7X7K" // Replace with your ALB's zone ID evaluate_target_health = true } set_identifier = "us-east-1" region = "us-east-1" latency_routing_policy { region = "us-east-1" } health_check_id = aws_route53_health_check.us_east.id } resource "aws_route53_record" "eu_west" { zone_id = aws_route53_zone.primary.zone_id name = "r53-demo.moabukar.co.uk" type = "A" alias { name = "alb-eu-west-1.example.com" zone_id = "Z32O12XQLNTSW2" // Replace with your ALB's zone ID evaluate_target_health = true } set_identifier = "eu-west-1" region = "eu-west-1" latency_routing_policy { region = "eu-west-1" } health_check_id = aws_route53_health_check.eu_west.id }
Note: Replace zone_id values with the correct zone IDs for your ALBs. You can find these in the AWS documentation.
Apply on TF
terraform init terraform apply
Verify that the DNS records are created and health checks are active.
Test the Setup
-
Normal Operation: When both regions are healthy, Route 53 directs users to the region with the lowest latency based on their location.
-
Simulate Failure: Stop the application in us-east-1 to trigger a health check failure.
-
Failover: Route 53 detects the failure and stops including us-east-1 in DNS responses. Traffic is rerouted to eu-west-1.
-
Recovery: Restart the application in us-east-1. Once the health check passes, Route 53 includes it again in DNS responses.
Monitoring and Observability
-
Route 53 Console: Monitor health check statuses and DNS records.
-
CloudWatch: Set up alarms for health check failures to receive notifications.
-
Logs: Enable query logging in Route 53 to analyze DNS queries.
Conclusion
By configuring latency-based routing with health checks in Route 53, you ensure your application serves users from the nearest healthy region, providing low latency and high availability. This setup is crucial for global applications where performance and uptime are paramount.
References: