Implementing High Availability on AWS with Terraform
Learn how to design and implement highly available architectures on AWS using Terraform, including multi-AZ deployments, load balancing, and fault tolerance strategies
Implementing High Availability on AWS with Terraform
High Availability (HA) is crucial for maintaining system uptime and reliability. This guide demonstrates how to implement HA architectures on AWS using Terraform, covering multi-AZ deployments, load balancing, and fault tolerance strategies.
Video Tutorial
Learn more about implementing High Availability with Terraform in AWS in this comprehensive video tutorial:
Prerequisites
- AWS CLI configured with appropriate permissions
- Terraform installed (version 1.0.0 or later)
- Basic understanding of AWS services and HA concepts
- Understanding of Terraform configuration syntax
Project Structure
ha-terraform/
├── main.tf
├── variables.tf
├── outputs.tf
├── alb.tf
├── asg.tf
├── rds.tf
└── terraform.tfvars
Setting Up the Infrastructure
Create main.tf:
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}
provider "aws" {
region = var.aws_region
}
# VPC Configuration
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = "${var.project_name}-vpc"
}
}
# Availability Zones
data "aws_availability_zones" "available" {
state = "available"
}
# Public Subnets
resource "aws_subnet" "public" {
count = var.az_count
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index)
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
tags = {
Name = "${var.project_name}-public-${count.index + 1}"
}
}
# Private Subnets
resource "aws_subnet" "private" {
count = var.az_count
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 8, count.index + var.az_count)
availability_zone = data.aws_availability_zones.available.names[count.index]
tags = {
Name = "${var.project_name}-private-${count.index + 1}"
}
}
# Internet Gateway
resource "aws_internet_gateway" "main" {
vpc_id = aws_vpc.main.id
tags = {
Name = "${var.project_name}-igw"
}
}
# Elastic IPs for NAT Gateways
resource "aws_eip" "nat" {
count = var.az_count
vpc = true
tags = {
Name = "${var.project_name}-nat-eip-${count.index + 1}"
}
}
# NAT Gateways
resource "aws_nat_gateway" "main" {
count = var.az_count
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = {
Name = "${var.project_name}-nat-${count.index + 1}"
}
depends_on = [aws_internet_gateway.main]
}
# Route Tables
resource "aws_route_table" "public" {
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
gateway_id = aws_internet_gateway.main.id
}
tags = {
Name = "${var.project_name}-public-rt"
}
}
resource "aws_route_table" "private" {
count = var.az_count
vpc_id = aws_vpc.main.id
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = aws_nat_gateway.main[count.index].id
}
tags = {
Name = "${var.project_name}-private-rt-${count.index + 1}"
}
}
# Route Table Associations
resource "aws_route_table_association" "public" {
count = var.az_count
subnet_id = aws_subnet.public[count.index].id
route_table_id = aws_route_table.public.id
}
resource "aws_route_table_association" "private" {
count = var.az_count
subnet_id = aws_subnet.private[count.index].id
route_table_id = aws_route_table.private[count.index].id
}
Create alb.tf:
# Application Load Balancer
resource "aws_lb" "main" {
name = "${var.project_name}-alb"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.alb.id]
subnets = aws_subnet.public[*].id
enable_deletion_protection = true
tags = {
Name = "${var.project_name}-alb"
}
}
# ALB Target Group
resource "aws_lb_target_group" "main" {
name = "${var.project_name}-tg"
port = 80
protocol = "HTTP"
vpc_id = aws_vpc.main.id
target_type = "instance"
health_check {
healthy_threshold = 2
interval = 30
protocol = "HTTP"
matcher = "200"
timeout = 5
path = "/health"
unhealthy_threshold = 2
}
}
# ALB Listener
resource "aws_lb_listener" "http" {
load_balancer_arn = aws_lb.main.arn
port = 80
protocol = "HTTP"
default_action {
type = "forward"
target_group_arn = aws_lb_target_group.main.arn
}
}
# ALB Security Group
resource "aws_security_group" "alb" {
name = "${var.project_name}-alb-sg"
description = "Security group for ALB"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from anywhere"
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-alb-sg"
}
}
Create asg.tf:
# Launch Template
resource "aws_launch_template" "main" {
name_prefix = "${var.project_name}-lt"
image_id = var.ami_id
instance_type = var.instance_type
network_interfaces {
associate_public_ip_address = false
security_groups = [aws_security_group.app.id]
}
user_data = base64encode(<<-EOF
#!/bin/bash
yum update -y
yum install -y httpd
systemctl start httpd
systemctl enable httpd
echo "<h1>Hello from $(hostname -f)</h1>" > /var/www/html/index.html
echo "OK" > /var/www/html/health
EOF
)
tag_specifications {
resource_type = "instance"
tags = {
Name = "${var.project_name}-app"
}
}
}
# Auto Scaling Group
resource "aws_autoscaling_group" "main" {
name = "${var.project_name}-asg"
desired_capacity = var.desired_capacity
max_size = var.max_size
min_size = var.min_size
target_group_arns = [aws_lb_target_group.main.arn]
vpc_zone_identifier = aws_subnet.private[*].id
launch_template {
id = aws_launch_template.main.id
version = "$Latest"
}
tag {
key = "Name"
value = "${var.project_name}-app"
propagate_at_launch = true
}
}
# Application Security Group
resource "aws_security_group" "app" {
name = "${var.project_name}-app-sg"
description = "Security group for application instances"
vpc_id = aws_vpc.main.id
ingress {
description = "HTTP from ALB"
from_port = 80
to_port = 80
protocol = "tcp"
security_groups = [aws_security_group.alb.id]
}
egress {
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
}
tags = {
Name = "${var.project_name}-app-sg"
}
}
# Auto Scaling Policies
resource "aws_autoscaling_policy" "scale_up" {
name = "${var.project_name}-scale-up"
scaling_adjustment = 1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.main.name
}
resource "aws_autoscaling_policy" "scale_down" {
name = "${var.project_name}-scale-down"
scaling_adjustment = -1
adjustment_type = "ChangeInCapacity"
cooldown = 300
autoscaling_group_name = aws_autoscaling_group.main.name
}
Create rds.tf:
# DB Subnet Group
resource "aws_db_subnet_group" "main" {
name = "${var.project_name}-db-subnet"
subnet_ids = aws_subnet.private[*].id
tags = {
Name = "${var.project_name}-db-subnet"
}
}
# RDS Instance
resource "aws_db_instance" "main" {
identifier = "${var.project_name}-db"
engine = "mysql"
engine_version = "8.0"
instance_class = "db.t3.medium"
allocated_storage = 20
db_name = var.database_name
username = var.database_username
password = var.database_password
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.db.id]
multi_az = true
publicly_accessible = false
skip_final_snapshot = false
deletion_protection = true
backup_retention_period = 7
backup_window = "03:00-04:00"
maintenance_window = "Mon:04:00-Mon:05:00"
tags = {
Name = "${var.project_name}-db"
}
}
# DB Security Group
resource "aws_security_group" "db" {
name = "${var.project_name}-db-sg"
description = "Security group for RDS instance"
vpc_id = aws_vpc.main.id
ingress {
description = "MySQL from application"
from_port = 3306
to_port = 3306
protocol = "tcp"
security_groups = [aws_security_group.app.id]
}
tags = {
Name = "${var.project_name}-db-sg"
}
}
Variables Configuration
Create variables.tf:
variable "aws_region" {
description = "AWS region"
type = string
default = "us-west-2"
}
variable "project_name" {
description = "Name of the project"
type = string
}
variable "vpc_cidr" {
description = "CIDR block for VPC"
type = string
default = "10.0.0.0/16"
}
variable "az_count" {
description = "Number of AZs to use"
type = number
default = 2
}
variable "ami_id" {
description = "AMI ID for EC2 instances"
type = string
}
variable "instance_type" {
description = "Instance type for EC2 instances"
type = string
default = "t3.micro"
}
variable "desired_capacity" {
description = "Desired number of instances in ASG"
type = number
default = 2
}
variable "min_size" {
description = "Minimum number of instances in ASG"
type = number
default = 2
}
variable "max_size" {
description = "Maximum number of instances in ASG"
type = number
default = 4
}
variable "database_name" {
description = "Name of the database"
type = string
}
variable "database_username" {
description = "Database master username"
type = string
}
variable "database_password" {
description = "Database master password"
type = string
sensitive = true
}
Output Configuration
Create outputs.tf:
output "alb_dns_name" {
description = "DNS name of the load balancer"
value = aws_lb.main.dns_name
}
output "rds_endpoint" {
description = "Endpoint of the RDS instance"
value = aws_db_instance.main.endpoint
}
output "private_subnets" {
description = "IDs of private subnets"
value = aws_subnet.private[*].id
}
output "public_subnets" {
description = "IDs of public subnets"
value = aws_subnet.public[*].id
}
High Availability Features
-
Multi-AZ Infrastructure
- VPC spans multiple Availability Zones
- Public and private subnets in each AZ
- NAT Gateway in each AZ for redundancy
-
Load Balancing
- Application Load Balancer for HTTP traffic
- Health checks for instance monitoring
- Cross-zone load balancing enabled
-
Auto Scaling
- Auto Scaling Group across multiple AZs
- Scale-up and scale-down policies
- Launch Template for consistent configuration
-
Database High Availability
- Multi-AZ RDS deployment
- Automated backups enabled
- Maintenance windows configured
Deployment Steps
- Initialize Terraform:
terraform init
- Create
terraform.tfvars:
aws_region = "us-west-2"
project_name = "ha-demo"
ami_id = "ami-0735c191cf914754d"
database_name = "hadb"
database_username = "admin"
database_password = "your-secure-password"
- Review the plan:
terraform plan
- Apply the configuration:
terraform apply
Best Practices
-
Redundancy
- Deploy across multiple AZs
- Use Auto Scaling Groups
- Implement Multi-AZ RDS
-
Monitoring
- Configure health checks
- Set up CloudWatch alarms
- Monitor instance metrics
-
Security
- Use private subnets for applications
- Implement security groups
- Enable encryption for sensitive data
Monitoring and Maintenance
-
Health Checks
- ALB health checks
- RDS monitoring
- Instance status checks
-
Scaling
- CPU utilization-based scaling
- Schedule-based scaling
- Target tracking policies
-
Backup and Recovery
- Automated RDS backups
- AMI backups
- Disaster recovery planning
Conclusion
You’ve learned how to implement High Availability architecture using Terraform in AWS. This setup provides:
- Multi-AZ redundancy
- Automated scaling
- Load balancing
- Database high availability
Remember to:
- Monitor your infrastructure
- Test failover scenarios
- Keep configurations up to date
- Regularly review and optimize costs