Setting up AWS EKS (Elastic Kubernetes Service) with Terraform

Learn how to provision and manage AWS EKS clusters using Terraform, including node groups, add-ons, and best practices for Kubernetes deployments

Overview

AWS EKS is a managed Kubernetes service that makes it easy to run Kubernetes on AWS. This guide demonstrates how to set up and manage EKS clusters using Infrastructure as Code with Terraform.

Prerequisites

Required Tools

  • AWS CLI configured with appropriate permissions
  • Terraform (version 1.0.0 or later)
  • kubectl installed
  • AWS IAM permissions for EKS management

Knowledge Requirements

  • Basic understanding of Kubernetes concepts
  • Familiarity with AWS services
  • Understanding of Infrastructure as Code
  • Basic Terraform knowledge

Infrastructure Design

Project Structure

terraform-eks/
├── main.tf           # Main Terraform configuration
├── variables.tf      # Input variables
├── outputs.tf        # Output values
├── modules/
│   └── eks/         # EKS module
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
└── kubernetes/      # Kubernetes manifests
    └── manifests/
        ├── namespace.yaml
        └── deployment.yaml

Core Components

  1. EKS Cluster
  2. Managed Node Groups
  3. Fargate Profiles
  4. VPC Configuration
  5. Security Groups
  6. IAM Roles and Policies

Implementation Guide

1. EKS Cluster Configuration

Create the main EKS cluster with encryption and logging:

# EKS Cluster
resource "aws_eks_cluster" "main" {
  name     = "${var.project_name}-cluster"
  role_arn = aws_iam_role.cluster.arn
  version  = var.kubernetes_version

  vpc_config {
    subnet_ids              = var.subnet_ids
    endpoint_private_access = true
    endpoint_public_access  = true
    public_access_cidrs    = var.allowed_cidr_blocks
    security_group_ids     = [aws_security_group.cluster.id]
  }

  encryption_config {
    provider {
      key_arn = aws_kms_key.eks.arn
    }
    resources = ["secrets"]
  }

  enabled_cluster_log_types = [
    "api",
    "audit",
    "authenticator",
    "controllerManager",
    "scheduler"
  ]

  kubernetes_network_config {
    service_ipv4_cidr = var.service_ipv4_cidr
    ip_family         = "ipv4"
  }

  tags = merge(
    var.tags,
    {
      Name = "${var.project_name}-cluster"
    }
  )
}

2. Node Group Management

Configure managed node groups for workload execution:

resource "aws_eks_node_group" "main" {
  cluster_name    = aws_eks_cluster.main.name
  node_group_name = "${var.project_name}-node-group"
  node_role_arn   = aws_iam_role.node_group.arn
  subnet_ids      = var.private_subnet_ids

  scaling_config {
    desired_size = var.desired_size
    max_size     = var.max_size
    min_size     = var.min_size
  }

  update_config {
    max_unavailable = 1
  }

  ami_type       = "AL2_x86_64"
  capacity_type  = "ON_DEMAND"
  disk_size      = 50
  instance_types = ["t3.medium"]

  labels = {
    role = "general"
  }
}

3. Fargate Integration

Set up Fargate profiles for serverless workloads:

resource "aws_eks_fargate_profile" "main" {
  cluster_name           = aws_eks_cluster.main.name
  fargate_profile_name   = "${var.project_name}-fargate"
  pod_execution_role_arn = aws_iam_role.fargate.arn
  subnet_ids             = var.private_subnet_ids

  selector {
    namespace = "default"
    labels = {
      Environment = "production"
      Type        = "fargate"
    }
  }
}

Best Practices

Security

  1. Network Security

    • Use private endpoints where possible
    • Implement strict security groups
    • Enable KMS encryption for secrets
  2. Access Control

    • Implement least privilege IAM roles
    • Use RBAC for Kubernetes access
    • Enable AWS IAM authentication

Cost Optimization

  1. Resource Management

    • Use appropriate instance types
    • Implement auto-scaling
    • Consider Spot instances for non-critical workloads
  2. Operational Efficiency

    • Use managed node groups
    • Implement proper tagging
    • Configure cluster autoscaler

Monitoring and Maintenance

Cluster Monitoring

resource "aws_cloudwatch_log_group" "eks" {
  name              = "/aws/eks/${var.project_name}/cluster"
  retention_in_days = 30
}

resource "aws_eks_addon" "cloudwatch" {
  cluster_name = aws_eks_cluster.main.name
  addon_name   = "amazon-cloudwatch-observability"
}

Health Checks

resource "kubernetes_horizontal_pod_autoscaler_v2" "app" {
  metadata {
    name = "app-hpa"
  }
  spec {
    scale_target_ref {
      api_version = "apps/v1"
      kind        = "Deployment"
      name        = "app"
    }
    min_replicas = 2
    max_replicas = 10
    metric {
      type = "Resource"
      resource {
        name = "cpu"
        target {
          type                = "Utilization"
          average_utilization = 70
        }
      }
    }
  }
}

Troubleshooting Guide

Common Issues

  1. Cluster Creation Failures

    • Check IAM permissions
    • Verify VPC configuration
    • Review security group rules
  2. Node Group Issues

    • Check instance type availability
    • Verify subnet configuration
    • Review IAM roles
  3. Network Problems

    • Check VPC CNI configuration
    • Verify security group rules
    • Review network policies

Maintenance Tasks

Regular Updates

  • Keep EKS version current
  • Update node AMIs
  • Review security patches
  • Monitor CloudWatch logs

Backup and Recovery

  • Implement regular etcd backups
  • Configure disaster recovery
  • Test restoration procedures

Additional Resources

Documentation