Managing Google Kubernetes Engine (GKE) with Terraform
Managing Google Kubernetes Engine (GKE) with Terraform is a great choice for infrastructure as code (IaC) enthusiasts. Terraform allows you to define and manage your GKE clusters, node pools, and other resources in a declarative way. Here’s a basic guide to get you started
Video Tutorial
Learn more about managing Google Kubernetes Engine with Terraform in this comprehensive video tutorial:
Prerequisites
- Google Cloud SDK installed and configured
- Terraform installed (version 1.0.0 or later)
- Basic understanding of Kubernetes concepts
- A GCP project with billing enabled
Project Structure
gke-terraform
|-- README.md
|-- environments
| |-- dev
| | |-- README.md
| | |-- main.tf
| | |-- providers.tf
| | |-- variables.tf
| | `-- versions.tf
| `-- stage
`-- modules
`-- gke
|-- README.md
|-- main.tf
|-- outputs.tf
`-- variables.tf
Step 1: Initialize Terraform and Configure GCP GKE Provider
Provider Configuration
terraform {
required_version = ">= 1.0"
required_providers {
google = {
source = "hashicorp/google"
version = ">= 4.0"
}
google-beta = {
source = "hashicorp/google-beta"
version = ">= 4.0"
}
}
}
provider "google" {
project = var.project_id # The GCP project ID
region = var.region # The region for resource deployment
}
Variables
variable "project_id" {
description = "The ID of the GCP project"
type = string
default = ""
}
variable "region" {
description = "The region where resources will be deployed"
type = string
default = "us-central1"
}
variable "cluster_name" {
description = "The name of the GKE cluster"
type = string
default = "dev-gke-cluster"
}
variable "network" {
description = "The name of the VPC network"
type = string
default = "dev-network"
}
variable "subnetwork" {
description = "The name of the subnetwork"
type = string
default = "dev-subnetwork"
}
variable "cluster_secondary_range_name" {
description = "The name of the secondary range for pods"
type = string
default = "pods-range"
}
variable "services_secondary_range_name" {
description = "The name of the secondary range for services"
type = string
default = "services-range"
}
variable "master_ipv4_cidr_block" {
description = "The CIDR block for the master"
type = string
default = "10.0.0.0/28"
}
variable "node_count" {
description = "The number of nodes in the node pool"
type = number
default = 3
}
variable "machine_type" {
description = "The machine type for the nodes"
type = string
default = "e2-standard-2"
}
variable "disk_size_gb" {
description = "The disk size for the nodes"
type = number
default = 100
}
variable "disk_type" {
description = "The disk type for the nodes"
type = string
default = "pd-standard"
}
variable "node_labels" {
description = "The labels for the nodes"
type = map(string)
default = {
"env" = "dev"
"team" = "devops"
}
}
variable "node_tags" {
description = "The tags for the nodes"
type = list(string)
default = ["gke-node", "production"]
}
variable "maintenance_start_time" {
description = "The start time for the maintenance window"
type = string
default = "2025-01-01T00:00:00Z"
}
variable "maintenance_end_time" {
description = "The end time for the maintenance window"
type = string
default = "2026-01-01T00:00:00Z"
}
variable "maintenance_recurrence" {
description = "The recurrence for the maintenance window"
type = string
default = "FREQ=WEEKLY;BYDAY=SA,SU"
}
variable "node_metadata" {
description = "The metadata for the nodes"
type = map(string)
default = {
"disable-legacy-endpoints" = "true"
}
}
variable "master_authorized_networks" {
description = "The authorized networks for the master"
type = list(map(string))
default = [
{
cidr_block = "0.0.0.0/0"
display_name = "all"
}
]
}
GKE Cluster
resource "google_container_cluster" "primary" {
name = var.cluster_name
location = var.region
# We can't create a cluster with no node pool defined, but we want to only use
# separately managed node pools. So we create the smallest possible default
# node pool and immediately delete it.
remove_default_node_pool = true
initial_node_count = 1
network = google_compute_network.vpc.name
subnetwork = google_compute_subnetwork.subnet.name
ip_allocation_policy {
cluster_secondary_range_name = "pod-ranges"
services_secondary_range_name = "services-range"
}
master_authorized_networks_config {
cidr_blocks {
cidr_block = "0.0.0.0/0"
display_name = "All"
}
}
}
resource "google_container_node_pool" "primary_nodes" {
name = "${google_container_cluster.primary.name}-node-pool"
location = var.region
cluster = google_container_cluster.primary.name
node_count = var.node_count
node_config {
oauth_scopes = [
"https://www.googleapis.com/auth/logging.write",
"https://www.googleapis.com/auth/monitoring",
"https://www.googleapis.com/auth/devstorage.read_only",
"https://www.googleapis.com/auth/service.management.readonly",
"https://www.googleapis.com/auth/servicecontrol",
"https://www.googleapis.com/auth/trace.append",
]
labels = {
env = "production"
}
machine_type = var.machine_type
tags = ["gke-node"]
metadata = {
disable-legacy-endpoints = "true"
}
}
}
Outputs
output "kubernetes_cluster_name" {
value = google_container_cluster.primary.name
description = "GKE Cluster Name"
}
output "kubernetes_cluster_host" {
value = google_container_cluster.primary.endpoint
description = "GKE Cluster Host"
}
Best Practices
-
Security:
- Enable Workload Identity
- Use Binary Authorization
- Implement Network Policies
-
Networking:
- Use VPC-native clusters
- Configure private clusters
- Implement proper firewall rules
-
Cost Optimization:
- Use preemptible nodes when possible
- Implement autoscaling
- Right-size node pools
-
Maintenance:
- Enable auto-upgrades
- Configure maintenance windows
- Use node auto-repair
Common Operations
Creating the Cluster
terraform init
terraform plan
terraform apply
Getting Cluster Credentials
gcloud container clusters get-credentials $(terraform output -raw kubernetes_cluster_name) --region $(terraform output -raw region)
Destroying the Cluster
terraform destroy
Best Practices and Tips
-
Cluster Management:
- Use multiple node pools
- Implement proper monitoring
- Regular security audits
-
Security:
- Use Workload Identity
- Enable network policies
- Regular security updates
-
Performance:
- Configure autoscaling
- Monitor resource usage
- Use appropriate machine types
Conclusion
You’ve learned how to set up and manage Google Kubernetes Engine using Terraform. This setup provides:
- Automated cluster deployment
- Secure and scalable infrastructure
- Best practices implementation
- Easy cluster management and maintenance