
Terraform Basics: Managing Infrastructure as Code
Why clicking through the AWS console is a recipe for pain, and how declaring infrastructure as code with Terraform changes everything. Practical VPC + EC2 + RDS example included.

Why clicking through the AWS console is a recipe for pain, and how declaring infrastructure as code with Terraform changes everything. Practical VPC + EC2 + RDS example included.
Solving server waste at dawn and crashes at lunch. Understanding Auto Scaling vs Serverless through 'Taxi Dispatch' and 'Pizza Delivery' analogies. Plus, cost-saving tips using Spot Instances.

How to deploy without shutting down servers. Differences between Rolling, Canary, and Blue-Green. Deep dive into Database Rollback strategies, Online Schema Changes, AWS CodeDeploy integration, and Feature Toggles.

Why your server isn't hacked. From 'Packet Filtering' checking ports/IPs to AWS Security Groups. Evolution of Firewalls.

Why would Netflix intentionally shut down its own production servers? Explore the philosophy of Chaos Engineering, the Simian Army, and detailed strategies like GameDays and Automating Chaos to build resilient distributed systems.

One week into a new job. A production server was acting up and I needed to recreate the same environment. I opened the AWS console. The EC2 security groups? No idea who attached them or why. The RDS parameter group? Someone tweaked it six months ago — no notes. The VPC CIDR block? No clue why that particular range was chosen.
That's the reality of console-driven infrastructure management. Someone built it by hand, the process was never recorded. You can't reproduce it, you can't audit it, and when people leave, the knowledge disappears with them.
When I first encountered Terraform, "building servers from code" felt abstract. But once I started using it, I realized this isn't just automation — it's applying version control to infrastructure.
| Problem | Description |
|---|---|
| Not reproducible | Can't recreate a production-identical environment |
| Drift | Silent manual changes accumulate over time |
| No collaboration | "I set that up... but I forget what I did" |
| No rollback | Hard to revert to a previous state |
| No audit trail | Who changed what, when? Unknown |
Code = infrastructure documentation + automation + version history
Terraform isn't the only IaC tool out there.
| Terraform | Pulumi | CloudFormation | |
|---|---|---|---|
| Language | HCL (custom DSL) | Python/TypeScript/Go/etc | YAML/JSON |
| Multi-cloud | Full support | Full support | AWS only |
| Learning curve | Medium | Low (use existing languages) | High |
| Community | Very large | Growing | AWS ecosystem |
| State management | Self-managed state file | Self-managed | AWS-managed |
| Cost | OSS free / Cloud paid | OSS free / Cloud paid | Free |
This post focuses on Terraform.
Terraform uses HCL (HashiCorp Configuration Language). It's more readable than JSON and more expressive than YAML.
# main.tf
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
required_version = ">= 1.6"
}
provider "aws" {
region = "ap-northeast-2" # Seoul region
}
resource "aws_instance" "web" {
ami = "ami-0c9c942bd7bf113a2"
instance_type = "t3.micro"
tags = {
Name = "web-server"
Environment = "production"
}
}
# variables.tf
variable "environment" {
description = "Deployment environment"
type = string
default = "staging"
validation {
condition = contains(["staging", "production"], var.environment)
error_message = "Environment must be 'staging' or 'production'."
}
}
variable "db_password" {
description = "RDS master password"
type = string
sensitive = true # Hidden from logs and plan output
}
# outputs.tf
output "web_public_ip" {
value = aws_instance.web.public_ip
}
# locals: reusable computed values
locals {
common_tags = {
Project = "codemapo"
Environment = var.environment
ManagedBy = "terraform"
}
name_prefix = "${var.environment}-codemapo"
}
provider "aws" {
region = "ap-northeast-2"
# Option 1: Environment variables (recommended)
# AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY
# Option 2: Assume role (for CI/CD)
assume_role {
role_arn = "arn:aws:iam::123456789:role/TerraformRole"
}
}
Resources can reference each other. Terraform automatically determines the dependency order.
resource "aws_security_group" "web" {
name = "web-sg"
vpc_id = aws_vpc.main.id # VPC must be created first
ingress {
from_port = 80
to_port = 80
protocol = "tcp"
cidr_blocks = ["0.0.0.0/0"]
}
}
resource "aws_instance" "web" {
ami = "ami-0c9c942bd7bf113a2"
instance_type = var.instance_type
vpc_security_group_ids = [aws_security_group.web.id] # references security group
subnet_id = aws_subnet.public.id
}
Reference resources managed outside of Terraform:
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["al2023-ami-*-x86_64"]
}
}
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id # always latest AMI
instance_type = "t3.micro"
}
Terraform records actual infrastructure state in a state file (terraform.tfstate). Without it, Terraform can't know what exists and what needs to change.
Local state files cause conflicts in team environments. Use a remote backend.
# backend.tf
terraform {
backend "s3" {
bucket = "mycompany-terraform-state"
key = "production/terraform.tfstate"
region = "ap-northeast-2"
encrypt = true
dynamodb_table = "terraform-state-lock" # prevents concurrent runs
}
}
When someone manually changes resources in the console, state and reality diverge.
# Detect drift without making changes
terraform plan -refresh-only
# Sync state to match actual infrastructure
terraform apply -refresh-only
terraform init # download providers, connect backend
terraform fmt # format code
terraform validate # check for errors
terraform plan # preview changes (no modifications)
terraform apply # apply changes
# aws_instance.web will be created
+ resource "aws_instance" "web" { ... }
# aws_security_group.web will be updated in-place
~ resource "aws_security_group" "web" { ... }
# aws_db_instance.legacy will be destroyed
- resource "aws_db_instance" "legacy" { ... }
Plan: 1 to add, 1 to change, 1 to destroy.
+ = create, ~ = modify, - = destroy. Watch out for -/+ (destroy then recreate) — that can cause downtime.
Modules are Terraform's equivalent of functions — they encapsulate repeated infrastructure patterns.
# Using a module
module "web_server" {
source = "./modules/ec2-web"
name = "codemapo-web"
vpc_id = aws_vpc.main.id
subnet_id = aws_subnet.public.id
ami_id = data.aws_ami.amazon_linux.id
instance_type = "t3.small"
}
# Using a public Registry module
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.5.0"
name = "main-vpc"
cidr = "10.0.0.0/16"
azs = ["ap-northeast-2a", "ap-northeast-2c"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
}
A complete basic stack for a web application.
# network.tf
resource "aws_vpc" "main" {
cidr_block = "10.0.0.0/16"
enable_dns_hostnames = true
tags = merge(local.common_tags, { Name = "${local.name_prefix}-vpc" })
}
resource "aws_subnet" "public" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 1}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
map_public_ip_on_launch = true
}
resource "aws_subnet" "private" {
count = 2
vpc_id = aws_vpc.main.id
cidr_block = "10.0.${count.index + 11}.0/24"
availability_zone = data.aws_availability_zones.available.names[count.index]
}
# database.tf
resource "aws_db_subnet_group" "main" {
name = "${local.name_prefix}-db-subnet"
subnet_ids = aws_subnet.private[*].id
}
resource "aws_db_instance" "main" {
identifier = "${local.name_prefix}-db"
engine = "postgres"
engine_version = "16.1"
instance_class = var.db_instance_class
allocated_storage = 20
max_allocated_storage = 100
storage_encrypted = true
db_name = "app_db"
username = "app_user"
password = var.db_password
db_subnet_group_name = aws_db_subnet_group.main.name
vpc_security_group_ids = [aws_security_group.rds.id]
backup_retention_period = 7
deletion_protection = var.environment == "production"
skip_final_snapshot = var.environment != "production"
tags = local.common_tags
}
cd infrastructure/
terraform init
cp terraform.tfvars.example terraform.tfvars
# edit terraform.tfvars
terraform validate
terraform plan -out=tfplan
terraform apply tfplan
terraform output # see results
*.tfstate
*.tfstate.backup
.terraform/
terraform.tfvars
*.tfplan
# Pull from AWS Secrets Manager
data "aws_secretsmanager_secret_version" "db_password" {
secret_id = "production/db/password"
}
resource "aws_db_instance" "main" {
password = data.aws_secretsmanager_secret_version.db_password.secret_string
}
# Or inject via environment variable
# TF_VAR_db_password=... terraform apply
Once you start using Terraform, the AWS console feels awkward. You keep thinking "I should just declare this in code."
The core insight is simple: infrastructure is code. Review it, test it, version-control it. "Does anyone know how this server was built?" gets answered with git log.