Why I replaced AWS NAT Gateway with a NAT Instance - and saved 20$ of dollar per month

2022-04-03

Introduction

Why Consider a NAT Instance Over a NAT Gateway?

AWS offers NAT Gateways as the default, fully managed solution for letting private subnet resources reach the internet. However, NAT Gateways can be pricey:

  • Hourly cost: ~₹3.75/hour (varies by region)
  • Data transfer cost: Additional ₹3.75/GB on top of standard data transfer

For small dev/test environments or personal labs, these costs can add up quickly. In contrast, a NAT Instance is just a normal EC2 instance configured to perform IP forwarding and NAT. It’s typically much cheaper to run a small instance (t3.micro) than a NAT Gateway, especially if your traffic volume is modest.

How NAT Works (Big Picture)

NAT (Network Address Translation) is a way for devices on a private network to communicate with the outside world using a single (or a small number of) public IP(s). In AWS:

  1. Private Subnet resources route outbound traffic to a NAT device (gateway or instance).
  2. NAT Device replaces the private source IP with its own public IP.
  3. Return traffic from the internet is routed back to the NAT device, which translates it back to the private source IP.

The AWS NAT Gateway handles this as a managed service. A NAT Instance is a do-it-yourself approach:

  • You pick an EC2 AMI and instance type.
  • You enable IP forwarding and set up iptables rules for NAT.
  • You configure your private subnet route table to point 0.0.0.0/0 traffic at the NAT Instance.

High-Level Comparison

| Feature | NAT Gateway | NAT Instance | |------------------------|-----------------------------------|---------------------------------------| | Managed Service | Yes (HA if deployed in multiple AZs) | No (you manage patching, health, etc.)| | Cost | Hourly + data transfer surcharges | EC2 hourly cost + standard data transfer | | Complexity | Very low | Medium (iptables, IP forwarding) | | Ideal Use Case | Production/high-volume | Dev/test/labs or cost-sensitive setups|


Step-by-Step: Setting Up a NAT Instance Manually

Below is a straightforward approach using Amazon Linux 2 or Ubuntu. You can adapt it as needed.

1. Launch an EC2 in a Public Subnet

  • AMI: Amazon Linux 2 or Ubuntu (lightweight, widely supported).
  • Instance Type: t3.micro for small workloads; scale up if needed.
  • Network: Must be in a public subnet that has a route to an Internet Gateway.
  • Security Group:
    • Inbound: Typically just SSH (port 22 from your IP or VPC CIDR) and maybe ICMP for debugging.
    • Outbound: Usually all traffic is allowed.
  • Elastic IP: Allocate an Elastic IP and associate it to this instance to maintain a stable public IP.

2. Enable IP Forwarding

SSH into the instance and enable forwarding:

sudo su -
echo 1 > /proc/sys/net/ipv4/ip_forward

# Make forwarding persistent across reboots
echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
sysctl -p

Configure iptables for NAT

On Amazon Linux 2 or Ubuntu, set up iptables to NAT outgoing traffic:

# Flush existing rules first (careful in production!)
iptables -F
iptables -t nat -F

# MASQUERADE traffic going out the public interface (often named eth0)
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

# Forward packets (replace eth0 with your interface if it differs)
iptables -A FORWARD -i eth0 -o eth0 -j ACCEPT
iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT

# Persist iptables rules (varies by distro)
service iptables save 2>/dev/null || iptables-save > /etc/sysconfig/iptables

Update Private Subnet Route

In your private subnet’s Route Table:

  • Destination: 0.0.0.0/0

  • Target: The Instance ID of the NAT Instance

Verify

Create a test EC2 in the private subnet, SSH in (bastion or Session Manager), and run:

curl https://google.com

If it responds, your NAT Instance is up!

Method 2: Automated Setup with Packer & Terraform (Recommended)

For more repeatable deployments (e.g., multiple regions/accounts), it’s better to bake an AMI with the NAT settings pre-applied and spin it up via Terraform. That way, you avoid repeating manual steps.

Step A: Create a Packer Template

Below is a Packer HCL example (packer-nat.hcl). It builds an AMI from Amazon Linux 2, enables IP forwarding, and configures the iptables NAT rules.

packer {
  required_plugins {
    amazon = {
      version = ">= 0.0.1"
      source  = "github.com/hashicorp/amazon"
    }
  }
}

variable "aws_region" {
  type    = string
  default = "us-east-1"
}

source "amazon-ebs" "nat_instance" {
  region               = var.aws_region
  instance_type        = "t3.micro"
  ami_name             = "custom-nat-{{timestamp}}"
  source_ami_filter {
    filters = {
      name                = "amzn2-ami-hvm-*-x86_64-gp2"
      root-device-type    = "ebs"
      virtualization-type = "hvm"
    }
    owners      = ["137112412989"] # Amazon Linux 2 official owner
    most_recent = true
  }
  ssh_username         = "ec2-user"
}

build {
  name    = "nat-instance-ami"
  sources = ["source.amazon-ebs.nat_instance"]

  provisioner "shell" {
    inline = [
      "echo 1 > /proc/sys/net/ipv4/ip_forward",
      "echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf",
      "sysctl -p",
      "iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE",
      "iptables -A FORWARD -i eth0 -o eth0 -j ACCEPT",
      "iptables -A FORWARD -m state --state RELATED,ESTABLISHED -j ACCEPT",
      "service iptables save 2>/dev/null || iptables-save > /etc/sysconfig/iptables"
    ]
  }
}

Step B: Build the AMI

packer build -var aws_region=us-east-1 packer-nat.hcl

| This will create a new AMI (named custom-nat-<timestamp>) in your account.

Step C: Use the AMI in Terraform

# main.tf

provider "aws" {
  region = var.aws_region
}

resource "aws_security_group" "nat_instance_sg" {
  name        = "nat-instance-sg"
  description = "Security Group for NAT instance"
  vpc_id      = var.vpc_id

  ingress {
    description      = "SSH from my IP"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = [var.my_ip_cidr]
  }

  # Egress can be wide open for NAT traffic
  egress {
    from_port        = 0
    to_port          = 0
    protocol         = "-1"
    cidr_blocks      = ["0.0.0.0/0"]
  }
}

resource "aws_instance" "nat_instance" {
  ami                         = var.nat_ami_id
  instance_type               = "t3.micro"
  subnet_id                   = var.public_subnet_id
  associate_public_ip_address = true
  vpc_security_group_ids      = [aws_security_group.nat_instance_sg.id]

  tags = {
    Name = "my-nat-instance"
  }
}

resource "aws_eip" "nat_instance_eip" {
  instance = aws_instance.nat_instance.id
  depends_on = [aws_instance.nat_instance]
}

resource "aws_route" "private_subnet_to_nat" {
  route_table_id         = var.private_subnet_rtb_id
  destination_cidr_block = "0.0.0.0/0"
  instance_id            = aws_instance.nat_instance.id
}

// Usage:

// Update variables (vpc_id, public_subnet_id, private_subnet_rtb_id, etc.) with your environment details.

// terraform init && terraform apply.

// Terraform will:

// - Launch an EC2 using the custom NAT AMI
// - Assign it a public IP via an EIP
// - Create a route in the private subnet’s route table, pointing default traffic to this NAT instance.

Operational Tips

  • High Availability: One NAT instance is a single point of failure. If you need robust HA, you’ll have to deploy multiple NAT instances in multiple AZs and handle failover logic. NAT Gateway does this automatically (in a multi-AZ deployment).

  • Scaling: NAT Gateways auto-scale with traffic. For NAT Instances, you’ll need to monitor traffic and bump up the instance size or add more NAT instances if traffic grows.

  • Security:

    • Keep OS packages up to date.

    • Restrict inbound SSH to your IP only.

    • The instance is publicly exposed, so ensure best practices are followed for patching and firewall rules.

  • Cost: A small NAT Instance can run ~£4/month, whereas NAT Gateways can be ~£24/month + premium data transfer costs. For dev/test, which is a fair amount.

Conclusion

  • Method 1 (Manual) is a quick way to see how NAT Instances work in a single environment.

  • Method 2 (Automated via Packer & Terraform) is ideal for repeated or multi-account deployments. You save both money (vs. NAT Gateway) and time (vs. manual config each time).

If your traffic is light and you’re comfortable managing an instance, NAT Instances are a great cost-saving option—especially for labs, side projects, and dev/test. Meanwhile, production setups might still justify the ease and auto-scaling reliability of NAT Gateways.

Related Posts