EKS without VPC CNI: Deploying Calico with IPIP and BGP

2024-05-06

Introduction

AWS EKS defaults to the VPC CNI plugin, assigning VPC IPs to pods via ENIs. While straightforward, this setup limits pod density per node and consumes VPC IPs rapidly. To overcome these constraints, deploying Calico with IPIP or BGP offers a scalable alternative.

Why Replace AWS VPC CNI?

  • Pod Density Limitations: ENI and IP limits per instance type restrict the number of pods per node.
  • VPC IP Consumption: Each pod consumes a VPC IP, leading to potential exhaustion.
  • Complex Networking: Managing ENIs and secondary IPs adds complexity.

Calico addresses these issues by providing flexible IP address management and networking modes. Calico Docs

Calico Networking Modes

  • IPIP (IP-in-IP): Encapsulates pod traffic, allowing for scalable networking without consuming VPC IPs.
  • BGP (Border Gateway Protocol): Distributes routing information between nodes, enabling efficient traffic routing.

Calico Docs

Setup

Manually: eksctl create cluster --name calico-cluster --without-nodegroup

Using Terraform:

We're going to use community modules for the VPC and EKS cluster to avoid reinventing the wheel and keep things simple.

Providers:

provider "aws" {
  region = "us-west-2"
}

variable "cluster_name" {
  default = "calico-eks-cluster"
}

variable "region" {
  default = "us-west-2"
}

Create VPC & subnets:

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "3.14.2"

  name = "calico-vpc"
  cidr = "10.0.0.0/16"

  azs             = ["us-west-2a", "us-west-2b"]
  private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
  public_subnets  = ["10.0.101.0/24", "10.0.102.0/24"]

  enable_nat_gateway = true
  single_nat_gateway = true

  tags = {
    Name = "calico-vpc"
  }
}

Create EKS Cluster without nodegroup:

module "eks" {
  source          = "terraform-aws-modules/eks/aws"
  version         = "20.4.0"
  cluster_name    = var.cluster_name
  cluster_version = "1.27"

  vpc_id     = module.vpc.vpc_id
  subnet_ids = module.vpc.private_subnets

  enable_irsa = true

  manage_aws_auth = true
  create_node_security_group = true

  eks_managed_node_groups = {}

  node_security_group_additional_rules = {
    ingress_self_all = {
      description = "Node to node communication"
      protocol    = "-1"
      from_port   = 0
      to_port     = 0
      type        = "ingress"
      self        = true
    }
  }

  tags = {
    Environment = "dev"
    Terraform   = "true"
  }
}

terraform init terraform apply

Configure Calico as CNO

aws eks --region us-west-2 update-kubeconfig --name calico-eks-cluster # Configure kubectl to use the new cluster

kubectl delete daemonset aws-node -n kube-system # Delete the AWS VPC CNI plugin

Deploy Calico

# install calico operator
kubectl apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.29.3/manifests/tigera-operator.yaml

Calico installation

For IPIP mode, we need to set the ipPools to use the IPIP encapsulation mode.

apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  kubernetesProvider: EKS
  cni:
    type: Calico
  calicoNetwork:
    bgp: Disabled
    ipPools:
      - cidr: 192.168.0.0/16
        encapsulation: IPIP
        natOutgoing: Enabled
        nodeSelector: all()

For BGP mode, we need to set the bgp to Enabled and configure the ipPools to use the BGP encapsulation mode.

apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  kubernetesProvider: EKS
  cni:
    type: Calico
  calicoNetwork:
    bgp: Enabled
    ipPools:
      - cidr: 192.168.0.0/16
        encapsulation: None
        natOutgoing: Enabled
        nodeSelector: all()

depending on which one you use, run kubectl apply -f <file>.yaml to deploy Calico.

Add node groups

We add node groups after the Calico installation to ensure that the nodes are created with the correct configuration.

Update your Terraform configuration to add node groups:

module "eks" {
  # ... existing configuration ...

  eks_managed_node_groups = {
    calico_nodes = {
      desired_capacity = 2
      max_capacity     = 3
      min_capacity     = 1

      instance_types = ["t3.medium"]
      subnet_ids     = module.vpc.private_subnets
    }
  }
}

terraform apply

Why Node Groups Are Added After Calico Installation

If you create node groups before removing the AWS VPC CNI (aws-node), the following happens:

  • Nodes boot up with the AWS VPC CNI already running.
  • Each node tries to attach ENIs and configure VPC IPs.
  • This conflicts with Calico's CNI once you install it.
  • Even worse: nodes may go NotReady or your pods may fail to get Calico-managed IPs.

Verify Calico

kubectl get pods -n calico-system

Configure BGP Peering (optional)

If you want to use BGP peering, you need to configure the BGP peering between the nodes and the VPC.

If using BGP mode and need to configure peering:

Disable Node-to-Node Mesh:

calicoctl patch bgpconfiguration default -p '{"spec": {"nodeToNodeMeshEnabled": false}}'

Configure Global BGP Peer:

apiVersion: projectcalico.org/v3
kind: BGPPeer
metadata:
  name: global-peer
spec:
  peerIP: <peer-ip>
  asNumber: <asn>

kubectl apply -f bgp-peer.yaml

Related Posts