Deploy an application in ECS with Terraform

In this article, I take you trough the process of deploying a docker container in AWS ECS. I'll be covering every step required to get a docker application up and running in AWS ECS using Fargate. This article assumes you have some knowledge on how to write and deploy Terraform code with AWS. If you don't have Terraform with AWS already up and running, please have a look at Terraform with AWS. I will also be using Docker to create my container application in this article. If you don't have any experience with Docker, please have a look at the Docker documentation.

Introduction

AWS ECS Fargate is a managed and serverless way to run containerized applications in the cloud. ECS consists of clusters, services, task definitions and tasks. Clusters are groups of services and are used to isolate services from each other. Service is the part that runs your tasks and will make sure that the desired number of tasks, are running. Task definitions is a text file that describes how our tasks will run. Parameters such as docker image, CPU, memory, logging etc. are defined within our task definition. Lastly, a task is the instantiation of task definitions that can be run in a service. With ECS Fargate, you only pay for the CPU and memory used by your applications.

I might not go in depth on all of the components shown in this article, but you'll hopefully have a good idea of how it all works. I'll go through how to set up the following.

Virtual Private Cloud (VPC)
Application Load Balancer (ALB)
ECS Cluster, Service and Task definition
Identity Access Management (IAM)
Security Groups
CloudWatch log group
ECR repository
A Docker container running a hello-world Python Flask application

Variables

First of, I'll start by defining my variables. I'll create a file, variables.tf, and insert the following.

variable "region" {
  type = string 
  default = "eu-north-1"
}

variable "vpc_cidr" {
  type = string
  description = "VPC cidr"
  default = "10.0.0.0/16"
}

variable "vpc_azs" {
  type = list(string)
  description = "Availability zones for VPC"
  default = ["eu-north-1a", "eu-north-1b", "eu-north-1c"]
}

variable "private_subnets" {
  type = list(string)
  description = "Private subnets inside the VPC"
  default = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"]
}

variable "public_subnets" {
  type = list(string)
  description = "Public subnets inside the VPC"
  default = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"]
}

Here I define the region, VPC cidr and availability zones I want to use. I also define some cidrs for my public and private subnets that I'll create in the next step.

VPC

Next step is to create a Virtual Private Cloud (VPC). I'll be using a Terraform module for this to simplify the configuration setup. I'll create a file, vpc.tf, and insert the following.

module "vpc" {
  source  = "terraform-aws-modules/vpc/aws"
  version = "3.0.0"

  name = "my-vpc"
  cidr = var.vpc_cidr

  azs  = var.vpc_azs
  private_subnets = var.private_subnets
  public_subnets = var.public_subnets

  create_igw             = true
  enable_nat_gateway     = true
  create_egress_only_igw = true
  single_nat_gateway     = true
}

ALB

Next, I need a Application Load Balancer (ALB). An ALB will balance the load going towards my applications in ECS . I'm running my ECS service in a private subnet, so the ALB will also be exposing my applications to the internet. This is so that all internet traffic towards my services, needs to go through my ALB. I also create a target_group. The ALB will automatically distribute the traffic between targets within a target group. I'll later specify in my task definition to use this target group. Lastly, I'll create a ALB listener. This is how my ALB knows where to route traffic. I can specify a hostname, e.g my-website.com in a rule, and choose to direct all traffic from this host, towards the target group which corresponds with my-website.com ECS service. I can add another rule telling my ALB to route all traffic from my-other-website.com to the correct target group. ALB listeners is also used to route all HTTP traffic to HTTPS. I won't be covering how to use SSL/TLS termination with ALB in this article.

I'll create a file, alb.tf, and add the following.

resource "aws_lb" "main_lb" {
  name               = "my-alb"
  internal           = false
  load_balancer_type = "application"
  security_groups    = [aws_security_group.alb_sg.id]
  subnets            = module.vpc.public_subnets
 }

resource "aws_alb_target_group" "my_ecs_target_group" {
  name        = "my-app-tg"
  port        = 80
  protocol    = "HTTP"
  vpc_id      = module.vpc.vpc_id
  target_type = "ip"
 
  health_check {
   healthy_threshold   = "3"
   interval            = "30"
   protocol            = "HTTP"
   matcher             = "200"
   timeout             = "3"
   path                = "/"
   unhealthy_threshold = "2"
  }
} 

resource "aws_alb_listener" "http" {
  load_balancer_arn = aws_lb.main_lb.id
  port              = 80
  protocol          = "HTTP"
 
  default_action {
    target_group_arn = aws_alb_target_group.my_ecs_target_group.id
    type             = "forward"
  }
}

### AWS Load Balancer ###

Here I create my application load balancer and set the following arguments,

name, is the name of my ALB.
internal, specifies where I want to target the incoming traffic. I've chosen internal since I'm routing traffic to ECS services inside my VPC.
load_balancer_type, is the type of load balancer I want to use. Here I've chosen to use an application load balancer.
security_groups, is a list of security groups I want to attach to my ALB.
subnets, is the subnets I want my ALB to exist in. I use 3 subnets to achive high resilience for my ALB.
enable_protection_deletion,

### AWS Target Group ###

Next, I need to create my target group. Here I set the following arguments,

name - the name of my target group.
port - the port on which targets recieve traffic. When using ECS this actually doesn't matter. This will only act as a default port unless overwritten.
protocol - what protocol to use when routing traffic to targets.
vpc_id - the VPC ID of the VPC I created earlier.
target_type - the type of target we want to route traffic to. I want to route traffic to the ip addresses of my ESC services.

I also set up Health check with this target group and set the following arguments,

healty_treshold - the number of consecutive health check successes required before considering a target healthy.
interval - how many seconds between each health check.
protocol - the same protocol the ALB uses to send health checks to targets.
mather - the respons code I want to use to verfiy a health check. This is usually 200.
timeout - how many seconds without a respons it should wait before deeming the target as unhealthy.
path - the destination of the health check in my application.
unhealthy_threshold - the number of consecutive health check failures required before considering a target unhealthy.

### AWS ALB Listener ###

Lasty, I need to configure my ALB listener and set the following arguments,

load_balancer_arn - the id of the load balancer I created above.
port - the port that the load balancer is listening on.
protocol - is the protocol of which clients connect to the load balancer.

The interesting part of the listener is the default_action. Here I specify what I want to do with traffic coming into my ALB. I set type to forward which will forward all traffic to the target group specified with target_group_arn.

ECS

Now, I need to define my ECS configuration. I'll create a file, ecs.tf, and insert the following.

resource "aws_ecs_cluster" "my_ecs_cluster" {
  name = "my-ecs-cluster"
}

resource "aws_ecs_service" "my_ecs_service" {
  name                               = "my-ecs-service"
  cluster                            = aws_ecs_cluster.my_ecs_cluster.id
  task_definition                    = aws_ecs_task_definition.main_task_definition.arn
  desired_count                      = 1
  deployment_minimum_healthy_percent = 50
  deployment_maximum_percent         = 200
  launch_type                        = "FARGATE"
  scheduling_strategy                = "REPLICA"

  network_configuration {
    security_groups  = [aws_security_group.ecs_services_sg.id]
    subnets          = module.vpc.private_subnets
  }

  load_balancer {
    target_group_arn = aws_alb_target_group.my_ecs_target_group.arn
    container_name   = var.app_name
    container_port   = var.container_port
  }
}

resource "aws_ecs_task_definition" "main_task_definition" {
  family                   = "my-task-definition"
  requires_compatibilities = ["FARGATE"]
  execution_role_arn       = aws_iam_role.ecs_task_execution_role.arn
  task_role_arn            = aws_iam_role.ecs_task_role.arn
  network_mode             = "awsvpc"
  cpu                      = 256
  memory                   = 512

  container_definitions = templatefile("containers/task.tpl.json",
    { CONTAINER_PORT              = var.container_port,
      REGION                      = var.region,
      LOG_GROUP                   = aws_cloudwatch_log_group.my_ecs_service_log_group.name,
      APP_NAME                    = var.app_name,
  })

}

Here, I first create my cluster. I only need to define the name of my cluter.

### ECS Service ###

Here, I set the following arguments,

desired_count - the number of tasks I want running at once. I only need one, so I'll set this to 1.
deployment_minimum_health_percent - the lower limit, as percent of desired_count, that must be running. In this case, 50 percent means there will always be 1 running task.
deployment_maximum_percent - the upper limit, as percent of desired_count, that can be run. In this case, the service can deploy a new task without terminating the existing one. After the new task is running in a healthy state, the previous task will be terminated.
launch_type - set to Fargate since I want my ECS service to run with Fargate.
scheduling_strategy - how to reschedule a task, after failure. This is set to Replica.

For network configuration I choose the security I want to associate with my service. I also choose to put my service in a private subnet, to restrict outbound traffic directly to my service.

### Task Definition ###

Here, I create my task definition and set the following arguments,

family - the name my task definition.
requires_compatibilities - which ECS launch type I want.
execution_role_arn - which task execution role I want to use. I've set this to the task execution role I created earlier.
task_role_arn - which task role I want to use. I've set this to the task role I created earlier.
network_mode - the Docker networking mode to use in tasks.
cpu - the number of cpu units the task will have.
memory - is the amount of memory, in Mib, the task will have.

I also define container_definitions which is a list of container definitions in a single JSON document. The container definition will be passed on to the Docker deamon.

Next, I need to create this JSON template file referenced in my container definition. I'll create a folder in my project called containers. In this folder I'll create a file, task.tpl.json and insert the following.

[  
  {
    "name": "${APP_NAME}",
    "image": "aw_account_id.dkr.ecr.aws_region.amazonaws.com/my-ecr:latest",
    "essential":true,
    "portMappings": [
      {
        "containerPort": ${CONTAINER_PORT},
        "hostPort": ${CONTAINER_PORT}
      }
    ],
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-region": "${REGION}",
        "awslogs-stream-prefix": "app-logstream",
        "awslogs-group": "${LOG_GROUP}"
      }
    }
  }
]

note! remember to change aws_account_id and aws_region to match your configuration.

IAM

I need to create two roles for my ECS service. One ECS Task Execution role. This role will allow my ECS container to e.g pull images from ECR and write to my CloudWatch log group. The second role is for my ECS Task. This I'll use to grant my application running within the task to make API calls on my behalf. E.g retrieve data from S3 or write data to AWS RDS.

resource "aws_iam_role" "ecs_task_execution_role" {
  name                = "ecsTaskExecutionRole"
  assume_role_policy  = data.aws_iam_policy_document.execution_assume_role_policy.json
}

resource "aws_iam_role" "ecs_task_role" {
  name = "ecsTaskRole"
 
  assume_role_policy = <<EOF
{
 "Version": "2012-10-17",
 "Statement": [
   {
     "Action": "sts:AssumeRole",
     "Principal": {
       "Service": "ecs-tasks.amazonaws.com"
     },
     "Effect": "Allow",
     "Sid": ""
   }
 ]
}
EOF
}

data "aws_iam_policy_document" "execution_assume_role_policy" {
  statement {
    actions = ["sts:AssumeRole"]

    principals {
      type        = "Service"
      identifiers = ["ecs-tasks.amazonaws.com"]
    }
  }
}

data "aws_iam_policy" "ecsTaskExecutionRole_policy" {
  arn = "arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy"
}

resource "aws_iam_role_policy_attachment" "ecs_task_execution_role_policy" {
  role       = aws_iam_role.ecs_task_execution_role.name
  policy_arn = data.aws_iam_policy.ecsTaskExecutionRole_policy.arn
}

I won't explain how these are created in-depth. All you need to know is that I create 2 roles, and assign policies to these roles.

Security groups

Next, I need some security groups to make my application a bit more secure. I want to restrict inbound traffic to my ALB to port 80 and 443. I also restrict the inbound traffic to my ECS service to only allow traffic on port 80 and 8080 from my ALB.

I'll create a file, security_groups.tf, and add the following.

resource "aws_security_group" "alb" {
  name   = "alb-sg"
  vpc_id = module.vpc.vpc_id
  description = "Allow inbound traffic from port 80 and 443, to the ALB"
 
  ingress {
   protocol         = "tcp"
   from_port        = 80
   to_port          = 80
   cidr_blocks      = ["0.0.0.0/0"]
   ipv6_cidr_blocks = ["::/0"]
  }
 
  ingress {
   protocol         = "tcp"
   from_port        = 443
   to_port          = 443
   cidr_blocks      = ["0.0.0.0/0"]
   ipv6_cidr_blocks = ["::/0"]
  }
 
  egress {
   protocol         = "-1"
   from_port        = 0
   to_port          = 0
   cidr_blocks      = ["0.0.0.0/0"]
   ipv6_cidr_blocks = ["::/0"]
  }
}

resource "aws_security_group" "ecs_services" {
  name   = "ecs-service-sg"
  vpc_id = module.vpc.vpc_id
  description = "Allow inbound access from the ALB only"
 
  ingress {
   protocol         = "tcp"
   from_port        = 80
   to_port          = 80
   security_groups = [aws_security_group.alb.id]
  }

  ingress {
   protocol         = "tcp"
   from_port        = 8080
   to_port          = 8080
   security_groups = [aws_security_group.alb.id]
  }
 
  egress {
   protocol         = "-1"
   from_port        = 0
   to_port          = 0
   cidr_blocks      = ["0.0.0.0/0"]
  }
}

Here, I allow all traffic on port 80 and 443 into my ALB. I then configure my ECS security group to only allow inbound traffic on port 80 and 8080. 8080 is the port I will be using for my application.

CloudWatch log group

Here, I create a CloudWatch log group which my ECS Task will write logs to. This is very helpful when troubleshooting errors related to ECS Task.

resource "aws_cloudwatch_log_group" "my_ecs_service_log_group" {
  name = "my-ecs-service-loggroup"
}

ECR

Lastly, I need to create my ECR Repository. This is where I will be storing my Docker images. I've configured my ECS task definition to retrieve the application image from here.

resource "aws_ecr_repository" "my_ecr_repository" {
  name                 = "my-ecr"
  image_tag_mutability = "MUTABLE"
}

Now, I can run terraform plan and terraform apply to provision my IAC to AWS.

Application

Now, all my infrastructure should be up and running. But, my ECS task has no image in ECR to pull from. This is because I need to build my application and containerize it using Docker. Since this article primarily focuses on AWS ECS, I will only be writing a simple application using Python Flask. You could use whatever Docker image you'd like. I'll start of by creating a file, run.py, and insert the following.

run.py

from flask import Flask
import os

app = Flask(__name__)

@app.route("/")
def index():
  return "hello, world!"

if __name__ == "__main__":
  app.run(host="0.0.0.0", port=int(os.environ.get('APP_PORT', 8080)))

All this does is return a "hello, world!" message in our browser. I'm not going to go further into detail what this code does, since it is not relevant for this article. Next, I need to create my requirements.txt file, which the Dockerfile will use when building our image.

requirements.txt

Flask==2.2.3

Now, I want to containerize this application using Docker. I'll create a file, Dockerfile, and insert the following.

FROM --platform=linux/x86-64 python:3.8

WORKDIR /user/src/app

COPY . .

RUN pip install --no-cache-dir -r requirements.txt 

EXPOSE 8080

CMD ["python", "./run.py"]

Here I first define what layer I want for my docker image. Since I'm using python with Flask, I'll use the python:3.8 layer. I also specify the platform to be linux/x86-64 to avoid any issues when running this application in AWS ECS, which I have experienced when build my Docker image on MacOS.

I'll build a docker image, called hello-world, from my Dockerfile by running,

$ docker build -t hello-world .

note! Make sure all three files, run.py, requirements.txt and Dockerfile are in the same location before running this command.

I can view my image by running, docker images.

Next, I want to upload my docker image to my ECR repository I created earlier. I'll be using AWS CLI to achieve this. First, I need to retrieve an authentication token from ECR.

$ aws ecr get-login-password --region aws_region | docker login --username AWS --password-stdin aws_account_id.dkr.ecr.aws_region.amazonaws.com

Here, you need to replace aws_region and aws_account_id with your specific region and account ID.

Then, I run,

$ docker tag hello-world:latest aws_account_id.dkr.ecr.aws_region.amazonaws.com/my-ecr:latest

Remember to also replace aws_account_id and aws_region to match your specific configuration.

Lastly, I need to push my docker image to my ECR repository by running the following command,.

$ docker push aws_account_id.dkr.ecr.aws_region.amazonaws.com/my-ecr:latest

Now, I should be able to view my docker image in my ECR repository in AWS console.

After some time, the ECS task should redeploy using the latest image in ECR and I should be able to see my ECS service up and running by visiting my ALBs DNS name.

If I go to the DNS name for my ALB, my application will be displayed as such.

Hopefully you found this article helpful!