Set up ECS logging and alerts via SNS

By Mateo Spak Jenbergsen
Published on 2022-07-04 (Last modified: 2023-09-14)

...

In this article I will go through how we can use SNS email notifications to send an alert whenever our ECS tasks or container instances change. This could be caused by errors within the container that stops our tasks. We'll be creating a Lambda function that takes the message from EventBridge and rewrites it to a custom alert message. 

This article will not be going through how to use Terraform. If you don't know how to use Terraform, Terraform introduction will be a place to start before moving on with the rest of the article.

 

Motivation

  • Prevent downtime for our applications running in AWS ECS using alerts via SNS email notifications.
  • Retrieve real-time information about our ECS services and tasks that could be useful for further improvements.
  • Logging and alerting is a very important part of cloud development and operations. Without setting up some kind of alert system, your ECS service might be stopped without you knowing.

 

The different services

AWS SNS is a fully managed messaging service for both application-to-application and application-to-person communication. For our use case, we would want to send an email that both alerts us about some task change in ECS and also give brief information about what's happening.

AWS Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. We want to use Lambda as a configuration point between SNS and Eventbridge by retrieving the information we want from the event logs and sending it to our SNS.

AWS EventBridge is an AWS service that enables you to respond to state changes in an AWS resource. An EventBridge rule matches incoming events and sends them to targets for processing. 

 

SNS

Let's start off by creating our SNS. We'll create a file called sns.tf file and insert the following.

resource "aws_sns_topic" "sns_topic" {
  name = "ecs-task-change"
}

resource "aws_sns_topic_subscription" "sns_topic_target" {
  topic_arn = aws_sns_topic.sns_topic.arn
  protocol  = "email"
  endpoint  = "<your-email-adress>"
}

 

Here we first create our SNS topic which we call ecs-task-change. Then we create our topic subscription. Here we set the following arguments.

  • topic_arn: This is the arn from the topic we create above. 
  • protocol: protocol is set to email since we want email notifications. 
  • endpoint: Should be the email you want to be notified on. 

N.B. When we set up our topic subscription, a validation email is sent to our endpoint. You have to confirm the subscription before you can destroy your Terraform code. If your destroy an unconfirmed subscription, Terraform will remove the subscription from its state but the subscription will still exist in AWS. However, if you delete an SNS topic, SNS deletes all the subscriptions associated with the topic.

 

Lambda

Next step is to create our lambda function. We'll start off by creating our python function that our lambda will use. Create a file called lambda_function.py and add the following.

import json
import boto3

sns_arn = "<sns-topic-arn>"
subject = "Changes in ECS tasks."

def lambda_handler(event, context):
    result = {}
    client = boto3.client("sns")
    
    if event["source"] != "aws.ecs":
       raise ValueError("Function only supports input from events with a source type of: aws.ecs")
       
    if (event['detail-type'] == 'ECS Task State Change' or event['detail-type'] == 'ECS Container Instance State Change'):
        result['event'] = event['detail-type']
    else:
        raise ValueError("detail-type for event is not a supported type. Exiting without saving event.")
    
    for i in event['detail']['containers']:
        result['lastStatus'] = i['lastStatus']


    result['time'] = event['time']
    result['resources'] = event['resources']
    result['stopCode'] = event['detail']['stopCode']
        
    message = "Event: {}\nTime: {}\nStatus: {}\nResources: {}\nStop Code: {}".format(result['event'], result['time'], result['lastStatus'], result['resources'], result['stopCode'])
    
    resp = client.publish(TargetArn=sns_arn, Message=message, Subject=subject)

 

This code takes the input from our eventbridge messages and sends a portion of it to our SNS topic. Note that this is only a simple example of how this can be done. You should look at the entire message from Eventbridge and write code that sends the information you need. Remember to replace sns_arn = "<sns-topic-arn>" with your sns topic arn.

NB. Remember to zip this file and give it the name lambda_function.zip.

Let's create the file, lambda.tf, and add the following content.

resource "aws_lambda_function" "lambda_function" {
  filename      = "lambda_function.zip"
  function_name = "lambda_function"
  role          = aws_iam_role.iam_lambda_role.arn
  handler       = "lambda_function.lambda_handler"

  source_code_hash = filebase64sha256("lambda_function.zip")

  runtime = "python3.8"
}

 

This will create a lambda function that uses the python script we created above. We set the following arguments.

  • filename: The zipped file that we want to use as our function.
  • function_name: The name our lambda function.
  • role: The arn of the role we want to attach to our lambda function. 
  • handler: The handler is the method code that processes the event. Needs to have the name of the file in which the Lambda handler function is located.
  • source_code_hash: Used to trigger updates.
  • runtime: Identifier of the function's runtime. We want to use python for this function so its set to python version 3.8.

 

Eventbridge

Next we create our EventBridge rule. Create a file, eventbridge.tf and add the following.

module "eventbridge" {
  source = "terraform-aws-modules/eventbridge/aws"

  create_bus = false

  rules = {
    orders = {
      description   = "Capture data when tasks or container instances change."
      event_pattern = jsonencode({"source": ["aws.ecs"],"detail-type": ["ECS Task State Change", "ECS Container Instance State Change"],"detail": {"clusterArn": ["<ecs-cluster-arn>"]}})
      enabled       = true
    }
  }

  targets = {
    orders = [
      {
        name = "send-logs-to-lambda"
        arn = aws_lambda_function.lambda_function.arn
      }
    ]
  }
}

 

For the EventBridge I've chosen to use a Terraform AWS module. We set the following arguments.

  • sourceThe source of the module we want to use.
  • create_bus: Defines whether or not we want to create a bus. To keep things simple for now I've chosen assign this the value of false. This means it uses the default bus created by AWS Eventbridge.

Rules is a map of rule definitions. Here we define that when we want our rule to run. We do this by defining our event_pattern so that it triggers whenever there is a ECS task or container service change. We also specify which cluster we want this event to use. Remember to replace 

{"clusterArn": ["<ecs-cluster-arn>"]}

with your cluster arn.

Targets is a map of objects with EventBridge Target definitions. Since we want to send the logs to our Lambda function, we use the arn of our lambda function as target.

Now, we can enter terraform plan and terraform apply in our command line. Make sure you accept the Subscription Confirmation email sendt by AWS when the SNS Subscription service gets created. 

 

Conclusion

We've now created a SNS topic and a subscription that sends emails to our email address. Then we set up our Lambda function that takes the Eventbridge message as input, extracts the important information from the message and using boto3, calls the SNS api to send our message. Lastly, we created our Eventbridge rule that triggers whenever our ECS cluster has a task or container instance change and send a logging message to our Lambda function. Now we should have a functioning logging and alert system that sends an email whenever a ECS task or container instance change happens. 




About the author



Mateo Spak Jenbergsen

Mateo is a Devops at Spak Consultants, with strong focus on AWS, Terraform and container technologies. He has a strong passion for pushing the limits when it comes to building software on cloud platforms.

Comments