In this article we explain how to set up and implement EC2 metric monitoring and alarm system that notifies via email when our EC2 instance is running low on disk space. We'll be using the AWS unified Cloudwatch Agent to create our custom metric for our EC2 instance. Note: this is not restricted to EC2 instances but also works with on-premise servers running macOS, Windows and Linux. Here is a list of the supported operating systems.
Motivation
Recently, we had a client that was running a mongodb cluster on a couple of EC2 instances. One day, the databases stopped working. The reason behind this was that the disks on the EC2 instances were full. Our client had no way of detecting this before it was too late and their production services were down for a while. This could have easily been detected by implementing metrics and alarms. In this article we'll go through how to avoid ending up in this position using Cloudwatch Agents. We'll also set up an alarm that sends an email whenever our EC2 instance is running low on disk space.
What are Cloudwatch Agents?
Let's look at AWS unified Cloudwatch Agent. This service allows us to collect internal system-level metrics from our EC2 server. By default, Cloudwatch does not log information such as disk space and memory utilization, so we need to write custom configurations to accomplish this. You install an agent on the instance you'd like to monitor and using a configuration file to tell the agent what data to collect. In this article I will be storing this configuration file in AWS SSM Parameter Store so it can be used on multiple agents.
Install Cloudwatch agent
If your EC2 instance is running Amazon Linux 2, you can install the Cloudwatch agent package using yum.
sudo yum install amazon-cloudwatch-agent
If you are not using Amazon Linux 2, you can download the agent from Amazon S3. You can find the specific url for your OS in the aws documentation.
AWS SSM Parameter Store
When setting up a Cloudwatch agent we need a configuration file to tell our agent what data we want to collect. We can store this file locally on our server or in AWS SSM Parameter Store. For this article, I will show how we can both store our configuration file in AWS SSM Parameter Store and locally. Open AWS SSM and then click on Parameter Store in the left side panel. Click Create parameter and enter a name. You could call this whatever you'd like, but I'll call it /cloudwatch-agent/config. We want to use a Standard, String type parameter. In our value box, enter the following.
{
"agent": {
"metrics_collection_interval": 60,
"run_as_user": "root"
},
"metrics": {
"append_dimensions": {
"AutoScalingGroupName": "${aws:AutoScalingGroupName}",
"ImageId": "${aws:ImageId}",
"InstanceId": "${aws:InstanceId}",
"InstanceType": "${aws:InstanceType}"
},
"metrics_collected": {
"disk": {
"measurement": [
"used_percent"
],
"metrics_collection_interval": 60,
"resources": [
"*"
]
},
"mem": {
"measurement": [
"mem_used_percent"
],
"metrics_collection_interval": 60
}
}
}
}
Now press Create parameter and we are done with our configuration file. If you don't want to use SSM, just create a file called config.json locally on your EC2 instance and enter the code above.
IAM
In order for our EC2 instance to write to Cloudwatch, we need to give it permissions to do so. Open IAM Roles and hit Create new role. Select AWS Service for your trusted entity type and we'll be using EC2. Now we need to add 2 Amazon managed policies to our role, CloudWatchAgentServerPolicy and AmazonSSMManagedInstanceCore. Add both of these policies and press Next. Give your role a name. I've chosen to name mine CloudWatchAgentServerRole. Lastly, hit Create role. Now we need to attach this role to our EC2-instance. Go to your EC2-instance, Actions -> Security and press Modify IAM role.
Select the role we created earlier and hit Update IAM role.
Setting up our agent
Now, we can fetch our Cloudwatch agent configuration. We ssh into our EC2 instance and run the following command.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c ssm:/cloudwatch-agent/config
This command retrieves the configuration that we saved in AWS SSM Parameter Store earlier in the article. If you want to fetch a local configuration file, run this command instead.
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -s -c file:/config.json
Now, within about 5 minutes, we should see our custom metrics in Cloudwatch Metrics named CWAgent.
Cloudwatch alarm
The last and final step is to set up our Cloudwatch alarm. Go to Cloudwatch -> Alarms and hit Create new alarm. Here we select the metric we want to monitor. Select the namespace CWAgent that we created earlier and choose the metric you want to monitor. Since I want to monitor disk space used, I'll choose metric named disk_used_percent with path "/".
Under conditions I set the threshold value to 80 percent and hit next. Since I don't have a topic, I select Create new topic, call it my_ec2_disk_space_topic and enter my email as endpoint. Select Create topic and hit Next. Give the alarm a suitable name and description. Review your alarm to see if everything looks alright and lastly hit Create alarm. We also collect memory used in percentage. If this is something you'd also like to monitor, create a separate alarm for this metric.
Conclusion
In this article we have learned how to monitor disk space and memory utilization in an EC2 instance. We then configured an alarm that notifies us via email when our EC2 instance is running out of disk space. It should be known that this is one of the simplest and fastest ways to implement internal system-level monitoring with Cloudwatch Agents. There is a lot of cool stuff you can do with this service that I didn't cover in this article. To read more about Cloudwatch agents, check out the aws documentation.