Amazon CloudWatch - User Guide

Amazon CloudWatch

User Guide

Amazon CloudWatch User Guide

Amazon CloudWatch: User Guide

Copyright © 2017 Amazon Web Services, Inc. and/or its affiliates. All rights reserved.

Amazon's trademarks and trade dress may not be used in connection with any product or service that is not Amazon's, in any manner that is likely to cause confusion among customers, or in any manner that disparages or discredits Amazon. All other trademarks not owned by Amazon are the property of their respective owners, who may or may not be affiliated with, connected to, or sponsored by

Amazon.


Table of Contents

What is Amazon CloudWatch? ............................................................................................................. 1

Accessing CloudWatch ................................................................................................................ 1

Related AWS Services ................................................................................................................. 1

How CloudWatch Works .............................................................................................................. 2

Concepts ................................................................................................................................... 3

Namespaces ...................................................................................................................... 3

Metrics .............................................................................................................................. 3

Dimensions ........................................................................................................................ 4

Statistics ........................................................................................................................... 5

Percentiles ......................................................................................................................... 7

Alarms .............................................................................................................................. 7

Limits ....................................................................................................................................... 8

Resources .................................................................................................................................. 9

Getting Set Up ................................................................................................................................. 10

Sign Up for Amazon Web Services (AWS) .................................................................................... 10

Sign in to the Amazon CloudWatch Console ................................................................................ 10

Set Up the AWS CLI ................................................................................................................. 11

Getting Started ................................................................................................................................ 12

Scenario: Monitor Estimated Charges .......................................................................................... 12

Step 1: Enable Billing Alerts .............................................................................................. 13

Step 2: Create a Billing Alarm ............................................................................................ 13

Step 3: Check the Alarm Status .......................................................................................... 14

Step 4: Edit a Billing Alarm ............................................................................................... 15

Step 5: Delete a Billing Alarm ............................................................................................ 15

Scenario: Publish Metrics ........................................................................................................... 15

Step 1: Define the Data Configuration ................................................................................ 16

Step 2: Add Metrics to CloudWatch .................................................................................... 16

Step 3: Get Statistics from CloudWatch ............................................................................... 17

Step 4: View Graphs with the Console ................................................................................ 17

Using Dashboards ............................................................................................................................. 18

Create a Dashboard .................................................................................................................. 19

Add or Remove a Graph ............................................................................................................ 19

Move or Resize a Graph ............................................................................................................ 20

Edit a Graph ............................................................................................................................ 21

Rename a Graph ...................................................................................................................... 22

Add or Remove a Text Widget ................................................................................................... 22

Add or Remove an Alarm .......................................................................................................... 23

Monitor Resources in Multiple Regions ........................................................................................ 23

Link and Unlink Graphs ............................................................................................................. 24

Change the Refresh Interval ...................................................................................................... 24

Change the Time Range or Format ............................................................................................. 25

Using Metrics ................................................................................................................................... 26

View Available Metrics .............................................................................................................. 26

Search for Available Metrics ....................................................................................................... 29

Get Statistics for a Metric .......................................................................................................... 30

Get Statistics for a Specific Resource .................................................................................. 30

Aggregate Statistics Across Resources ................................................................................. 33

Aggregate Statistics by Auto Scaling Group ......................................................................... 35

Aggregate Statistics by AMI ............................................................................................... 36

Graph Metrics .......................................................................................................................... 37

Graph a Metric ................................................................................................................. 37

Modify the Time Range for a Graph ................................................................................... 39

Modify the Y Axis for a Graph ........................................................................................... 40

Create an Alarm from a Metric on a Graph .......................................................................... 41

iii


Publish Custom Metrics ............................................................................................................. 42

High-Resolution Metrics .................................................................................................... 42

Using Dimensions ............................................................................................................. 42

Publish Single Data Points ................................................................................................. 43

Publish Statistic Sets ........................................................................................................ 44

Publish the Value Zero ...................................................................................................... 44

Metrics and Dimensions Reference ...................................................................................................... 45

AWS Namespaces ..................................................................................................................... 46

API Gateway ............................................................................................................................ 48

API Gateway Metrics ......................................................................................................... 48

Dimensions for Metrics ..................................................................................................... 49

AppStream 2.0 ......................................................................................................................... 49

Amazon AppStream 2.0 Metrics ......................................................................................... 50

Dimensions for Amazon AppStream 2.0 Metrics ................................................................... 51

Auto Scaling ............................................................................................................................ 51

Auto Scaling Group Metrics ............................................................................................... 51

Dimensions for Auto Scaling Group Metrics ......................................................................... 52

AWS Billing and Cost Management ............................................................................................. 52

AWS Billing and Cost Management Metrics .......................................................................... 52

Dimensions for AWS Billing and Cost Management Metrics .................................................... 52

Amazon CloudFront .................................................................................................................. 53

Amazon CloudFront Metrics ............................................................................................... 53

Dimensions for CloudFront Metrics ..................................................................................... 54

Amazon CloudSearch ................................................................................................................ 54

Amazon CloudSearch Metrics ............................................................................................. 54

Dimensions for Amazon CloudSearch Metrics ....................................................................... 55

Amazon CloudWatch Events ...................................................................................................... 55

CloudWatch Events Metrics ................................................................................................ 55

Dimensions for CloudWatch Events Metrics .......................................................................... 56

Amazon CloudWatch Logs ......................................................................................................... 56

CloudWatch Logs Metrics .................................................................................................. 56

Dimensions for CloudWatch Logs Metrics ............................................................................ 57

AWS DMS ................................................................................................................................ 58

AWS Direct Connect ................................................................................................................. 58

AWS Direct Connect Metrics .............................................................................................. 58

Dimensions for AWS Direct Connect Metrics ........................................................................ 59

Amazon DynamoDB .................................................................................................................. 59

DynamoDB Metrics ........................................................................................................... 59

Dimensions for DynamoDB Metrics ..................................................................................... 70

Amazon EC2 ............................................................................................................................ 70

Amazon EC2 Metrics ......................................................................................................... 70

Dimensions for Amazon EC2 Metrics ................................................................................... 73

Amazon EC2 Spot Fleet ............................................................................................................ 74

Amazon EC2 Spot Fleet Metrics ......................................................................................... 74

Dimensions for Amazon EC2 Spot Fleet Metrics .................................................................... 75

Amazon ECS ............................................................................................................................ 75

Amazon ECS Metrics ......................................................................................................... 75

Dimensions for Amazon ECS Metrics ................................................................................... 77

Elastic Beanstalk ...................................................................................................................... 78

Elastic Beanstalk Metrics ................................................................................................... 78

Dimensions for Elastic Beanstalk Metrics ............................................................................. 79

Amazon ElastiCache .................................................................................................................. 79

Dimensions for ElastiCache Metrics ..................................................................................... 79

Host-Level Metrics ............................................................................................................ 80

Metrics for Memcached ..................................................................................................... 80

Metrics for Redis .............................................................................................................. 83

Amazon EBS ............................................................................................................................ 85

iv


Amazon EBS Metrics ......................................................................................................... 85

Dimensions for Amazon EBS Metrics ................................................................................... 86

Amazon EFS ............................................................................................................................ 86

Amazon CloudWatch Metrics for Amazon EFS ...................................................................... 87

Dimensions for Amazon EFS Metrics ................................................................................... 89

Elastic Load Balancing .............................................................................................................. 89

Application Load Balancer Metrics ...................................................................................... 89

Metric Dimensions for Application Load Balancers ................................................................ 91

Network Load Balancer Metrics .......................................................................................... 92

Metric Dimensions for Network Load Balancers .................................................................... 92

Classic Load Balancer Metrics ............................................................................................. 93

Metric Dimensions for Classic Load Balancers ....................................................................... 97

Amazon EMR ........................................................................................................................... 97

Amazon EMR Metrics ........................................................................................................ 97

Amazon EMR Dimensions ................................................................................................ 106

Amazon ES ............................................................................................................................ 107

Amazon Elasticsearch Service Metrics ................................................................................ 107

Dimensions for Amazon Elasticsearch Service Metrics .......................................................... 110

Elastic Transcoder ................................................................................................................... 111

Elastic Transcoder Metrics ................................................................................................ 111

Dimensions for Elastic Transcoder Metrics .......................................................................... 112

Amazon GameLift ................................................................................................................... 112

Amazon GameLift Metrics for Fleets ................................................................................. 112

Amazon GameLift Metrics for Queues ............................................................................... 116

Dimensions for Amazon GameLift Metrics .......................................................................... 117

Amazon Inspector ................................................................................................................... 118

AWS IoT ................................................................................................................................ 118

AWS IoT Metrics ............................................................................................................. 118

Dimensions for Metrics .................................................................................................... 121

Amazon Kinesis Analytics ........................................................................................................ 121

Metrics .......................................................................................................................... 121

Dimensions for Metrics .................................................................................................... 122

Amazon Kinesis Firehose ......................................................................................................... 122

Service-level CloudWatch Metrics ..................................................................................... 122

API-Level CloudWatch Metrics .......................................................................................... 123

Amazon Kinesis Streams .......................................................................................................... 124

Basic Stream-level Metrics ............................................................................................... 125

Enhanced Shard-level Metrics ........................................................................................... 128

Dimensions for Amazon Kinesis Metrics ............................................................................. 131

AWS KMS .............................................................................................................................. 131

AWS KMS Metrics ........................................................................................................... 131

Dimensions for AWS KMS Metrics ..................................................................................... 132

AWS Lambda ......................................................................................................................... 132

AWS Lambda CloudWatch Metrics .................................................................................... 132

Dimensions for AWS Lambda Metrics ................................................................................ 133

Amazon Machine Learning ....................................................................................................... 134

Amazon ML Metrics ........................................................................................................ 134

Dimensions for Amazon Machine Learning Metrics .............................................................. 134

AWS OpsWorks ...................................................................................................................... 135

AWS OpsWorks Stacks Metrics ......................................................................................... 135

Dimensions for AWS OpsWorks Metrics ............................................................................. 138

Amazon Polly ......................................................................................................................... 139

Amazon Polly Metrics ...................................................................................................... 139

Dimensions for Amazon Polly Metrics ................................................................................ 140

Amazon Redshift .................................................................................................................... 140

Amazon Redshift Metrics ................................................................................................. 140

Dimensions for Amazon Redshift Metrics ........................................................................... 142

v


Amazon RDS .......................................................................................................................... 143

Amazon RDS Metrics ....................................................................................................... 143

Dimensions for RDS Metrics ............................................................................................. 144

Amazon Route 53 ................................................................................................................... 145

Amazon Route 53 Metrics ................................................................................................ 145

Dimensions for Amazon Route 53 Metrics .......................................................................... 146

Amazon SES .......................................................................................................................... 147

Amazon SES Metrics ....................................................................................................... 147

Dimensions for Amazon SES Metrics ................................................................................. 148

Amazon SNS .......................................................................................................................... 148

Amazon Simple Notification Service Metrics ....................................................................... 148

Dimensions for Amazon Simple Notification Service Metrics ................................................. 149

Amazon SQS .......................................................................................................................... 149

Amazon SQS Metrics ....................................................................................................... 149

Dimensions for Amazon SQS Metrics ................................................................................. 152

Amazon S3 ............................................................................................................................ 152

Amazon S3 CloudWatch Metrics ....................................................................................... 152

Amazon S3 CloudWatch Dimensions ................................................................................. 154

AWS Shield Advanced ............................................................................................................. 155

AWS Step Functions ................................................................................................................ 155

Execution Metrics ........................................................................................................... 155

Activity Metrics .............................................................................................................. 155

Lambda Function Metrics ................................................................................................. 156

Amazon SWF ......................................................................................................................... 157

Workflow Metrics ............................................................................................................ 157

Activity Metrics .............................................................................................................. 158

AWS Storage Gateway ............................................................................................................. 159

AWS Storage Gateway Metrics .......................................................................................... 159

Dimensions for AWS Storage Gateway Metrics .................................................................... 166

Amazon VPC NAT Gateway ...................................................................................................... 166

NAT Gateway Metrics ...................................................................................................... 166

Dimensions for NAT Gateway Metrics ................................................................................ 169

Amazon VPC VPN ................................................................................................................... 169

VPN Metrics ................................................................................................................... 169

Dimensions for VPN Metrics ............................................................................................. 170

AWS WAF .............................................................................................................................. 170

AWS WAF Metrics ........................................................................................................... 170

AWS WAF Dimensions ..................................................................................................... 171

Amazon WorkSpaces ............................................................................................................... 171

Amazon WorkSpaces Metrics ............................................................................................ 171

Dimensions for Amazon WorkSpaces Metrics ...................................................................... 172

Creating Alarms .............................................................................................................................. 174

Configuring How Alarms Handle Missing Data ............................................................................ 175

High-Resolution Alarms ........................................................................................................... 176

Percentile-Based Alarms and Low Data Samples ......................................................................... 176

Common Features of CloudWatch Alarms .................................................................................. 176

Set Up an SNS Topic ............................................................................................................... 177

Set Up an SNS Topic Using the AWS Management Console .................................................. 177

Set Up an SNS Topic Using the AWS CLI ............................................................................ 178

Create or Edit an Alarm ........................................................................................................... 179

Create a CPU Usage Alarm ...................................................................................................... 180

Set Up a CPU Usage Alarm Using the AWS Management Console .......................................... 180

Set Up a CPU Usage Alarm Using the AWS CLI ................................................................... 182

Create a Load Balancer Latency Alarm ...................................................................................... 182

Set Up a Latency Alarm Using the AWS Management Console .............................................. 183

Set Up a Latency Alarm Using the AWS CLI ....................................................................... 183

Create a Storage Throughput Alarm ......................................................................................... 184

vi


Set Up a Storage Throughput Alarm Using the AWS Management Console ............................. 184

Set Up a Storage Throughput Alarm Using the AWS CLI ...................................................... 184

Create Alarms to Stop, Terminate, Reboot, or Recover an Instance ................................................ 185

Adding Stop Actions to Amazon CloudWatch Alarms ........................................................... 186

Adding Terminate Actions to Amazon CloudWatch Alarms ................................................... 187

Adding Reboot Actions to Amazon CloudWatch Alarms ....................................................... 188

Adding Recover Actions to Amazon CloudWatch Alarms ...................................................... 189

Viewing the History of Triggered Alarms and Actions .......................................................... 190

Create a Billing Alarm ............................................................................................................. 191

Enable Billing Alerts ....................................................................................................... 191

Create a Billing Alarm ..................................................................................................... 192

Check the Alarm Status ................................................................................................... 193

Delete a Billing Alarm ..................................................................................................... 193

Authentication and Access Control .................................................................................................... 194

Authentication ....................................................................................................................... 194

Access Control ........................................................................................................................ 195

Overview of Managing Access .................................................................................................. 196

Resources and Operations ............................................................................................... 196

Understanding Resource Ownership .................................................................................. 197

Managing Access to Resources ......................................................................................... 198

Specifying Policy Elements: Actions, Effects, and Principals .................................................. 199

Specifying Conditions in a Policy ...................................................................................... 199

Using Identity-Based Policies (IAM Policies) ................................................................................ 199

Permissions Required to Use the CloudWatch Console ......................................................... 200

AWS Managed (Predefined) Policies for CloudWatch ............................................................ 203

Customer Managed Policy Examples ................................................................................. 203

Amazon CloudWatch Permissions Reference ............................................................................... 205

Logging API Calls ........................................................................................................................... 211

CloudWatch Information in CloudTrail ....................................................................................... 211

Understanding Log File Entries ................................................................................................. 213

Document History .......................................................................................................................... 216

vii


Accessing CloudWatch

What is Amazon CloudWatch?

Amazon CloudWatch monitors your Amazon Web Services (AWS) resources and the applications you run on AWS in real time. You can use CloudWatch to collect and track metrics, which are variables you can measure for your resources and applications. CloudWatch alarms send notifications or automatically make changes to the resources you are monitoring based on rules that you define. For example, you can monitor the CPU usage and disk reads and writes of your Amazon EC2 instances and then use this data to determine whether you should launch additional instances to handle increased load. You can also use this data to stop under-used instances to save money. In addition to monitoring the built-in metrics that come with AWS, you can monitor your own custom metrics. With CloudWatch, you gain system-wide visibility into resource utilization, application performance, and operational health.

Accessing CloudWatch

You can access CloudWatch using any of the following methods:

• Amazon CloudWatch console — https://console.aws.amazon.com/cloudwatch/

• AWS CLI — For more information, see Getting Set Up with the AWS Command Line Interface in the

AWS Command Line Interface User Guide.

• CloudWatch API — For more information, see the Amazon CloudWatch API Reference .

• AWS SDKs — For more information, see Tools for Amazon Web Services .

Related AWS Services

The following services are used along with Amazon CloudWatch:

• Amazon Simple Notification Service (Amazon SNS) coordinates and manages the delivery or sending of messages to subscribing endpoints or clients. You use Amazon SNS with CloudWatch to send messages when an alarm threshold has been reached. For more information, see

Set Up Amazon SNS

Notifications (p. 177)

.

• Auto Scaling enables you to automatically launch or terminate Amazon EC2 instances based on userdefined policies, health status checks, and schedules. You can use a CloudWatch alarm with Auto

Scaling to scale your EC2 instances based on demand. For more information, see Dynamic Scaling in the Auto Scaling User Guide.

• AWS CloudTrail enables you to monitor the calls made to the Amazon CloudWatch API for your account, including calls made by the AWS Management Console, AWS CLI, and other services. When

1


How CloudWatch Works

CloudTrail logging is turned on, CloudWatch writes log files to the Amazon S3 bucket that you specified when you configured CloudTrail. For more information, see

Logging Amazon CloudWatch API

Calls in AWS CloudTrail (p. 211) .

• AWS Identity and Access Management (IAM) is a web service that helps you securely control access to

AWS resources for your users. Use IAM to control who can use your AWS resources (authentication) and what resources they can use in which ways (authorization). For more information, see

Authentication and Access Control for Amazon CloudWatch (p. 194)

.

How Amazon CloudWatch Works

Amazon CloudWatch is basically a metrics repository. An AWS service—such as Amazon EC2—puts metrics into the repository, and you retrieve statistics based on those metrics. If you put your own custom metrics into the repository, you can retrieve statistics on these metrics as well.

You can use metrics to calculate statistics and then present the data graphically in the CloudWatch console. For more information about the other AWS resources that generate and send metrics to

CloudWatch, see Amazon CloudWatch Metrics and Dimensions Reference (p. 45) .

You can configure alarm actions to stop, start, or terminate an Amazon EC2 instance when certain criteria are met. In addition, you can create alarms that initiate Auto Scaling and Amazon Simple

Notification Service (Amazon SNS) actions on your behalf. For more information about creating

CloudWatch alarms, see

Alarms (p. 7) .

AWS Cloud computing resources are housed in highly available data center facilities. To provide additional scalability and reliability, each data center facility is located in a specific geographical area, known as a region. Each region is designed to be completely isolated from the other regions, to achieve the greatest possible failure isolation and stability. Amazon CloudWatch does not aggregate data across regions. Therefore, metrics are completely separate between regions. For more information, see Regions and Endpoints in the Amazon Web Services General Reference.

2


Concepts

Amazon CloudWatch Concepts

The following terminology and concepts are central to your understanding and use of Amazon

CloudWatch:

•

Namespaces (p. 3)

•

Metrics (p. 3)

•

Dimensions (p. 4)

•

Statistics (p. 5)

•

Percentiles (p. 7)

•

Alarms (p. 7)

Namespaces

A namespace is a container for CloudWatch metrics. Metrics in different namespaces are isolated from each other, so that metrics from different applications are not mistakenly aggregated into the same statistics.

There is no default namespace. You must specify a namespace for each data point you publish to

CloudWatch. You can specify a namespace name when you create a metric. These names must contain valid XML characters, and be fewer than 256 characters in length. Possible characters are: alphanumeric characters (0-9A-Za-z), period (.), hyphen (-), underscore (_), forward slash (/), hash (#), and colon (:).

The AWS namespaces use the following naming convention:

AWS/

service

. For example, Amazon EC2 uses the

AWS/EC2

namespace. For the list of AWS namespaces, see

AWS Namespaces (p. 46)

.

Metrics

Metrics are the fundamental concept in CloudWatch. A metric represents a time-ordered set of data points that are published to CloudWatch. Think of a metric as a variable to monitor, and the data points represent the values of that variable over time. For example, the CPU usage of a particular EC2 instance is one metric provided by Amazon EC2. The data points themselves can come from any application or business activity from which you collect data.

AWS services send metrics to CloudWatch, and you can send your own custom metrics to CloudWatch.

You can add the data points in any order, and at any rate you choose. You can retrieve statistics about those data points as an ordered set of time-series data.

Metrics exist only in the region in which they are created. Metrics cannot be deleted, but they automatically expire after 15 months if no new data is published to them. Data points older than 15 months expire on a rolling basis; as new data points come in, data older than 15 months is dropped.

Metrics are uniquely defined by a name, a namespace, and one or more dimensions. Each data point has a time stamp, and (optionally) a unit of measure. When you request statistics, the returned data stream is identified by namespace, metric name, dimension, and (optionally) the unit.

For more information, see

View Available Metrics (p. 26) and

Publish Custom Metrics (p. 42)

.

Time Stamps

Each metric data point must be marked with a time stamp. The time stamp can be up to two weeks in the past and up to two hours into the future. If you do not provide a time stamp, CloudWatch creates a time stamp for you based on the time the data point was received.

3


Dimensions

Time stamps are dateTime

objects, with the complete date plus hours, minutes, and seconds (for example, 2016-10-31T23:59:59Z). For more information, see dateTime . Although it is not required, we recommend that you use Coordinated Universal Time (UTC). When you retrieve statistics from

CloudWatch, all times are in UTC.

CloudWatch alarms check metrics based on the current time in UTC. Custom metrics sent to CloudWatch with time stamps other than the current UTC time can cause alarms to display the Insufficient Data state or result in delayed alarms.

Metrics Retention

CloudWatch retains metric data as follows:

• Data points with a period of less than 60 seconds are available for 3 hours. These data points are highresolution custom metrics.

• Data points with a period of 60 seconds (1 minute) are available for 15 days

• Data points with a period of 300 seconds (5 minute) are available for 63 days

• Data points with a period of 3600 seconds (1 hour) are available for 455 days (15 months)

Data points that are initially published with a shorter period are aggregated together for long-term storage. For example, if you collect data using a period of 1 minute, the data remains available for

15 days with 1-minute resolution. After 15 days this data is still available, but is aggregated and is retrievable only with a resolution of 5 minutes. After 63 days, the data is further aggregated and is available with a resolution of 1 hour.

CloudWatch started retaining 5-minute and 1-hour metric data as of 9 July 2016.

Dimensions

A dimension is a name/value pair that uniquely identifies a metric. You can assign up to 10 dimensions to a metric.

Every metric has specific characteristics that describe it, and you can think of dimensions as categories for those characteristics. Dimensions help you design a structure for your statistics plan. Because dimensions are part of the unique identifier for a metric, whenever you add a unique name/value pair to one of your metrics, you are creating a new variation of that metric.

AWS services that send data to CloudWatch attach dimensions to each metric. You can use dimensions to filter the results that CloudWatch returns. For example, you can get statistics for a specific EC2 instance by specifying the

InstanceId

dimension when you search for metrics.

For metrics produced by certain AWS services, such as Amazon EC2, CloudWatch can aggregate data across dimensions. For example, search for metrics in the

AWS/EC2

namespace but do not specify any dimensions, CloudWatch aggregates all data for the specified metric to create the statistic that you requested. CloudWatch does not aggregate across dimensions for your custom metrics.

Dimension Combinations

CloudWatch treats each unique combination of dimensions as a separate metric, even if the metrics have the same metric name. You can only retrieve statistics using combinations of dimensions that you specifically published. When you retrieve statistics, specify the same values for the namespace, metric name, and dimension parameters that were used when the metrics were created. You can also specify the start and end times for CloudWatch to use for aggregation.

For example, suppose that you publish four distinct metrics named ServerStats in the DataCenterMetric namespace with the following properties:

4


Statistics

Dimensions: Server=Prod, Domain=Frankfurt, Unit: Count, Timestamp: 2016-10-31T12:30:00Z,

Value: 105

Dimensions: Server=Beta, Domain=Frankfurt, Unit: Count, Timestamp: 2016-10-31T12:31:00Z,

Value: 115

Dimensions: Server=Prod, Domain=Rio, Unit: Count, Timestamp: 2016-10-31T12:32:00Z,

Value: 95

Dimensions: Server=Beta, Domain=Rio, Unit: Count, Timestamp: 2016-10-31T12:33:00Z,

Value: 97

If you publish only those four metrics, you can retrieve statistics for these combinations of dimensions:

•

Server=Prod,Domain=Frankfurt

•

Server=Prod,Domain=Rio

•

Server=Beta,Domain=Frankfurt

•

Server=Beta,Domain=Rio

You can't retrieve statistics for the following dimensions or if you specify no dimensions:

•

Server=Prod

•

Server=Beta

•

Domain=Frankfurt

•

Domain=Rio

Statistics

Statistics are metric data aggregations over specified periods of time. CloudWatch provides statistics based on the metric data points provided by your custom data or provided by other AWS services to

CloudWatch. Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period you specify. The following table describes the available statistics.

Statistic

Minimum

Maximum

Sum

Average

SampleCount pNN.NN

Description

The lowest value observed during the specified period. You can use this value to determine low volumes of activity for your application.

The highest value observed during the specified period. You can use this value to determine high volumes of activity for your application.

All values submitted for the matching metric added together. This statistic can be useful for determining the total volume of a metric.

The value of

Sum

/

SampleCount

during the specified period. By comparing this statistic with the

Minimum

and

Maximum

, you can determine the full scope of a metric and how close the average use is to the

Minimum

and

Maximum

. This comparison helps you to know when to increase or decrease your resources as needed.

The count (number) of data points used for the statistical calculation.

The value of the specified percentile. You can specify any percentile, using up to two decimal places (for example, p95.45). For more information, see

Percentiles (p. 7)

.

You can add pre-calculated statistics. Instead of data point values, you specify values for

SampleCount

,

Minimum

,

Maximum

, and

Sum

(CloudWatch calculates the average for you). The values you add in this way are aggregated with any other values associated with the matching metric.

5


Statistics

Units

Each statistic has a unit of measure. Example units include

Bytes

,

Seconds

,

Count

, and

Percent

. For the complete list of the units that CloudWatch supports, see the MetricDatum data type in the Amazon

CloudWatch API Reference.

You can specify a unit when you create a custom metric. If you do not specify a unit, CloudWatch uses

None

as the unit. Units help provide conceptual meaning to your data. Though CloudWatch attaches no significance to a unit internally, other applications can derive semantic information based on the unit.

Metric data points that specify a unit of measure are aggregated separately. When you get statistics without specifying a unit, CloudWatch aggregates all data points of the same unit together. If you have two otherwise identical metrics with different units, two separate data streams are returned, one for each unit.

Periods

A period is the length of time associated with a specific Amazon CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period of time. Periods are defined in numbers of seconds, and valid values for period are 1, 5, 10, 30, or any multiple of 60. For example, to specify a period of six minutes, use 360 as the period value. You can adjust how the data is aggregated by varying the length of the period. A period can be as short as one second or as long as one day (86,400 seconds). The default value is 60 seconds.

Only custom metrics that you define with a storage resolution of 1 second support sub-minute periods.

Even though the option to set a period below 60 is always available in the console, you should select a period that aligns to how the metric is stored. For more information about metrics that support subminute periods, see

High-Resolution Metrics (p. 42)

.

When you retrieve statistics, you can specify a period, start time, and end time. These parameters determine the overall length of time associated with the statistics. The default values for the start time and end time get you the last hour's worth of statistics. The values that you specify for the start time and end time determine how many periods CloudWatch returns. For example, retrieving statistics using the default values for the period, start time, and end time returns an aggregated set of statistics for each minute of the previous hour. If you prefer statistics aggregated in ten-minute blocks, specify a period of

600. For statistics aggregated over the entire hour, specify a period of 3600.

Periods are also important for CloudWatch alarms. When you create an alarm to monitor a specific metric, you are asking CloudWatch to compare that metric to the threshold value that you specified.

You have extensive control over how CloudWatch makes that comparison. Not only can you specify the period over which the comparison is made, but you can also specify how many evaluation periods are used to arrive at a conclusion. For example, if you specify three evaluation periods, CloudWatch compares a window of three data points. CloudWatch only notifies you if the oldest data point is breaching and the others are breaching or missing. For metrics that are continuously emitted,

CloudWatch doesn't notify you until three failures are found.

Aggregation

Amazon CloudWatch aggregates statistics according to the period length that you specify when retrieving statistics. You can publish as many data points as you want with the same or similar time stamps. CloudWatch aggregates them by period length. Aggregated statistics are only available when using detailed monitoring. In addition, Amazon CloudWatch does not aggregate data across regions.

You can publish data points for a metric that share not only the same time stamp, but also the same namespace and dimensions. CloudWatch returns aggregated statistics for those data points. You can also publish multiple data points for the same or different metrics, with any time stamp.

For large datasets, you can insert a pre-aggregated dataset called a statistic set. With statistic sets, you give CloudWatch the Min, Max, Sum, and SampleCount for a number of data points. This is commonly

6


Percentiles used when you need to collect data many times in a minute. For example, suppose you have a metric for the request latency of a webpage. It doesn't make sense to publish data with every webpage hit. We suggest that you collect the latency of all hits to that webpage, aggregate them once a minute, and send that statistic set to CloudWatch.

Amazon CloudWatch doesn't differentiate the source of a metric. If you publish a metric with the same namespace and dimensions from different sources, CloudWatch treats this as a single metric. This can be useful for service metrics in a distributed, scaled system. For example, all the hosts in a web server application could publish identical metrics representing the latency of requests they are processing.

CloudWatch treats these as a single metric, allowing you to get the statistics for minimum, maximum, average, and sum of all requests across your application.

Percentiles

A percentile indicates the relative standing of a value in a dataset. For example, the 95th percentile means that 95 percent of the data is lower than this value and 5 percent of the data is higher than this value. Percentiles help you get a better understanding of the distribution of your metric data. You can use percentiles with the following services:

• Amazon EC2

• Amazon RDS

• Kinesis

• Application Load Balancer

• Elastic Load Balancing

• API Gateway

Percentiles are often used to isolate anomalies. In a typical distribution, 95 percent of the data is within two standard deviations from the mean and 99.7 percent of the data is within three standard deviations from the mean. Any data that falls outside three standard deviations is often considered to be an anomaly because it differs so greatly from the average value. For example, suppose that you are monitoring the CPU utilization of your EC2 instances to ensure that your customers have a good experience. If you monitor the average, this can hide anomalies. If you monitor the maximum, a single anomaly can skew the results. Using percentiles, you can monitor the 95th percentile of CPU utilization to check for instances with an unusually heavy load.

You can monitor your system and applications using percentiles as you would use the other CloudWatch statistics (Average, Minimum, Maximum, and Sum). For example, when you create an alarm, you can use percentiles as the statistical function. You can specify the percentile with up to two decimal places (for example, p95.45).

CloudWatch needs raw data points to calculate percentiles. If you publish data using a statistic set instead, you can only retrieve percentile statistics for this data when one of the following conditions is true:

• The SampleCount of the statistic set is 1.

• The Min and the Max of the statistic set are equal.

Alarms

You can use an alarm to automatically initiate actions on your behalf. An alarm watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time. The action is a notification sent to an Amazon SNS topic or an

Auto Scaling policy. You can also add alarms to dashboards.

7


Limits

Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state. The state must have changed and been maintained for a specified number of periods.

When creating an alarm, select a period that is greater than or equal to the frequency of the metric to be monitored. For example, basic monitoring for Amazon EC2 provides metrics for your instances every 5 minutes. When setting an alarm on a basic monitoring metric, select a period of at least 300 seconds (5 minutes). Detailed monitoring for Amazon EC2 provides metrics for your instances every 1 minute. When setting an alarm on a detailed monitoring metric, select a period of at least 60 seconds (1 minute).

If you set an alarm on a high-resolution metric, you can specify a high-resolution alarm with a period of

10 seconds or 30 seconds, or you can set a regular alarm with a period of any multiple of 60 seconds.

There is a higher charge for high-resolution alarms. For more information about high-resolution metrics,

see Publish Custom Metrics (p. 42) .


Creating Amazon CloudWatch Alarms (p. 174) and

Create an Alarm from a

Metric on a Graph (p. 41)

.

CloudWatch Limits

CloudWatch has the following limits:

Resource

Actions

Alarms

API requests

Custom metrics

DescribeAlarms

Dimensions

GetMetricStatistics

ListMetrics

Metric data

MetricDatum items

Default Limit

5/alarm. This limit cannot be changed.

10/month/customer for free. 5000 per region per account.

1,000,000/month/customer for free.

No limit.

9 transactions per second (TPS). The maximum number of operation requests you can make per second without being throttled.

You can request a limit increase .

10/metric. This limit cannot be changed.





15 months. This limit cannot be changed.

20/ PutMetricData request. A MetricDatum object can contain a single value or a StatisticSet object representing many values. This limit cannot be changed.

8


Resources

Resource

Metrics

Period

PutMetricAlarm request

PutMetricData request

Amazon SNS email notifications

Default Limit

10/month/customer for free.

Maximum value is one day (86,400 seconds). This limit cannot be changed.



40 KB for HTTP POST requests. PutMetricData can handle

150 transactions per second (TPS), which is the maximum number of operation requests you can make per second without being throttled.


1,000/month/customer for free.

Amazon CloudWatch Resources

The following related resources can help you as you work with this service.

Resource

Amazon CloudWatch FAQs

Release notes

AWS Developer Resource Center

AWS Management Console

Description

The FAQ covers the top questions developers have asked about this product.

The release notes give a high-level overview of the current release. They specifically note any new features, corrections, and known issues.

A central starting point to find documentation, code examples, release notes, and other information to help you build innovative applications with AWS.

The console allows you to perform most of the functions of Amazon CloudWatch and various other AWS offerings without programming.

Amazon CloudWatch Discussion Forums Community-based forum for developers to discuss technical questions related to Amazon CloudWatch.

AWS Support The hub for creating and managing your AWS Support cases. Also includes links to other helpful resources, such as forums, technical FAQs, service health status, and AWS

Trusted Advisor.

Amazon CloudWatch product information

Contact Us

The primary webpage for information about Amazon

CloudWatch.

A central contact point for inquiries concerning AWS billing, account, events, abuse, etc.

9


Sign Up for Amazon Web Services (AWS)

Getting Set Up

To use Amazon CloudWatch you need an AWS account. Your AWS account allows you to use services (for example, Amazon EC2) to generate metrics that you can view in the CloudWatch console, a point-andclick web-based interface. In addition, you can install and configure the AWS command line interface

(CLI).

Sign Up for Amazon Web Services (AWS)

When you create an AWS account, we automatically sign up your account for all AWS services. You pay only for the services that you use.

If you have an AWS account already, skip to the next step. If you don't have an AWS account, use the following procedure to create one.

To sign up for an AWS account

1. Open https://aws.amazon.com/ , and then choose Create an AWS Account.

2. Follow the online instructions.

Part of the sign-up procedure involves receiving a phone call and entering a PIN using the phone keypad.

Sign in to the Amazon CloudWatch Console

To sign in to the Amazon CloudWatch console

1. Open the CloudWatch console at https://console.aws.amazon.com/cloudwatch/ .

2. If necessary, use the navigation bar to change the region to the region where you have your AWS resources.

3. Even if this is the first time you are using the CloudWatch console, Your Metrics could already report metrics, because you have used a AWS product that automatically pushes metrics to Amazon

CloudWatch for free. Other AWS products require that you enable metrics.

If you do not have any alarms, the Your Alarms section will have a Create Alarm button.

10


Set Up the AWS CLI

Set Up the AWS CLI

You can use the AWS CLI or the Amazon CloudWatch CLI to perform CloudWatch commands. Note that the AWS CLI replaces the CloudWatch CLI; we include new CloudWatch features only in the AWS CLI.

For information about how to install and configure the AWS CLI, see Getting Set Up with the AWS

Command Line Interface in the AWS Command Line Interface User Guide.

For information about how to install and configure the Amazon CloudWatch CLI, see Set Up the

Command Line Interface in the Amazon CloudWatch CLI Reference.

11


Scenario: Monitor Estimated Charges

Getting Started with Amazon

CloudWatch

The following scenarios show you how to use Amazon CloudWatch. In the first scenario, you use the

CloudWatch console to create a billing alarm that tracks your AWS usage and lets you know when you have exceeded a certain spending threshold. In the second, more advanced scenario, you use the AWS command line interface (CLI) to publish a single metric for a hypothetical application named GetStarted.

Scenarios

•

Monitor Your Estimated Charges (p. 12)

•

Publish Metrics (p. 15)

Scenario: Monitor Your Estimated Charges Using

CloudWatch

In this scenario, you create an Amazon CloudWatch alarm to monitor your estimated charges. When you enable the monitoring of estimated charges for your AWS account, the estimated charges are calculated and sent several times daily to CloudWatch as metric data.

Billing metric data is stored in the US East (N. Virginia) Region and reflects worldwide charges. This data includes the estimated charges for every service in AWS that you use, as well as the estimated overall total of your AWS charges.

You can choose to receive alerts by email when charges have exceeded a certain threshold. These alerts are triggered by CloudWatch and messages are sent using Amazon Simple Notification Service (Amazon

SNS).

Tasks

•

Step 1: Enable Billing Alerts (p. 13)

•

Step 2: Create a Billing Alarm (p. 13)

•

Step 3: Check the Alarm Status (p. 14)

•

Step 4: Edit a Billing Alarm (p. 15)

12


Step 1: Enable Billing Alerts

•

Step 5: Delete a Billing Alarm (p. 15)

Step 1: Enable Billing Alerts

Before you can create an alarm for your estimated charges, you must enable billing alerts, so that you can monitor your estimated AWS charges and create an alarm using billing metric data. After you enable billing alerts, you cannot disable data collection, but you can delete any billing alarms you created.

After you enable billing alerts for the first time, it takes about 15 minutes before you can view billing data and set billing alarms.

Requirements

• You must be signed in using root account credentials; IAM users cannot enable billing alerts for your

AWS account.

• For consolidated billing accounts, billing data for each linked account can be found by logging in as the paying account. You can view billing data for total estimated charges and estimated charges by service for each linked account as well as for the consolidated account.

To enable monitoring of your estimated charges

1. Open the Billing and Cost Management console at https://console.aws.amazon.com/billing/home?# .

2. In the navigation pane, choose Preferences.

3. Select Receive Billing Alerts.

4. Choose Save preferences.

Step 2: Create a Billing Alarm

After you've enabled billing alerts, you can create a billing alarm. In this scenario, you create an alarm that sends an email message when your estimated charges for AWS exceed a specified threshold.

Note

This procedure uses the simple options. To use the advanced options, see

Create a Billing

Alarm (p. 192)

in Create a Billing Alarm to Monitor Your Estimated AWS Charges.

13


Step 3: Check the Alarm Status

To create a billing alarm


2. If necessary, change the region to US East (N. Virginia). Billing metric data is stored in this region and reflects worldwide charges.

3. In the navigation pane, choose Alarms, Billing.

4. For Whenever my total AWS charges for the month exceed, specify the monetary amount (for example, 200) that must be exceeded to trigger the alarm and send an email notification.

Tip

Under Alarm Preview, there is an estimate of your charges that you can use to set an appropriate amount.

5. For send a notification to, choose an existing notification list or create a new one.

To create a list, choose New list and type a comma-separated list of email addresses to be notified when the alarm changes to the ALARM state. Each email address will be sent a subscription confirmation email. The recipient must confirm the subscription before notifications can be sent to the email address.

6. Choose Create Alarm.

Step 3: Check the Alarm Status

Now, check the status of the billing alarm that you just created.

To check the alarm status



14


Step 4: Edit a Billing Alarm


4. Select the check box next to the alarm. Note that until the subscription is confirmed, it is shown as

"Pending confirmation". After the subscription is confirmed, refresh the console to show the updated status.

Step 4: Edit a Billing Alarm

Let's say that you want to increase the amount money you spend with AWS each month from $200 to

$400. You can edit your existing billing alarm and increase the monetary amount that must be exceeded before the alarm is triggered.

To edit a billing alarm



3. In the navigation pane, choose

Alarms, Billing.

4. Select the check box next to the alarm and then choose Actions, Modify.

5. For Whenever my total AWS charges for the month exceed, specify the new amount that must be exceeded to trigger the alarm and send an email notification.

6. Choose Save Changes.

Step 5: Delete a Billing Alarm

You can delete your billing alarm if you no longer need it.

To delete a billing alarm




4. Select the check box next to the alarm and then choose Actions, Delete.

5. When prompted for confirmation, choose Yes, Delete.

Scenario: Publish Metrics to CloudWatch

In this scenario, you'll use the AWS Command Line Interface (AWS CLI) to publish a single metric for a hypothetical application named GetStarted. If you haven't already installed and configured the AWS

CLI, see Getting Set Up with the AWS Command Line Interface in the AWS Command Line Interface User

Guide.

Tasks

•

Step 1: Define the Data Configuration (p. 16)

•

Step 2: Add Metrics to CloudWatch (p. 16)

•

Step 3: Get Statistics from CloudWatch (p. 17)

•

Step 4: View Graphs with the Console (p. 17)

15


Step 1: Define the Data Configuration

Hour

1

1

2

1

1

3

Step 1: Define the Data Configuration

In this scenario, you'll publish data points that track the request latency for the application. Choose names for your metric and namespace that make sense to you. For this example, name the metric

RequestLatency and place all of the data points into the GetStarted namespace.

You'll publish several data points that collectively represent three hours of latency data. The raw data comprises fifteen request latency readings distributed over three hours. Each reading is in milliseconds:

• Hour one: 87, 51, 125, 235

• Hour two: 121, 113, 189, 65, 89

• Hour three: 100, 47, 133, 98, 100, 328

You can publish data to CloudWatch as single data points or as an aggregated set of data points called a statistic set. You can aggregate metrics to a granularity as low as one minute. You can publish the aggregated data points to CloudWatch as a set of statistics with four predefined keys:

Sum

,

Minimum

,

Maximum

, and

SampleCount

.

You'll publish the data points from hour one as single data points. For the data from hours two and three, you'll aggregate the data points and publish a statistic set for each hour. The key values are shown in the following table.

Raw Data

87

51

125

235

121, 113, 189, 65, 89

100, 47, 133, 98, 100, 328

Sum

577

806

Minimum Maximum

65

47

189

328

SampleCount

5

6

Step 2: Add Metrics to CloudWatch

After you have defined your data configuration, you are ready to add data.

To publish data points to CloudWatch

1. At a command prompt, run the following put-metric-data commands to add data for the first hour. Replace the example time stamp with a time stamp that is two hours in the past, in Universal

Coordinated Time (UTC).

aws cloudwatch put-metric-data --metric-name RequestLatency --namespace GetStarted \

--timestamp

2016-10-14T20:30:00Z

--value 87 --unit Milliseconds aws cloudwatch put-metric-data --metric-name RequestLatency --namespace GetStarted \

--timestamp

2016-10-14T20:30:00Z


--timestamp

2016-10-14T20:30:00Z


--timestamp

2016-10-14T20:30:00Z

--value 235 --unit Milliseconds

2. Add data for the second hour, using a time stamp that is one hour later than the first hour.

16


Step 3: Get Statistics from CloudWatch aws cloudwatch put-metric-data --metric-name RequestLatency --namespace GetStarted \

--timestamp

2016-10-14T21:30:00Z

--statistic-values

Sum=577,Minimum=65,Maximum=189,SampleCount=5 --unit Milliseconds

3. Add data for the third hour, omitting the time stamp to default to the current time.

aws cloudwatch put-metric-data --metric-name RequestLatency --namespace GetStarted \

--statistic-values Sum=806,Minimum=47,Maximum=328,SampleCount=6 --unit Milliseconds

Step 3: Get Statistics from CloudWatch

Now that you have published metrics to CloudWatch, you can retrieve statistics based on those metrics using the get-metric-statistics command as follows. Be sure to specify

--start-time

and

--end-time

far enough in the past to cover the earliest time stamp that you published.

aws cloudwatch get-metric-statistics --namespace GetStarted --metric-name RequestLatency -statistics Average \

--start-time

2016-10-14T00:00:00Z

--end-time

2016-10-15T00:00:00Z

--period 60

The following is example output:

{

"Datapoints": [],

"Label": "Request:Latency"

}

Step 4: View Graphs with the Console

After you have published metrics to CloudWatch, you can use the CloudWatch console to view statistical graphs.

To view graphs of your statistics on the console


2. In the Navigation pane, choose Metrics.

3. In the All metrics tab, in the search box, type RequestLatency and press Enter.

4. Select the check box for the

RequestLatency metric. A graph of the metric data is displayed in the upper pane.


Graph Metrics (p. 37)

.

17


Using Amazon CloudWatch

Dashboards

Amazon CloudWatch dashboards are customizable home pages in the CloudWatch console that you can use to monitor your resources in a single view, even those resources that are spread across different regions. You can use CloudWatch dashboards to create customized views of the metrics and alarms for your AWS resources.

With dashboards, you can create the following:

• A single view for selected metrics and alarms to help you assess the health of your resources and applications across one or more regions. You can select the color used for each metric on each graph, so that you can easily track the same metric across multiple graphs.

• An operational playbook that provides guidance for team members during operational events about how to respond to specific incidents.

• A common view of critical resource and application measurements that can be shared by team members for faster communication flow during operational events.

You can create dashboards by using the console, the AWS CLI, or by using the

PutDashboard

API.

Contents

•

Create a CloudWatch Dashboard (p. 19)

•

Add or Remove a Graph from a CloudWatch Dashboard (p. 19)

•

Move or Resize a Graph on a CloudWatch Dashboard (p. 20)

•

Edit a Graph on a CloudWatch Dashboard (p. 21)

•

Rename a Graph on a CloudWatch Dashboard (p. 22)

•

Add or Remove a Text Widget from a CloudWatch Dashboard (p. 22)

•

Add or Remove an Alarm from a CloudWatch Dashboard (p. 23)

•

Monitor Resources in Multiple Regions Using a CloudWatch Dashboard (p. 23)

•

Link and Unlink Graphs on a CloudWatch Dashboard (p. 24)

•

Change the Refresh Interval for the CloudWatch Dashboard (p. 24)

•

Change the Time Range or Format of a CloudWatch Dashboard (p. 25)

18


Create a Dashboard

Create a CloudWatch Dashboard

To get started with CloudWatch dashboards, you must first create a dashboard. You can create multiple dashboards. You can have up to 500 dashboards in your AWS account. All dashboards are global, not region-specific.

The steps in this section are for creating a dashboard using the console. You can also create a dashboard with the

PutDashboard

API, which uses a JSON string to define the dashboard contents. To create a dashboard using

PutDashboard

and base this dashboard on an existing dashboard, choose Actions,

View/edit source to display and copy the JSON string of a current dashboard to use for your new dashboard.

For more information about creating a dashboard using the API, see PutDashboard in the Amazon

CloudWatch API Reference.

To create a dashboard using the console


2. In the navigation pane, choose Dashboards, Create dashboard.

3. In the Create new dashboard dialog box, type a name for the dashboard and choose Create

dashboard.

4. Do one of the following in the Add to this dashboard dialog box:

• To add a graph to your dashboard, choose Line or Stacked area and then choose Configure. In the Add metric graph dialog box, select the metrics to graph and choose Create widget.

• To add a number displaying a metric to the dashboard, choose Number, Configure. In the Add

metric graph dialog box, select the metrics to graph and choose Create widget.

• To add a text block to your dashboard, choose Text, Configure. In the New text widget dialog box, for Markdown, add and format your text using Markdown , and then choose Create widget.

5. Optionally, choose Add widget and repeat step 4 to add another widget to the dashboard. You can repeat this step as much as you want.

6. Choose

Save dashboard.

Add or Remove a Graph from a CloudWatch

Dashboard

You can add graphs containing one or more metrics to your dashboard for the resources you monitor.

You can remove the graphs when they're no longer needed.

To add a graph to a dashboard


2. In the navigation pane, choose Dashboards and select a dashboard.

3. Choose Add widget.

4. Choose either Line or Stacked area, and then choose Configure.

5. In the All metrics tab, select the metrics to graph.

6. (Optional) As you choose metrics to graph, you can change their color on the graph. To do so, choose

Graphed metrics and select the color square next to the metric to display a color picker box. Choose another color square in the color picker, and then click outside the color picker to see your new color on the graph. Alternatively, in the color picker, you can type the six-digit standard HTML hex color code for the color you want and press ENTER.

19


Move or Resize a Graph

7. Horizontal annotations can help dashboard users quickly see when a metric has spiked to a certain level, or whether the metric is within a predefined range. To add a horizontal annotation, choose

Graph options, Add horizontal annotation: a. For Label, type a label for the annotation.

b. For Value, type the metric value where the horizontal annotation appears.

c.

For Fill, specify whether to use fill shading with this annotation. For example, choose

Above or

Below

for the corresponding area to be filled. If you specify

Between

, another

Value

field appears, and the area of the graph between the two values is filled.

d. For Axis, specify whether the numbers in

Value

refer to the metric associated with the left Yaxis or the right Y-axis, if the graph includes multiple metrics.

You can change the fill color of an annotation by choosing the color square in the left column of the annotation.

Repeat these steps to add multiple horizontal annotations to the same graph.

To hide an annotation, clear the checkbox in the left column for that annotation.

To delete an annotation, choose x in the Actions column.

8. (Optional) To view more information about the metric being graphed, hover over the legend.

9. (Optional) To change the widget type, hover over the title area of the graph and choose

Widget

actions, Widget type.

10. Choose Create widget.

11. Choose Save dashboard.

To remove a graph from a dashboard



3. Hover over the title of the graph and choose Widget actions, Delete.

4. Choose Save dashboard. If you attempt to navigate away from the dashboard before you save your changes, you are prompted to either save or discard your changes.

Move or Resize a Graph on a CloudWatch

Dashboard

You can arrange and resize graphs on your CloudWatch dashboard.

To move a graph on a dashboard



3. Hover over the title of the graph until the selection icon appears, and then select and drag the graph to a new location on the dashboard.


To resize a graph


20


Edit a Graph


3. To increase or decrease the size, hover over the graph and drag the lower right corner of the graph.


To enlarge a graph temporarily



3. Select the graph. Alternatively, hover over the title of the graph and choose Widget actions,

Enlarge.

Edit a Graph on a CloudWatch Dashboard

You can edit a graph to change the title, statistic, or period, or to add or remove metrics. If you have multiple metrics displayed on a graph, you can reduce clutter by temporarily hiding the metrics that don't interest you.

To edit a graph on a dashboard



3. Hover over the title of the graph and choose Widget actions, Edit.

4. To change the graph's title, select the title, type a new title, and press ENTER.

5. In the lower half of the screen, in the Graphed metrics tab, you can change the colors, statistic, or period: a. To change the color of one of the lines, select the color square next to the metric to display a color picker box. Choose another color in the color picker, and then click outside the color picker to see your new color on the graph. Alternatively, in the color picker, you can type the six-digit

HTML hex color code for the color you want and press ENTER.

b. To change the statistic, choose Statistic in the lower half of the window, and choose the new statistic you want.

c.

To change the time period, which is next to Statistic in the lower half of the window, choose

Period, and then select another value. This new setting is used on the dashboard only if the period setting of the dashboard itself is set to

Auto

. Otherwise, the period setting of the dashboard overrides the period setting for individual widgets.

6. To add or edit horizontal annotations, choose Graph options: a. To add a horizontal annotation, choose

Add horizontal annotation.

b. For Label, type a label for the annotation.

c.

For Value, type the metric value where the horizontal annotation appears.

d. For Fill, specify how to use fill shading with this annotation. For example, choose

Above

or

Below for the corresponding area to be filled. If you specify

Between

, another

Value



e. For Axis, specify whether the numbers in

Value



21


Rename a Graph

To hide an annotation, clear the check box in the left column for that annotation.

To delete an annotation, click the x in the Actions column.

7. When you're finished with your changes, choose Update widget.

To temporarily hide metrics on a graph on a dashboard



3. In the graph's footer, hover over the colored square in the legend. When it changes to an X, click it.

4. To restore the metric, choose the grayed out square and metric name.

Rename a Graph on a CloudWatch Dashboard

You can change the default name that CloudWatch assigns to a graph on your dashboard.

To rename a graph on a dashboard



3. Hover over the title of the graph and choose Widget actions, Edit.

4. On the Edit graph screen, near the top, choose the title of the graph.

5. For Title, type a new name, choose Ok (check mark), and then in the lower-right corner of the Edit

graph screen, choose Update widget.

Add or Remove a Text Widget from a CloudWatch

Dashboard

A text widget contains a block of text in Markdown format. You can add, edit, or remove text widgets from your CloudWatch dashboard.

To add a text widget to a dashboard



3. Choose Add widget.

4. Choose Text, Configure.

5. For Markdown, add and format your text using Markdown , and then choose Create widget.


To edit a text widget on a dashboard



3. Hover over the upper-right corner of the text block, and then choose Widget actions, Edit.

4. Update the text as needed, and then choose Update widget.

22


Add or Remove an Alarm


To remove a text widget from a dashboard



3. Hover over the upper-right corner of the text block and choose Widget actions, Delete.

4. Choose

Save dashboard.

Add or Remove an Alarm from a CloudWatch

Dashboard

You can add alarms that you have created to your dashboard. When an alarm is on a dashboard, it turns red when it is in the

ALARM

state.

To add an alarm to a dashboard


2. In the navigation pane, choose Alarms, select the alarm to add, and then choose Add to Dashboard.

3. Select a dashboard, choose a widget type ( Line, Stacked area, or Number), and then choose Add to

dashboard.

4. To see your alarm on the dashboard, choose Dashboards in the navigation pane and select the dashboard.

5. (Optional) To temporarily make an alarm graph larger, select the graph.

6. (Optional) To change the widget type, hover over the title of the graph and choose Widget actions,

Widget type.

To remove an alarm from a dashboard



3. Hover over the title of the graph and choose Widget actions, Delete.

4. Choose Save dashboard. If you attempt to navigate away from the dashboard before you save your changes, you are prompted to either save or discard your changes.

Monitor Resources in Multiple Regions Using a

CloudWatch Dashboard

You can monitor AWS resources in multiple regions using a single CloudWatch dashboard. For example, you can create a dashboard that shows CPU utilization for an EC2 instance located in the us-west-2 region with your billing metrics, which are located in the us-east-1

region.

To monitor resources in multiple regions in one dashboard


2. In the navigation pane, choose Metrics.

23


Link and Unlink Graphs

3. In the navigation bar, select a region.

4. Select the metrics to add to your dashboard.

5. For Actions, choose Add to dashboard.

6. For Add to, type a name for the new dashboard and choose Add to dashboard.

Alternatively, to add to an existing dashboard, choose Existing dashboard, select a dashboard, and then choose Add to dashboard.

7. To add metrics from another region, select the next region and repeat these steps.


Link and Unlink Graphs on a CloudWatch

Dashboard

You can link the graphs on your dashboard together, so that when you zoom in or zoom out on one graph, the other graphs zoom in or zoom out at the same time. You can unlink graphs to limit zoom to one graph.

To link the graphs on a dashboard



3. Choose Actions, Link graphs.

To unlink the graphs on a dashboard



3. Clear Actions, Link graphs.

Change the Refresh Interval for the CloudWatch

Dashboard

You can change how often the data on your CloudWatch dashboard is refreshed or set it to automatically refresh.

To change the dashboard refresh interval



3. On the Refresh options menu (upper right corner), choose 10 Seconds, 1 Minute, 2 Minutes, 5

Minutes, or 15 Minutes.

To automatically refresh the dashboard



3. Choose Refresh options, Auto refresh.

24


Change the Time Range or Format

Change the Time Range or Format of a

CloudWatch Dashboard

You can change the time range to display dashboard data over minutes, hours, days, or weeks. You can also change the time format to display dashboard data in UTC or local time.

Note

If you create a dashboard with graphs that contain close to 100 or more high-resolution metrics, we recommend that you set the time range to no longer than one hour, to ensure

good dashboard performance. For more information about high-resolution metrics, see High-

Resolution Metrics (p. 42) .

To change the dashboard time range



3. Do one of the following:

• Select one of the predefined ranges shown, which span from 1 hour to 1 week: 1h, 3h, 12h, 1d,

3d, or 1w.

• Choose custom, Relative. Select one of the predefined ranges, which span from 1 minute to 15 months.

• Choose custom, Absolute. Use the calendar picker or the text fields to specify the time range.

To change the dashboard time format



3. Choose custom.

4. From the upper corner, choose UTC or Local timezone.

25


View Available Metrics

Using Amazon CloudWatch Metrics

Metrics are data about the performance of your systems. By default, several services provide free metrics for resources (such as Amazon EC2 instances, Amazon EBS volumes, and Amazon RDS DB instances).

You can also enable detailed monitoring some resources, such as your Amazon EC2 instances, or publish your own application metrics. Amazon CloudWatch can load all the metrics in your account (both AWS resource metrics and application metrics that you provide) for search, graphing, and alarms.

Metric data is kept for a period of 15 months, enabling you to view both up-to-the-minute data and historical data.

Contents

•

View Available Metrics (p. 26)

•

Search for Available Metrics (p. 29)

•

Get Statistics for a Metric (p. 30)

•

Graph Metrics (p. 37)

•

Publish Custom Metrics (p. 42)


Metrics are grouped first by namespace, and then by the various dimension combinations within each namespace. For example, you can view all EC2 metrics, EC2 metrics grouped by instance, or EC2 metrics grouped by Auto Scaling group.

Only the AWS services that you're using send metrics to Amazon CloudWatch.

To view available metrics by namespace and dimension using the console



3. Select a metric namespace (for example, EC2).

26



4. Select a metric dimension (for example, Per-Instance Metrics).

5. The All metrics tab displays all metrics for that dimension in the namespace. You can do the following: a. To sort the table, use the column heading.

b. To graph a metric, select the check box next to the metric. To select all metrics, select the check box in the heading row of the table.

c.

To filter by resource, choose the resource ID and then choose Add to search.

d. To filter by metric, choose the metric name and then choose Add to search.

27



To view available metrics by namespace, dimension, or metric using the AWS CLI

Use the list-metrics command to list CloudWatch metrics. For a list of all service namespaces, see

AWS

Namespaces (p. 46)

. For lists of the metrics and dimensions for each service, see Amazon CloudWatch

Metrics and Dimensions Reference (p. 45)

.

The following example specifies the

AWS/EC2

namespace to view all the metrics for Amazon EC2:

aws cloudwatch list-metrics --namespace AWS/EC2


{

"Metrics" : [

...

{

"Namespace": "AWS/EC2",

"Dimensions": [

{

"Name": "InstanceId",

"Value": "i-1234567890abcdef0"

}

],

"MetricName": "NetworkOut"

},

{


"Dimensions": [

{



}

],

"MetricName": "CPUUtilization"

},

{


"Dimensions": [

28


Search for Available Metrics

{



}

],

"MetricName": "NetworkIn"

},

...

]

}

To list all the available metrics for a specified resource


AWS/EC2

namespace and the

InstanceId

dimension to view the results for the specified instance only.

aws cloudwatch list-metrics --namespace AWS/EC2 --dimensions

Name=InstanceId,Value=

i-1234567890abcdef0

To list a metric for all resources


AWS/EC2

namespace and a metric name to view the results for the specified metric only.

aws cloudwatch list-metrics --namespace AWS/EC2 --metric-name

CPUUtilization

Search for Available Metrics

You can search within all the metrics in your account using targeted search terms. Metrics are returned that have matching results within their namespace, metric name, or dimensions.

To search for available metrics in CloudWatch



3. In the search field on the

All metrics tab, type a search term, such as a metric name, service name, or resource name, and press Enter. This shows you all the namespaces with metrics with this search term.

For example, if you search for

volume

, this shows the namespaces that contain metrics with this term in their name.

4. Select a namespace with results for your search to view the metrics. You can do the following: a. To graph one or more metrics, select the check box next to each metric. To select all metrics, select the check box in the heading row of the table.

b. To view one of the resources in its console, choose the resource ID and then choose

Jump to

resource.

c.

To view help for a metric, select the metric name and choose

What is this?

29


Get Statistics for a Metric

Get Statistics for a Metric

The following examples show you how to get statistics for the CloudWatch metrics for your resources, such as your EC2 instances.

Examples

•

Get Statistics for a Specific Resource (p. 30)

•

Aggregate Statistics Across Resources (p. 33)

•

Aggregate Statistics by Auto Scaling Group (p. 35)

•

Aggregate Statistics by Amazon Machine Image (AMI) (p. 36)

Get Statistics for a Specific Resource

The following example shows you how to determine the maximum CPU utilization of a specific EC2 instance.

Requirements

• You must have the ID of the instance. You can get the instance ID using the Amazon EC2 console or the describe-instances command.

• By default, basic monitoring is enabled, but you can enable detailed monitoring. For more information, see Enable or Disable Detailed Monitoring for Your Instances in the Amazon EC2 User Guide for Linux

Instances.

To display the average CPU utilization for a specific instance using the console



3. Select the EC2 metric namespace.

30



4. Select the Per-Instance Metrics dimension.

5. In the search field, type

CPUUtilization

and press Enter. Select the row for the specific instance, which displays a graph for the CPUUtilization metric for the instance. To change the name of the graph, choose the pencil icon. To change the time range, select one of the predefined values or choose custom.

31



6. To change the statistic, choose the Graphed metrics tab. Choose the column heading or an individual value, and then choose one of the statistics or predefined percentiles, or specify a custom percentile (for example, p95.45).

7. To change the period, choose the Graphed metrics tab. Choose the column heading or an individual value, and then choose a different value.

To get the CPU utilization per EC2 instance using the AWS CLI

Use the get-metric-statistics command as follows to get the CPUUtilization metric for the specified instance:

aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization \

--dimensions Name=InstanceId,Value=

i-1234567890abcdef0

--statistics Maximum \

32


Aggregate Statistics Across Resources

--start-time

2016-10-18T23:18:00

--end-time

2016-10-19T23:18:00

--period 360

The returned statistics are six-minute values for the requested 24-hour time interval. Each value represents the maximum CPU utilization percentage for the specified instance for a particular six-minute time period. The data points are not returned in chronological order. The following shows the beginning of the example output (the full output includes data points for every 6 minutes of the 24-hour period):

{

"Datapoints": [

{

"Timestamp": "2016-10-19T00:18:00Z",

"Maximum": 0.33000000000000002,

"Unit": "Percent"

},

{

"Timestamp": "2016-10-19T03:18:00Z",

"Maximum": 99.670000000000002,

"Unit": "Percent"

},

{

"Timestamp": "2016-10-19T07:18:00Z",

"Maximum": 0.34000000000000002,

"Unit": "Percent"

},

...

],

"Label": "CPUUtilization"

}


You can aggregate the metrics for AWS resources across multiple resources. Amazon CloudWatch cannot aggregate data across regions. Metrics are completely separate between regions.

For example, you can aggregate statistics for your EC2 instances that have detailed monitoring enabled.

Instances that use basic monitoring are not included. Therefore, you must enable detailed monitoring

(at an additional charge), which provides data in 1-minute periods. For more information, see Enable or

Disable Detailed Monitoring for Your Instances in the Amazon EC2 User Guide for Linux Instances.

This example shows you how to get the average CPU usage for your EC2 instances. Because no dimension is specified, CloudWatch returns statistics for all dimensions in the

AWS/EC2

namespace. To get statistics for other metrics, see

Amazon CloudWatch Metrics and Dimensions Reference (p. 45)

.

Important

This technique for retrieving all dimensions across an AWS namespace does not work for custom namespaces that you publish to Amazon CloudWatch. With custom namespaces, you must specify the complete set of dimensions that are associated with any given data point to retrieve statistics that include the data point.

To display average CPU utilization for your EC2 instances



3. Choose the EC2 namespace and choose Across All Instances.

4. Select the row that contains CPUUtilization, which displays a graph for the metric for all your EC2 instances. To change the name of the graph, choose the pencil icon. To change the time range, select one of the predefined values or choose custom.

33





To get average CPU utilization across your EC2 instances using the AWS CLI

Use the get-metric-statistics command as follows:

aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization -statistics "Average" "SampleCount" \

--start-time

2016-10-11T23:18:00

--end-time

2016-10-12T23:18:00

--period 3600


{

"Datapoints": [

{

"SampleCount": 238.0,

"Timestamp": "2016-10-12T07:18:00Z",

"Average": 0.038235294117647062,

"Unit": "Percent"

},

{


"Timestamp": "2016-10-12T09:18:00Z",

"Average": 0.16670833333333332,

"Unit": "Percent"

},

{


"Timestamp": "2016-10-11T23:18:00Z",

"Average": 0.041596638655462197,

"Unit": "Percent"

34


Aggregate Statistics by Auto Scaling Group

},

...

],


}

Aggregate Statistics by Auto Scaling Group

You can aggregate statistics for the EC2 instances in an Auto Scaling group. Amazon CloudWatch cannot aggregate data across regions. Metrics are completely separate between regions.

This example shows you how to get the total bytes written to disk for one Auto Scaling group. The total is computed for one-minute periods for a 24-hour interval across all EC2 instances in the specified Auto

Scaling group.

To display DiskWriteBytes for the instances in an Auto Scaling group using the console



3. Choose the

EC2 namespace and then choose By Auto Scaling Group.

4. Select the row for the DiskWriteBytes metric and the specific Auto Scaling group, which displays a graph for the metric for the instances in the Auto Scaling group. To change the name of the graph, choose the pencil icon. To change the time range, select one of the predefined values or choose

custom.



To get DiskWriteBytes for the instances in an Auto Scaling group using the AWS CLI


aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name DiskWriteBytes

--dimensions Name=AutoScalingGroupName,Value=

my-asg

--statistics "Sum" "SampleCount" \

35


Aggregate Statistics by AMI

--start-time

2016-10-16T23:18:00

--end-time

2016-10-18T23:18:00

--period 360


{

"Datapoints": [

{


"Timestamp": "2016-10-19T21:36:00Z",

"Sum": 0.0,

"Unit": "Bytes"

},

{

"SampleCount": 5.0,

"Timestamp": "2016-10-19T21:42:00Z",

"Sum": 0.0,

"Unit": "Bytes"

}

],

"Label": "DiskWriteBytes"

}

Aggregate Statistics by Amazon Machine Image (AMI)

You can aggregate statistics for the EC2 instances that have detailed monitoring enabled. Instances that use basic monitoring are not included. For more information, see Enable or Disable Detailed Monitoring for Your Instances in the Amazon EC2 User Guide for Linux Instances.

This example shows you how to determine average CPU utilization for all instances that use the specified

AMI. The average is over 60-second time intervals for a one-day period.

To display the average CPU utilization by AMI using the console



3. Choose the EC2 namespace and then choose By Image (AMI) Id.

4. Select the row for the CPUUtilization metric and the specific AMI, which displays a graph for the metric for the specified AMI. To change the name of the graph, choose the pencil icon. To change the time range, select one of the predefined values or choose custom.

36


Graph Metrics



To get the average CPU utilization by AMI using the AWS CLI


aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization \

--dimensions Name=ImageId,Value=

ami-3c47a355

--statistics Average \

--start-time

2016-10-10T00:00:00

--end-time

2016-10-11T00:00:00

--period 3600

The operation returns statistics that are one-minute values for the one-day interval. Each value represents an average CPU utilization percentage for EC2 instances running the specified AMI. The following is example output:

{

"Datapoints": [

{

"Timestamp": "2016-10-10T07:00:00Z",

"Average": 0.041000000000000009,

"Unit": "Percent"

},

{

"Timestamp": "2016-10-10T14:00:00Z",

"Average": 0.079579831932773085,

"Unit": "Percent"

},

{

"Timestamp": "2016-10-10T06:00:00Z",

"Average": 0.036000000000000011,

"Unit": "Percent"

},

...

],


}

Graph Metrics

You can use the CloudWatch console to graph metric data generated by other AWS services to make it easier to see the metric activity on your services. You can use the following procedures to graph metrics in CloudWatch.

Contents

•

Graph a Metric (p. 37)

•

Modify the Time Range for a Graph (p. 39)

•

Modify the Y Axis for a Graph (p. 40)

•

Create an Alarm from a Metric on a Graph (p. 41)

Graph a Metric

You can select metrics and create graphs of the data using the CloudWatch console.

37


Graph a Metric

CloudWatch supports the following statistics on metrics:

Average

,

Minimum

,

Maximum

,

Sum

, and

SampleCount

. For more information, see

Statistics (p. 5) .

You can view your data at different granularities. For example, you can choose a detailed view (for example 1 minute), which can be useful when troubleshooting. You can choose a less detailed view (for example, 1 hour), which can be useful when viewing a broader time range (for example, 3 days) so that

you can see trends over time. For more information, see Periods (p. 6)

.

Create a Graph

To graph a metric



3. On the All metrics tab, type a search term in the search field, such as a metric name or resource name, and press Enter.

For example, if you search for the CPUUtilization metric, you see the namespaces and dimensions with this metric.

4. Select one of the results for your search to view the metrics.

5. To graph one or more metrics, select the check box next to each metric. To select all metrics, select the check box in the heading row of the table.

6. To view more information about the metric being graphed, hover over the legend.

7. Horizontal annotations can help graph users quickly see when a metric has spiked to a certain level, or whether the metric is within a predefined range. To add a horizontal annotation, choose

Graph

options, Add horizontal annotation: a. For Label, type a label for the annotation.


c.


Above or

Below


Between

, another

Value



Value





To delete an annotation, choose

x in the Actions column.

8. To get a URL for your graph, choose Actions, Share. Copy the URL and save it or share it.

9. To add your graph to a dashboard, choose Actions, Add to dashboard.

Update a Graph

To update your graph

1. To change the name of the graph, choose the pencil icon.

2. To change the time range, select one of the predefined values or choose custom. For more

information, see Modify the Time Range for a Graph (p. 39)

.

38


Modify the Time Range for a Graph


4. To change the period, choose the

Graphed metrics tab. Choose the column heading or an individual value, and then choose a different value.

5. To add a horizontal annotation, choose Graph options, Add horizontal annotation: a. For Label, type a label for the annotation.


c.


Above or

Below


Between

, another

Value



Value





To delete an annotation, choose x in the Actions column.

6. To change the refresh interval, choose Refresh options, and then select Auto refresh or choose 1

Minute, 2 Minutes, 5 Minutes, or 15 Minutes.

Duplicate a Metric

To duplicate a metric

1. Choose the Graphed metrics tab.

2. For Actions, choose the Duplicate icon.

3. Update the duplicate metric as needed.

Modify the Time Range for a Graph

You can change the time range for a graph to view the data at different points in time.

Relative Time Ranges

You can set a relative time range for your graph.

To specify a relative time range for a graph


39


Modify the Y Axis for a Graph


3. Select one of the predefined ranges shown at the top of the page, which span from 1 hour to 1 week ago.

4. For more predefined ranges, choose the custom menu and then choose Relative. Select one of the predefined ranges, which span from 5 minutes to 15 months ago.

Absolute Time Ranges

You can set an absolute time range for your graph.

To specify an absolute time range for a graph



3. Choose the custom menu and then choose Absolute. Use the calendar picker or the text fields to specify the time range.

Zoom in on a Graph

You can change the granularity of a graph and zoom in to see data over a shorter time period.

To zoom in on a graph



3. Choose and drag on the graph area, and then release the drag.

4. To reset a zoomed-in graph, choose the Reset zoom icon.

Modify the Y Axis for a Graph

You can set custom bounds for the Y axis on a graph to help you see the data better. For example, you can change the bounds on a CPUUtilization graph to 100 percent so that it's easy to see whether the CPU is low (the plotted line is near the bottom of the graph) or high (the plotted line is near the top of the graph).

You can switch between two different Y axes for your graph. This is useful if the graph contains metrics that have different units or that differ greatly in their range of values.

To modify the Y axis on a graph



Metrics.

3. Select a metric namespace (for example, EC2) and then a metric dimension (for example, Per-

Instance Metrics).

4. The All metrics tab displays all metrics for that dimension in that namespace. To graph a metric, select the check box next to the metric.

5. On the Graph options tab, specify the Min and Max values for Left Y Axis. The value of Min cannot be greater than the value of Max.

40


Create an Alarm from a Metric on a Graph

6. To create a second Y axis, specify the Min and Max values for Right Y Axis.

7. To switch between the two Y axes, choose the

Graphed metrics tab. For Y Axis, choose Left Y Axis or Right Y Axis.

Create an Alarm from a Metric on a Graph

You can graph a metric and then create an alarm from the metric on the graph, which has the benefit of populating many of the alarm fields for you.

To create an alarm from a metric on a graph



3. Select a metric namespace (for example, EC2) and then a metric dimension (for example, Per-

Instance Metrics).

4. The All metrics tab displays all metrics for that dimension in that namespace. To graph a metric, select the check box next to the metric.

5. To create an alarm for the metric, choose the Graphed metrics tab. For Actions, choose the alarm icon.

6. Under Alarm Threshold, type a unique name for the alarm and a description of the alarm. For

Whenever, specify a threshold and the number of periods.

7. Under Actions, select the type of action to have the alarm perform when the alarm is triggered.

8. (Optional) For Period, choose a different value. For Statistic, choose Standard to specify one of the statistics in the list or choose Custom to specify a percentile (for example, p95.45).

41


Publish Custom Metrics


Publish Custom Metrics

You can publish your own metrics to CloudWatch using the AWS CLI or an API. You can view statistical graphs of your published metrics with the AWS Management Console.

CloudWatch stores data about a metric as a series of data points. Each data point has an associated time stamp. You can even publish an aggregated set of data points called a statistic set.

Topics

•

High-Resolution Metrics (p. 42)

•

Using Dimensions (p. 42)

•

Publish Single Data Points (p. 43)

•

Publish Statistic Sets (p. 44)

•

Publish the Value Zero (p. 44)

High-Resolution Metrics

Each metric is one of the following:

• Standard resolution, with data having a one-minute granularity

• High resolution, with data at a granularity of one second

Metrics produced by AWS services are standard resolution by default. When you publish a custom metric, you can define it as either standard resolution or high resolution. When you publish a high-resolution metric, CloudWatch stores it with a resolution of 1 second, and you can read and retrieve it with a period of 1 second, 5 seconds, 10 seconds, 30 seconds, or any multiple of 60 seconds.

High-resolution metrics can give you more immediate insight into your application's sub-minute activity.

Keep in mind that every

PutMetricData

call for a custom metric is charged, so calling

PutMetricData more often on a high-resolution metric can lead to higher charges. For more information about

CloudWatch pricing, see Amazon CloudWatch Pricing .



There is a higher charge for high-resolution alarms with a period of 10 or 30 seconds.

Using Dimensions

In custom metrics, the

--dimensions

parameter is common. A dimension further clarifies what the metric is, and what data it stores. You can have up to 10 dimensions in one metric, and each dimension is defined by a Name and Value pair.

Note that how you specify a dimension is different when you use different commands. With put-metricdata , you specify each dimension as

MyName

=

MyValue

, while with get-metric-statistics or put-metric-

42


Publish Single Data Points alarm you use the format

Name=

MyName

,

Value=

MyValue

. For example, the following command publishes a

"Buffers" metric with two dimensions named InstanceId and InstanceType.

aws cloudwatch put-metric-data --metric-name Buffers --namespace MyNameSpace --unit Bytes

--value 231434333 --dimensions InstanceId=1-23456789,InstanceType=m1.small

This command retrieves statistics for that same metric. Separate the Name and Value parts of a single dimension with commas, but you use a space between one dimension and the next if you have multiple dimensions.

aws cloudwatch get-metric-statistics --metric-name Buffers --namespace MyNameSpace -dimensions Name=InstanceId,Value=1-23456789 Name=InstanceType,Value=m1.small --start-time

2016-10-15T04:00:00Z --end-time 2016-10-19T07:00:00Z --statistics Average --period 60

If a single metric includes multiple dimensions, you must specify a value for every defined dimension when you use get-metric-statistics . For example, the Amazon S3 metric

BucketSizeBytes

includes the dimensions

BucketName

and

StorageType

, so you must specify both dimensions with get-metricstatistics .

aws cloudwatch get-metric-statistics --metric-name BucketSizeBytes --start-time

2017-01-23T14:23:00Z --end-time 2017-01-26T19:30:00Z --period 3600 --namespace

AWS/S3 --statistics Maximum --dimensions Name=BucketName,Value=

MyBucketName

Name=StorageType,Value=StandardStorage --output table

You can see what dimensions are defined for a metric by using the list-metrics command.

Publish Single Data Points

To publish a single data point for a new or existing metric, use the put-metric-data command with one value and time stamp. For example, the following actions each publish one data point: aws cloudwatch put-metric-data --metric-name

PageViewCount

--namespace

MyService

--value 2

--timestamp

2016-10-14T12:00:00.000Z

aws cloudwatch put-metric-data --metric-name

PageViewCount

--namespace

MyService

--value 4

--timestamp

2016-10-14T12:00:01.000Z

aws cloudwatch put-metric-data --metric-name

PageViewCount

--namespace

MyService

--value 5

--timestamp

2016-10-14T12:00:02.000Z

If you call this command with a new metric name, CloudWatch creates a metric for you. Otherwise,

CloudWatch associates your data with the existing metric that you specified.

Note

When you create a metric, it can take up to two minutes before you can retrieve statistics for the new metric using the get-metric-statistics command. However, it can take up to fifteen minutes before the new metric appears in the list of metrics retrieved using the list-metrics command.

Although you can publish data points with time stamps as granular as one-thousandth of a second,

CloudWatch aggregates the data to a minimum granularity of one minute. CloudWatch records the average (sum of all items divided by number of items) of the values received for every 1-minute period, as well as the number of samples, maximum value, and minimum value for the same time period. For example, the

PageViewCount

metric from the previous examples contains three data points with time stamps just seconds apart. CloudWatch aggregates the three data points because they all have time stamps within a one-minute period.

CloudWatch uses one-minute boundaries when aggregating data points. For example, CloudWatch aggregates the data points from the previous example because all three data points fall within the oneminute period that begins at

2016-10-20T12:00:00.000Z

and ends at

2016-10-20T12:01:00.000Z

.

43


Publish Statistic Sets

You can use the get-metric-statistics command to retrieve statistics based on the data points that you published.

aws cloudwatch get-metric-statistics --namespace

MyService

--metric-name

PageViewCount

\

--statistics "Sum" "Maximum" "Minimum" "Average" "SampleCount" \

--start-time

2016-10-20T12:00:00.000Z

--end-time

2016-10-20T12:05:00.000Z

--period 60


{

"Datapoints": [

{

"SampleCount": 3.0,

"Timestamp": "2016-10-20T12:00:00Z",

"Average": 3.6666666666666665,

"Maximum": 5.0,

"Minimum": 2.0,

"Sum": 11.0,

"Unit": "None"

}

],

"Label": "PageViewCount"

}

Publish Statistic Sets

You can aggregate your data before you publish to CloudWatch. When you have multiple data points per minute, aggregating data minimizes the number of calls to put-metric-data. For example, instead of calling put-metric-data multiple times for three data points that are within three seconds of each other, you can aggregate the data into a statistic set that you publish with one call, using the

--statisticvalues

parameter: aws cloudwatch put-metric-data --metric-name PageViewCount --namespace

MyService

--statistic-values Sum=11,Minimum=2,Maximum=5,SampleCount=3 -timestamp

2016-10-14T12:00:00.000Z

CloudWatch needs raw data points to calculate percentiles. If you publish data using a statistic set instead, you cannot retrieve percentile statistics for this data unless one of the following conditions is true:

• The SampleCount of the statistic set is 1.

• The Min and the Max of the statistic set are equal.

Publish the Value Zero

When your data is more sporadic and you have periods that have no associated data, you can choose to publish the value zero (

0

) for that period or no value at all. If you use periodic calls to

PutMetricData

to monitor the health of your application, you might want to publish zero instead of no value. For example, you can set a CloudWatch alarm to notify you if your application fails to publish metrics every five minutes. You want such an application to publish zeros for periods with no associated data.

You might also publish zeros if you want to track the total number of data points or if you want statistics such as minimum and average to include data points with the value 0.

44


Amazon CloudWatch Metrics and

Dimensions Reference

This reference includes all the namespaces, dimensions, and metrics that you can use with CloudWatch.

Namespaces are containers for metrics. Metrics, which are time-ordered sets of data points, are isolated from one another in different namespaces so that metrics from different applications are not mistakenly aggregated into the same statistics. In addition, each metric has a dimension, which is a name/value pair that you can use to filter metrics.

Metrics and Dimensions

•

AWS Namespaces (p. 46)

•

Amazon API Gateway Metrics and Dimensions (p. 48)

•

AppStream 2.0 Metrics and Dimensions (p. 49)

•

Auto Scaling Metrics and Dimensions (p. 51)

•

AWS Billing and Cost Management Dimensions and Metrics (p. 52)

•

Amazon CloudFront Metrics and Dimensions (p. 53)

•

Amazon CloudSearch Metrics and Dimensions (p. 54)

•

Amazon CloudWatch Events Metrics and Dimensions (p. 55)

•

Amazon CloudWatch Logs Metrics and Dimensions (p. 56)

•

AWS DMS Metrics (p. 58)

•

AWS Direct Connect Metrics and Dimensions (p. 58)

•

Amazon DynamoDB Metrics and Dimensions (p. 59)

•

Amazon EC2 Metrics and Dimensions (p. 70)

•

Amazon EC2 Spot Fleet Metrics and Dimensions (p. 74)

•

Amazon ECS Metrics and Dimensions (p. 75)

•

AWS Elastic Beanstalk Metrics and Dimensions (p. 78)

•

Amazon ElastiCache Metrics and Dimensions (p. 79)

•

Amazon EBS Metrics and Dimensions (p. 85)

•

Amazon EFS Metrics and Dimensions (p. 86)

•

Elastic Load Balancing Metrics and Dimensions (p. 89)

•

Amazon EMR Metrics and Dimensions (p. 97)

•

Amazon Elasticsearch Service Metrics and Dimensions (p. 107)

•

Amazon Elastic Transcoder Metrics and Dimensions (p. 111)

45


AWS Namespaces

•

Amazon GameLift Metrics and Dimensions (p. 112)

•

Amazon Inspector Metrics (p. 118)

•

AWS IoT Metrics and Dimensions (p. 118)

•

Amazon Kinesis Analytics Metrics (p. 121)

•

Amazon Kinesis Firehose Metrics (p. 122)

•

Amazon Kinesis Streams Metrics and Dimensions (p. 124)

•

AWS Key Management Service Metrics and Dimensions (p. 131)

•

AWS Lambda Metrics and Dimensions (p. 132)

•

Amazon Machine Learning Metrics and Dimensions (p. 134)

•

AWS OpsWorks Metrics and Dimensions (p. 135)

•

Amazon Polly Metrics (p. 139)

•

Amazon Redshift Metrics and Dimensions (p. 140)

•

Amazon RDS Metrics and Dimensions (p. 143)

•

Amazon Route 53 Metrics and Dimensions (p. 145)

•

Amazon Simple Email Service Metrics and Dimensions (p. 147)

•

Amazon Simple Notification Service Metrics and Dimensions (p. 148)

•

Amazon SQS Metrics and Dimensions (p. 149)

•

Amazon Simple Storage Service Metrics and Dimensions (p. 152)

•

AWS Shield Advanced Metrics (p. 155)

•

AWS Step Functions Metrics and Dimensions (p. 155)

•

Amazon SWF Metrics and Dimensions (p. 157)

•

AWS Storage Gateway Metrics and Dimensions (p. 159)

•

Amazon VPC NAT Gateway Metrics and Dimensions (p. 166)

•

Amazon VPC VPN Metrics and Dimensions (p. 169)

•

AWS WAF Metrics and Dimensions (p. 170)

•

Amazon WorkSpaces Metrics and Dimensions (p. 171)

AWS Namespaces

CloudWatch namespaces are containers for metrics. Metrics in different namespaces are isolated from each other, so that metrics from different applications are not mistakenly aggregated into the same statistics. All AWS services that provide Amazon CloudWatch data use a namespace string, beginning with "AWS/". When you create custom metrics, you must also specify a namespace as a container for custom metrics. The following services push metric data points to CloudWatch.

AWS Product

Amazon API Gateway

AppStream 2.0

Auto Scaling

AWS Billing

Amazon CloudFront

Amazon CloudSearch

Amazon CloudWatch Events

Namespace

AWS/ApiGateway

AWS/AppStream

AWS/AutoScaling

AWS/Billing

AWS/CloudFront

AWS/CloudSearch

AWS/Events

46

AWS Product

Amazon CloudWatch Logs

AWS Database Migration Service

AWS Direct Connect

Amazon DynamoDB

Amazon EC2

Amazon EC2

Amazon EC2 Container Service

AWS Elastic Beanstalk

Amazon Elastic Block Store

Amazon Elastic File System

Elastic Load Balancing

Elastic Load Balancing

Amazon Elastic Transcoder

Amazon ElastiCache

Amazon Elasticsearch Service

Amazon EMR

Amazon GameLift

Amazon Inspector

AWS IoT

AWS Key Management Service

Amazon Kinesis Analytics

Amazon Kinesis Firehose

Amazon Kinesis Streams

AWS Lambda

Amazon Lex

Amazon Machine Learning

AWS OpsWorks

Amazon Polly

Amazon Redshift

Amazon Relational Database Service


AWS Namespaces

Namespace

AWS/Logs

AWS/DMS

AWS/DX

AWS/DynamoDB

AWS/EC2

AWS/EC2Spot

(Spot Instances)

AWS/ECS

AWS/ElasticBeanstalk

AWS/EBS

AWS/EFS

AWS/ELB

(Classic Load Balancers)

AWS/ApplicationELB

(Application Load

Balancers)

AWS/ElasticTranscoder

AWS/ElastiCache

AWS/ES

AWS/ElasticMapReduce

AWS/GameLift

AWS/Inspector

AWS/IoT

AWS/KMS

AWS/KinesisAnalytics

AWS/Firehose

AWS/Kinesis

AWS/Lambda

AWS/Lex

AWS/ML

AWS/OpsWorks

AWS/Polly

AWS/Redshift

AWS/RDS

47


API Gateway

AWS Product

Amazon Route 53

AWS Shield Advanced

Amazon Simple Email Service

Amazon Simple Notification Service

Amazon Simple Queue Service

Amazon Simple Storage Service

Amazon Simple Workflow Service

AWS Step Functions

AWS Storage Gateway

Amazon VPC

Amazon VPC

AWS WAF

Amazon WorkSpaces

Namespace

AWS/Route53

AWS/DDoSProtection

AWS/SES

AWS/SNS

AWS/SQS

AWS/S3

AWS/SWF

AWS/States

AWS/StorageGateway

AWS/NATGateway

(NAT gateway)

AWS/VPN

(VPN)

WAF

AWS/WorkSpaces

Amazon API Gateway Metrics and Dimensions

The metrics and dimensions that API Gateway sends to Amazon CloudWatch are listed below. For more information, see Monitor API Execution with Amazon CloudWatch in the Amazon API Gateway Developer

Guide.

API Gateway Metrics

Amazon API Gateway sends metric data to CloudWatch every minute.

The

AWS/ApiGateway

namespace includes the following metrics.

Metric

4XXError

5XXError

CacheHitCount

CacheMissCount

Description

The number of client-side errors captured

Unit: count

The number of server-side errors captured.

Unit: count

The number of requests served from the API cache.

Unit: count

The number of requests served from the back end when

API caching is enabled.

Unit: count

48


Dimensions for Metrics

Metric

Count

IntegrationLatency

Latency

Description

The number of calls to API methods.

Unit: count

The time between when API Gateway relays a request to the back end and when it receives a response from the back end.

Unit: millisecond

The time between when API Gateway receives a request from a client and when it returns a response to the client. The latency includes the integration latency and other API Gateway overhead.

Unit: millisecond


You can use the dimensions in the following table to filter API Gateway metrics.

Dimension

ApiName

ApiName, Method, Resource, Stage

ApiName, Stage

Description

Filters API Gateway metrics for an API of the specified

API name.

Filters API Gateway metrics for an API method of the specified API, stage, resource, and method.

API Gateway will not send such metrics unless you have explicitly enabled detailed CloudWatch metrics. You can do this in the console by selecting

Enable CloudWatch

Metrics under a stage Settings tab. Alternatively, you can call the stage:update action of the API Gateway

REST API to update the metricsEnabled

property to true

.

Enabling such metrics will incur additional charges to your account. For pricing information, see Amazon

CloudWatch Pricing .

Filters API Gateway metrics for an API stage of the specified API and stage.

AppStream 2.0 Metrics and Dimensions

The metrics and dimensions that AppStream 2.0 sends to Amazon CloudWatch are listed below. For more information, see Monitor Amazon AppStream 2.0 With Amazon CloudWatch in the Amazon AppStream

2.0 Developer Guide.

49


Amazon AppStream 2.0 Metrics

Amazon AppStream 2.0 Metrics

AppStream 2.0 sends metrics to CloudWatch one time every minute. The

AWS/AppStream


Metric

RunningCapacity

InUseCapacity

PendingCapacity

AvailableCapacity

DesiredCapacity

CapacityUtilization

Description

Total number of instances currently running. Represents the number of concurrent streaming sessions that can be supported by the fleet in its current state.

Units: Count

Valid statistics: Average, Minimum, Maximum

Number of instances currently being used for streaming sessions. One

InUseCapacity

count represents one streaming session.

Units: Count


Number of instances being provisioned by AppStream 2.0. Represents the additional number of streaming sessions the fleet can support after provisioning is complete. When provisioning starts, it usually takes 10-20 minutes for an instance to become available for streaming.

Units: Count


Number of idle instances currently available for user sessions.

AvailableCapacity = RunningCapacity - InUseCapacity

Units: Count


Total number of instances that are either running or pending. This represents the total number of concurrent streaming sessions your fleet can support in a steady state.

DesiredCapacity = RunningCapacity + PendingCapacity

Units: Count


The percentage of instances in a fleet that are being used, using the following formula.

InUseCapacity

CapacityUtilization = -------------------- * 100

RunningCapacity

50


Dimensions for Amazon AppStream 2.0 Metrics

Metric Description

Monitoring this metric helps with decisions about increasing or decreasing the value of a fleet's desired capacity.

Units: Percent


Number of session requests rejected due to lack of capacity. One

InsufficientCapacityError

count represents one session rejection due to insufficient capacity.

You can set alarms to use this metric to be notified of users waiting for streaming sessions.

Units: Count

Valid statistics: Average, Minimum, Maximum, Sum

Dimensions for Amazon AppStream 2.0 Metrics

Amazon AppStream 2.0 provides metrics for the following dimension.

Dimension

Fleet

Description

The name of the AppStream 2.0 fleet. All available statistics are filtered by

Fleet

.

Auto Scaling Metrics and Dimensions

Auto Scaling sends metrics for instances and groups to CloudWatch. For Auto Scaling instances, you can enable detailed (one-minute) monitoring or basic (five-minute) monitoring. For Auto Scaling groups, you can enable group metrics. For more information, see Monitoring Your Auto Scaling Instances and Groups in the Auto Scaling User Guide.

Auto Scaling Group Metrics

If you enable group metrics, Auto Scaling sends aggregated data to CloudWatch every minute.

The

AWS/AutoScaling


Metric

GroupMinSize

GroupMaxSize

GroupDesiredCapacity

GroupInServiceInstances

Description

The minimum size of the Auto Scaling group.

The maximum size of the Auto Scaling group.

The number of instances that the Auto Scaling group attempts to maintain.

The number of instances that are running as part of the Auto

Scaling group. This metric does not include instances that are pending or terminating.

51


Dimensions for Auto Scaling Group Metrics

Metric

GroupPendingInstances

GroupStandbyInstances

GroupTerminatingInstances

GroupTotalInstances

Description

The number of instances that are pending. A pending instance is not yet in service. This metric does not include instances that are in service or terminating.

The number of instances that are in a

Standby

state. Instances in this state are still running but are not actively in service.

The number of instances that are in the process of terminating. This metric does not include instances that are in service or pending.

The total number of instances in the Auto Scaling group. This metric identifies the number of instances that are in service, pending, and terminating.

Dimensions for Auto Scaling Group Metrics

To filter the metrics for your Auto Scaling group by group name, use the

AutoScalingGroupName dimension.

AWS Billing and Cost Management Dimensions and Metrics

The AWS Billing and Cost Management service sends metrics to CloudWatch. For more information, see

Monitoring Charges with Alerts and Notifications in the AWS Billing and Cost Management User Guide.

AWS Billing and Cost Management Metrics

The

AWS/Billing


Metric

EstimatedCharges

Description

The estimated charges for your AWS usage. This can either be estimated charges for one service or a roll-up of estimated charges for all services.

Dimensions for AWS Billing and Cost Management

Metrics

Billing and Cost Management supports filtering metrics by the following dimensions.

Dimension

ServiceName

LinkedAccount

Description

The name of the AWS service. This dimension is omitted for the total of estimated charges across all services.

The linked account number. This is used for consolidated billing only. This dimension is included only for accounts that are linked to a separate paying

52


Amazon CloudFront

Dimension

Currency

Description

account in a consolidated billing relationship. It is not included for accounts that are not linked to a consolidated billing paying account.

The monetary currency to bill the account. This dimension is required.

Unit: USD

Amazon CloudFront Metrics and Dimensions

Amazon CloudFront sends metrics to Amazon CloudWatch for web distributions. Metrics and dimensions are not available for RTMP distributions. For more information, see Monitoring CloudFront Activity Using

CloudWatch in the Amazon CloudFront Developer Guide.

Amazon CloudFront Metrics

The

AWS/CloudFront


Note

Only one statistic, Average or Sum, is applicable for each metric. However, all statistics are available through the console, API, and AWS Command Line Interface. In the following table, each metric specifies the statistic that is applicable to that metric.

Metric

Requests

BytesDownloaded

BytesUploaded

TotalErrorRate

4xxErrorRate

Description

The number of requests for all HTTP methods and for both HTTP and HTTPS requests.

Valid Statistics: Sum

Units: None

The number of bytes downloaded by viewers for

GET

,

HEAD

, and

OPTIONS

requests.


Units: None

The number of bytes uploaded to your origin with

CloudFront using

POST

and

PUT

requests.


Units: None

The percentage of all requests for which the HTTP status code is

4xx

or

5xx

.

Valid Statistics: Average

Units: Percent


4xx

.

53


Dimensions for CloudFront Metrics

Metric

5xxErrorRate

Dimension

DistributionId

Region

Description


Units: Percent


5xx

.


Units: Percent

Dimensions for CloudFront Metrics

CloudFront metrics use the CloudFront namespace and provide metrics for two dimensions:

Description

The CloudFront ID of the distribution for which you want to display metrics.

The region for which you want to display metrics. This value must be

Global

. The

Region

dimension is different from the region in which CloudFront metrics are stored, which is US East (N. Virginia).

Amazon CloudSearch Metrics and Dimensions

Amazon CloudSearch sends metrics to Amazon CloudWatch. For more information, see Monitoring an

Amazon CloudSearch Domain with Amazon CloudWatch in the Amazon CloudSearch Developer Guide.

Amazon CloudSearch Metrics

The

AWS/CloudSearch


Metric

SuccessfulRequests

SearchableDocuments

IndexUtilization

Description

The number of search requests successfully processed by a search instance.

Units: Count

Valid statistics: Maximum, Sum

The number of searchable documents in the domain's search index.

Units: Count

Valid statistics: Maximum

The percentage of the search instance's index capacity that has been used.

The Maximum value indicates the percentage of the domain's index capacity that has been used.

54


Dimensions for Amazon CloudSearch Metrics

Metric

Partitions

Description

Units: Percent

Valid statistics: Average, Maximum

The number of partitions the index is distributed across.

Units: Count

Valid statistics: Minimum, Maximum

Dimensions for Amazon CloudSearch Metrics

Amazon CloudSearch sends the ClientId and DomainName dimensions to CloudWatch.

Dimension

ClientId

DomainName

Description

The AWS account ID.

The name of the search domain.

Amazon CloudWatch Events Metrics and

Dimensions

CloudWatch Events sends metrics to Amazon CloudWatch every minute.

CloudWatch Events Metrics

The

AWS/Events


All of these metrics use Count as the unit, so Sum and SampleCount are the most useful statistics.

Metric

Invocations

FailedInvocations

Description

Measures the number of times a target is invoked for a rule in response to an event. This includes successful and failed invocations, but does not include throttled or retried attempts until they fail permanently.

Note

CloudWatch Events only sends this metric to CloudWatch if it has a non-zero value.

Valid Dimensions: RuleName

Units: Count

Measures the number of invocations that failed permanently. This does not include invocations that are retried or that succeeded after a retry attempt.


Units: Count

55


Dimensions for CloudWatch Events Metrics

Metric

TriggeredRules

MatchedEvents

ThrottledRules

Dimension

RuleName

Description

Measures the number of triggered rules that matched with any event.


Units: Count

Measures the number of events that matched with any rule.

Valid Dimensions: None

Units: Count

Measures the number of triggered rules that are being throttled.


Units: Count

Dimensions for CloudWatch Events Metrics

CloudWatch Events metrics have one dimension, which is listed below.

Description

Filters the available metrics by rule name.

Amazon CloudWatch Logs Metrics and Dimensions

CloudWatch Logs sends metrics to CloudWatch every minute.

CloudWatch Logs Metrics

The

AWS/Logs


Metric

IncomingBytes

IncomingLogEvents

Description

The volume of log events in uncompressed bytes uploaded to CloudWatch

Logs. When used with the

LogGroupName

dimension, this is the volume of log events in uncompressed bytes uploaded to the log group.

Valid Dimensions: LogGroupName

Valid Statistic: Sum

Units: Bytes

The number of log events uploaded to CloudWatch Logs. When used with the

LogGroupName

dimension, this is the number of log events uploaded to the log group.

Valid Dimensions: LogGroupName

56


Dimensions for CloudWatch Logs Metrics

Metric

ForwardedBytes

ForwardedLogEvents

DeliveryErrors

DeliveryThrottling

Description


Units: None

The volume of log events in compressed bytes forwarded to the subscription destination.

Valid Dimensions: LogGroupName, DestinationType, FilterName


Units: Bytes

The number of log events forwarded to the subscription destination.



Units: None

The number of log events for which CloudWatch Logs received an error when forwarding data to the subscription destination.



Units: None

The number of log events for which CloudWatch Logs was throttled when forwarding data to the subscription destination.



Units: None

Dimensions for CloudWatch Logs Metrics

The dimensions that you can use with CloudWatch Logs metrics are listed below.

Dimension

LogGroupName

DestinationType

FilterName

Description

The name of the CloudWatch Logs log group for which to display metrics.

The subscription destination for the CloudWatch Logs data, which can be AWS Lambda, Amazon Kinesis Streams, or Amazon Kinesis

Firehose.

The name of the subscription filter that is forwarding data from the log group to the destination. The subscription filter name is automatically converted by CloudWatch to ASCII and any unsupported characters get replaced with a question mark (?).

57


AWS DMS

AWS DMS Metrics

For information about the AWS DMS metrics that you can use with CloudWatch, see Monitoring AWS

Database Migration Service Tasks in the AWS Database Migration Service User Guide.

AWS Direct Connect Metrics and Dimensions

By default, CloudWatch provides AWS Direct Connect metric data in 5-minute intervals. You can optionally view data in 1-minute intervals. For more information, see Monitoring with CloudWatch in the

AWS Direct Connect User Guide.

AWS Direct Connect Metrics

The following metrics are available from AWS Direct Connect. Metrics are currently available for AWS

Direct Connect physical connections only.

Metric

ConnectionState

ConnectionBpsEgress

ConnectionBpsIngress

ConnectionPpsEgress

ConnectionPpsIngress

Description

The state of the connection. 0 indicates DOWN and 1 indicates UP.

Units: Boolean

The bit rate for outbound data from the AWS side of the connection.

The number reported is the aggregate over the specified time period (5 minutes by default, 1 minute minimum).

Units: Bits per second

The bit rate for inbound data to the AWS side of the connection.


Units: Bits per second

The packet rate for outbound data from the AWS side of the connection.


Units: Packets per second

The packet rate for inbound data to the AWS side of the connection.


58


Dimensions for AWS Direct Connect Metrics

Metric

ConnectionCRCErrorCount

ConnectionLightLevelTx

ConnectionLightLevelRx

Description

Units: Packets per second

The number of times cyclic redundancy check (CRC) errors are observed for the data received at the connection.

Units: Integer

Indicates the health of the fiber connection for egress

(outbound) traffic from the AWS side of the connection.

This metric is available for connections with 10 Gbps port speeds only.

Units: dBm

Indicates the health of the fiber connection for ingress

(inbound) traffic to the AWS side of the connection.

This metric is available for connections with 10 Gbps port speeds only.

Units: dBm

Dimensions for AWS Direct Connect Metrics

You can filter the AWS Direct Connect data using the following dimensions.

Dimension

ConnectionId

Description

This dimension filters the data by the AWS Direct Connect connection.

Amazon DynamoDB Metrics and Dimensions

Amazon DynamoDB sends metrics to CloudWatch. For more information, see Monitoring DynamoDB

Tables with Amazon CloudWatch in the Amazon DynamoDB Developer Guide.

DynamoDB Metrics

The following metrics are available from Amazon DynamoDB. Note that DynamoDB only sends metrics to CloudWatch when they have a non-zero value. For example, the

UserErrors

metric is incremented whenever a request generates an HTTP 400 status code. If no HTTP 400 errors were encountered during a time period, CloudWatch will not provide metrics for

UserErrors

during that period.

Note

Amazon CloudWatch aggregates the following DynamoDB metrics at one-minute intervals:

•

ConditionalCheckFailedRequests

•

ConsumedReadCapacityUnits

•

ConsumedWriteCapacityUnits

•

ReadThrottleEvents

59


DynamoDB Metrics

•

ReturnedBytes

•

ReturnedItemCount

•

ReturnedRecordsCount

•

SuccessfulRequestLatency

•

SystemErrors

•

TimeToLiveDeletedItemCount

•

ThrottledRequests

•

UserErrors

•

WriteThrottleEvents

For all other DynamoDB metrics, the aggregation granularity is five minutes.

Not all statistics, such as Average or Sum, are applicable for every metric. However, all of these values are available through the Amazon DynamoDB console, or by using the CloudWatch console, AWS CLI, or AWS

SDKs for all metrics. In the following table, each metric has a list of Valid Statistics that is applicable to that metric.

Metric



Description

The number of failed attempts to perform conditional writes. The

PutItem

,

UpdateItem

, and

DeleteItem

operations let you provide a logical condition that must evaluate to true before the operation can proceed. If this condition evaluates to false,


is incremented by one.

Note

A failed conditional write will result in an HTTP 400 error (Bad Request). These events are reflected in the


metric, but not in the

UserErrors

metric.

Units:

Count

Dimensions:

TableName

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

The number of read capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used. You can retrieve the total consumed read capacity for a table and all of its global secondary indexes, or for a particular global secondary index. For more information, see Provisioned Throughput in

Amazon DynamoDB .

Note

Use the

Sum

statistic to calculate the consumed throughput. For example, get the

Sum

value over a

60

Metric


DynamoDB Metrics

Description

span of one minute, and divide it by the number of seconds in a minute (60) to calculate the average


per second

(recognizing that this average will not highlight any large but brief spikes in read activity that occurred during that minute). You can compare the calculated value to the provisioned throughput value you provide DynamoDB.

Units:

Count

Dimensions:

TableName, GlobalSecondaryIndexName

Valid Statistics:

•

Minimum

– Minimum number of read capacity units consumed by any individual request to the table or index.

•

Maximum

– Maximum number of read capacity units consumed by any individual request to the table or index.

•

Average

– Average per-request read capacity consumed.

•

Sum

– Total read capacity units consumed. This is the most useful statistic for the


metric.

•

SampleCount

– Number of requests to DynamoDB that consumed read capacity.

61

Metric



DynamoDB Metrics

Description

The number of write capacity units consumed over the specified time period, so you can track how much of your provisioned throughput is used. You can retrieve the total consumed write capacity for a table and all of its global secondary indexes, or for a particular global secondary index. For more information, see Provisioned Throughput in

Amazon DynamoDB .

Note

Use the

Sum

statistic to calculate the consumed throughput. For example, get the

Sum

value over a span of one minute, and divide it by the number of seconds in a minute (60) to calculate the average


per second

(recognizing that this average will not highlight any large but brief spikes in write activity that occurred during that minute). You can compare the calculated value to the provisioned throughput value you provide DynamoDB.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

– Minimum number of write capacity units consumed by any individual request to the table or index.

•

Maximum

– Maximum number of write capacity units consumed by any individual request to the table or index.

•

Average

– Average per-request write capacity consumed.

•

Sum

– Total write capacity units consumed. This is the most useful statistic for the

ConsumedWriteCapacityUnits metric.

•

SampleCount

– Number of requests to DynamoDB that consumed write capacity.

62


DynamoDB Metrics

Metric

OnlineIndexConsumedWriteCapacity

OnlineIndexPercentageProgress

Description

The number of write capacity units consumed when adding a new global secondary index to a table. If the write capacity of the index is too low, incoming write activity during the backfill phase might be throttled; this can increase the time it takes to create the index. You should monitor this statistic while the index is being built to determine whether the write capacity of the index is underprovisioned.

You can adjust the write capacity of the index using the

UpdateTable

operation, even while the index is still being built.

Note that the


metric for the index does not include the write throughput consumed during index creation.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

The percentage of completion when a new global secondary index is being added to a table. DynamoDB must first allocate resources for the new index, and then backfill attributes from the table into the index. For large tables, this process might take a long time. You should monitor this statistic to view the relative progress as DynamoDB builds the index.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

63

Metric

OnlineIndexThrottleEvents


DynamoDB Metrics

ProvisionedReadCapacityUnits

Description

The number of write throttle events that occur when adding a new global secondary index to a table. These events indicate that the index creation will take longer to complete, because incoming write activity is exceeding the provisioned write throughput of the index.

You can adjust the write capacity of the index using the

UpdateTable

operation, even while the index is still being built.

Note that the

WriteThrotttleEvents

metric for the index does not include any throttle events that occur during index creation.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

The number of provisioned read capacity units for a table or a global secondary index.

The

TableName

dimension returns the


for the table, but not for any global secondary indexes. To view


for a global secondary index, you must specify both

TableName

and

GlobalSecondaryIndex

.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

– Lowest setting for provisioned read capacity. If you use

UpdateTable

to increase read capacity, this metric shows the lowest value of provisioned

ReadCapacityUnits during this time period.

•

Maximum

– Highest setting for provisioned read capacity.

If you use

UpdateTable

to decrease read capacity, this metric shows the highest value of provisioned

ReadCapacityUnits

during this time period.

•

Average

– Average provisioned read capacity. The


metric is published at five-minute intervals. Therefore, if you rapidly adjust the provisioned read capacity units, this statistic might not reflect the true average.

64

Metric

ReadThrottleEvents


DynamoDB Metrics

ProvisionedWriteCapacityUnits

Description

The number of provisioned write capacity units for a table or a global secondary index

The

TableName



for the table, but not for any global secondary indexes. To view



TableName

and


.

Units:

Count

Dimensions:


Valid Statistics:

•

Minimum

– Lowest setting for provisioned write capacity.

If you use

UpdateTable

to increase write capacity, this metric shows the lowest value of provisioned

WriteCapacityUnits


•

Maximum

– Highest setting for provisioned write capacity.

If you use

UpdateTable

to decrease write capacity, this metric shows the highest value of provisioned

WriteCapacityUnits


•

Average

– Average provisioned write capacity. The


metric is published at five-minute intervals. Therefore, if you rapidly adjust the provisioned write capacity units, this statistic might not reflect the true average.

Requests to DynamoDB that exceed the provisioned read capacity units for a table or a global secondary index.

A single request can result in multiple events. For example, a

BatchGetItem

that reads 10 items is processed as ten

GetItem

events. For each event,

ReadThrottleEvents is incremented by one if that event is throttled. The

ThrottledRequests

metric for the entire

BatchGetItem

is not incremented unless all ten of the

GetItem

events are throttled.

The

TableName


ReadThrottleEvents for the table, but not for any global secondary indexes. To view

ReadThrottleEvents


TableName

and


.

Units:

Count

Dimensions:


Valid Statistics:

•

SampleCount

•

Sum

65

Metric

ReturnedBytes

ReturnedItemCount

ReturnedRecordsCount


DynamoDB Metrics

Description

The number of bytes returned by

GetRecords

operations

(Amazon DynamoDB Streams) during the specified time period.

Units:

Bytes

Dimensions:

Operation, StreamLabel, TableName

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

The number of items returned by

Query

or

Scan

operations during the specified time period.

Note that the number of items returned is not necessarily the same as the number of items that were evaluated. For example, suppose you requested a

Scan

on a table that had

100 items, but specified a

FilterExpression

that narrowed the results so that only 15 items were returned. In this case, the response from

Scan

would contain a

ScanCount

of 100 and a

Count

of 15 returned items.

Units:

Count

Dimensions:

TableName, Operation

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

The number of stream records returned by

GetRecords operations (Amazon DynamoDB Streams) during the specified time period.

Units:

Count

Dimensions:

Operation, StreamLabel, TableName

Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

•

Sum

66

Metric


SystemErrors

TimeToLiveDeletedItemCount


DynamoDB Metrics

Description

Successful requests to DynamoDB or Amazon

DynamoDB Streams during the specified time period.


can provide two different kinds of information:

• The elapsed time for successful requests (

Minimum

,

Maximum

,

Sum

, or

Average

).

• The number of successful requests (

SampleCount

).


reflects activity only within

DynamoDB or Amazon DynamoDB Streams, and does not take into account network latency or client-side activity.

Units:

Milliseconds

Dimensions:


Valid Statistics:

•

Minimum

•

Maximum

•

Average

•

SampleCount

Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 500 status code during the specified time period. An HTTP 500 usually indicates an internal service error.

Units:

Count

Dimensions: All dimensions

Valid Statistics:

•

Sum

•

SampleCount

The number of items deleted by Time To Live (TTL) during the specified time period. This metric helps you monitor the rate of TTL deletions on your table.

Units:

Count

Dimensions: TableName

Valid Statistics:

•

Sum

67

Metric

ThrottledRequests


DynamoDB Metrics

Description

Requests to DynamoDB that exceed the provisioned throughput limits on a resource (such as a table or an index).

ThrottledRequests

is incremented by one if any event within a request exceeds a provisioned throughput limit.

For example, if you update an item in a table with global secondary indexes, there are multiple events—a write to the table, and a write to each index. If one or more of these events are throttled, then

ThrottledRequests

is incremented by one.

Note

In a batch request (

BatchGetItem

or

BatchWriteItem

),

ThrottledRequests

is only incremented if every request in the batch is throttled.

If any individual request within the batch is throttled, one of the following metrics is incremented:

•

ReadThrottleEvents

– For a throttled

GetItem event within

BatchGetItem

.

•

WriteThrottleEvents

– For a throttled

PutItem

or

DeleteItem

event within

BatchWriteItem

.

To gain insight into which event is throttling a request, compare

ThrottledRequests

with the

ReadThrottleEvents and

WriteThrottleEvents

for the table and its indexes.

Note

A throttled request will result in an HTTP 400 status code. All such events are reflected in the

ThrottledRequests

metric, but not in the

UserErrors

metric.

Units:

Count

Dimensions:


Valid Statistics:

•

Sum

•

SampleCount

68

Metric

UserErrors

WriteThrottleEvents


DynamoDB Metrics

Description

Requests to DynamoDB or Amazon DynamoDB Streams that generate an HTTP 400 status code during the specified time period. An HTTP 400 usually indicates a client-side error such as an invalid combination of parameters, attempting to update a nonexistent table, or an incorrect request signature.

All such events are reflected in the

UserErrors

metric, except for the following:

• ProvisionedThroughputExceededException – See the

ThrottledRequests

metric in this section.

• ConditionalCheckFailedException – See the


metric in this section.

UserErrors

represents the aggregate of HTTP 400 errors for

DynamoDB or Amazon DynamoDB Streams requests for the current region and the current AWS account.

Units:

Count

Valid Statistics:

•

Sum

•

SampleCount

Requests to DynamoDB that exceed the provisioned write capacity units for a table or a global secondary index.

A single request can result in multiple events. For example, a

PutItem

request on a table with three global secondary indexes would result in four events—the table write, and each of the three index writes. For each event, the

WriteThrottleEvents

metric is incremented by one if that event is throttled. For single

PutItem

requests, if any of the events are throttled,

ThrottledRequests

is also incremented by one. For

BatchWriteItem

, the

ThrottledRequests

metric for the entire

BatchWriteItem

is not incremented unless all of the individual

PutItem

or

DeleteItem

events are throttled.

The

TableName


WriteThrottleEvents for the table, but not for any global secondary indexes. To view

WriteThrottleEvents


TableName

and


.

Units:

Count

Dimensions:


Valid Statistics:

•

Sum

•

SampleCount

69


Dimensions for DynamoDB Metrics

Dimensions for DynamoDB Metrics

The metrics for DynamoDB are qualified by the values for the account, table name, global secondary index name, or operation. You can use the CloudWatch console to retrieve DynamoDB data along any of the dimensions in the table below.

Dimension

GlobalSecondaryIndexName

Operation

StreamLabel

TableName

Description

This dimension limits the data to a global secondary index on a table. If you specify

GlobalSecondaryIndexName

, you must also specify

TableName

.

This dimension limits the data to one of the following DynamoDB operations:

•

PutItem

•

DeleteItem

•

UpdateItem

•

GetItem

•

BatchGetItem

•

Scan

•

Query

•

BatchWriteItem

In addition, you can limit the data to the following Amazon

DynamoDB Streams operation:

•

GetRecords

This dimension limits the data to a specific stream label. It is used with metrics originating from Amazon DynamoDB Streams

GetRecords

operations.

This dimension limits the data to a specific table. This value can be any table name in the current region and the current AWS account.

Amazon EC2 Metrics and Dimensions

Amazon Elastic Compute Cloud (Amazon EC2) sends metrics to CloudWatch for your EC2 instances. Basic

(five-minute) monitoring is enabled by default. You can enable detailed (one-minute) monitoring. For information about additional metrics for Amazon EC2 instances that are in an Auto Scaling group, see

Auto Scaling Metrics and Dimensions (p. 51)

.

For more information about how to monitor Amazon EC2, see Monitoring Your Instances with

CloudWatch in the Amazon EC2 User Guide for Linux Instances.

Amazon EC2 Metrics

The following metrics are available from each EC2 instance.

The

AWS/EC2

namespace includes the following CPU credit metrics for your T2 instances.

70


Amazon EC2 Metrics

Metric

CPUCreditUsage

CPUCreditBalance

Description

[T2 instances] The number of CPU credits consumed by the instance. One CPU credit equals one vCPU running at 100% utilization for one minute or an equivalent combination of vCPUs, utilization, and time (for example, one vCPU running at 50% utilization for two minutes or two vCPUs running at 25% utilization for two minutes).

CPU credit metrics are available only at a 5 minute frequency. If you specify a period greater than five minutes, use the

Sum

statistic instead of the

Average

statistic.

Units: Count

[T2 instances] The number of CPU credits available for the instance to burst beyond its base CPU utilization. Credits are stored in the credit balance after they are earned and removed from the credit balance after they expire. Credits expire 24 hours after they are earned.

CPU credit metrics are available only at a 5 minute frequency.

Units: Count

The

AWS/EC2

namespace includes the following instance metrics.

Metric

CPUUtilization

DiskReadOps

DiskWriteOps

Description

The percentage of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance.

To use the percentiles statistic, you must enable detailed monitoring.

Depending on the instance type, tools in your operating system can show a lower percentage than CloudWatch when the instance is not allocated a full processor core.

Units: Percent

Completed read operations from all instance store volumes available to the instance in a specified period of time.

To calculate the average I/O operations per second (IOPS) for the period, divide the total operations in the period by the number of seconds in that period.

Units: Count

Completed write operations to all instance store volumes available to the instance in a specified period of time.


71

Metric

DiskReadBytes

DiskWriteBytes

NetworkIn

NetworkOut

NetworkPacketsIn


Amazon EC2 Metrics

Description

Units: Count

Bytes read from all instance store volumes available to the instance.

This metric is used to determine the volume of the data the application reads from the hard disk of the instance. This can be used to determine the speed of the application.

The number reported is the number of bytes received during the period. If you are using basic (five-minute) monitoring, you can divide this number by 300 to find Bytes/second. If you have detailed (one-minute) monitoring, divide it by 60.

Units: Bytes

Bytes written to all instance store volumes available to the instance.

This metric is used to determine the volume of the data the application writes onto the hard disk of the instance. This can be used to determine the speed of the application.


Units: Bytes

The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to a single instance.


Units: Bytes

The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic from a single instance.


Units: Bytes

The number of packets received on all network interfaces by the instance. This metric identifies the volume of incoming traffic in terms of the number of packets on a single instance. This metric is available for basic monitoring only.

Units: Count

Statistics: Minimum, Maximum, Average

72


Dimensions for Amazon EC2 Metrics

Metric

NetworkPacketsOut

Description

The number of packets sent out on all network interfaces by the instance. This metric identifies the volume of outgoing traffic in terms of the number of packets on a single instance. This metric is available for basic monitoring only.

Units: Count


The

AWS/EC2

namespace includes the following status checks metrics. Status check metrics are available at a 1 minute frequency. For a newly-launched instance, status check metric data is only available after the instance has completed the initialization state (within a few minutes of the instance entering the running state). For more information about EC2 status checks, see Status Checks For Your Instances .

Metric

StatusCheckFailed

StatusCheckFailed_Instance

StatusCheckFailed_System

Description

Reports whether the instance has passed both the instance status check and the system status check in the last minute.

This metric can be either 0 (passed) or 1 (failed).

Units: Count

Reports whether the instance has passed the instance status check in the last minute.


Units: Count

Reports whether the instance has passed the system status check in the last minute.


Units: Count

Amazon CloudWatch data for a new EC2 instance typically becomes available within one minute of the end of the first period of time requested (the aggregation period) in the query. You can set the period—the length of time over which statistics are aggregated—with the Period parameter. For more

information on periods, see Periods (p. 6) .

You can use the currently available dimensions for EC2 instances (for example,

ImageId

or

InstanceType

) to refine the metrics returned. For information about the dimensions you can use with EC2, see

Dimensions for Amazon EC2 Metrics (p. 73) .

Dimensions for Amazon EC2 Metrics

If you're using Detailed Monitoring, you can filter the EC2 instance data using any of the dimensions in the following table.

73


Amazon EC2 Spot Fleet

Dimension

AutoScalingGroupName

ImageId

InstanceId

InstanceType

Description

This dimension filters the data you request for all instances in a specified capacity group. An Auto Scaling group is a collection of instances you define if you're using Auto Scaling. This dimension is available only for Amazon EC2 metrics when the instances are in such an Auto Scaling group. Available for instances with Detailed or

Basic Monitoring enabled.

This dimension filters the data you request for all instances running this Amazon EC2 Amazon Machine Image (AMI). Available for instances with Detailed Monitoring enabled.

This dimension filters the data you request for the identified instance only. This helps you pinpoint an exact instance from which to monitor data.

This dimension filters the data you request for all instances running with this specified instance type. This helps you categorize your data by the type of instance running. For example, you might compare data from an m1.small instance and an m1.large

instance to determine which has the better business value for your application. Available for instances with Detailed Monitoring enabled.

Amazon EC2 Spot Fleet Metrics and Dimensions

Amazon Elastic Compute Cloud (Amazon EC2) sends information about your Spot fleet to CloudWatch.

For more information, see CloudWatch Metrics for Spot Fleet in the Amazon EC2 User Guide for Linux

Instances.

Amazon EC2 Spot Fleet Metrics

The

AWS/EC2Spot

namespace includes the following metrics, plus the CloudWatch metrics for the Spot instances in your fleet.

The

AWS/EC2Spot


Metric

AvailableInstancePoolsCount

BidsSubmittedForCapacity

EligibleInstancePoolCount

Description

The Spot Instance pools specified in the Spot Fleet request.

Units: Count

The capacity for which Amazon EC2 has submitted bids.

Units: Count

The Spot Instance pools specified in the Spot Fleet request where Amazon EC2 can fulfill bids. Amazon EC2 will not fulfill bids in pools where your bid price is less than the Spot price or the Spot price is greater than the price for On-Demand instances.

Units: Count

74


Dimensions for Amazon EC2 Spot Fleet Metrics

Metric

FulfilledCapacity

MaxPercentCapacityAllocation

PendingCapacity

PercentCapacityAllocation

TargetCapacity

TerminatingCapacity

Description

The capacity that Amazon EC2 has fulfilled.

Units: Count

The maximum value of

PercentCapacityAllocation

across all

Spot Instance pools specified in the Spot Fleet request.

Units: Percent

The difference between

TargetCapacity

and

FulfilledCapacity

.

Units: Count

The capacity allocated for the Spot Instance pool for the specified dimensions. To get the maximum value recorded across all Spot Instance pools, use

MaxPercentCapacityAllocation

.

Units: Percent

The target capacity of the Spot Fleet request.

Units: Count

The capacity that is being terminated due to Spot Instance interruptions.

Units: Count

If the unit of measure for a metric is

Count

, the most useful statistic is

Average

.

Dimensions for Amazon EC2 Spot Fleet Metrics

You can filter the data using the following dimensions.

Dimensions

AvailabilityZone

FleetRequestId

InstanceType

Description

Filter the data by Availability Zone.

Filter the data by Spot Fleet request.

Filter the data by instance type.

Amazon ECS Metrics and Dimensions

Amazon EC2 Container Service (Amazon ECS) sends metrics to Amazon CloudWatch. For more information, see Amazon ECS CloudWatch Metrics in the Amazon EC2 Container Service Developer Guide.

Amazon ECS Metrics

Amazon ECS provides metrics for you to monitor the CPU and memory reservation and utilization across your cluster as a whole, and the CPU and memory utilization on the services in your clusters.

75


Amazon ECS Metrics

Amazon ECS sends the following metrics to CloudWatch every minute. When Amazon ECS collects metrics, it collects multiple data points per customer instance per minute. It then aggregates them to one data point before sending the data to CloudWatch. So in CloudWatch, one sample count is actually the aggregate of multiple data points per instance during one minute.

Metric

CPUReservation

CPUUtilization

MemoryReservation

Description

The percentage of CPU units that are reserved by running tasks in the cluster.

Cluster CPU reservation (this metric can only be filtered by

ClusterName

) is measured as the total CPU units that are reserved by Amazon ECS tasks on the cluster, divided by the total CPU units that were registered for all of the container instances in the cluster.

Valid Dimensions:

ClusterName

,

ServiceName

Valid Statistics: Average, Minimum, Maximum, Sum,

Data Samples.

Unit: Percent

The percentage of CPU units that are used in the cluster or service.

Cluster CPU utilization (metrics that are filtered by

ClusterName

without

ServiceName

) is measured as the total CPU units in use by Amazon ECS tasks on the cluster, divided by the total CPU units that were registered for all of the container instances in the cluster.

Service CPU utilization (metrics that are filtered by

ClusterName

and

ServiceName

) is measured as the total CPU units in use by the tasks that belong to the service, divided by the total number of CPU units that are reserved for the tasks that belong to the service.

Valid Dimensions:

ClusterName

,

ServiceName


Data Samples.

Unit: Percent

The percentage of memory that is reserved by running tasks in the cluster.

Cluster memory reservation (this metric can only be filtered by

ClusterName

) is measured as the total memory that is reserved by Amazon ECS tasks on the cluster, divided by the total amount of memory that was registered for all of the container instances in the cluster.

Valid Dimensions:

ClusterName

,

ServiceName

76


Dimensions for Amazon ECS Metrics

Metric

MemoryUtilization

Description


Data Samples.

Unit: Percent

The percentage of memory that is used in the cluster or service.

Cluster memory utilization (metrics that are filtered by

ClusterName

without

ServiceName

) is measured as the total memory in use by Amazon ECS tasks on the cluster, divided by the total amount of memory that was registered for all of the container instances in the cluster.

Service memory utilization (metrics that are filtered by

ClusterName

and

ServiceName

) is measured as the total memory in use by the tasks that belong to the service, divided by the total memory that is reserved for the tasks that belong to the service.

Valid Dimensions:

ClusterName

,

ServiceName


Data Samples.

Unit: Percent

Note

On Linux instances, the Amazon ECS container agent relies on Docker stats

metrics to gather

CPU and memory data for each container running on the instance. If you are using an Amazon

ECS agent prior to version 1.14.0, ECS includes filesystem cache usage when reporting memory utilization to CloudWatch so your CloudWatch graphs show a higher than actual memory utilization for tasks. To remediate this, starting with Amazon ECS agent version 1.14.0, the

Amazon ECS container agent excludes the filesystem cache usage from the memory utilization metric. This change does not impact the out-of-memory behavior of containers.

Dimensions for Amazon ECS Metrics

Amazon ECS metrics use the

AWS/ECS

namespace and provide metrics for the following dimensions:

Dimension

ClusterName

ServiceName

Description

This dimension filters the data you request for all resources in a specified cluster. All Amazon ECS metrics are filtered by

ClusterName

.

This dimension filters the data you request for all resources in a specified service within a specified cluster.

77


Elastic Beanstalk

AWS Elastic Beanstalk Metrics and Dimensions

AWS Elastic Beanstalk sends metrics to Amazon CloudWatch. For more information, see Publishing

Amazon CloudWatch Custom Metrics for an Environment in the AWS Elastic Beanstalk Developer Guide.

Elastic Beanstalk Metrics

The

AWS/ElasticBeanstalk


Metric

EnvironmentHealth

InstancesOk

InstancesPending

InstancesInfo

InstancesUnknown

InstancesNoData

InstancesWarning

InstancesDegraded

InstancesSevere

ApplicationRequestsTotal

ApplicationRequests2xx




ApplicationLatencyP10







ApplicationLatencyP99.9

LoadAverage1min

Description

[Environment] The health status of the environment. The possible values are 0 (OK), 1 (Info), 5 (Unknown), 10 (No data), 15 (Warning),

20 (Degraded) and 25 (Severe).

[Environment] The number of instances with OK health status.

[Environment] The number of instances with Pending health status.

[Environment] The number of instances with Info health status.

[Environment] The number of instances with Unknown health status.

[Environment] The number of instances with no health status data.

[Environment] The number of instances with Warning health status.

[Environment] The number of instances with Degraded health status.

[Environment] The number of instances with Severe health status.

The number of requests completed by the instance or environment.

The number of requests that completed with a 2XX status code.




The average time to complete the fastest 10 percent of requests.







The average time to complete the fastest x percent of requests.

[Instance] The average CPU load over the last minute.

78


Dimensions for Elastic Beanstalk Metrics

Metric

InstanceHealth

RootFilesystemUtil

CPUIrq

CPUUser

CPUIdle

CPUSystem

CPUSoftirq

CPUIowait

CPUNice

Description

[Instance] The health status of the instance.

[Instance] The percentage of disk space in use.

[Instance] The percentage of time the CPU was in this state in the last minute.







Dimensions for Elastic Beanstalk Metrics

You can filter the data using the following dimensions.

Dimensions

EnvironmentName

InstanceId

Description

Filter the data by environment.

Filter the data by instance.

Amazon ElastiCache Metrics and Dimensions

Amazon ElastiCache sends metrics to Amazon CloudWatch. For more information, see Viewing Cache

Cluster and Cache Node Metrics in the Amazon ElastiCache User Guide.

Contents

•

Dimensions for ElastiCache Metrics (p. 79)

•

Host-Level Metrics (p. 80)

•

Metrics for Memcached (p. 80)

•

Metrics for Redis (p. 83)

Dimensions for ElastiCache Metrics

All ElastiCache metrics use the

AWS/ElastiCache

namespace and provide metrics for a single dimension, the

CacheNodeId

, which is the automatically-generated identifier for each cache node in the cache cluster. You can find out what these values are for your cache nodes by using the

79


Host-Level Metrics

DescribeCacheClusters

API or describe-cache-clusters command line utility. For more information, see

DescribeCacheClusters in the Amazon ElastiCache API Reference and describe-cache-clusters in the AWS

Command Line Interface Reference.

Each metric is published under a single set of dimensions. When retrieving metrics, you must supply both the

CacheClusterId

and

CacheNodeId

dimensions.

Contents

•


•


•


• Which Metrics Should I Monitor?

Host-Level Metrics

The

AWS/ElastiCache

namespace includes the following host-level metrics for individual cache nodes.

See Also

•


•


Metric

CPUUtilization

FreeableMemory

NetworkBytesIn

NetworkBytesOut

SwapUsage

Description Unit

The percentage of CPU utilization.

Percent

The amount of free memory available on the host. Bytes

Bytes The number of bytes the host has read from the network.

The number of bytes the host has written to the network.

The amount of swap used on the host.

Bytes

Bytes

Metrics for Memcached

The

AWS/ElastiCache

namespace includes the following metrics that are derived from the Memcached

stats command. Each metric is calculated at the cache node level.

For complete documentation of the Memcached stats command, go to https://github.com/ memcached/memcached/blob/master/doc/protocol.txt

.

See Also

•


Metric

BytesReadIntoMemcached

Description

The number of bytes that have been read from the network by the cache node.

Unit

Bytes

80



Metric

BytesUsedForCacheItems

BytesWrittenOutFromMemcached

CasBadval

CasHits

CasMisses

CmdFlush

CmdGet

CmdSet

CurrConnections

CurrItems

DecrHits

DecrMisses

DeleteHits

DeleteMisses

Evictions

GetHits

GetMisses

IncrHits

Description

The number of bytes used to store cache items.

The number of bytes that have been written to the network by the cache node.

The number of CAS (check and set) requests the cache has received where the Cas value did not match the Cas value stored.

The number of Cas requests the cache has received where the requested key was found and the Cas value matched.

The number of Cas requests the cache has received where the key requested was not found.

The number of flush commands the cache has received.

The number of get commands the cache has received.

The number of

set commands the cache has received.

A count of the number of connections connected to the cache at an instant in time. ElastiCache uses two to three of the connections to monitor the cluster in each case.

A count of the number of items currently stored in the cache.

The number of decrement requests the cache has received where the requested key was found.

The number of decrement requests the cache has received where the requested key was not found.

The number of delete requests the cache has received where the requested key was found.

The number of delete requests the cache has received where the requested key was not found.

The number of non-expired items the cache evicted to allow space for new writes.

The number of get requests the cache has received where the key requested was found.

The number of get requests the cache has received where the key requested was not found.

The number of increment requests the cache has received where the key requested was found.

Unit

Bytes

Bytes

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

Count

81



Metric

IncrMisses

Reclaimed

Description

The number of increment requests the cache has received where the key requested was not found.

The number of expired items the cache evicted to allow space for new writes.

Unit

Count

Count

For Memcached 1.4.14, the following additional metrics are provided.

Metric

BytesUsedForHash

CmdConfigGet

CmdConfigSet

CmdTouch

CurrConfig

EvictedUnfetched

ExpiredUnfetched

SlabsMoved

TouchHits

TouchMisses

Description

The number of bytes currently used by hash tables.

The cumulative number of config get requests.

The cumulative number of config set requests.

The cumulative number of touch requests.

The current number of configurations stored.

The number of valid items evicted from the least recently used cache (LRU) which were never touched after being set.

The number of expired items reclaimed from the

LRU which were never touched after being set.

The total number of slab pages that have been moved.

The number of keys that have been touched and were given a new expiration time.

The number of items that have been touched, but were not found.

Unit

Bytes

Count

Count

Count

Count

Count

Count

Count

Count

Count

The

AWS/ElastiCache

namespace includes the following calculated cache-level metrics.

Metric

NewConnections

NewItems

UnusedMemory

Description

The number of new connections the cache has received. This is derived from the memcached total_connections

statistic by recording the change in total_connections

across a period of time. This will always be at least 1, due to a connection reserved for a ElastiCache.

The number of new items the cache has stored.

This is derived from the memcached total_items statistic by recording the change in total_items across a period of time.

The amount of memory not used by data.

This is derived from the Memcached statistics

Unit

Count

Count

Bytes

82


Metrics for Redis


limit_maxbytes

and bytes

by subtracting bytes from limit_maxbytes

.

Because Memcached overhead uses memory in addition to that used by data,

UnusedMemory should not be considered to be the amount of memory available for additional data. You may experience evictions even though you still have some unused memory.

For more detailed information, see Memcached item memory usage .

Unit

Metrics for Redis

The

AWS/ElastiCache

namespace includes the following Redis metrics.

With the exception of

ReplicationLag

, these metrics are derived from the Redis info command. Each metric is calculated at the cache node level.

For complete documentation of the Redis info command, go to http://redis.io/commands/info .

See Also

•


Metric

BytesUsedForCache

CacheHits

CacheMisses

CurrConnections

Evictions

HyperLogLogBasedCmds

NewConnections

Reclaimed

ReplicationBytes

Description

The total number of bytes allocated by Redis.

The number of successful key lookups.

The number of unsuccessful key lookups.

The number of client connections, excluding connections from read replicas. ElastiCache uses two to three of the connections to monitor the cluster in each case.

The number of keys that have been evicted due to the maxmemory

limit.

The total number of

HyperLogLog

based commands. This is derived from the Redis commandstats

statistic by summing all of the

pf

type of commands ( pfadd, pfcount, pfmerge).

The total number of connections that have been accepted by the server during this period.

The total number of key expiration events.

For primaries with attached replicas,

ReplicationBytes

reports the number of bytes that the primary is sending to all of its replicas.

Unit

Bytes

Count

Count

Count

Count

Count

Count

Count

Bytes

83


Metrics for Redis

Metric

ReplicationLag

SaveInProgress

Description

This metric is representative of the write load on the replication group. For replicas and standalone primaries,

ReplicationBytes

is always 0.

This metric is only applicable for a node running as a read replica. It represents how far behind, in seconds, the replica is in applying changes from the primary node.

This binary metric returns 1 whenever a background save (forked or forkless) is in progress, and 0 otherwise. A background save process is typically used during snapshots and syncs. These operations can cause degraded performance.

Using the

SaveInProgress

metric, you can diagnose whether or not degraded performance was caused by a background save process.

Unit

Seconds

Count

These are aggregations of certain kinds of commands, derived from info commandstats:

Metric

CurrItems

GetTypeCmds

HashBasedCmds

KeyBasedCmds

ListBasedCmds

SetBasedCmds

SetTypeCmds

Description

The number of items in the cache. This is derived from the Redis keyspace

statistic, summing all of the keys in the entire keyspace.

The total number of get types of commands. This is derived from the Redis commandstats

statistic by summing all of the

get types of commands (get,

mget, hget, etc.)

The total number of commands that are hash-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more hashes.

The total number of commands that are key-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more keys.

The total number of commands that are list-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more lists.

The total number of commands that are set-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more sets.

The total number of set types of commands. This is derived from the Redis commandstats

statistic by summing all of the set types of commands (set,

hset, etc.)

Unit

Count

Count

Count

Count

Count

Count

Count

84


Amazon EBS

Metric

SortedSetBasedCmds

StringBasedCmds

Description

The total number of commands that are sorted set-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more sorted sets.

The total number of commands that are string-based. This is derived from the Redis commandstats

statistic by summing all of the commands that act upon one or more strings.

Unit

Count

Count

Amazon EBS Metrics and Dimensions

Amazon EBS Metrics

Amazon Elastic Block Store (Amazon EBS) sends data points to CloudWatch for several metrics. Amazon

EBS General Purpose SSD (gp2), Throughput Optimized HDD (st1) , Cold HDD (sc1), and Magnetic

(standard) volumes automatically send five-minute metrics to CloudWatch. Provisioned IOPS SSD (io1) volumes automatically send one-minute metrics to CloudWatch. For more information about how to monitor Amazon EBS, see Monitoring the Status of Your Volumes in the Amazon EC2 User Guide for Linux

Instances.

The

AWS/EBS


Metric

VolumeReadBytes

VolumeWriteBytes

VolumeReadOps

VolumeWriteOps

VolumeTotalReadTime

VolumeTotalWriteTime

Description

Provides information on the I/O operations in a specified period of time. The

Sum

statistic reports the total number of bytes transferred during the period. The

Average

statistic reports the average size of each I/O operation during the period. The

SampleCount

statistic reports the total number of I/O operations during the period. The

Minimum

and

Maximum

statistics are not relevant for this metric. Data is only reported to Amazon

CloudWatch when the volume is active. If the volume is idle, no data is reported to Amazon CloudWatch.

Units: Bytes

The total number of I/O operations in a specified period of time.


Units: Count

The total number of seconds spent by all operations that completed in a specified period of time. If multiple requests are submitted at the same time, this total could be greater than the length of the period. For example, for a period of 5 minutes (300 seconds): if 700 operations completed during that period, and each operation took 1 second, the value would be 700 seconds.

Units: Seconds

85


Dimensions for Amazon EBS Metrics

Metric

VolumeIdleTime

VolumeQueueLength

VolumeThroughputPercentage

VolumeConsumedReadWriteOps

BurstBalance

Description

The total number of seconds in a specified period of time when no read or write operations were submitted.

Units: Seconds

The number of read and write operation requests waiting to be completed in a specified period of time.

Units: Count

Used with Provisioned IOPS SSD volumes only. The percentage of I/O operations per second (IOPS) delivered of the total IOPS provisioned for an Amazon EBS volume. Provisioned IOPS SSD volumes deliver within 10 percent of the provisioned IOPS performance 99.9 percent of the time over a given year.

During a write, if there are no other pending I/O requests in a minute, the metric value will be 100 percent. Also, a volume's I/O performance may become degraded temporarily due to an action you have taken (for example, creating a snapshot of a volume during peak usage, running the volume on a non-EBS-optimized instance, accessing data on the volume for the first time).

Units: Percent

Used with Provisioned IOPS SSD volumes only. The total amount of read and write operations (normalized to 256K capacity units) consumed in a specified period of time.

I/O operations that are smaller than 256K each count as 1 consumed IOPS. I/O operations that are larger than 256K are counted in 256K capacity units. For example, a 1024K I/O would count as 4 consumed IOPS.

Units: Count

Used with General Purpose SSD ( gp2

), Throughput Optimized

HDD ( st1

), and Cold HDD ( sc1

) volumes only. Provides information about the percentage of I/O credits (for gp2

) or throughput credits (for st1

and sc1

) remaining in the burst bucket. Data is reported to CloudWatch only when the volume is active. If the volume is not attached, no data is reported.

Units: Percent

Dimensions for Amazon EBS Metrics

The only dimension that Amazon EBS sends to CloudWatch is

VolumeId

. All available statistics are filtered by

VolumeId

.

Amazon EFS Metrics and Dimensions

Amazon EFS sends metrics to CloudWatch for every Amazon EFS file system every minute. For more information, see Monitor Metrics with CloudWatch in the Amazon Elastic File System User Guide.

86


Amazon CloudWatch Metrics for Amazon EFS


The

AWS/EFS


Metric

BurstCreditBalance

ClientConnections

DataReadIOBytes

DataWriteIOBytes

Description

The number of burst credits that a file system has.

Burst credits allow a file system to burst to throughput levels above a file system’s baseline level for periods of time. For more information, see

Throughput scaling in Amazon EFS .

The

Minimum

statistic is the smallest burst credit balance for any minute during the period. The

Maximum

statistic is the largest burst credit balance for any minute during the period. The

Average

statistic is the average burst credit balance during the period.

Units: Bytes

Valid statistics:

Minimum

,

Maximum

,

Average

The number of client connections to a file system. When using a standard client, there is one connection per mounted Amazon EC2 instance.

Note

To calculate the average

ClientConnections

for periods greater than one minute, divide the

Sum

statistic by the number of minutes in the period.

Units: Count of client connections

Valid statistics:

Sum

The number of bytes for each file system read operation.

The

Sum

statistic is the total number of bytes associated with read operations. The

Minimum

statistic is the size of the smallest read operation during the period. The

Maximum

statistic is the size of the largest read operation during the period. The

Average

statistic is the average size of read operations during the period. The

SampleCount

statistic provides a count of read operations.

Units:

• Bytes for

Minimum

,

Maximum

,

Average

, and

Sum

.

• Count for

SampleCount

.

Valid statistics:

Minimum

,

Maximum

,

Average

,

Sum

,

SampleCount

The number of bytes for each file write operation.

The

Sum

statistic is the total number of bytes associated with write operations. The

Minimum

statistic is the size of the smallest write operation during the period. The

Maximum

statistic is the size of the largest write operation during the period. The

Average

statistic is the average size of write operations during the period. The

SampleCount

statistic provides a count of write operations.

87



Metric

MetadataIOBytes

PercentIOLimit

PermittedThroughput

Description

Units:

• Bytes are the units for the

Minimum

,

Maximum

,

Average

, and

Sum

statistics.

• Count for

SampleCount

.

Valid statistics:

Minimum

,

Maximum

,

Average

,

Sum

,

SampleCount

The number of bytes for each metadata operation.

The

Sum

statistic is the total number of bytes associated with metadata operations. The

Minimum

statistic is the size of the smallest metadata operation during the period. The

Maximum

statistic is the size of the largest metadata operation during the period. The

Average

statistic is the size of the average metadata operation during the period. The

SampleCount statistic provides a count of metadata operations.

Units:

• Bytes are the units for the

Minimum

,

Maximum

,

Average

, and

Sum

statistics.

• Count for

SampleCount

.

Valid statistics:

Minimum

,

Maximum

,

Average

,

Sum

,

SampleCount

Shows how close a file system is to reaching the I/O limit of the General

Purpose performance mode. If this metric is at 100% more often than not, consider moving your application to a file system using the Max I/O performance mode.

Note

This metric is only submitted for file systems using the General

Purpose performance mode.

Units:

• Percent

The maximum amount of throughput a file system is allowed, given the file system size and

BurstCreditBalance

. For more information, see Amazon

EFS Performance .

The

Minimum

statistic is the smallest throughput permitted for any minute during the period. The

Maximum

statistic is the highest throughput permitted for any minute during the period. The

Average

statistic is the average throughput permitted during the period.

Units: Bytes per second

Valid statistics:

Minimum

,

Maximum

,

Average

88


Dimensions for Amazon EFS Metrics

Metric

TotalIOBytes

Description

The number of bytes for each file system operation, including data read, data write, and metadata operations.

The

Sum

statistic is the total number of bytes associated with all file system operations. The

Minimum

statistic is the size of the smallest operation during the period. The

Maximum

statistic is the size of the largest operation during the period. The

Average

statistic is the average size of an operation during the period. The

SampleCount

statistic provides a count of all operations.

Note

To calculate the average operations per second for a period, divide the

SampleCount

statistic by the number of seconds in the period.

To calculate the average throughput (Bytes per second) for a period, divide the

Sum

statistic by the number of seconds in the period.

Units:

• Bytes for

Minimum

,

Maximum

,

Average

, and

Sum

statistics.

• Count for

SampleCount

.

Valid statistics:

Minimum

,

Maximum

,

Average

,

Sum

,

SampleCount

Dimensions for Amazon EFS Metrics

Amazon EFS Dimensions

Amazon EFS metrics use the

EFS

namespace and provides metrics for a single dimension,

FileSystemId

.

A file system's ID can be found in the Amazon EFS management console, and it takes the form of fs-

XXXXXXXX

.

Elastic Load Balancing Metrics and Dimensions

Elastic Load Balancing supports two types of load balancers: Classic Load Balancers and Application Load

Balancers. Elastic Load Balancing sends metrics to CloudWatch for both types of load balancers.

Contents

•

Application Load Balancer Metrics (p. 89)

•

Metric Dimensions for Application Load Balancers (p. 91)

•

Network Load Balancer Metrics (p. 92)

•

Metric Dimensions for Network Load Balancers (p. 92)

•

Classic Load Balancer Metrics (p. 93)

•

Metric Dimensions for Classic Load Balancers (p. 97)

Application Load Balancer Metrics

The

AWS/ApplicationELB


89


Application Load Balancer Metrics

Metric

ActiveConnectionCount

ProcessedBytes

Description

The total number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.

Statistics: The most useful statistic is

Sum

.

The number of TLS connections initiated by the client that did not establish a session with the load balancer. Possible causes include a mismatch of ciphers or protocols.

ConsumedLCUs

HealthyHostCount


Sum

.

The number of load balancer capacity units (LCU) used by your load balancer. You pay for the number of LCUs that you use per hour. For more information, see LCU Details .

The number of targets that are considered healthy.

HTTPCode_ELB_4XX_Count

HTTPCode_ELB_5XX_Count

Statistics: The most useful statistics are

Average

,

Minimum

, and

Maximum

.

The number of HTTP 4XX client error codes that originate from the load balancer. Client errors are generated when requests are malformed or incomplete. These requests have not been received by the target. This count does not include any response codes generated by the targets.


Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

The number of HTTP 5XX server error codes that originate from the load balancer. This count does not include any response codes generated by the targets.

IPv6ProcessedBytes


Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

HTTPCode_Target_2XX_Count

,


,


,


The number of HTTP response codes generated by the targets. This does not include any response codes generated by the load balancer.


Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

The total number of bytes processed by the load balancer over IPv6.

IPv6RequestCount


Sum

.

The number of IPv6 requests received by the load balancer.

NewConnectionCount


Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

The total number of new TCP connections established from clients to the load balancer and from the load balancer to targets.


Sum

.

The total number of bytes processed by the load balancer over IPv4 and

IPv6.


Sum

.

90


Metric Dimensions for Application Load Balancers

Metric

RejectedConnectionCount

Description

The number of connections that were rejected because the load balancer had reached its maximum number of connections.

RequestCount

RequestCountPerTarget


Sum

.

The number of requests received by the load balancer. This includes requests over IPv4 and IPv6.


Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

The average number of requests received by each target in a target group. You must specify the target group using the

TargetGroup dimension.

RuleEvaluations

Statistics: The only valid statistic is

Sum

. Note that this represents the average not the sum.

The number of rules processed by the load balancer given a request rate averaged over an hour.


Sum

.

TargetConnectionErrorCount

The number of connections that were not successfully established between the load balancer and target.

TargetResponseTime


Sum

.

The time elapsed, in seconds, after the request leaves the load balancer until a response from the target is received. This is equivalent to the target_processing_time

field in the access logs.


Average

and pNN.NN

(percentiles).

The number of TLS connections initiated by the load balancer that did not establish a session with the target. Possible causes include a mismatch of ciphers or protocols.

UnHealthyHostCount


Sum

.

The number of targets that are considered unhealthy.


Average

,

Minimum

, and

Maximum

.

Metric Dimensions for Application Load Balancers

To filter the metrics for your Application Load Balancer, use the following dimensions.

Dimension

AvailabilityZone

LoadBalancer

Description

Filter the metric data by Availability Zone.

Filter the metric data by load balancer. Specify the load balancer as follows: app/load-balancer-name/1234567890123456 (the final portion of the load balancer ARN).

91


Network Load Balancer Metrics

Dimension

TargetGroup

Description

Filter the metric data by target group. Specify the target group as follows: targetgroup/target-group-name/1234567890123456 (the final portion of the target group ARN).

Network Load Balancer Metrics

The

AWS/NetworkELB


Metric

ActiveFlowCount

HealthyHostCount

NewFlowCount

ProcessedBytes

TCP_Client_Reset_Count

TCP_ELB_Reset_Count

TCP_Target_Reset_Count

UnHealthyHostCount

Description

The total number of concurrent TCP flows (or connections) from clients to targets. This metric includes only the connections in the ESTABLISHED state. TCP connections are not terminated at the load balancer, so a client opening a TCP connection to a target counts as a single flow.


Average

,

Maximum

, and

Minimum

.

The number of targets that are considered healthy.


Average

,

Maximum

, and

Minimum

.

The total number of new TCP flows (or connections) established from clients to targets in the time period.


Sum

.

The total number of bytes processed by the load balancer, including

TCP/IP headers.


Sum

.

The total number of reset (RST) packets sent from a client to a target.

These resets are generated by the client and forwarded by the load balancer.


Sum

.

The total number of reset (RST) packets generated by the load balancer.


Sum

.

The total number of reset (RST) packets sent from a target to a client.

These resets are generated by the target and forwarded by the load balancer.


Sum

.

The number of targets that are considered unhealthy.


Average

,

Maximum

, and

Minimum

.

Metric Dimensions for Network Load Balancers

To filter the metrics for your load balancer, use the following dimensions.

92


Classic Load Balancer Metrics

Dimension

AvailabilityZone

LoadBalancer

TargetGroup

Description

Filter the metric data by Availability Zone.

Filter the metric data by load balancer. Specify the load balancer as follows: net/load-balancer-name/1234567890123456 (the final portion of the load balancer ARN).

Filter the metric data by target group. Specify the target group as follows: targetgroup/target-group-name/1234567890123456 (the final portion of the target group ARN).


The

AWS/ELB


Metric

BackendConnectionErrors

HealthyHostCount

Description

The number of connections that were not successfully established between the load balancer and the registered instances. Because the load balancer retries the connection when there are errors, this count can exceed the request rate. Note that this count also includes any connection errors related to health checks.

Reporting criteria: There is a nonzero value


Sum

. Note that

Average

,

Minimum

, and

Maximum

are reported per load balancer node and are not typically useful. However, the difference between the minimum and maximum (or peak to average or average to trough) might be useful to determine whether a load balancer node is an outlier.

Example: Suppose that your load balancer has 2 instances in uswest-2a and 2 instances in us-west-2b, and that attempts to connect to 1 instance in us-west-2a result in back-end connection errors. The sum for us-west-2a includes these connection errors, while the sum for us-west-2b does not include them. Therefore, the sum for the load balancer equals the sum for us-west-2a.

The number of healthy instances registered with your load balancer. A newly registered instance is considered healthy after it passes the first health check. If cross-zone load balancing is enabled, the number of healthy instances for the

LoadBalancerName

dimension is calculated across all Availability Zones. Otherwise, it is calculated per Availability

Zone.

Reporting criteria: There are registered instances


Average

and

Maximum

. These statistics are determined by the load balancer nodes. Note that some load balancer nodes might determine that an instance is unhealthy for a brief period while other nodes determine that it is healthy.

Example: Suppose that your load balancer has 2 instances in uswest-2a and 2 instances in us-west-2b, us-west-2a has 1 unhealthy instance, and us-west-2b has no unhealthy instances. With the

93

Metric

HTTPCode_Backend_2XX

,


,


,


HTTPCode_ELB_4XX

HTTPCode_ELB_5XX



Description

AvailabilityZone

dimension, there is an average of 1 healthy and 1 unhealthy instance in us-west-2a, and an average of 2 healthy and 0 unhealthy instances in us-west-2b.

[HTTP listener] The number of HTTP response codes generated by registered instances. This count does not include any response codes generated by the load balancer.



Sum

. Note that

Minimum

,

Maximum

, and

Average

are all 1.

Example: Suppose that your load balancer has 2 instances in uswest-2a and 2 instances in us-west-2b, and that requests sent to 1 instance in us-west-2a result in HTTP 500 responses. The sum for uswest-2a includes these error responses, while the sum for us-west-2b does not include them. Therefore, the sum for the load balancer equals the sum for us-west-2a.

[HTTP listener] The number of HTTP 4XX client error codes generated by the load balancer. Client errors are generated when a request is malformed or incomplete.



Sum

. Note that

Minimum

,

Maximum

, and

Average

are all 1.

Example: Suppose that your load balancer has us-west-2a and uswest-2b enabled, and that client requests include a malformed request URL. As a result, client errors would likely increase in all

Availability Zones. The sum for the load balancer is the sum of the values for the Availability Zones.

[HTTP listener] The number of HTTP 5XX server error codes generated by the load balancer. This count does not include any response codes generated by the registered instances. The metric is reported if there are no healthy instances registered to the load balancer, or if the request rate exceeds the capacity of the instances (spillover) or the load balancer.



Sum

. Note that

Minimum

,

Maximum

, and

Average

are all 1.

Example: Suppose that your load balancer has us-west-2a and uswest-2b enabled, and that instances in us-west-2a are experiencing high latency and are slow to respond to requests. As a result, the surge queue for the load balancer nodes in us-west-2a fills and clients receive a 503 error. If us-west-2b continues to respond normally, the sum for the load balancer equals the sum for us-west-2a.

94

Metric

Latency

RequestCount



Description

[HTTP listener] The total time elapsed, in seconds, from the time the load balancer sent the request to a registered instance until the instance started to send the response headers.

[TCP listener] The total time elapsed, in seconds, for the load balancer to successfully establish a connection to a registered instance.



Average

. Use

Maximum

to determine whether some requests are taking substantially longer than the average. Note that

Minimum

is typically not useful.

Example: Suppose that your load balancer has 2 instances in uswest-2a and 2 instances in us-west-2b, and that requests sent to 1 instance in us-west-2a have a higher latency. The average for uswest-2a has a higher value than the average for us-west-2b.

The number of requests completed or connections made during the specified interval (1 or 5 minutes).

[HTTP listener] The number of requests received and routed, including

HTTP error responses from the registered instances.

[TCP listener] The number of connections made to the registered instances.



Sum

. Note that

Minimum

,

Maximum

, and

Average

all return 1.

Example: Suppose that your load balancer has 2 instances in uswest-2a and 2 instances in us-west-2b, and that 100 requests are sent to the load balancer. There are 60 requests sent to us-west-2a, with each instance receiving 30 requests, and 40 requests sent to us-west-2b, with each instance receiving 20 requests. With the

AvailabilityZone

dimension, there is a sum of 60 requests in uswest-2a and 40 requests in us-west-2b. With the

LoadBalancerName dimension, there is a sum of 100 requests.

95

Metric

SpilloverCount

SurgeQueueLength

UnHealthyHostCount



Description

The total number of requests that were rejected because the surge queue is full.

[HTTP listener] The load balancer returns an HTTP 503 error code.

[TCP listener] The load balancer closes the connection.



Sum

. Note that

Average

,

Minimum

, and

Maximum

are reported per load balancer node and are not typically useful.

Example: Suppose that your load balancer has us-west-2a and uswest-2b enabled, and that instances in us-west-2a are experiencing high latency and are slow to respond to requests. As a result, the surge queue for the load balancer node in us-west-2a fills, resulting in spillover. If us-west-2b continues to respond normally, the sum for the load balancer will be the same as the sum for us-west-2a.

The total number of requests that are pending routing. The load balancer queues a request if it is unable to establish a connection with a healthy instance in order to route the request. The maximum size of the queue is 1,024. Additional requests are rejected when the queue is full. For more information, see

SpilloverCount

.

Reporting criteria: There is a nonzero value.


Maximum

, because it represents the peak of queued requests. The

Average

statistic can be useful in combination with

Minimum

and

Maximum

to determine the range of queued requests. Note that

Sum

is not useful.

Example: Suppose that your load balancer has us-west-2a and uswest-2b enabled, and that instances in us-west-2a are experiencing high latency and are slow to respond to requests. As a result, the surge queue for the load balancer nodes in us-west-2a fills, with clients likely experiencing increased response times. If this continues, the load balancer will likely have spillovers (see the

SpilloverCount

metric).

If us-west-2b continues to respond normally, the max

for the load balancer will be the same as the max

for us-west-2a.

The number of unhealthy instances registered with your load balancer.

An instance is considered unhealthy after it exceeds the unhealthy threshold configured for health checks. An unhealthy instance is considered healthy again after it meets the healthy threshold configured for health checks.

Reporting criteria: There are registered instances


Average

and

Minimum

. These statistics are determined by the load balancer nodes. Note that some load balancer nodes might determine that an instance is unhealthy for a brief period while other nodes determine that it is healthy.

Example: See

HealthyHostCount

.

96


Metric Dimensions for Classic Load Balancers

The following metrics enable you to estimate your costs if you migrate a Classic Load Balancer to an

Application Load Balancer. These metrics are intended for informational use only, not for use with

CloudWatch alarms. Note that if your Classic Load Balancer has multiple listeners, these metrics are aggregated across the listeners.

These estimates are based on a load balancer with one default rule and a certificate that is 2K in size. If you use a certificate that is 4K or greater in size, we recommend that you estimate your costs as follows: create an Application Load Balancer based on your Classic Load Balancer using the migration tool and monitor the

ConsumedLCUs

metric for the Application Load Balancer. For more information, see Migrate from a Classic Load Balancer to an Application Load Balancer in the Elastic Load Balancing User Guide.


The estimated number of concurrent TCP connections active from clients to the load balancer and from the load balancer to targets.

EstimatedALBConsumedLCUs

The estimated number of load balancer capacity units (LCU) used by an Application Load Balancer. You pay for the number of LCUs that you use per hour. For more information, see LCU Details .

The estimated number of new TCP connections established from clients to the load balancer and from the load balancer to targets.

EstimatedProcessedBytes

The estimated number of bytes processed by an Application Load

Balancer.

Metric Dimensions for Classic Load Balancers

To filter the metrics for your Classic Load Balancer, use the following dimensions.

Dimension

AvailabilityZone

LoadBalancerName

Description

Filter the metric data by the specified Availability Zone.

Filter the metric data by the specified load balancer.

Amazon EMR Metrics and Dimensions

Amazon EMR (Amazon EMR) sends metrics to CloudWatch. All Amazon EMR job flows automatically send metrics in five-minute intervals. Metrics are archived for 15 months; after that period, the data is discarded. For more information, see Monitor Metrics with Amazon CloudWatch in the Amazon EMR

Developer Guide.

Amazon EMR Metrics

Amazon EMR sends the following metrics to Amazon CloudWatch.

The

AWS/ElasticMapReduce


Note

Amazon EMR pulls metrics from a cluster. If a cluster becomes unreachable, no metrics are reported until the cluster becomes available again.

97


Amazon EMR Metrics

The following are Hadoop 1 metrics:

Metric

Cluster Status

IsIdle

Description

JobsRunning

JobsFailed

Indicates that a cluster is no longer performing work, but is still alive and accruing charges. It is set to 1 if no tasks are running and no jobs are running, and set to 0 otherwise. This value is checked at five-minute intervals and a value of 1 indicates only that the cluster was idle when checked, not that it was idle for the entire five minutes. To avoid false positives, you should raise an alarm when this value has been 1 for more than one consecutive 5-minute check. For example, you might raise an alarm on this value if it has been 1 for thirty minutes or longer.

Use case: Monitor cluster performance

Units: Boolean

The number of jobs in the cluster that are currently running.

Use case: Monitor cluster health

Units: Count

The number of jobs in the cluster that have failed.


Units: Count

Map/Reduce

MapTasksRunning

MapTasksRemaining

MapSlotsOpen

The number of running map tasks for each job. If you have a scheduler installed and multiple jobs running, multiple graphs are generated.

Use case: Monitor cluster progress

Units: Count

The number of remaining map tasks for each job. If you have a scheduler installed and multiple jobs running, multiple graphs are generated. A remaining map task is one that is not in any of the following states: Running, Killed, or Completed.


Units: Count

The unused map task capacity. This is calculated as the maximum number of map tasks for a given cluster, less the total number of map tasks currently running in that cluster.

Use case: Analyze cluster performance

Units: Count

98

Metric

RemainingMapTasksPerSlot

ReduceTasksRunning

ReduceTasksRemaining

ReduceSlotsOpen


Amazon EMR Metrics

Description

The ratio of the total map tasks remaining to the total map slots available in the cluster.


Units: Ratio

The number of running reduce tasks for each job. If you have a scheduler installed and multiple jobs running, multiple graphs are generated.


Units: Count

The number of remaining reduce tasks for each job. If you have a scheduler installed and multiple jobs running, multiple graphs are generated.


Units: Count

Unused reduce task capacity. This is calculated as the maximum reduce task capacity for a given cluster, less the number of reduce tasks currently running in that cluster.


Units: Count

Node Status

CoreNodesRunning

CoreNodesPending

LiveDataNodes

The number of core nodes working. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The number of core nodes waiting to be assigned. All of the core nodes requested may not be immediately available; this metric reports the pending requests. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The percentage of data nodes that are receiving work from

Hadoop.


Units: Percent

99

Metric

TaskNodesRunning

TaskNodesPending

LiveTaskTrackers

IO

S3BytesWritten

S3BytesRead

HDFSUtilization

HDFSBytesRead

HDFSBytesWritten


Amazon EMR Metrics

Description

The number of task nodes working. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The number of core nodes waiting to be assigned. All of the task nodes requested may not be immediately available; this metric reports the pending requests. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The percentage of task trackers that are functional.


Units: Percent

The number of bytes written to Amazon S3.

Use case: Analyze cluster performance, Monitor cluster progress

Units: Bytes

The number of bytes read from Amazon S3.


Units: Bytes

The percentage of HDFS storage currently used.


Units: Percent

The number of bytes read from HDFS.


Units: Bytes

The number of bytes written to HDFS.


Units: Bytes

100


Amazon EMR Metrics

Metric

MissingBlocks

TotalLoad

Description

The number of blocks in which HDFS has no replicas. These might be corrupt blocks.


Units: Count

The total number of concurrent data transfers.


Units: Count

HBase

BackupFailed

MostRecentBackupDuration

TimeSinceLastSuccessfulBackup

Whether the last backup failed. This is set to 0 by default and updated to 1 if the previous backup attempt failed. This metric is only reported for HBase clusters.

Use case: Monitor HBase backups

Units: Count

The amount of time it took the previous backup to complete.

This metric is set regardless of whether the last completed backup succeeded or failed. While the backup is ongoing, this metric returns the number of minutes after the backup started.

This metric is only reported for HBase clusters.

Use case: Monitor HBase Backups

Units: Minutes

The number of elapsed minutes after the last successful HBase backup started on your cluster. This metric is only reported for

HBase clusters.


Units: Minutes

The following metrics are available for Hadoop 2 AMIs:

Metric

Cluster Status

IsIdle

Description

Indicates that a cluster is no longer performing work, but is still alive and accruing charges. It is set to 1 if no tasks are running and no jobs are running, and set to 0 otherwise. This value is checked at five-minute intervals and a value of 1 indicates only that the cluster was idle when checked, not that it was idle for the entire five minutes. To avoid false positives, you should raise an alarm when this value has been 1 for more than one consecutive 5-minute check. For example, you might raise an alarm on this value if it has been 1 for thirty minutes or longer.

101

Metric

ContainerAllocated

ContainerReserved

ContainerPending

ContainerPendingRatio

AppsCompleted

AppsFailed

AppsKilled

AppsPending


Amazon EMR Metrics

Description

Use case: Monitor cluster performance

Units: Boolean

The number of resource containers allocated by the

ResourceManager.


Units: Count

The number of containers reserved.


Units: Count

The number of containers in the queue that have not yet been allocated.


Units: Count

The ratio of pending containers to containers allocated

(ContainerPendingRatio = ContainerPending /

ContainerAllocated). If ContainerAllocated = 0, then

ContainerPendingRatio = ContainerPending. The value of

ContainerPendingRatio represents a number, not a percentage.

This value is useful for scaling cluster resources based on container allocation behavior.

The number of applications submitted to YARN that have completed.


Units: Count

The number of applications submitted to YARN that have failed to complete.

Use case: Monitor cluster progress, Monitor cluster health

Units: Count

The number of applications submitted to YARN that have been killed.


Units: Count

The number of applications submitted to YARN that are in a pending state.


Units: Count

102

Metric

AppsRunning

AppsSubmitted

Node Status

CoreNodesRunning

CoreNodesPending

LiveDataNodes

MRTotalNodes

MRActiveNodes

MRLostNodes


Amazon EMR Metrics

Description

The number of applications submitted to YARN that are running.


Units: Count

The number of applications submitted to YARN.


Units: Count

The number of core nodes working. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The number of core nodes waiting to be assigned. All of the core nodes requested may not be immediately available; this metric reports the pending requests. Data points for this metric are reported only when a corresponding instance group exists.


Units: Count

The percentage of data nodes that are receiving work from

Hadoop.


Units: Percent

The number of nodes presently available to MapReduce jobs.

Use ase: Monitor cluster progress

Units: Count

The number of nodes presently running MapReduce tasks or jobs.


Units: Count

The number of nodes allocated to MapReduce that have been marked in a LOST state.

Use case: Monitor cluster health, Monitor cluster progress

Units: Count

103

Metric

MRUnhealthyNodes

MRDecommissionedNodes

MRRebootedNodes


Amazon EMR Metrics

Description

The number of nodes available to MapReduce jobs marked in an UNHEALTHY state.


Units: Count

The number of nodes allocated to MapReduce applications that have been marked in a DECOMMISSIONED state.

Use ase: Monitor cluster health, Monitor cluster progress

Units: Count

The number of nodes available to MapReduce that have been rebooted and marked in a REBOOTED state.

Use case: Monitor cluster health, Monitor cluster progress

Units: Count

IO

S3BytesWritten

S3BytesRead

HDFSUtilization

HDFSBytesRead

HDFSBytesWritten

The number of bytes written to Amazon S3.


Units: Bytes

The number of bytes read from Amazon S3.


Units: Bytes

The percentage of HDFS storage currently used.


Units: Percent

The number of bytes read from HDFS.


Units: Bytes

The number of bytes written to HDFS.


Units: Bytes

104


Amazon EMR Metrics

Metric

MissingBlocks

CorruptBlocks

TotalLoad

MemoryTotalMB

MemoryReservedMB

MemoryAvailableMB

YARNMemoryAvailablePercentage

MemoryAllocatedMB

PendingDeletionBlocks

Description

The number of blocks in which HDFS has no replicas. These might be corrupt blocks.


Units: Count

The number of blocks that HDFS reports as corrupted.


Units: Count

The total number of concurrent data transfers.


Units: Count

The total amount of memory in the cluster.


Units: Bytes

The amount of memory reserved.


Units: Bytes

The amount of memory available to be allocated.


Units: Bytes

The percentage of remaining memory available to YARN

(YARNMemoryAvailablePercentage = MemoryAvailableMB /

MemoryTotalMB). This value is useful for scaling cluster resources based on YARN memory usage.

The amount of memory allocated to the cluster.


Units: Bytes

The number of blocks marked for deletion.


Units: Count

105


Amazon EMR Dimensions

Metric

UnderReplicatedBlocks

DfsPendingReplicationBlocks

CapacityRemainingGB

Description

The number of blocks that need to be replicated one or more times.


Units: Count

The status of block replication: blocks being replicated, age of replication requests, and unsuccessful replication requests.


Units: Count

The amount of remaining HDFS disk capacity.


Units: Bytes

HBase

HbaseBackupFailed

MostRecentBackupDuration

TimeSinceLastSuccessfulBackup

Whether the last backup failed. This is set to 0 by default and updated to 1 if the previous backup attempt failed. This metric is only reported for HBase clusters.


Units: Count

The amount of time it took the previous backup to complete.

This metric is set regardless of whether the last completed backup succeeded or failed. While the backup is ongoing, this metric returns the number of minutes after the backup started.

This metric is only reported for HBase clusters.

Use case: Monitor HBase Backups

Units: Minutes

The number of elapsed minutes after the last successful HBase backup started on your cluster. This metric is only reported for

HBase clusters.


Units: Minutes

Amazon EMR Dimensions

The following dimensions are available for Amazon EMR.

106


Amazon ES

Dimension

JobFlowId

JobId

Description

The same as cluster ID, which is the unique identifier of a cluster in the form j-XXXXXXXXXXXXX

. Find this value by clicking on the cluster in the Amazon EMR console.

The identifier of a job within a cluster. You can use this to filter the metrics returned from a cluster down to those that apply to a single job within the cluster. JobId takes the form job_XXXXXXXXXXXX_XXXX.

Amazon Elasticsearch Service Metrics and

Dimensions

Amazon Elasticsearch Service sends data to CloudWatch every minute. You can create alarms using

Amazon Elasticsearch Service Metrics and Dimensions (p. 107) . For more information, see

Monitoring

Cluster Metrics and Statistics with Amazon CloudWatch in the Amazon Elasticsearch Service Developer

Guide.

Amazon Elasticsearch Service Metrics

The

AWS/ES

namespace includes the following metrics for clusters.

Metric

ClusterStatus.green

ClusterStatus.yellow

ClusterStatus.red

Nodes

Description

Indicates that all index shards are allocated to nodes in the cluster.

Relevant statistics: Minimum, Maximum

Indicates that the primary shards for all indices are allocated to nodes in a cluster, but the replica shards for at least one index are not. Single node clusters always initialize with this cluster status because there is no second node to which a replica can be assigned. You can either increase your node count to obtain a green cluster status, or you can use the Elasticsearch API to set the number_of_replicas

setting for your index to

0

. For more information, see Configuring Amazon

Elasticsearch Service Domains and Update Indices Settings in the

Elasticsearch documentation.


Indicates that the primary and replica shards of at least one index are not allocated to nodes in a cluster. A common cause for this state is a lack of free storage space on one or more of the data nodes in the cluster. In turn, a lack of free storage space prevents the service from distributing replica shards to the affected data node or nodes, and all new indices to start with a red cluster status. To recover, you must add

EBS-based storage to existing data nodes, use larger instance types, or delete the indices and restore them from a snapshot. For more information, see Red Cluster Status .


The number of nodes in the Amazon ES cluster.

107



Metric

SearchableDocuments

DeletedDocuments

CPUUtilization

FreeStorageSpace

ClusterUsedSpace

ClusterIndexWritesBlocked

JVMMemoryPressure

Description

Relevant Statistics: Minimum, Maximum, Average

The total number of searchable documents across all indices in the cluster.

Relevant statistics: Minimum, Maximum, Average

The total number of deleted documents across all indices in the cluster.


The maximum percentage of CPU resources used for data nodes in the cluster.

Relevant statistics: Maximum, Average

The free space, in megabytes, for all data nodes in the cluster. Amazon

ES throws a

ClusterBlockException

when this metric reaches

0

. To recover, you must either delete indices, add larger instances, or add

EBS-based storage to existing instances. To learn more, see Recovering from a Lack of Free Storage Space

Note

FreeStorageSpace

will always be lower than the value that the Elasticsearch

_cluster/stats

API provides. Amazon ES reserves a percentage of the storage space on each instance for internal operations.

Relevant statistics: Minimum

The total used space, in megabytes, for a cluster. You can view this metric in the Amazon CloudWatch console, but not in the Amazon ES console.


Indicates whether your cluster is accepting or blocking incoming write requests. A value of 0 means that the cluster is accepting requests. A value of 1 means that it is blocking requests.

Many factors can cause a cluster to begin blocking requests. Some common factors include the following:

FreeStorageSpace

is too low,

JVMMemoryPressure

is too high, or

CPUUtilization

is too high. To alleviate this issue, consider adding more disk space or scaling your cluster.

Relevant statistics: Maximum

Note

You can view this metric in the Amazon CloudWatch console, but not the Amazon ES console.

The maximum percentage of the Java heap used for all data nodes in the cluster.


108



Metric

AutomatedSnapshotFailure

CPUCreditBalance

KibanaHealthyNodes

Description

The number of failed automated snapshots for the cluster. A value of

1 indicates that no automated snapshot was taken for the domain in the previous 36 hours.


The remaining CPU credits available for data nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. For more information, see CPU Credits in the Amazon EC2 Developer Guide. This metric is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types.


A health check for Kibana. A value of 1 indicates normal behavior.

A value of 0 indicates that Kibana is inaccessible. In most cases, the health of Kibana mirrors the health of the cluster.


Note

You can view this metric on the Amazon CloudWatch console, but not the Amazon ES console.

The

AWS/ES

namespace includes the following metrics for dedicated master nodes.

Metric

MasterCPUUtilization

MasterFreeStorageSpace

MasterJVMMemoryPressure

MasterCPUCreditBalance

MasterReachableFromNode

Description

The maximum percentage of CPU resources used by the dedicated master nodes. We recommend increasing the size of the instance type when this metric reaches 60 percent.

Relevant statistics: Average

This metric is not relevant and can be ignored. The service does not use master nodes as data nodes.

The maximum percentage of the Java heap used for all dedicated master nodes in the cluster. We recommend moving to a larger instance type when this metric reaches 85 percent.


The remaining CPU credits available for dedicated master nodes in the cluster. A CPU credit provides the performance of a full CPU core for one minute. For more information, see CPU Credits in the

Amazon EC2 User Guide for Linux Instances. This metric is available only for the t2.micro.elasticsearch, t2.small.elasticsearch, and t2.medium.elasticsearch instance types.


A health check for

MasterNotDiscovered

exceptions. A value of 1 indicates normal behavior. A value of 0 indicates that

/_cluster/ health/

is failing.

109


Dimensions for Amazon Elasticsearch Service Metrics


Failures mean that the master node stopped or is not reachable.

They are usually the result of a network connectivity issue or AWS dependency problem.


Note

You can view this metric on the Amazon CloudWatch console, but not the Amazon ES console.

The

AWS/ES

namespace includes the following metrics for EBS volumes.

Metric

ReadLatency

WriteLatency

ReadThroughput

WriteThroughput

DiskQueueDepth

ReadIOPS

WriteIOPS

Description

The latency, in seconds, for read operations on EBS volumes.


The latency, in seconds, for write operations on EBS volumes.


The throughput, in bytes per second, for read operations on EBS volumes.


The throughput, in bytes per second, for write operations on EBS volumes.


The number of pending input and output (I/O) requests for an EBS volume.


The number of input and output (I/O) operations per second for read operations on EBS volumes.


The number of input and output (I/O) operations per second for write operations on EBS volumes.


Dimensions for Amazon Elasticsearch Service Metrics

To filter the metrics, use the following dimensions.

Dimension

ClientId

DomainName

Description

The AWS account ID.

The name of the search domain.

110


Elastic Transcoder

Amazon Elastic Transcoder Metrics and Dimensions

When you interact with Amazon Elastic Transcoder, it sends the following metrics to CloudWatch every minute.

Elastic Transcoder Metrics

The

AWS/ElasticTranscoder


Metric

Billed HD Output

Billed SD Output

Billed Audio Output

Jobs Completed

Jobs Errored

Outputs per Job

Standby Time

Description

The number of billable seconds of HD output for a pipeline.

Valid Dimensions: PipelineId

Unit: Seconds

The number of billable seconds of SD output for a pipeline.


Unit: Seconds

The number of billable seconds of audio output for a pipeline.


Unit: Seconds

The number of jobs completed by this pipeline.


Unit: Count

The number of jobs that failed because of invalid inputs, such as a request to transcode a file that is not in the given input bucket.


Unit: Count

The number of outputs Elastic Transcoder created for a job.


Unit: Count

The number of seconds before Elastic Transcoder started transcoding a job.


Unit: Seconds

111


Dimensions for Elastic Transcoder Metrics

Metric

Errors

Throttles

Description

The number of errors caused by invalid operation parameters, such as a request for a job status that does not include the job ID.

Valid Dimensions: Operation

Unit: Count

The number of times that Elastic Transcoder automatically throttled an operation.


Unit: Count

Dimensions for Elastic Transcoder Metrics

Elastic Transcoder metrics use the Elastic Transcoder namespace and provide metrics for the following dimension(s):

Dimension

PipelineId

Operation

Description

The ID of a pipeline. This dimension filters the data you request for an Elastic Transcoder pipeline.

This dimension filters the data you request for the APIs that Elastic Transcoder provides.

Amazon GameLift Metrics and Dimensions

Amazon GameLift Metrics for Fleets

The

AWS/GameLift

namespace includes the following metrics related to activity across a fleet or a group of fleets. The Amazon GameLift service sends metrics to CloudWatch every minute.

Instances

Metric

ActiveInstances

Description

Instances with ACTIVE status, which means they are running active server processes. The count includes idle instances and those that are hosting one or more game sessions. This metric measures current total instance capacity. This metric can be used with automatic scaling.

Units: Count

Relevant CloudWatch statistics: Average, Minimum,

Maximum

112

Metric

DesiredInstances

IdleInstances

MaxInstances

MinInstances

PercentIdleInstances



Description

Target number of active instances that Amazon

GameLift is working to maintain in the fleet. With automatic scaling, this value is determined based on the scaling policies currently in force. Without automatic scaling, this value is set manually. This metric is not available when viewing data for fleet metric groups.

Units: Count


Maximum

Active instances that are currently hosting zero (0) game sessions. This metric measures capacity that is available but unused. This metric can be used with automatic scaling.

Units: Count


Maximum

Maximum number of instances that are allowed for the fleet. A fleet's instance maximum determines the capacity ceiling during manual or automatic scaling up.

This metric is not available when viewing data for fleet metric groups.

Units: Count


Maximum

Minimum number of instances allowed for the fleet. A fleet's instance minimum determines the capacity floor during manual or automatic scaling down. This metric is not available when viewing data for fleet metric groups.

Units: Count


Maximum

Percentage of all active instances that are idle

(calculated as

IdleInstances / ActiveInstances

). This metric can be used for automatic scaling.

Units: Percent


Maximum

113

Server Processes

Metric

ActiveServerProcesses



HealthyServerProcesses

PercentHealthyServerProcesses

ServerProcessAbnormalTerminations

ServerProcessActivations

ServerProcessTerminations

Description

Server processes with ACTIVE status, which means they are running and able to host game sessions. The count includes idle server processes and those that are hosting game sessions. This metric measures current total server process capacity.

Units: Count


Maximum

Active server processes that are reporting healthy. This metric is useful for tracking the overall health of the fleet's game servers.

Units: Count


Maximum

Percentage of all active server processes that are reporting healthy (calculated as

HealthyServerProcesses / ActiveServerProcesses

).

Units: Percent


Maximum

Server processes that were shut down due to abnormal circumstances since the last report. This metric includes terminations that were initiated by the Amazon

GameLift service. This occurs when a server process stops responding, consistently reports failed health checks, or does not terminate cleanly (by calling

ProcessEnding() ).

Units: Count

Relevant CloudWatch statistics: Sum, Average,

Minimum, Maximum

Server processes that successfully transitioned from

ACTIVATING to ACTIVE status since the last report.

Server processes cannot host game sessions until they are active.

Units: Count


Minimum, Maximum

Server processes that were shut down since the last report. This includes all server processes that

114

Metric

Game Sessions

Metric

ActivatingGameSessions



Description

transitioned to TERMINATED status for any reason, including normal and abnormal process terminations.

Units: Count


Minimum, Maximum

ActiveGameSessions

AvailableGameSessions

PercentAvailableGameSessions

Description

Game sessions with ACTIVATING status, which means they are in the process of starting up. Game sessions cannot host players until they are active. High numbers for a sustained period of time may indicate that game sessions are not transitioning from ACTIVATING to

ACTIVE status. This metric can be used with automatic scaling.

Units: Count


Maximum

Game sessions with ACTIVE status, which means they are able to host players, and are hosting zero or more players. This metric measures the total number of game sessions currently being hosted. This metric can be used with automatic scaling.

Units: Count


Maximum

Game session slots on active, healthy server processes that are not currently being used. This metric measures the number of new game sessions that could be started immediately. This metric can be used with automatic scaling.

Units: Count


Maximum

Percentage of game session slots on all active server processes (healthy or unhealthy) that are not currently being used (calculated as

AvailableGameSessions /

[ActiveGameSessions + AvailableGameSessions + unhealthy server processes]

). This metric can be used with automatic scaling.

Units: Percent

115

Metric


Amazon GameLift Metrics for Queues

Description

Relevant CloudWatch statistics: Average

Player Sessions

Metric

CurrentPlayerSessions

PlayerSessionActivations

Description

Player sessions with either ACTIVE status (player is connected to an active game session) or RESERVED status (player has been given a slot in a game session but hasn't yet connected). This metric can be used with automatic scaling.

Units: Count


Maximum

Player sessions that transitioned from RESERVED status to ACTIVE since the last report. This occurs when a player successfully connects to an active game session.

Units: Count


Minimum, Maximum

Amazon GameLift Metrics for Queues

The

GameLift

namespace includes the following metrics related to activity across a game session placement queue. The Amazon GameLift service sends metrics to CloudWatch every minute.

Metric

AverageWaitTime

PlacementsCanceled

PlacementsStarted

Description

Average amount of time that game session placement requests in the queue with status PENDING have been waiting to be fulfilled.

Units: Seconds


Maximum

Game session placement requests that were canceled before timing out since the last report.

Units: Count


Maximum

New game session placement requests that were added to the queue since the last report.

116


Dimensions for Amazon GameLift Metrics

Metric

PlacementsSucceeded

PlacementsTimedOut

QueueDepth

Description

Units: Count


Maximum

Game session placement requests that resulted in a new game session since the last report.

Units: Count


Maximum

Game session placement requests that reached the queue's timeout limit without being fulfilled since the last report.

Units: Count


Maximum

Number of game session placement requests in the queue with status PENDING.

Units: Count


Maximum

Dimensions for Amazon GameLift Metrics

Amazon GameLift supports filtering metrics by the following dimensions.

Dimension

FleetId

FleetMetricsGroup

QueueName

Description

Unique identifier for a single fleet. This dimension is used with all metrics for instances, server processes, game sessions, and player sessions. It is not used with queue metrics.

Unique identifier for a collection of fleets. A fleet is included in a fleet metric group by adding the metric group name to the fleet's attributes (see

UpdateFleetAttributes() ). This dimension is used with all metrics for instances, server processes, game sessions, and player sessions. It is not used with queue metrics.

Unique identifier for a single queue. This dimension is used with metrics for game session placement queues only. It is not used with metrics for instances, server processes, game sessions, and player sessions.

117


Amazon Inspector

Amazon Inspector Metrics

For information about the Amazon Inspector metrics that you can use with CloudWatch, see Monitoring

Amazon Inspector Using CloudWatch in the Amazon Inspector User Guide.

AWS IoT Metrics and Dimensions

When you interact with AWS IoT, it sends the following metrics to CloudWatch every minute.

AWS IoT Metrics

AWS IoT sends the following metrics to CloudWatch once per received request.

IoT Metrics

Metric

RulesExecuted

Description

The number of AWS IoT rules executed.

Rule Metrics

Metric

TopicMatch

ParseError

Description

The number of incoming messages published on a topic on which a rule is listening. The

RuleName

dimension contains the name of the rule.

The number of JSON parse errors that occurred in messages published on a topic on which a rule is listening. The

RuleName

dimension contains the name of the rule.

Rule Action Metrics

Metric

Success

Failure

Description

The number of successful rule action invocations. The

RuleName

dimension contains the name of the rule that specifies the action. The

ActionType

dimension contains the type of action that was invoked.

The number of failed rule action invocations. The

RuleName


RuleName


ActionType

dimension contains the type of action that was invoked.

118

Message Broker Metrics

Metric

Connect.AuthError

Connect.ClientError

Connect.ServerError

Connect.Success

Connect.Throttle

Ping.Success

PublishIn.AuthError

PublishIn.ClientError

PublishIn.ServerError

PublishIn.Success

PublishIn.Throttle

PublishOut.AuthError


AWS IoT Metrics

Description

The number of connection requests that could not be authorized by the message broker. The

Protocol dimension contains the protocol used to send the

CONNECT

. message.

The number of connection requests rejected because the MQTT message did not meet the requirements defined in AWS IoT Limits . The

Protocol

dimension contains the protocol used to send the

CONNECT

.

message.

The number of connection requests that failed because an internal error occurred. The Protocol dimension contains the protocol used to send the

CONNECT message.

The number of successful connections to the message broker. The

Protocol


CONNECT

message.

The number of connection requests that were throttled because the client exceeded the allowed connect request rate. The

Protocol


CONNECT

message.

The number of ping messages received by the message broker. The

Protocol

dimension contains the protocol used to send the ping message.

The number of publish requests the message broker was unable to authorize. The

Protocol

dimension contains the protocol used to publish the message.

The number of publish requests rejected by the message broker because the message did not meet the requirements defined in AWS IoT Limits . The

Protocol dimension contains the protocol used to publish the message.

The number of publish requests the message broker failed to process because an internal error occurred. The

Protocol


PUBLISH

message.

The number of publish requests successfully processed by the message broker. The

Protocol


PUBLISH message.

The number of publish request that were throttled because the client exceeded the allowed inbound message rate. The

Protocol


PUBLISH

message.

The number of publish requests made by the message broker that could not be authorized by AWS IoT. The

119

Metric

PublishOut.ClientError

PublishOut.Success

Subscribe.AuthError

Subscribe.ClientError

Subscribe.ServerError

Subscribe.Success

Subscribe.Throttle

Unsubscribe.ClientError

Unsubscribe.ServerError

Unsubscribe.Success


AWS IoT Metrics

Description

Protocol


PUBLISH

message.

The number of publish requests made by the message broker that were rejected because the message did not meet the requirements defined in AWS IoT Limits . The

Protocol


PUBLISH

message.

The number of publish requests successfully made by the message broker. The

Protocol


PUBLISH

message.

The number of subscription requests made by a client that could not be authorized. The

Protocol


SUBSCRIBE message.

The number of subscribe requests that were rejected because the

SUBSCRIBE

message did not meet the requirements defined in AWS IoT Limits . The


SUBSCRIBE

message.

The number of subscribe requests that were rejected because an internal error occurred. The


SUBSCRIBE

message.

The number of subscribe requests that were successfully processed by the message broker. The


SUBSCRIBE

message.

The number of subscribe requests that were throttled because the client exceeded the allowed subscribe request rate. The

Protocol


SUBSCRIBE

message.

The number of unsubscribe requests that were rejected because the

UNSUBSCRIBE

message did not meet the requirements defined in AWS IoT Limits . The


UNSUBSCRIBE

message.

The number of unsubscribe requests that were rejected because an internal error occurred. The


UNSUBSCRIBE

message.

The number of unsubscribe requests that were successfully processed by the message broker. The

Protocol


UNSUBSCRIBE

message.

120



Metric

Unsubscribe.Throttle

Description

The number of unsubscribe requests that were rejected because the client exceeded the allowed unsubscribe request rate. The

Protocol


UNSUBSCRIBE

message.

Note

The message broker metrics are displayed in the AWS IoT console under Protocol Metrics.

Thing Shadow Metrics

Metric

DeleteThingShadow.Accepted

GetThingShadow.Accepted

UpdateThingShadow.Accepted

Description

The number of DeleteThingShadow requests processed successfully. The

Protocol

dimension contains the protocol used to make the request.

The number of GetThingShadow requests processed successfully. The

Protocol


The number of UpdateThingShadow requests processed successfully. The

Protocol


Note

The thing shadow metrics are displayed in the AWS IoT console under Protocol Metrics.


Metrics use the namespace and provide metrics for the following dimension(s):

Dimension

ActionType

Protocol

RuleName

Description

The action type specified by the rule that triggered by the request.

The protocol used to make the request. Valid values are:

MQTT or HTTP

The name of the rule triggered by the request.

Amazon Kinesis Analytics Metrics

Kinesis Analytics sends metrics to CloudWatch. For more information, see Monitoring with Amazon

CloudWatch Metrics in the Amazon Kinesis Analytics Developer Guide.

Metrics

The

AWS/KinesisAnalytics


121



Metric

Bytes

MillisBehindLatest

Records

Description

The number of bytes read (per input stream) or written

(per output stream).

Levels: Per input stream and per output stream

Indicates how far behind from the current time an application is reading from the streaming source.

Levels: Application-level

The number of records read (per input stream) or written (per output stream).

Levels: Per input stream and per output stream


Amazon Kinesis Analytics provides metrics for the following dimensions.

Dimension

Flow

Id

Description

Per input stream: Input

Per output stream: Output

Per input stream: Input Id

Per output stream: Output Id

Amazon Kinesis Firehose Metrics

Kinesis Firehose sends metrics to CloudWatch. For more information, see Monitoring with Amazon

CloudWatch Metrics in the Amazon Kinesis Firehose Developer Guide.

Service-level CloudWatch Metrics

The

AWS/Firehose

namespace includes the following service-level metrics.

Metric

DeliveryToElasticsearch.Bytes

DeliveryToElasticsearch.Records

DeliveryToElasticsearch.Success

Description

The number of bytes indexed to Amazon ES over the specified time period.

Units: Bytes

The number of records indexed to Amazon ES over the specified time period.

Units: Count

The sum of the successfully indexed records over the sum of records that were attempted.

122


API-Level CloudWatch Metrics

Metric

DeliveryToRedshift.Bytes

DeliveryToRedshift.Records

DeliveryToRedshift.Success

DeliveryToS3.Bytes

DeliveryToS3.DataFreshness

DeliveryToS3.Records

DeliveryToS3.Success

IncomingBytes

IncomingRecords

Description

The number of bytes copied to Amazon Redshift over the specified time period.

Units: Bytes

The number of records copied to Amazon Redshift over the specified time period.

Units: Count

The sum of successful Amazon Redshift COPY commands over the sum of all Amazon Redshift COPY commands.

The number of bytes delivered to Amazon S3 over the specified time period.

Units: Bytes

The age (from getting into Kinesis Firehose to now) of the oldest record in Kinesis Firehose. Any record older than this age has been delivered to the S3 bucket.

Units: Seconds

The number of records delivered to Amazon S3 over the specified time period.

Units: Count

The sum of successful Amazon S3 put commands over the sum of all Amazon S3 put commands.

The number of bytes ingested into the Kinesis Firehose stream over the specified time period.

Units: Bytes

The number of records ingested into the Kinesis Firehose stream over the specified time period.

Units: Count

API-Level CloudWatch Metrics

The

AWS/Firehose

namespace includes the following API-level metrics.

Metric

DescribeDeliveryStream.Latency

DescribeDeliveryStream.Requests

Description

The time taken per

DescribeDeliveryStream

operation, measured over the specified time period.

Units: Milliseconds

The total number of

DescribeDeliveryStream

requests.

Units: Count

123

Metric


Amazon Kinesis Streams

ListDeliveryStreams.Latency

ListDeliveryStreams.Requests

PutRecord.Bytes

PutRecord.Latency

PutRecord.Requests

PutRecordBatch.Bytes

PutRecordBatch.Latency

PutRecordBatch.Records

PutRecordBatch.Requests

UpdateDeliveryStream.Latency

UpdateDeliveryStream.Requests

Description

The time taken per

ListDeliveryStream


Units: Milliseconds

The total number of

ListFirehose

requests.

Units: Count

The number of bytes put to the Kinesis Firehose delivery stream using

PutRecord

over the specified time period.

Units: Bytes

The time taken per

PutRecord


Units: Milliseconds

The total number of

PutRecord

requests, which is equal to total number of records from

PutRecord

operations.

Units: Count

The number of bytes put to the Kinesis Firehose delivery stream using

PutRecordBatch

over the specified time period.

Units: Bytes

The time taken per

PutRecordBatch


Units: Milliseconds

The total number of records from

PutRecordBatch operations.

Units: Count

The total number of

PutRecordBatch

requests.

Units: Count

The time taken per

UpdateDeliveryStream


Units: Milliseconds

The total number of

UpdateDeliveryStream

requests.

Units: Count

Amazon Kinesis Streams Metrics and Dimensions

Kinesis Streams sends metrics to CloudWatch at two levels; the stream level and, optionally, the shard level. Stream-level metrics are for most common monitoring use cases in normal conditions. Shard-level

124


Basic Stream-level Metrics metrics are for specific monitoring tasks, usually related to troubleshooting. For more information, see

Monitoring Amazon Kinesis with Amazon CloudWatch in the Amazon Kinesis Developer Guide.

Contents

•

Basic Stream-level Metrics (p. 125)

•

Enhanced Shard-level Metrics (p. 128)

•

Dimensions for Amazon Kinesis Metrics (p. 131)

Basic Stream-level Metrics

The

AWS/Kinesis

namespace includes the following stream-level metrics.

Kinesis Streams sends these stream-level metrics to CloudWatch every minute. These metrics are always available.

Metric

GetRecords.Bytes

GetRecords.IteratorAge

GetRecords.IteratorAgeMilliseconds

GetRecords.Latency

Description

The number of bytes retrieved from the Kinesis stream, measured over the specified time period. Minimum,

Maximum, and Average statistics represent the bytes in a single

GetRecords

operation for the stream in the specified time period.

Shard-level metric name:

OutgoingBytes

Dimensions: StreamName

Statistics: Minimum, Maximum, Average, Sum, Samples

Units: Bytes

This metric is deprecated. Use


.

The age of the last record in all

GetRecords

calls made against an Kinesis stream, measured over the specified time period. Age is the difference between the current time and when the last record of the

GetRecords

call was written to the stream. The Minimum and Maximum statistics can be used to track the progress of Kinesis consumer applications.

A value of zero indicates that the records being read are completely caught up with the stream.


IteratorAgeMilliseconds


Statistics: Minimum, Maximum, Average, Samples

Units: Milliseconds

The time taken per

GetRecords




125

Metric

GetRecords.Records

GetRecords.Success

IncomingBytes

IncomingRecords



Description

Units: Milliseconds

The number of records retrieved from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the records in a single

GetRecords

operation for the stream in the specified time period.


OutgoingRecords



Units: Count

The number of successful

GetRecords

operations per stream, measured over the specified time period.


Statistics: Average, Sum, Samples

Units: Count

The number of bytes successfully put to the Kinesis stream over the specified time period. This metric includes bytes from

PutRecord

and

PutRecords

operations. Minimum,

Maximum, and Average statistics represent the bytes in a single put operation for the stream in the specified time period.


IncomingBytes



Units: Bytes

The number of records successfully put to the Kinesis stream over the specified time period. This metric includes record counts from

PutRecord

and

PutRecords

operations.

Minimum, Maximum, and Average statistics represent the records in a single put operation for the stream in the specified time period.


IncomingRecords



Units: Count

126

Metric

PutRecord.Bytes

PutRecord.Latency

PutRecord.Success

PutRecords.Bytes

PutRecords.Latency

PutRecords.Records



Description

The number of bytes put to the Kinesis stream using the

PutRecord

operation over the specified time period.



Units: Bytes

The time taken per

PutRecord




Units: Milliseconds

The number of successful

PutRecord

operations per Kinesis stream, measured over the specified time period. Average reflects the percentage of successful writes to a stream.



Units: Count

The number of bytes put to the Kinesis stream using the

PutRecords

operation over the specified time period.



Units: Bytes

The time taken per

PutRecords




Units: Milliseconds

The number of successful records in a

PutRecords

operation per Kinesis stream, measured over the specified time period.



Units: Count

127


Enhanced Shard-level Metrics

Metric

PutRecords.Success

ReadProvisionedThroughputExceeded

WriteProvisionedThroughputExceeded

Description

The number of

PutRecords

operations where at least one record succeeded, per Kinesis stream, measured over the specified time period.



Units: Count

The number of

GetRecords

calls throttled for the stream over the specified time period. The most commonly used statistic for this metric is Average.

When the Minimum statistic has a value of 1, all records were throttled for the stream during the specified time period.

When the Maximum statistic has a value of 0 (zero), no records were throttled for the stream during the specified time period.





Units: Count

The number of records rejected due to throttling for the stream over the specified time period. This metric includes throttling from

PutRecord

and

PutRecords

operations. The most commonly used statistic for this metric is Average.

When the Minimum statistic has a non-zero value, records were being throttled for the stream during the specified time period.

When the Maximum statistic has a value of 0 (zero), no records were being throttled for the stream during the specified time period.





Units: Count


The

AWS/Kinesis

namespace includes the following shard-level metrics.

128



Kinesis sends the following shard-level metrics to CloudWatch every minute. These metrics are not enabled by default. There is a nominal charge for enhanced metrics emitted from Kinesis. For more information, see Amazon CloudWatch Pricing .

Metric

IncomingBytes

IncomingRecords

IteratorAgeMilliseconds

OutgoingBytes

Description

The number of bytes successfully put to the shard over the specified time period. This metric includes bytes from

PutRecord

and

PutRecords

operations. Minimum, Maximum, and Average statistics represent the bytes in a single put operation for the shard in the specified time period.

Stream-level metric name:

IncomingBytes

Dimensions: StreamName, ShardId


Units: Bytes

The number of records successfully put to the shard over the specified time period. This metric includes record counts from

PutRecord

and

PutRecords

operations. Minimum,

Maximum, and Average statistics represent the records in a single put operation for the shard in the specified time period.


IncomingRecords



Units: Count

The age of the last record in all

GetRecords

calls made against a shard, measured over the specified time period.

Age is the difference between the current time and when the last record of the

GetRecords

call was written to the stream.

The Minimum and Maximum statistics can be used to track the progress of Kinesis consumer applications. A value of 0

(zero) indicates that the records being read are completely caught up with the stream.




Statistics: Minimum, Maximum, Average, Samples

Units: Milliseconds

The number of bytes retrieved from the shard, measured over the specified time period. Minimum, Maximum, and

Average statistics represent the bytes in a single

GetRecords operation for the shard in the specified time period.


GetRecords.Bytes

129



Metric

OutgoingRecords


Description



Units: Bytes

The number of records retrieved from the shard, measured over the specified time period. Minimum, Maximum, and Average statistics represent the records in a single

GetRecords

operation for the shard in the specified time period.


GetRecords.Records



Units: Count

The number of

GetRecords

calls throttled for the shard over the specified time period. This exception count covers all dimensions of the following limits: 5 reads per shard per second or 2 MB per second per shard. The most commonly used statistic for this metric is Average.

When the Minimum statistic has a value of 1, all records were throttled for the shard during the specified time period.

When the Maximum statistic has a value of 0 (zero), no records were throttled for the shard during the specified time period.





Units: Count

130


Dimensions for Amazon Kinesis Metrics

Metric


Description

The number of records rejected due to throttling for the shard over the specified time period. This metric includes throttling from

PutRecord

and

PutRecords

operations and covers all dimensions of the following limits: 1,000 records per second per shard or 1 MB per second per shard. The most commonly used statistic for this metric is Average.

When the Minimum statistic has a non-zero value, records were being throttled for the shard during the specified time period.

When the Maximum statistic has a value of 0 (zero), no records were being throttled for the shard during the specified time period.





Units: Count

Dimensions for Amazon Kinesis Metrics

You can use the following dimensions to filter the metrics for Amazon Kinesis Streams.

Dimension

StreamName

ShardId

Description

The name of the Kinesis stream.

The shard ID within the Kinesis stream.

AWS Key Management Service Metrics and

Dimensions

When you use AWS Key Management Service (AWS KMS) to import key material into a customer master key (CMK) and set it to expire, AWS KMS sends metrics and dimensions to CloudWatch. For more information, see Monitoring with Amazon CloudWatch in the AWS Key Management Service Developer

Guide.

AWS KMS Metrics

The

AWS/KMS


SecondsUntilKeyMaterialExpiration

This metric tracks the number of seconds remaining until imported key material expires. This metric is valid only for CMKs whose origin is

EXTERNAL

and whose key material is or was set to expire.

131


Dimensions for AWS KMS Metrics

The most useful statistic for this metric is

Minimum

, which tells you the smallest amount of time remaining for all data points in the specified statistic period. The only valid unit for this metric is

Seconds

.

Use this metric to track the amount of time that remains until your imported key material expires.

When that amount of time falls below a threshold that you define, you might want to take action such as reimporting the key material with a new expiration date. You can create a CloudWatch alarm to notify you when that happens. For more information, see Creating CloudWatch Alarms to Monitor

AWS KMS Metrics in the AWS Key Management Service Developer Guide.

Dimensions for AWS KMS Metrics

AWS KMS metrics use the

AWS/KMS

namespace and have only one valid dimension:

KeyId

. You can use this dimension to view metric data for a specific CMK or set of CMKs.

AWS Lambda Metrics and Dimensions

AWS Lambda sends metrics to CloudWatch every minute. For more information, see Troubleshooting and

Monitoring AWS Lambda Functions with Amazon CloudWatch in the AWS Lambda Developer Guide.

AWS Lambda CloudWatch Metrics

The

AWS/Lambda


Metric

Invocations

Errors

Description

Measures the number of times a function is invoked in response to an event or invocation API call. This replaces the deprecated RequestCount metric. This includes successful and failed invocations, but does not include throttled attempts. This equals the billed requests for the function. Note that AWS Lambda only sends these metrics to CloudWatch if they have a nonzero value.

Units: Count

Measures the number of invocations that failed due to errors in the function

(response code 4XX). This replaces the deprecated ErrorCount metric. Failed invocations may trigger a retry attempt that succeeds. This includes:

• Handled exceptions (for example, context.fail(error))

• Unhandled exceptions causing the code to exit

• Out of memory exceptions

• Timeouts

• Permissions errors

This does not include invocations that fail due to invocation rates exceeding default concurrent limits (error code 429) or failures due to internal service errors (error code 500).

Units: Count

132


Dimensions for AWS Lambda Metrics

Metric

Dead Letter Error

Duration

Throttles

IteratorAge

Description

Incremented when Lambda is unable to write the failed event payload to your configured Dead Letter Queues. This could be due to the following:

• Permissions errors

• Throttles from downstream services

• Misconfigured resources

• Timeouts

Units: Count

Measures the elapsed wall clock time from when the function code starts executing as a result of an invocation to when it stops executing. This replaces the deprecated Latency metric. The maximum data point value possible is the function timeout configuration. The billed duration will be rounded up to the nearest 100 millisecond. Note that AWS Lambda only sends these metrics to CloudWatch if they have a nonzero value.

Units: Milliseconds

Measures the number of Lambda function invocation attempts that were throttled due to invocation rates exceeding the customer’s concurrent limits (error code 429). Failed invocations may trigger a retry attempt that succeeds.

Units: Count

Emitted for stream-based invocations only (functions triggered by an

Amazon DynamoDB stream or Kinesis stream). Measures the age of the last record for each batch of records processed. Age is the difference between the time Lambda received the batch, and the time the last record in the batch was written to the stream.

Units: Milliseconds

Errors/Invocations Ratio

When calculating the error rate on Lambda function invocations, it’s important to distinguish between an invocation request and an actual invocation. It is possible for the error rate to exceed the number of billed Lambda function invocations. Lambda reports an invocation metric only if the Lambda function code is executed. If the invocation request yields a throttling or other initialization error that prevents the Lambda function code from being invoked, Lambda will report an error, but it does not log an invocation metric.

• Lambda emits

Invocations=1

when the function is executed. If the Lambda function is not executed, nothing is emitted.

• Lambda emits a data point for

Errors

for each invoke request.

Errors=0

means that there is no function execution error.

Errors=1

means that there is a function execution error.

• Lambda emits a data point for

Throttles

for each invoke request.

Throttles=0

means there is no invocation throttle.

Throttles=1

means there is an invocation throttle.

Dimensions for AWS Lambda Metrics

Lambda data can be filtered along any of the following dimensions in the table below.

133


Amazon Machine Learning

AWS Lambda CloudWatch Dimensions

You can use the dimensions in the following table to refine the metrics returned for your Lambda functions.

Dimension

FunctionName

Resource

Version

Alias

Description

Filters the metric data by Lambda function.

Filters the metric data by Lambda function resource.

Filters the metric data by Lambda version.

Filters the metric data by Lambda alias.

Amazon Machine Learning Metrics and Dimensions

Amazon Machine Learning sends metrics to CloudWatch every five minutes. For more information, see

Monitoring Amazon ML with Amazon CloudWatch Metrics in the Amazon Machine Learning Developer

Guide.

Amazon ML Metrics

The

AWS/ML


Metric

PredictCount

PredictFailureCount

Description

The number of observations received by Amazon ML, measured over the specified time period.

Units: Count

The number of invalid or malformed observations received by Amazon ML, measured over the specified time period.

Units: Count

Dimensions for Amazon Machine Learning Metrics

Amazon ML data can be filtered along any of the following dimensions in the table below.

Dimension

MLModelId

RequestMode

Description

The identifier of an Amazon ML model. All available statistics are filtered by

MLModelId

.

An indicator specifying whether observations were received as part of a batch prediction request or as real-time predict requests. All available statistics are filtered by

RequestMode

.

134


AWS OpsWorks

AWS OpsWorks Metrics and Dimensions

AWS OpsWorks sends metrics to CloudWatch for each active stack every minute. Detailed monitoring is enabled by default. For more information, see Monitoring in the AWS OpsWorks User Guide.

AWS OpsWorks Stacks Metrics

AWS OpsWorks Stacks sends the following metrics to CloudWatch every five minutes.

CPU Metrics

Metric

cpu_idle cpu_nice cpu_steal cpu_system

Description

The percentage of time that the CPU is idle.

Valid Dimensions: The IDs of the individual resources for which you are viewing metrics: StackId, LayerId, or

InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The percentage of time that the CPU is handling processes with a positive nice

value, which have a lower scheduling priority. For more information about what this measures, see nice (Unix) .


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

As AWS allocates hypervisor CPU resources among increasing numbers of instances, virtualization load rises, and can affect how often the hypervisor can perform requested work on an instance. cpu_steal measures the percentage of time that an instance is waiting for the hypervisor to allocate physical CPU resources.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The percentage of time that the CPU is handling system operations.

135

Metric

cpu_user cpu_waitio

Memory Metrics

Metric

memory_buffers memory_cached



Description


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The percentage of time that the CPU is handling user operations.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The percentage of time that the CPU is waiting for input/output operations.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

Description

The amount of buffered memory.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The amount of cached memory.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

136

Metric

memory_free memory_swap memory_total memory_used

Load Metrics

Metric

load_1



Description

Unit: None

The amount of free memory.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The amount of swap space.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The total amount of memory.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The amount of memory in use.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

Description

The load averaged over a one-minute window.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

137


Dimensions for AWS OpsWorks Metrics

Metric

load_5 load_15

Description

Unit: None

The load averaged over a five-minute window.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

The load averaged over a 15-minute window.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

Process Metrics

Metric

procs

Description

The number of active processes.


InstanceId.

Valid Statistics:

Average

,

Minimum

,

Maximum

,

Sum

, or

Data

Samples

.

Unit: None

Dimensions for AWS OpsWorks Metrics

AWS OpsWorks data can be filtered along any of the following dimensions in the table below.

Dimension

StackId

LayerId

InstanceId

Description

Average values for a stack.

Average values for a layer.

Average values for an instance.

138


Amazon Polly

Amazon Polly Metrics

Amazon Polly sends metrics to CloudWatch. For more information, see the Amazon Polly Developer Guide.

Amazon Polly Metrics

Amazon Polly produces the following metrics for each request. These metrics are aggregated and in one minute intervals sent to CloudWatch where they are available in the

AWS/Polly

namespace.

Metric

RequestCharacters

ResponseLatency

2XXCount

4XXCount

5XXCount

Description

The number of characters in the request. This is billable characters only and does not include SSML tags.

Valid Dimension: Operation

Valid Statistics: Minimum, Maximum, Average,

SampleCount, Sum

Unit: Count

The latency between when the request was made and the start of the streaming response.


Valid Statistics: Minimum, Maximum, Average,

SampleCount

Unit: milliseconds

HTTP 200 level code returned upon a successful response.


Valid Statistics: Average, SampleCount, Sum

Unit: Count

HTTP 400 level error code returned upon an error. For each successful response, a zero (0) is emitted.



Unit: Count

HTTP 500 level error code returned upon an error. For each successful response, a zero (0) is emitted.



Unit: Count

139


Dimensions for Amazon Polly Metrics

Dimensions for Amazon Polly Metrics

Amazon Polly provides metrics for the following dimension.

Dimension

Operation

Description

Metrics are grouped by the API method they refer to.

Possible values are

SynthesizeSpeech

,

PutLexicon

,

DescribeVoices

, etc.

Amazon Redshift Metrics and Dimensions

Amazon Redshift sends metrics to CloudWatch for each active cluster every minute. Detailed monitoring is enabled by default. For more information, see Monitoring Amazon Redshift Cluster Performance in the

Amazon Redshift Cluster Management Guide.

Amazon Redshift Metrics

The

AWS/Redshift


Metric

CPUUtilization

DatabaseConnections

HealthStatus

Description

The percentage of CPU utilization. For clusters, this metric represents an aggregation of all nodes (leader and compute) CPU utilization values.

Units: Percent

Dimensions:

NodeID

,

ClusterIdentifier

The number of database connections to a cluster.

Units: Count

Dimensions:

ClusterIdentifier

Indicates the health of the cluster. Every minute the cluster connects to its database and performs a simple query. If it is able to perform this operation successfully, the cluster is considered healthy. Otherwise, the cluster is unhealthy. An unhealthy status can occur when the cluster database is under extremely heavy load or if there is a configuration problem with a database on the cluster. The exception to this is when the cluster is undergoing maintenance. Even though your cluster might be unavailable due to maintenance tasks, the cluster remains in HEALTHY state. For more information, see Maintenance Windows in the Amazon Redshift

Cluster Management Guide.

Note

In Amazon CloudWatch this metric is reported as 1 or

0 whereas in the Amazon CloudWatch console, this metric is displayed with the words

HEALTHY

or

UNHEALTHY for convenience. When this metric is displayed in the

Amazon CloudWatch console, sampling averages are ignored and only

HEALTHY

or

UNHEALTHY

are displayed. In

140

Metric

MaintenanceMode

NetworkReceiveThroughput

NetworkTransmitThroughput

PercentageDiskSpaceUsed

ReadIOPS

ReadLatency


Amazon Redshift Metrics

Description

Amazon CloudWatch, values different than 1 and 0 may occur because of sampling issue. Any value below 1 for

HealthStatus

is reported as 0 (

UNHEALTHY

).

Units: 1/0 (

HEALTHY

/

UNHEALTHY

in the Amazon CloudWatch console)

Dimensions:

ClusterIdentifier

Indicates whether the cluster is in maintenance mode.

Note

In Amazon CloudWatch this metric is reported as 1 or 0 whereas in the Amazon CloudWatch console, this metric is displayed with the words

ON

or

OFF

for convenience.

When this metric is displayed in the Amazon CloudWatch console, sampling averages are ignored and only

ON

or

OFF are displayed. In Amazon CloudWatch, values different than 1 and 0 may occur because of sampling issues. Any value greater than 0 for

MaintenanceMode

is reported as 1

(

ON

).

Units: 1/0 (

ON

/

OFF

in the Amazon CloudWatch console).

Dimensions:

ClusterIdentifier

The rate at which the node or cluster receives data.

Units: Bytes/seconds (MB/s in the Amazon CloudWatch console)

Dimensions:

NodeID

,

ClusterIdentifier

The rate at which the node or cluster writes data.

Units: Bytes/second (MB/s in the Amazon CloudWatch console)

Dimensions:

NodeID

,

ClusterIdentifier

The percent of disk space used.

Units: Percent

Dimensions:

NodeID

,

ClusterIdentifier

The average number of disk read operations per second.

Units: Count/second

Dimensions:

NodeID

The average amount of time taken for disk read I/O operations.

Units: Seconds

Dimensions:

NodeID

141


Dimensions for Amazon Redshift Metrics

Metric

ReadThroughput

WriteIOPS

WriteLatency

WriteThroughput

Description

The average number of bytes read from disk per second.

Units: Bytes (GB/s in the Amazon CloudWatch console)

Dimensions:

NodeID

The average number of write operations per second.

Units: Count/seconds

Dimensions:

NodeID

The average amount of time taken for disk write I/O operations.

Units: Seconds

Dimensions:

NodeID

The average number of bytes written to disk per second.

Units: Bytes (GB/s in the Amazon CloudWatch console)

Dimensions:

NodeID

Dimensions for Amazon Redshift Metrics

Amazon Redshift data can be filtered along any of the following dimensions in the table below.

Dimension

NodeID

ClusterIdentifier

Description

Filters requested data that is specific to the nodes of a cluster.

NodeID

will be either "Leader", "Shared", or "Compute-N" where N is

0, 1, ... for the number of nodes in the cluster. "Shared" means that the cluster has only one node, i.e. the leader node and compute node are combined.

Metrics are reported for the leader node and compute nodes only for

CPUUtilization

,


, and

ReadIOPS

.

Other metrics that use the

NodeId

dimension are reported only for compute nodes.

Filters requested data that is specific to the cluster. Metrics that are specific to clusters include

HealthStatus

,

MaintenanceMode

, and

DatabaseConnections

. In general metrics in for this dimension (e.g.

ReadIOPS

) that are also metrics of nodes represent an aggregate of the node metric data. You should take care in interpreting these metrics because they aggregate behavior of leader and compute nodes.

142


Amazon RDS

Amazon RDS Metrics and Dimensions

Amazon Relational Database Service sends metrics to CloudWatch for each active database instance every minute. Detailed monitoring is enabled by default. For more information, see Monitoring a DB

Instance in the Amazon Relational Database Service User Guide.

Amazon RDS Metrics

The

AWS/RDS


Metric

BinLogDiskUsage

BurstBalance

CPUUtilization

CPUCreditUsage

CPUCreditBalance

DatabaseConnections

DiskQueueDepth

FreeableMemory

Description

The amount of disk space occupied by binary logs on the master.

Applies to MySQL read replicas.

Units: Bytes

The percent of General Purpose SSD (gp2) burst-bucket I/O credits available.

Units: Percent

The percentage of CPU utilization.

Units: Percent

[T2 instances] The number of CPU credits consumed by the instance.

One CPU credit equals one vCPU running at 100% utilization for one minute or an equivalent combination of vCPUs, utilization, and time

(for example, one vCPU running at 50% utilization for two minutes or two vCPUs running at 25% utilization for two minutes).

CPU credit metrics are available only at a 5 minute frequency. If you specify a period greater than five minutes, use the

Sum

statistic instead of the

Average

statistic.

Units: Count

[T2 instances] The number of CPU credits available for the instance to burst beyond its base CPU utilization. Credits are stored in the credit balance after they are earned and removed from the credit balance after they expire. Credits expire 24 hours after they are earned.

CPU credit metrics are available only at a 5 minute frequency.

Units: Count

The number of database connections in use.

Units: Count

The number of outstanding IOs (read/write requests) waiting to access the disk.

Units: Count

The amount of available random access memory.

143


Dimensions for RDS Metrics

Metric

FreeStorageSpace

ReplicaLag

SwapUsage

ReadIOPS

WriteIOPS

ReadLatency

WriteLatency

ReadThroughput

WriteThroughput

NetworkReceiveThroughput


Description

Units: Bytes

The amount of available storage space.

Units: Bytes

The amount of time a Read Replica DB instance lags behind the source DB instance. Applies to MySQL, MariaDB, and PostgreSQL Read

Replicas.

Units: Seconds

The amount of swap space used on the DB instance.

Units: Bytes

The average number of disk I/O operations per second.

Units: Count/Second

The average number of disk I/O operations per second.

Units: Count/Second

The average amount of time taken per disk I/O operation.

Units: Seconds

The average amount of time taken per disk I/O operation.

Units: Seconds

The average number of bytes read from disk per second.

Units: Bytes/Second

The average number of bytes written to disk per second.

Units: Bytes/Second

The incoming (Receive) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

Units: Bytes/second

The outgoing (Transmit) network traffic on the DB instance, including both customer database traffic and Amazon RDS traffic used for monitoring and replication.

Units: Bytes/second

Dimensions for RDS Metrics

Amazon RDS data can be filtered along any of the following dimensions in the table below.

144


Amazon Route 53

Dimension

DBInstanceIdentifier

DBClusterIdentifier

DBClusterIdentifier, Role

DatabaseClass

EngineName

Description

This dimension filters the data you request for a specific database instance.

This dimension filters the data you request for a specific Amazon

Aurora DB cluster.

This dimension filters the data you request for a specific Amazon

Aurora DB cluster, aggregating the metric by instance role (WRITER/

READER). For example, you can aggregate metrics for all READER instances that belong to a cluster.

This dimension filters the data you request for all instances in a database class. For example, you can aggregate metrics for all instances that belong to the database class db.m1.small

This dimension filters the data you request for the identified engine name only. For example, you can aggregate metrics for all instances that have the engine name mysql

.

Amazon Route 53 Metrics and Dimensions

Amazon Route 53 sends metrics to CloudWatch. CloudWatch provides detailed monitoring of Amazon

Route 53 by default. Amazon Route 53 sends one-minute metrics to CloudWatch. For more information, see Monitoring Health Checks Using Amazon CloudWatch in the Amazon Route 53 Developer Guide.

Note

To get Amazon Route 53 metrics using CloudWatch, you must choose US East (N. Virginia) as the region. Amazon Route 53 metrics are not available if you select any other region. You can also optionally specify a

Region

dimension. For more information, see

Dimensions for Amazon

Route 53 Metrics (p. 146)

.

Amazon Route 53 Metrics

The

AWS/Route53


Metric

ChildHealthCheckHealthyCount

ConnectionTime

Description

For a calculated health check, the number of health checks that are healthy among the health checks that Amazon Route 53 is monitoring.

Valid statistics: Average (recommended), Minimum, Maximum

Units: Healthy health checks

The average time, in milliseconds, that it took Amazon

Route 53 health checkers to establish a TCP connection with the endpoint. You can view

ConnectionTime

for a health check either across all regions or for a selected geographic region.


Units: Milliseconds

145


Dimensions for Amazon Route 53 Metrics

Metric

HealthCheckPercentageHealthy

HealthCheckStatus

SSLHandshakeTime

TimeToFirstByte

Description

The percentage of Amazon Route 53 health checkers that consider the selected endpoint to be healthy. You can view

HealthCheckPercentageHealthy

only across all regions; data is not available for a selected region.


Units: Percent

The status of the health check endpoint that CloudWatch is checking. 1 indicates healthy, and 0 indicates unhealthy. You can view

HealthCheckStatus

only across all regions; data is not available for a selected region.

Valid statistics: Minimum

Units: none


Route 53 health checkers to complete the SSL handshake. You can view

SSLHandshakeTime



Units: Milliseconds


Route 53 health checkers to receive the first byte of the response to an HTTP or HTTPS request. You can view

TimeToFirstByte



Units: Milliseconds

Dimensions for Amazon Route 53 Metrics

Amazon Route 53 metrics use the

AWS/Route53

namespace and provide metrics for

HealthCheckId

. When retrieving metrics, you must supply the

HealthCheckId

dimension.

In addition, for

ConnectionTime

,

SSLHandshakeTime

, and

TimeToFirstByte

, you can optionally specify

Region

. If you omit

Region

, CloudWatch returns metrics across all regions. If you include

Region

,

CloudWatch returns metrics only for the specified region.

For more information, see Monitoring Health Checks Using CloudWatch in the Amazon Route 53

Developer Guide.

146


Amazon SES

Amazon Simple Email Service Metrics and

Dimensions

Amazon Simple Email Service sends certain data points to CloudWatch. These data points track important metrics related to your email sending activities. For more information, see Retrieving Amazon

SES Event Data from CloudWatch in the Amazon Simple Email Service Developer Guide.

Amazon SES Metrics

The following metrics are available from Amazon SES.

Metric

Bounce

Click

Complaint

Delivery

Open

Reputation.BounceRate

Reputation.ComplaintRate

Send

Description

The recipient's mail server permanently rejected the email. This event corresponds to hard bounces; soft bounces are included only when Amazon SES fails to deliver the email after retrying for a period of time.

Unit: count

The recipient clicked a link in the email. Only click events from HTML emails are tracked.

Unit: count

The recipient marked the email as spam.

Unit: count

Amazon SES successfully delivered the email to the recipient's mail server.

Unit: count

The recipient opened an email. Only open events from

HTML emails are tracked.

Unit: count

The percentage of messages sent from your account that bounced. Only hard bounces are considered in this calculation.

Unit: percentage

The percentage of messages sent from your account that were reported by their recipients as spam.

Unit: percentage

Amazon SES accepted the email and attempted to send it. This value may be distinct from the Delivery metric, because messages could bounce or be rejected.

Unit: count

147


Dimensions for Amazon SES Metrics

Dimensions for Amazon SES Metrics

CloudWatch uses the dimension names that you specify when you add a CloudWatch event destination to a configuration set in Amazon SES. For more information, see Set Up a CloudWatch Event Destination for Amazon SES Event Publishing .

Amazon Simple Notification Service Metrics and

Dimensions

Amazon Simple Notification Service sends data points to CloudWatch for several metrics. All active topics automatically send five-minute metrics to CloudWatch. Detailed monitoring, or one-minute metrics, is currently unavailable for Amazon Simple Notification Service. A topic stays active for six hours from the last activity (for example, any API call) on the topic. For more information, see Monitoring

Amazon SNS with Amazon CloudWatch in the Amazon Simple Notification Service Developer Guide.

Amazon Simple Notification Service Metrics

The

AWS/SNS


Metric

NumberOfMessagesPublished

PublishSize

NumberOfNotificationsDelivered

NumberOfNotificationsFailed

Description

The number of messages published.

Units: Count


The size of messages published.

Units: Bytes

Valid Statistics: Minimum, Maximum, Average and Count

The number of messages successfully delivered.

Units: Count


The number of messages that Amazon SNS failed to deliver. This metric is applied after Amazon SNS stops attempting message deliveries to Amazon SQS, email,

SMS, or mobile push endpoints. Each delivery attempt to an HTTP or HTTPS endpoint adds 1 to the metric.

For all other endpoints, the count increases by 1 when the message is not delivered (regardless of the number of attempts). You can control the number of retries for HTTP endpoints; for more information, see Setting

Amazon SNS Delivery Retry Policies for HTTP/HTTPS

Endpoints .

Units: Count

Valid Statistics: Sum, Average

148


Dimensions for Amazon Simple Notification Service Metrics

Metric

SMSSuccessRate

Description

The rate of successful SMS message deliveries.

Units: Count

Valid Statistics: Sum, Average, Data Samples

Dimensions for Amazon Simple Notification Service

Metrics

Amazon SNS sends the following dimensions to CloudWatch.

Dimension

Application

Application,Platform

Country

Platform

TopicName

SMSType

Description

Filters on application objects, which represent an app and device registered with one of the supported push notification services, such as APNS and GCM.

Filters on application and platform objects, where the platform objects are for the supported push notification services, such as

APNS and GCM.

Filters on the destination country of an SMS message. The country is represented by its ISO 3166-1 alpha-2 code.

Filters on platform objects for the push notification services, such as APNS and GCM.

Filters on Amazon SNS topic names.

Filters on the message type of SMS message. Can be promotional or

transactional.

Amazon SQS Metrics and Dimensions

Amazon SQS sends data points to CloudWatch for several metrics. All active queues automatically send five-minute metrics to CloudWatch. Detailed monitoring, or one-minute metrics, is currently unavailable for Amazon SQS. A queue stays active for six hours from the last activity (for example, any API call) on the queue. For more information, see Monitoring Amazon SQS with Amazon CloudWatch in the Amazon

Simple Queue Service Developer Guide.

Amazon SQS Metrics

The

AWS/SQS


Metric

ApproximateAgeOfOldestMessage

Description

The approximate age of the oldest nondeleted message in the queue.

Units: Seconds

149

Metric

NumberOfEmptyReceives


Amazon SQS Metrics

ApproximateNumberOfMessagesDelayed

ApproximateNumberOfMessagesNotVisible

ApproximateNumberOfMessagesVisible

Description

Valid Statistics: Average, Minimum,

Maximum, Sum, Data Samples (displays as

Sample Count in the Amazon SQS console)

The number of messages in the queue that are delayed and not available for reading immediately. This can happen when the queue is configured as a delay queue or when a message has been sent with a delay parameter.

Units: Count




The number of messages that are "in flight."

Messages are considered in flight if they have been sent to a client but have not yet been deleted or have not yet reached the end of their visibility window.

Units: Count




The number of messages available for retrieval from the queue.

Units: Count




The number of

ReceiveMessage

API calls that did not return a message.

Units: Count




150

Metric

NumberOfMessagesDeleted

NumberOfMessagesReceived

NumberOfMessagesSent


Amazon SQS Metrics

Description

The number of messages deleted from the queue.

Units: Count




Amazon SQS emits the


metric for every successful deletion operation that uses a valid receipt handle , including duplicate deletions. The following scenarios might cause the value of the

NumberOfMessagesDeleted higher than expected:

metric to be

• Calling the

DeleteMessage

action on different receipt handles that belong to the same message: If the message is not processed before the visibility timeout expires, the message becomes available to other consumers that can process it and delete it again, increasing the value of the


metric.

• Calling the

DeleteMessage

action on the same receipt handle: If the message is processed and deleted but you call the

action again using the

DeleteMessage same receipt handle, a success status is returned, increasing the value of the


metric.

The number of messages returned by calls to the

ReceiveMessage

API action.

Units: Count




The number of messages added to a queue.

Units: Count




151


Dimensions for Amazon SQS Metrics

Metric

SentMessageSize

Description

The size of messages added to a queue.

Units: Bytes




Note that

SentMessageSize

does not display as an available metric in the CloudWatch console until at least one message is sent to the corresponding queue.

Dimensions for Amazon SQS Metrics

The only dimension that Amazon SQS sends to CloudWatch is

QueueName

. This means that all available statistics are filtered by

QueueName

.

Amazon Simple Storage Service Metrics and

Dimensions

Amazon Simple Storage Service sends data points to CloudWatch for several metrics, such as object counts and bytes stored, once a day. For more information, see Monitoring Amazon S3 with CloudWatch in the Amazon Simple Storage Service Developer Guide.

Amazon S3 CloudWatch Metrics

The

AWS/S3

namespace includes the following daily storage metrics for buckets.

Metric

BucketSizeBytes

NumberOfObjects

Description

The amount of data in bytes stored in a bucket in the Standard storage class,

Standard - Infrequent Access (Standard_IA) storage class, or the Reduced

Redundancy Storage (RRS) storage class.

Valid storage type filters:

StandardStorage

, or

StandardIAStorage

, or

ReducedRedundancyStorage

(see

StorageType

dimension)

Units: Bytes

Valid statistics: Average

The total number of objects stored in a bucket for all storage classes except for the

GLACIER

storage class.

Valid storage type filters:

AllStorageTypes

only (see

StorageType dimension)

Units: Count

Valid statistics: Average

152


Amazon S3 CloudWatch Metrics

The

AWS/S3

namespace includes the following request metrics.

Metric

AllRequests

GetRequests

PutRequests

DeleteRequests

HeadRequests

PostRequests

ListRequests

BytesDownloaded

Description

The total number of HTTP requests made to a bucket, regardless of type.

If you use a metrics configuration with a filter, this metric returns only the HTTP requests made to the objects in the bucket that meet the filter requirements.

Units: Count

Valid statistics: Sum

The number of HTTP GET requests made for objects in a bucket. This doesn't include list operations.

Paginated list-oriented requests, such as List Multipart Uploads , List Parts ,

Get Bucket Object Versions , and others, are not included in this metric.

Units: Count


The number of HTTP PUT requests made for objects in a bucket.

Units: Count


The number of HTTP DELETE requests made for objects in a bucket. This also includes Delete Multiple Objects requests.

Units: Count


The number of HTTP HEAD requests made to a bucket.

Units: Count


The number of HTTP POST requests made to a bucket.

Units: Count


The number of HTTP requests that list the contents of a bucket.

Units: Count


The number bytes downloaded for requests made to a bucket, where the response includes a body.

Units: Bytes

Valid statistics: Average (bytes per request), Sum (bytes per period), Sample

Count, Min, Max

153


Amazon S3 CloudWatch Dimensions

Metric

BytesUploaded

4xxErrors

5xxErrors

FirstByteLatency

TotalRequestLatency

Description

The number bytes uploaded to a bucket that contain a request body.

Units: Bytes

Valid statistics: Average (bytes per request), Sum (bytes per period), Sample

Count, Min, Max

The number of HTTP 4xx client error status code requests made to a bucket with a value of 0 or 1. The average

statistic shows the error rate, and the sum statistic shows the count of that type of error, during each period.

Units: Count

Valid statistics: Average (reports per request), Sum (reports per period), Min,

Max, Sample Count

The number of HTTP 5xx server error status code requests made to a bucket with a value of either 0 or 1. The average

statistic shows the error rate, and the sum

statistic shows the count of that type of error, during each period.

Units: Counts

Valid statistics: Average (reports per request), Sum (reports per period), Min,

Max, Sample Count

The per-request time from the complete request being received by a bucket to when the response starts to be returned.

Units: Milliseconds

Valid statistics: Average, Sum, Min, Max, Sample Count

The elapsed per-request time from the first byte received to the last byte sent to a bucket. This includes the time taken to receive the request body and send the response body, which is not included in

FirstByteLatency

.

Units: Milliseconds

Valid statistics: Average, Sum, Min, Max, Sample Count

Amazon S3 CloudWatch Dimensions

The following dimensions are used to filter Amazon S3 metrics.

Dimension

BucketName

StorageType

Description

Filters the data you request for the identified bucket only.

Filters the data stored in a bucket by the type of storage.

The types are

StandardStorage

for the Standard storage class,

StandardIAStorage

for the Standard_IA storage class,

ReducedRedundancyStorage

for the Reduced Redundancy Storage

(RRS) class, and

AllStorageTypes

. Note that the

AllStorageTypes type does not include the

GLACIER

storage class.

154

Dimension

FilterId


AWS Shield Advanced

Description

Filters metrics configurations that you specify for request metrics on a bucket, for example, a prefix or a tag. You specify a filter ID when you create a metrics configuration.

AWS Shield Advanced Metrics

For information about the Shield Advanced metrics that you can use with CloudWatch, see Shield

Advanced Metrics in the AWS WAF Developer Guide.

AWS Step Functions Metrics and Dimensions

The following metrics are available for AWS Step Functions. For more information, see Monitoring Step

Functions Using CloudWatch in the AWS Step Functions Developer Guide.

Execution Metrics

The

AWS/States

namespace includes the following metrics for Step Functions executions:

Metric

ExecutionTime

ExecutionThrottled

ExecutionsAborted

ExecutionsFailed

ExecutionsStarted

ExecutionsSucceeded

ExecutionsTimedOut

Description

The interval, in milliseconds, between the time the execution starts and the time it closes.

The number of

StateEntered

events in addition to retries.

The number of aborted or terminated executions.

The number of failed executions.

The number of started executions.

The number of successfully completed executions.

The number of executions that time out for any reason.

Dimension for Step Functions Execution Metrics

Dimension

StateMachineArn

Description

The ARN of the state machine for the execution in question.

Activity Metrics

The

AWS/States

namespace includes the following metrics for Step Functions activities:

155


Lambda Function Metrics

Metric

ActivityRunTime

ActivityScheduleTime

ActivityTime

ActivitiesFailed

ActivitiesHeartbeatTimedOut

ActivitiesScheduled

ActivitiesStarted

ActivitiesSucceeded

ActivitiesTimedOut

Description

The interval, in milliseconds, between the time the activity starts and the time it closes.

The interval, in milliseconds, for which the activity stays in the schedule state.

The interval, in milliseconds, between the time the activity is scheduled and the time it closes.

The number of failed activities.

The number of activities that time out due to a heartbeat timeout.

The number of scheduled activities.

The number of started activities.

The number of successfully completed activities.

The number of activities that time out on close.

Dimension for Step Functions Activity Metrics

Dimension

ActivityArn

Description

The ARN of the activity.

Lambda Function Metrics

The

AWS/States

namespace includes the following metrics for Step Functions Lambda functions:

Metric

LambdaFunctionRunTime

LambdaFunctionScheduleTime

LambdaFunctionTime

Description

The interval, in milliseconds, between the time the Lambda function starts and the time it closes.

The interval, in milliseconds, for which the Lambda function stays in the schedule state.

The interval, in milliseconds, between the time the Lambda function is scheduled and the time it closes.

LambdaFunctionsFailed

The number of failed Lambda functions.

The number of Lambda functions that time out due to a heartbeat timeout.

LambdaFunctionsScheduled

LambdaFunctionsStarted

LambdaFunctionsSucceeded

LambdaFunctionsTimedOut

The number of scheduled Lambda functions.

The number of started Lambda functions.

The number of successfully completed Lambda functions.

The number of Lambda functions that time out on close.

156


Amazon SWF

Dimension for Step Functions Lambda Function Metrics

Dimension

LambdaFunctionArn

Description

The ARN of the Lambda function.

Amazon SWF Metrics and Dimensions

Amazon SWF sends data points to CloudWatch for several metrics. Some of the Amazon SWF metrics for CloudWatch are time intervals, always measured in milliseconds. These metrics generally correspond to stages of your workflow execution for which you can set workflow and activity timeouts, and have similar names. For example, the DecisionTaskStartToCloseTime metric measures the time it took for the decision task to complete after it began executing, which is the same time period for which you can set a

DecisionTaskStartToCloseTimeout value.

Other Amazon SWF metrics report results as a count. For example, WorkflowsCanceled, records a result as either one or zero, indicating whether or not the workflow was canceled. A value of zero does not indicate that the metric was not reported, only that the condition described by the metric did not occur.

For count metrics, minimum and maximum will always be either zero or one, but average will be a value ranging from zero to one. For more information, see Viewing Amazon SWF Metrics for CloudWatch using the AWS Management Console; in the Amazon Simple Workflow Service Developer Guide.

Workflow Metrics

The

AWS/SWF

namespace includes the following metrics for Amazon SWF workflows:


DecisionTaskScheduleToStartTime The time interval, in milliseconds, between the time that the decision task was scheduled and the time it was picked up by a worker and started.

DecisionTaskStartToCloseTime The time interval, in milliseconds, between the time that the decision task was started and the time it was closed.

The count of decision tasks that have been completed.

DecisionTasksCompleted

StartedDecisionTasksTimedOutOnClose The count of decision tasks that started but timed out on closing.

WorkflowStartToCloseTime The time, in milliseconds, between the time the workflow started and the time it closed.

WorkflowsCanceled

WorkflowsCompleted

The count of workflows that were canceled.

The count of workflows that completed.

WorkflowsContinuedAsNew

WorkflowsFailed

WorkflowsTerminated

WorkflowsTimedOut

The count of workflows that continued as new.

the count of workflows that failed.

the count of workflows that were terminated.

The count of workflows that timed out, for any reason.

157


Activity Metrics

Dimensions for Amazon SWF Workflow Metrics

Dimension

Domain

WorkflowTypeName

WorkflowTypeVersion

Description

The Amazon SWF domain that the workflow is running in.

The name of the workflow type for this workflow execution.

The version of the workflow type for this workflow execution.

Activity Metrics

The

AWS/SWF

namespace includes the following metrics for Amazon SWF activities:

Metric

ActivityTaskScheduleToCloseTime

ActivityTaskScheduleToStartTime

ActivityTaskStartToCloseTime

ActivityTasksCanceled

ActivityTasksCompleted

ActivityTasksFailed

Description

The time interval, in milliseconds, between the time when the activity was scheduled to when it closed.

The time interval, in milliseconds, between the time when the activity task was scheduled and when it started.

The time interval, in milliseconds, between the time when the activity task started and when it was closed.

The count of activity tasks that were canceled.

The count of activity tasks that completed.

The count of activity tasks that failed.

ScheduledActivityTasksTimedOutOnClose The count of activity tasks that were scheduled but timed out on close.

ScheduledActivityTasksTimedOutOnStart The count of activity tasks that were scheduled but timed out on start.

StartedActivityTasksTimedOutOnClose The count of activity tasks that were started but timed out on close.

StartedActivityTasksTimedOutOnHeartbeatThe count of activity tasks that were started but timed out due to a heartbeat timeout.

Dimensions for Amazon SWF Activity Metrics

Dimension

Domain

ActivityTypeName

ActivityTypeVersion

Description

The Amazon SWF domain that the activity is running in.

The name of the activity type.

The version of the activity type

158


AWS Storage Gateway

AWS Storage Gateway Metrics and Dimensions

AWS Storage Gateway sends data points to CloudWatch for several metrics. All active queues automatically send five-minute metrics to CloudWatch. Detailed monitoring, or one-minute metrics, is currently unavailable for AWS Storage Gateway. For more information, see Monitoring Your AWS Storage

Gateway in the AWS Storage Gateway User Guide.

AWS Storage Gateway Metrics

The

AWS/StorageGateway


You can use these metrics to get information about your gateways. Specify the

GatewayId

or

GatewayName

dimension for each metric to view the data for a gateway. Note that these metrics are measured in 5-minute intervals.

Metric

CacheHitPercent

CachePercentUsed

CachePercentDirty

CloudBytesDownloaded

CloudDownloadLatency

Description

Percent of application reads served from the cache. The sample is taken at the end of the reporting period.

Applies To..

File, Cached volumes and Tape.

Units: Percent

Percent use of the gateway's cache storage. The sample is taken at the end of the reporting period.


Units: Percent

Percent of the gateway's cache that has not been persisted to

AWS. The sample is taken at the end of the reporting period.

Units: Percent

The total number of compressed bytes that the gateway downloaded from AWS during the reporting period.

Use this metric with the

Sum statistic to measure throughput and with the

Samples

statistic to measure input/output operations per second (IOPS).

Units: Bytes

The total number of milliseconds spent reading data from AWS during the reporting period.


Average statistic to measure latency.


File, Cached volumes, Stored volumes and Tape.


159

Metric

CloudBytesUploaded

UploadBufferFree

CacheFree

UploadBufferPercentUsed

UploadBufferUsed

CacheUsed

QueuedWrites



Description

Units: Milliseconds

The total number of compressed bytes that the uploaded to AWS during the reporting period.



Samples

statistic to measure IOPS.

Applies To..


Units: Bytes

The total amount of unused space in the gateway's upload buffer. The sample is taken at the end of the reporting period.

Units: Bytes

The total amount of unused space in the gateway's cache storage. The sample is taken at the end of the reporting period.

Cached volumes and Tape.

File, Cached volumes, and Tape.

Units: Bytes

Percent use of the gateway's upload buffer. The sample is taken at the end of the reporting period.


Units: Percent

The total number of bytes being used in the gateway's upload buffer. The sample is taken at the end of the reporting period.

Units: Bytes

The total number of bytes being used in the gateway's cache storage. The sample is taken at the end of the reporting period.

Units: Bytes

The number of bytes waiting to be written to AWS, sampled at the end of the reporting period for all volumes in the gateway.

These bytes are kept in your gateway's working storage.

Units: Bytes




160

Metric

ReadBytes

ReadTime

TotalCacheSize

WriteBytes

WriteTime



Description

The total number of bytes read from your on-premises applications in the reporting period for all volumes in the gateway.



Samples


Units: Bytes

The total number of milliseconds spent to do read operations from your onpremises applications in the reporting period for all volumes in the gateway.



Units: Milliseconds

The total size of the cache in bytes. The sample is taken at the end of the reporting period.

Applies To..



File, Cached volumes, and Tape.

Units: Bytes

The total number of bytes written to your on-premises applications in the reporting period for all volumes in the gateway.



Samples


Units: Bytes

The total number of milliseconds spent to do write operations from your onpremises applications in the reporting period for all volumes in the gateway.



Units: Milliseconds



161



Metric

TimeSinceLastRecoveryPoint

WorkingStorageFree

WorkingStoragePercentUsed

WorkingStorageUsed

Description

The time since the last available recovery point. For more information, see Using Volume

Recovery Points for Your Cached

Volumes Setup

Units: Seconds

The total amount of unused space in the gateway's working storage. The sample is taken at the end of the reporting period.

Applies To..

Cached volumes and Stored volumes.

Stored volumes only.

Units: Bytes

Percent use of the gateway's upload buffer. The sample is taken at the end of the reporting period.


Units: Percent

The total number of bytes being used in the gateway's upload buffer. The sample is taken at the end of the reporting period.


Units: Bytes

The following table describes the AWS Storage Gateway metrics that you can use to get information about your storage volumes. Specify the

VolumeId

dimension for each metric to view the data for a storage volume.

Metric

CacheHitPercent

CachePercentDirty

Description

Percent of application read operations from the volume that are served from cache. The sample is taken at the end of the reporting period.

When there are no application read operations from the volume, this metric reports 100 percent.

Units: Percent

The volume's contribution to the overall percentage of the gateway's cache

Cached volumes

yes yes

Stored volumes

no no

162

Metric

CachePercentUsed

ReadBytes



Description

that has not been persisted to AWS. The sample is taken at the end of the reporting period.

Use the

CachePercentDirty metric of the gateway to view the overall percentage of the gateway's cache that has not been persisted to AWS. For more information, see Monitoring Your

Gateway .

Units: Percent

The volume's contribution to the overall percent use of the gateway's cache storage. The sample is taken at the end of the reporting period.

Use the

CachePercentUsed metric of the gateway to view overall percent use of the gateway's cache storage. For more information, see Monitoring Your

Gateway .

Units: Percent

The total number of bytes read from your on-premises applications in the reporting period.


Sum

statistic to measure throughput and with the

Samples


Units: Bytes

Cached volumes

yes yes

Stored volumes

no yes

163



Metric

ReadTime

WriteBytes

WriteTime

QueuedWrites

Description

The total number of milliseconds spent to do read operations from your on-premises applications in the reporting period.


Average

statistic to measure latency.

Units: Milliseconds

The total number of bytes written to your on-premises applications in the reporting period.


Sum


Samples


Units: Bytes

The total number of milliseconds spent to do write operations from your on-premises applications in the reporting period.


Average


Units: Milliseconds

The number of bytes waiting to be written to AWS, sampled at the end of the reporting period.

Units: Bytes

Cached volumes

yes yes yes yes

Stored volumes

yes yes yes yes

The following table describes the metrics that you can use to get information about your file shares.

Metric

CacheHitPercent

Description

Percent of application read operations from the file shares that are served from cache. The sample is taken at the end of the reporting period.

164

Metric

CachePercentDirty

CachePercentUsed

ReadBytes

ReadTime

WriteBytes



Description

When there are no application read operations from the file share, this metric reports 100 percent.

Units: Percent

The file share's contribution to the overall percentage of the gateway's cache that has not been persisted to AWS. The sample is taken at the end of the reporting period.

Use the

CachePercentDirty

metric of the gateway to view the overall percentage of the gateway's cache that has not been persisted to AWS. For more information, see Monitoring Your Gateway .

Units: Percent

The file share's contribution to the overall percent use of the gateway's cache storage. The sample is taken at the end of the reporting period.

Use the

CachePercentUsed

metric of the gateway to view overall percent use of the gateway's cache storage. For more information, see Monitoring

Your Gateway .

Units: Percent

The total number of bytes read from your onpremises applications in the reporting period for a file share.


Sum


Samples


Units: Bytes

The total number of milliseconds spent to do read operations from your on-premises applications in the reporting period.


Average


Units: Milliseconds

The total number of bytes written to your onpremises applications in the reporting period.


Sum


Samples


Units: Bytes

165


Dimensions for AWS Storage Gateway Metrics

Metric

WriteTime

Description

The total number of milliseconds spent to do write operations from your on-premises applications in the reporting period.


Average


Units: Milliseconds

Dimensions for AWS Storage Gateway Metrics

The Amazon CloudWatch namespace for the AWS Storage Gateway service is

AWS/StorageGateway

. Data is available automatically in 5-minute periods at no charge.

Dimension

GatewayId

,

GatewayName

VolumeId

Description

These dimensions filter the data you request to gateway-specific metrics. You can identify a gateway to work by its

GatewayId

or its

GatewayName

. However, note that if the name of your gateway was changed for the time range that you are interested in viewing metrics, then you should use the

GatewayId

.

Throughput and latency data of a gateway is based on all the volumes for the gateway. For information about working with gateway metrics, see Measuring Performance Between Your

Gateway and AWS .

This dimension filters the data you request to volume-specific metrics. Identify a storage volume to work with by its

VolumeId

.

For information about working with volume metrics, see Measuring

Performance Between Your Application and Gateway .

Amazon VPC NAT Gateway Metrics and Dimensions

NAT gateway metric data is provided at 1-minute frequency. For more information, see Monitoring Your

NAT Gateway with Amazon CloudWatch in the Amazon VPC User Guide.

NAT Gateway Metrics

The following metrics are available from the NAT gateway service.

Metric

PacketsOutToDestination

Description

The number of packets sent out through the NAT gateway to the destination.

A value greater than zero indicates that there is traffic going to the internet from clients that are behind the NAT gateway.

Unit: Count

166

Metric

PacketsOutToSource

PacketsInFromSource

PacketsInFromDestination

BytesOutToDestination

BytesOutToSource

BytesInFromSource


NAT Gateway Metrics

Description

The number of packets sent through the NAT gateway to the clients in your VPC.

A value greater than zero indicates that there is traffic coming from the internet to clients that are behind the NAT gateway.

Unit: Count

The number of packets received by the NAT gateway from clients in your VPC.

If the value for


is less than the value for

PacketsInFromSource

, there may be packet loss during NAT gateway processing.

Unit: Count

The number of packets received by the NAT gateway from the destination.

If the value for



PacketsInFromSource

, there may be packet loss during NAT gateway processing.

Unit: Count

The number of bytes sent out through the NAT gateway to the destination.

A value greater than zero indicates that there is traffic going to the internet from clients that are behind the NAT gateway.

Unit: Bytes

The number of bytes sent through the NAT gateway to the clients in your VPC.

A value greater than zero indicates that there is traffic coming from the internet to clients that are behind the NAT gateway.

Units: Bytes

The number of bytes received by the NAT gateway from clients in your VPC.

If the value for

BytesInFromDestination


BytesInFromSource

, there may be data loss during NAT gateway processing.

Units: Bytes

167

Metric


ErrorPortAllocation

ActiveConnectionCount

ConnectionAttemptCount

ConnectionEstablishedCount

IdleTimeoutCount


NAT Gateway Metrics

Description

The number of bytes received by the NAT gateway from the destination.

If the value for



BytesInFromSource

, there may be data loss during NAT gateway processing.

Units: Bytes

The number of times the NAT gateway could not allocate a source port.

A value greater than zero indicates that too many concurrent connections are open through the NAT gateway.

Units: Count

The total number of concurrent active TCP connections through the NAT gateway.

A value of zero indicates that there are no active connections through the NAT gateway.

Units: Count

The number of connection attempts made through the NAT gateway.

If the value for




, this indicates that clients behind the NAT gateway attempted to establish new connections for which there was no response.

Unit: Count

The number of connections established through the NAT gateway.

If the value for




, this indicates that clients behind the NAT gateway attempted to establish new connections for which there was no response.

Unit: Count

The number of connections that have been idle for more than 350 seconds.

A value greater than zero indicates that there are connections that have been moved to an idle state. If the value for

IdleTimeoutCount increases, it may indicate that clients behind the

NAT gateway are re-using stale connections.

Unit: Count

168


Dimensions for NAT Gateway Metrics

Metric

PacketsDropCount

Description

The number of packets dropped by the NAT gateway.

A value greater than zero may indicate an ongoing transient issue with the NAT gateway. If this value is high, see the AWS service health dashboard .

Units: Count

Dimensions for NAT Gateway Metrics

You can filter the NAT gateway data using the following dimensions.

Dimension

NatGatewayId

Description

This dimension filters data by the NAT gateway ID.

Amazon VPC VPN Metrics and Dimensions

Amazon VPN sends data to CloudWatch as it becomes available. For more information, see Monitoring with CloudWatch in the Amazon VPC User Guide.

VPN Metrics

The following metrics are available from Amazon VPC VPN.

Metric

TunnelState

TunnelDataIn

TunnelDataOut

Description

The state of the tunnel. 0 indicates DOWN and 1 indicates UP.

Units: Boolean

The bytes received through the VPN tunnel. Each metric data point represents the number of bytes received after the previous data point. Use the Sum statistic to show the total number of bytes received during the period.

This metric counts the data after decryption.

Units: Bytes

The bytes sent through the VPN tunnel. Each metric data point represents the number of bytes sent after the previous data point. Use the Sum statistic to show the total number of bytes sent during the period.

This metric counts the data before encryption.

Units: Bytes

169


Dimensions for VPN Metrics

Dimensions for VPN Metrics

You can filter the Amazon VPC VPN data using the following dimensions.

Dimension

VpnId

TunnelIpAddress

Description

This dimension filters the data by the VPN connection.

This dimension filters the data by the IP address of the tunnel for the virtual private gateway.

AWS WAF Metrics and Dimensions

AWS WAF sends data to CloudWatch every minute. For more information, see Testing Web ACLs in the

AWS WAF Developer Guide.

AWS WAF Metrics

The

WAF


Metric

AllowedRequests

BlockedRequests

CountedRequests

Description

The number of allowed web requests.

Reporting criteria: There is a nonzero value

Units: Count

Dimensions:

Rule

,

WebACL


The number of blocked web requests.


Units: Count

Dimensions:

Rule

,

WebACL


The number of counted web requests.


A counted web request is one that matches all of the conditions in a particular rule. Counted web requests are typically used for testing.

Units: Count

Dimensions:

Rule

,

WebACL


170

WebACL


AWS WAF Dimensions

AWS WAF Dimensions

Dimension

Rule

Description

The name of the rule, or one of the following:

• ALL, which represents the set of all rules.

• Default_Action, which represents the action assigned to any request that does not match any rule with either an allow or block action.

The name of the web ACL.

Amazon WorkSpaces Metrics and Dimensions

Amazon WorkSpaces sends data points to CloudWatch for several metrics every five minutes (fiveminute metrics). Detailed monitoring, or one-minute metrics, is currently unavailable for Amazon

WorkSpaces. For more information, see Monitoring Amazon WorkSpaces in the Amazon WorkSpaces

Administration Guide.

Amazon WorkSpaces Metrics

The

AWS/WorkSpaces


Metric

Available

1

Unhealthy

1

ConnectionAttempt

2

ConnectionSuccess

2

ConnectionFailure

2

SessionLaunchTime

2

Description Dimensions

The number of WorkSpaces that returned a healthy status.

The number of WorkSpaces that returned an unhealthy status.

The number of connection attempts.

DirectoryId

WorkspaceId

DirectoryId

WorkspaceId

DirectoryId

WorkspaceId

The number of successful connections.

The number of failed connections.

The amount of time it takes

DirectoryId

WorkspaceId

DirectoryId

WorkspaceId

DirectoryID

Statistics

Available

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Units

Count

Count

Count

Count

Count

Second

(time)

171


Dimensions for Amazon WorkSpaces Metrics

Metric

InSessionLatency

2

SessionDisconnect

2

UserConnected

3

Stopped

Maintenance

4

Description Dimensions

to initiate a

WorkSpaces session.

The round trip time between the WorkSpaces client and the

WorkSpace.

The number of connections that were closed, including user-initiated and failed connections.

The number of WorkSpaces that have a user connected.

The number of

WorkSpaces that are stopped.

WorkspaceID

DirectoryID

WorkspaceID

DirectoryID

WorkspaceID

DirectoryID

WorkspaceID

DirectoryID

WorkspaceID

The number of WorkSpaces that are under maintenance.

DirectoryID

WorkspaceID

Statistics

Available

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Units

Millisecond

(time)

Count

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Average, Sum,

Maximum,

Minimum, Data

Samples

Count

Count

Count

1

Amazon WorkSpaces periodically sends status requests to a WorkSpace. A WorkSpace is marked

Available

when it responds to these requests, and

Unhealthy

when it fails to respond to these requests.

These metrics are available at a per-WorkSpace granularity, and also aggregated for all WorkSpaces in an organization.

2

Amazon WorkSpaces records metrics on connections made to each WorkSpace. These metrics are emitted after a user has successfully authenticated via the WorkSpaces client and the client then initiates a session. The metrics are available at a per-WorkSpace granularity, and also aggregated for all

WorkSpaces in a directory.

3

Amazon WorkSpaces periodically sends connection status requests to a WorkSpace. Users are reported as connected when they are actively using their sessions. This metric is available at a per-WorkSpace granularity, and is also aggregated for all WorkSpaces in an organization.

4

This metric applies to WorkSpaces that are configured with an AutoStop running mode. If you have maintenance enabled for your WorkSpaces, this metric captures the number of WorkSpaces that are currently under maintenance. This metric is available at a per-WorkSpace granularity, which describes when a WorkSpace went into maintenance and when it was removed.


Amazon WorkSpaces metrics are available for the following dimensions.

172

Dimension

DirectoryId

WorkspaceId



Description

Limits the data you receive to the WorkSpaces in the specified directory. The

DirectoryId

value is in the form of d-XXXXXXXXXX

.

Limits the data you receive to the specified

WorkSpace. The

WorkspaceId

value is in the form ws-XXXXXXXXXX

.

173


Creating Amazon CloudWatch

Alarms

You can create a CloudWatch alarm that watches a single metric. The alarm performs one or more actions based on the value of the metric relative to a threshold over a number of time periods. The action can be an Amazon EC2 action, an Auto Scaling action, or a notification sent to an Amazon SNS topic.

Alarms invoke actions for sustained state changes only. CloudWatch alarms do not invoke actions simply because they are in a particular state, the state must have changed and been maintained for a specified number of periods.

After an alarm invokes an action due to a change in state, its subsequent behavior depends on the type of action that you have associated with the alarm. For Amazon EC2 and Auto Scaling actions, the alarm continues to invoke the action for every period that the alarm remains in the new state. For Amazon SNS notifications, no additional actions are invoked.

Note

CloudWatch doesn't test or validate the actions that you specify, nor does it detect any Auto

Scaling or Amazon SNS errors resulting from an attempt to invoke nonexistent actions. Make sure that your actions exist.

You can also add alarms to dashboards. When an alarm is on a dashboard, it turns red when it is in the

ALARM

state, making it easier for you to monitor its status proactively.

An alarm has three possible states:

•

OK

—The metric is within the defined threshold

•

ALARM

—The metric is outside of the defined threshold

•

INSUFFICIENT_DATA

—The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state

In the following figure, the alarm threshold is set to 3 units and the alarm is evaluated over 3 periods.

That is, the alarm goes to ALARM state if the oldest of the 3 periods being evaluated is breaching, and the 2 subsequent periods are either breaching or missing. In the figure, this happens with the third

174


Configuring How Alarms Handle Missing Data through fifth time periods, and the alarm's state is set to

ALARM

. At period six, the value dips below the threshold, and the state reverts to

OK

. Later, during the ninth time period, the threshold is breached again, but for only one period. Consequently, the alarm state remains

OK

.

Configuring How CloudWatch Alarms Treats

Missing Data

Similar to how each alarm is always in one of three states, each specific data point reported to

CloudWatch falls under one of three categories::

• good

(within the threshold)

• bad

(violating the threshold)

• missing

You can specify how alarms handle missing data points. Choose whether to treat missing data points as:

• Missing (the alarm looks back farther in time to find additional data points)

• Good ("Not Breaching," treated as a data point that is within the threshold)

• Bad ("Breaching," treated as a data point that is breaching the threshold)

• Ignored (the current alarm state is maintained)

The best choice depends on the type of metric. For a metric that continually reports data, such as

CPUUtilization

of an instance, you might want to treat missing data points as bad

, because they may indicate something is wrong. But for a metric that generates data points only when an error occurs, such as

ThrottledRequests

in Amazon DynamoDB, you would want to treat missing data as good

. The default behavior is missing

.

Choosing the best option for your alarm prevents unnecessary and misleading alarm condition changes, and also more accurately indicates the health of your system.

Note

If you treat missing data as missing

and some data points in the current window are missing,

CloudWatch looks back extra periods to find other existing data points to assess whether the

175


High-Resolution Alarms alarm should change state. When this happens, if the furthest back period that is now being considered is not breaching, the alarm state does not go to

ALARM

.

High-Resolution Alarms



There is a higher charge for high-resolution alarms. For more information about high-resolution metrics,

see Publish Custom Metrics (p. 42) .

Percentile-Based CloudWatch Alarms and Low

Data Samples

When you set a percentile as the statistic for an alarm, you can specify what to do when there is not enough data for a good statistical assessment. You can choose to have the alarm evaluate the statistic anyway and possibly change the alarm state. Or, you can have the alarm ignore the metric while the sample size is low, and wait to evaluate it until there is enough data to be statistically significant.

For percentiles between 0.5 and 1.00, this setting is used when there are fewer than 10/(1-percentile) data points during the evaluation period. For example, this setting would be used if there were fewer than 1000 samples for an alarm on a p99 percentile. For percentiles between 0 and 0.5, the setting is used when there are fewer than 10/percentile data points.

Common Features of CloudWatch Alarms

The following features apply to all CloudWatch alarms:

• You can create up to 5000 alarms per region per AWS account. To create or update an alarm, you use the

PutMetricAlarm

API action ( mon-put-metric-alarm

command).

• You can list any or all of the currently configured alarms, and list any alarms in a particular state using

DescribeAlarms

( mon-describe-alarms

). You can further filter the list by time range.

• You can disable and enable alarms by using

DisableAlarmActions

and

EnableAlarmActions

( mondisable-alarm-actions

and mon-enable-alarm-actions

).

• You can test an alarm by setting it to any state using

SetAlarmState

( mon-set-alarm-state

). This temporary state change lasts only until the next alarm comparison occurs.

• You can create an alarm using

PutMetricAlarm

( mon-put-metric-alarm

) before you've created a custom metric. For the alarm to be valid, you must include all of the dimensions for the custom metric in addition to the metric namespace and metric name in the alarm definition.

• You can view an alarm's history using

DescribeAlarmHistory

( mon-describe-alarm-history

).

CloudWatch preserves alarm history for two weeks. Each state transition is marked with a unique time stamp. In rare cases, your history might show more than one notification for a state change. The time stamp enables you to confirm unique state changes.

• The number of evaluation periods for an alarm multiplied by the length of each evaluation period cannot exceed one day.

Note

Some AWS resources do not send metric data to CloudWatch under certain conditions.

176


Set Up an SNS Topic

For example, Amazon EBS may not send metric data for an available volume that is not attached to an Amazon EC2 instance, because there is no metric activity to be monitored for that volume.

If you have an alarm set for such a metric, you may notice its state change to Insufficient Data.

This may simply indicate that your resource is inactive, and may not necessarily mean that there is a problem.

Set Up Amazon SNS Notifications

Amazon CloudWatch uses Amazon SNS to send email. First, create and subscribe to an SNS topic. When you create a CloudWatch alarm, you can add this SNS topic to send an email notification when the alarm changes state. For more information, see the Amazon Simple Notification Service Getting Started Guide .

Note

Alternatively, if you plan to create your CloudWatch alarm using the AWS Management Console, you can skip this procedure because you can create the topic through the Create Alarm Wizard.

Set Up an SNS Topic Using the AWS Management

Console

First, create a topic, then subscribe to it. You can optionally publish a test message to the topic.

To create an SNS topic

1. Open the Amazon SNS console at https://console.aws.amazon.com/sns/v2/home .

2. On the Amazon SNS dashboard, under Common actions, choose Create Topic.

3. In the Create new topic dialog box, for Topic name, type a name for the topic (for example, mytopic).

4. Choose Create topic.

5. Copy the Topic ARN for the next task (for example, arn:aws:sns:us-east-1:111122223333:my-topic).

To subscribe to an SNS topic


2. In the navigation pane, choose Subscriptions, Create subscription.

3. In the Create subscription dialog box, for Topic ARN, paste the topic ARN that you copied created in the previous task.

4. For Protocol, choose Email.

5. For Endpoint, type an email address that you can use to receive the notification, and then choose

Create subscription.

6. From your email application and open the message from AWS Notifications and confirm your subscription.

Your web browser displays a confirmation response from Amazon Simple Notification Service.

To publish a test message to an SNS topic


2. In the navigation pane, choose Topics.

3. On the

Topics page, select a topic, and choose Publish to topic.

177


Set Up an SNS Topic Using the AWS CLI

4. In the Publish a message page, for Subject, type a subject line for your message, and for Message, type a brief message.

5. Choose Publish Message.

6. Check your email to confirm that you received the message.

Set Up an SNS Topic Using the AWS CLI

First you create an SNS topic, and then publish a message directly to the topic to test that you have properly configured it.

To set up an SNS topic

1. Create the topic using the create-topic command as follows.

aws sns create-topic --name

my-topic

Amazon SNS returns a topic ARN with the following format:

{

"TopicArn": "arn:aws:sns:us-east-1:111122223333:my-topic"

}

2. Subscribe your email address to the topic using the subscribe command. If the subscription request succeeds,you receive a confirmation email message.

aws sns subscribe --topic-arn arn:aws:sns:

us-east-1

:

111122223333

:

my-topic

--protocol

email --notification-endpoint

my-email-address

Amazon SNS returns the following:

{

"SubscriptionArn": "pending confirmation"

}

3. From your email application and open the message from AWS Notifications and confirm your subscription.

Your web browser displays a confirmation response from Amazon Simple Notification Service.

4. Check the subscription using the list-subscriptions-by-topic command.

aws sns list-subscriptions-by-topic --topic-arn arn:aws:sns:

us-east-1

:

111122223333

:

mytopic


{

"Subscriptions": [

{

"Owner": "111122223333",

"Endpoint": "[email protected]",

"Protocol": "email",

"TopicArn": "arn:aws:sns:us-east-1:111122223333:my-topic",

"SubscriptionArn": "arn:aws:sns:us-east-1:111122223333:my-topic:64886986bf10-48fb-a2f1-dab033aa67a3"

}

178


Create or Edit an Alarm

]

}

5. (Optional) Publish a test message to the topic using the publish command.

aws sns publish --message "Verification" --topic arn:aws:sns:

us-east-1

:

111122223333

:

mytopic


{

"MessageId": "42f189a0-3094-5cf6-8fd7-c2dde61a4d7d"

}

6. Check your email to confirm that you received the message.

Create or Edit a CloudWatch Alarm

You can choose specific metrics to trigger the alarm and specify thresholds for those metrics. You can then set your alarm to change state when a metric exceeds a threshold that you have defined.

To create an alarm


2. In the navigation pane, choose Alarms, Create Alarm.

3. For the Select Metric step, do the following: a. Choose a metric category (for example, EC2 Metrics).

b. Select an instance and metric (for example, CPUUtilization).

c.

For the statistic, choose one of the statistics (for example, Average) or predefined percentiles, or specify a custom percentile (for example, p95.45).

d. Choose a period (for example,

1 Hour

).

e. Choose Next.

4. For the Define Alarm step, do the following: a. Under Alarm Threshold, type a unique name for the alarm and a description of the alarm. For

Whenever, specify a threshold (for example, 80 percent of CPU utilization) and the number of periods.

b. Under Additional settings, for Treat missing data as, select how have the alarm treat missing data points. For more information, see

Configuring How CloudWatch Alarms Treats Missing

Data (p. 175)

.

If the alarm uses a percentile as the monitored statistic, choose whether to evaluate or ignore cases with low sample rates. If you choose ignore, the current alarm state is maintained when

the sample size is too low. For more information, see Percentile-Based CloudWatch Alarms and

Low Data Samples (p. 176)

.

179


Create a CPU Usage Alarm c.

Under Actions, select the type of action to have the alarm to perform when the alarm is triggered.

d. Choose Create Alarm.

You can also add alarms to a dashboard. For more information, see

Add or Remove an Alarm from a

CloudWatch Dashboard (p. 23) .

To edit an alarm


2. In the navigation pane, choose Alarms.

3. Select the alarm, and then choose Actions, Modify.

4. In the Modify Alarm dialog box, update the alarm as necessary and choose Save Changes.

To update an email notification list that was created using the Amazon SNS console


2. In the navigation pane, choose Topics, and then select the ARN for your notification list (topic).

3. Do one of the following:

• To add an email address, choose Create subscription. For Protocol, choose Email. For Endpoint, type the email address of the new recipient. Choose Create subscription.

• To remove an email address, choose the Subscription ID. Choose Other subscription actions,

Delete subscriptions.

4. Choose Publish to topic.

Create a CPU Usage Alarm that Sends Email

You can create an CloudWatch alarm that sends an email message using Amazon SNS when the alarm changes state from OK to ALARM.

The alarm changes to the ALARM state when the average CPU use of an EC2 instance exceeds a specified threshold for consecutive specified periods.

Set Up a CPU Usage Alarm Using the AWS

Management Console

To create an alarm that sends email based on CPU usage



3. Under EC2 Metrics, select a metric category (for example, Per-Instance Metrics).

4. Select a metric as follows: a. Select a row with the instance and the CPUUtilization metric.

b. For the statistic, choose Average, choose one of the predefined percentiles, or specify a custom percentile (for example, p95.45).

c.

Choose a period (for example,

5 minutes

).

d. Choose Next.

180


Set Up a CPU Usage Alarm Using the AWS Management Console

5. Define the alarm as follows: a. Under Alarm Threshold, type a unique name for the alarm (for example, myHighCpuAlarm) and a description of the alarm (for example, CPU usage exceeds 70 percent).

b. Under Whenever, for is, choose > and type

70

. For for, type

2

.

c.

Under Additional settings, for Treat missing data as, choose bad (breaching threshold), as missing data points may indicate the instance is down.

d. Under Actions, for Whenever this alarm, select State is ALARM. For Send notification to, select an existing SNS topic or create a new one.

181


Set Up a CPU Usage Alarm Using the AWS CLI e. To create a new SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example, myHighCpuAlarm), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM

state. Each email address is sent a topic subscription confirmation email. You must confirm the subscription before notifications can be sent.

f.

Choose Create Alarm.

Set Up a CPU Usage Alarm Using the AWS CLI

To create an alarm that sends email based on CPU usage

1. Set up an SNS topic. For more information, see

Set Up Amazon SNS Notifications (p. 177) .

2. Create an alarm using the put-metric-alarm command as follows.

aws cloudwatch put-metric-alarm --alarm-name

cpu-mon

--alarm-description "Alarm

when CPU exceeds 70%" --metric-name CPUUtilization --namespace AWS/EC2 --statistic

Average --period

300

--threshold

70

--comparison-operator GreaterThanThreshold -dimensions Name=InstanceId,Value=

i-12345678

--evaluation-periods

2

--alarm-actions

arn:aws:sns:

us-east-1

:

111122223333

:

my-topic

--unit Percent

3. Test the alarm by forcing an alarm state change using the set-alarm-state command.

a. Change the alarm state from

INSUFFICIENT_DATA

to

OK

:

aws cloudwatch set-alarm-state --alarm-name

cpu-mon

--state-reason "initializing"

--state-value OK

b. Change the alarm state from

OK

to

ALARM

:


cpu-mon


--state-value ALARM

c.

Check that you have received an email notification about the alarm.

Create a Load Balancer Latency Alarm that Sends

Email

You can set up an Amazon SNS notification and configure an alarm that monitors latency exceeding 100 ms for your Classic Load Balancer.

182


Set Up a Latency Alarm Using the AWS Management Console

Set Up a Latency Alarm Using the AWS Management

Console

To create a load balancer latency alarm that sends email



3. Under CloudWatch Metrics by Category, choose the ELB Metrics category.

4. Select the row with the Classic Load Balancer and the Latency metric.

5. For the statistic, choose Average, choose one of the predefined percentiles, or specify a custom percentile (for example, p95.45).

6. For the period, choose

1 Minute.

7. Choose Next.

8. Under Alarm Threshold, type a unique name for the alarm (for example,

myHighCpuAlarm

) and a description of the alarm (for example, Alarm when Latency exceeds 100s).

9. Under Whenever, for is, select > and type 0.1. For for, type 3.

10. Under Additional settings, for Treat missing data as, choose ignore (maintain alarm state) so that missing data points do not trigger alarm state changes.

For Percentiles with low samples choose ignore (maintain the alarm state) so that the alarm evaluates only situations with adequate numbers of data samples.

11. Under Actions, for Whenever this alarm, choose State is ALARM. For Send notification to choose an existing SNS topic or create a new one.

To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example,

myHighCpuAlarm

), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM

state. Each email address is sent a topic subscription confirmation email. You must confirm the subscription before notifications can be sent.


Set Up a Latency Alarm Using the AWS CLI

To create a load balancer latency alarm that sends email

1. Set up an SNS topic. For more information, see

Set Up Amazon SNS Notifications (p. 177)

2. Create the alarm using the put-metric-alarm command as follows:


lb-mon

--alarm-description "Alarm

when Latency exceeds 100s" --metric-name Latency --namespace AWS/ELB --statistic

Average --period 60 --threshold 100 --comparison-operator GreaterThanThreshold -dimensions Name=LoadBalancerName,Value=

my-server

--evaluation-periods 3 --alarm-actions

arn:aws:sns:

us-east-1

:

111122223333

:

my-topic

--unit Seconds



INSUFFICIENT_DATA

to

OK

:


lb-mon

--state-reason "initializing" -state-value OK


OK

to

ALARM

:

183


Create a Storage Throughput Alarm


lb-mon

--state-reason "initializing" -state-value ALARM

c.

Check that you have received an email notification about the alarm.

Create a Storage Throughput Alarm that Sends

Email

You can set up an SNS notification and configure an alarm that sends email when Amazon EBS exceeds

100 MB throughput.

Set Up a Storage Throughput Alarm Using the AWS

Management Console

To create a storage throughput alarm that sends email



Alarms, Create Alarm.

3. Under EBS Metrics, choose a metric category.

4. Select the row with the volume and the VolumeWriteBytes metric.

5. For the statistic, choose Average. For the period, choose 5 Minutes. Choose Next.

6. Under Alarm Threshold, type a unique name for the alarm (for example, myHighWriteAlarm) and a description of the alarm (for example, VolumeWriteBytes exceeds 100,000 KiB/s).

7. Under Whenever, for is, choose > and type

100000

. For for, type

15

consecutive periods.

A graphical representation of the threshold is shown under Alarm Preview.


9. Under Actions, for Whenever this alarm, choose State is ALARM. For Send notification to, chose an existing SNS topic or create one.

To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic

(for example, myHighCpuAlarm), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM

state. Each email address is sent a topic subscription confirmation email. You must confirm the subscription before notifications can be sent to an email address.


Set Up a Storage Throughput Alarm Using the AWS

CLI

To create a storage throughput alarm that sends email

1. Create an SNS topic. For more information, see Set Up Amazon SNS Notifications (p. 177) .

2. Create the alarm.

184


Create Alarms to Stop, Terminate,

Reboot, or Recover an Instance


ebs-mon

--alarm-description "Alarm when

EBS volume exceeds 100MB throughput" --metric-name VolumeReadBytes --namespace AWS/

EBS --statistic Average --period 300 --threshold 100000000 --comparison-operator

GreaterThanThreshold --dimensions Name=VolumeId,Value=

my-volume-id

--evaluationperiods 3 --alarm-actions arn:aws:sns:

us-east-1

:

111122223333

:

my-alarm-topic

-insufficient-data-actions arn:aws:sns:

us-east-1

:

111122223333

:

my-insufficient-data-topic



INSUFFICIENT_DATA

to

OK

:


ebs-mon


--state-value OK


OK

to

ALARM

:


ebs-mon


--state-value ALARM

c.

Change the alarm state from

ALARM

to

INSUFFICIENT_DATA

:


ebs-mon


--state-value INSUFFICIENT_DATA

d. Check that you have received an email notification about the alarm.

Create Alarms to Stop, Terminate, Reboot, or

Recover an Instance

Using Amazon CloudWatch alarm actions, you can create alarms that automatically stop, terminate, reboot, or recover your EC2 instances. You can use the stop or terminate actions to help you save money when you no longer need an instance to be running. You can use the reboot and recover actions to automatically reboot those instances or recover them onto new hardware if a system impairment occurs.

Every alarm action you create uses alarm action ARNs. One set of ARNs is more secure because it requires you to have the EC2ActionsAccess IAM role in your account. This IAM role enables you to perform stop, terminate, or reboot actions—previously you could not execute an action if you were using an IAM role. Existing alarms that use the previous alarm action ARNs do not require this IAM role, however it is recommended that you change the ARN and add the role when you edit an existing alarm that uses these

ARNs.

Note

The IAM role EC2ActionsAccess is not used in the recover action.

The EC2ActionsAccess IAM role enables AWS to perform alarm actions on your behalf. When you create an alarm action for the first time using the Amazon EC2 or Amazon CloudWatch consoles, AWS automatically creates this role for you.

There are a number of scenarios in which you might want to automatically stop or terminate your instance. For example, you might have instances dedicated to batch payroll processing jobs or scientific computing tasks that run for a period of time and then complete their work. Rather than letting those instances sit idle (and accrue charges), you can stop or terminate them which can help you to save money. The main difference between using the stop and the terminate alarm actions is that you can easily restart a stopped instance if you need to run it again later, and you can keep the same instance

185


Adding Stop Actions to Amazon CloudWatch Alarms

ID and root volume. However, you cannot restart a terminated instance. Instead, you must launch a new instance.

You can add the stop, terminate, reboot, or recover actions to any alarm that is set on an Amazon EC2 per-instance metric, including basic and detailed monitoring metrics provided by Amazon CloudWatch

(in the AWS/EC2 namespace), as well as any custom metrics that include the "InstanceId=" dimension, as long as the InstanceId value refers to a valid running Amazon EC2 instance.

Console Support

You can create alarms using the CloudWatch console or the Amazon EC2 console. The procedures in this documentation use the CloudWatch console. For procedures that use the Amazon EC2 console, see

Create Alarms That Stop, Terminate, Reboot, or Recover an Instance in the Amazon EC2 User Guide for

Linux Instances.

Permissions

If you are using an AWS Identity and Access Management (IAM) account to create or modify an alarm, you must have the following permissions:

• ec2:DescribeInstanceStatus

and ec2:DescribeInstances

— For all alarms on Amazon EC2 instance status metrics

• ec2:StopInstances

— For alarms with stop actions

• ec2:TerminateInstances

— For alarms with terminate actions

• ec2:DescribeInstanceRecoveryAttribute

and ec2:RecoverInstances

— For alarms with recover actions

If you have read/write permissions for Amazon CloudWatch but not for Amazon EC2, you can still create an alarm but the stop or terminate actions won't be performed on the instance. However, if you are later granted permission to use the associated Amazon EC2 APIs, the alarm actions you created earlier will be performed. For more information, see Permissions and Policies in the IAM User Guide.

If you want to use an IAM role to stop, terminate, or reboot an instance using an alarm action, you can only use the EC2ActionsAccess role. Other IAM roles are not supported. If you are using another IAM role, you cannot stop, terminate, or reboot the instance. However, you can still see the alarm state and perform any other actions such as Amazon SNS notifications or Auto Scaling policies.

Contents

•

Adding Stop Actions to Amazon CloudWatch Alarms (p. 186)

•

Adding Terminate Actions to Amazon CloudWatch Alarms (p. 187)

•

Adding Reboot Actions to Amazon CloudWatch Alarms (p. 188)

•

Adding Recover Actions to Amazon CloudWatch Alarms (p. 189)

•

Viewing the History of Triggered Alarms and Actions (p. 190)

Adding Stop Actions to Amazon CloudWatch Alarms

You can create an alarm that stops an Amazon EC2 instance when a certain threshold has been met. For example, you may run development or test instances and occasionally forget to shut them off. You can create an alarm that is triggered when the average CPU utilization percentage has been lower than 10 percent for 24 hours, signaling that it is idle and no longer in use. You can adjust the threshold, duration, and period to suit your needs, plus you can add an SNS notification, so that you will receive an email when the alarm is triggered.

Amazon EC2 instances that use an Amazon Elastic Block Store volume as the root device can be stopped or terminated, whereas instances that use the instance store as the root device can only be terminated.

186


Adding Terminate Actions to Amazon CloudWatch Alarms

To create an alarm to stop an idle instance using the Amazon CloudWatch console




4. For the

Select Metric step, do the following: a. Under EC2 Metrics, choose Per-Instance Metrics.

b. Select the row with the instance and the CPUUtilization metric.

c.

For the statistic, choose Average.


1 Hour

).

e. Choose Next.

5. For the Define Alarm step, do the following: a. Under Alarm Threshold, type a unique name for the alarm (for example, Stop EC2 instance) and a description of the alarm (for example, Stop EC2 instance when CPU is idle too long).

b. Under Whenever, for is, choose < and type

10

. For for, type

24



c.

Under Notification, for Send notification to, choose an existing SNS topic or create a new one.

To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example, Stop_EC2_Instance), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM

state. Each email address will be sent a topic subscription confirmation email. You must confirm the subscription before notifications can be sent to an email address.

d. Choose EC2 Action.

e. Under EC2 Action, for Whenever this alarm, choose State is ALARM. For Take this action, choose Stop this instance.

f.

If prompted, select Create IAM role: EC2ActionsAccess to automatically create an IAM role so that AWS can automatically stop the instance on your behalf when the alarm is triggered.

g. Choose Create Alarm.

Adding Terminate Actions to Amazon CloudWatch

Alarms

You can create an alarm that terminates an EC2 instance automatically when a certain threshold has been met (as long as termination protection is not enabled for the instance). For example, you might want to terminate an instance when it has completed its work, and you don't need the instance again.

If you might want to use the instance later, you should stop the instance instead of terminating it.

For information about enabling and disabling termination protection for an instance, see Enabling

Termination Protection for an Instance in the Amazon EC2 User Guide for Linux Instances.

To create an alarm to terminate an idle instance using the Amazon CloudWatch console




4. For the Select Metric step, do the following: a. Under EC2 Metrics, choose Per-Instance Metrics.

187


Adding Reboot Actions to Amazon CloudWatch Alarms b. Select the row with the instance and the CPUUtilization metric.

c.

For the statistic, choose Average.


1 Hour

).

e. Choose Next.

5. For the Define Alarm step, do the following: a. Under Alarm Threshold, type a unique name for the alarm (for example, Terminate EC2 instance) and a description of the alarm (for example, Terminate EC2 instance when CPU is idle for too long).

b. Under Whenever, for is, choose < and type

10

. For for, type

24



c.


To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example, Terminate_EC2_Instance), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM



e. Under EC2 Action, for Whenever this alarm, choose State is ALARM. For Take this action, choose Terminate this instance.

f.

If prompted, select Create IAM role: EC2ActionsAccess to automatically create an IAM role so that AWS can automatically terminate the instance on your behalf when the alarm is triggered.


Adding Reboot Actions to Amazon CloudWatch

Alarms

You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically reboots the instance. The reboot alarm action is recommended for Instance Health Check failures (as opposed to the recover alarm action, which is suited for System Health Check failures). An instance reboot is equivalent to an operating system reboot. In most cases, it takes only a few minutes to reboot your instance. When you reboot an instance, it remains on the same physical host, so your instance keeps its public DNS name, private IP address, and any data on its instance store volumes.

Rebooting an instance doesn't start a new instance billing hour, unlike stopping and restarting your instance. For more information about rebooting an instance, see Reboot Your Instance in the Amazon

EC2 User Guide for Linux Instances.

Important

To avoid a race condition between the reboot and recover actions, we recommend that you set the alarm threshold to 3 for 1 minute when creating alarms that reboot an Amazon EC2 instance.

To create an alarm to reboot an instance using the Amazon CloudWatch console





188


Adding Recover Actions to Amazon CloudWatch Alarms b. Select the row with the instance and the StatusCheckFailed_Instance metric.

c.

For the statistic, choose Minimum.


1 Minute

).

e. Choose Next.

5. For the Define Alarm step, do the following: a. Under Alarm Threshold, type a unique name for the alarm (for example, Reboot EC2 instance) and a description of the alarm (for example, Reboot EC2 instance when health checks fail).


0

. For for, type

3



c.

Under

Notification, for Send notification to, choose an existing SNS topic or create a new one.

To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example, Reboot_EC2_Instance), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM



e. Under EC2 Action, for Whenever this alarm, choose State is ALARM. For Take this action, choose Reboot this instance.

f.

If prompted, select Create IAM role: EC2ActionsAccess to automatically create an IAM role so that AWS can automatically stop the instance on your behalf when the alarm is triggered.


Adding Recover Actions to Amazon CloudWatch

Alarms

You can create an Amazon CloudWatch alarm that monitors an Amazon EC2 instance and automatically recovers the instance if it becomes impaired due to an underlying hardware failure or a problem that requires AWS involvement to repair. Terminated instances cannot be recovered. A recovered instance is identical to the original instance, including the instance ID, private IP addresses, Elastic IP addresses, and all instance metadata.

When the

StatusCheckFailed_System

alarm is triggered, and the recover action is initiated, you will be notified by the Amazon SNS topic that you chose when you created the alarm and associated the recover action. During instance recovery, the instance is migrated during an instance reboot, and any data that is in-memory is lost. When the process is complete, information is published to the SNS topic you've configured for the alarm. Anyone who is subscribed to this SNS topic will receive an email notification that includes the status of the recovery attempt and any further instructions. You will notice an instance reboot on the recovered instance.

Examples of problems that cause system status checks to fail include:

• Loss of network connectivity

• Loss of system power

• Software issues on the physical host

• Hardware issues on the physical host

The recover action is only supported on:

• The C3, C4, M3, M4, R3, R4, T2, and X1 instance types

189


Viewing the History of Triggered Alarms and Actions

• Instances in a VPC

• Instances with shared tenancy (the tenancy attribute is set to default

)

• Instances that use Amazon EBS storage exclusively

If your instance has a public IP address, it will retain the same public IP address after recovery.

To create an alarm to recover an instance using the Amazon CloudWatch console





b. Select the row with the instance and the StatusCheckFailed_System metric.

c.

For the statistic, choose Minimum.


1 Minute

).

Important

To avoid a race condition between the reboot and recover actions, we recommend that you set the alarm threshold to 2 for 1 Minute when creating alarms that recover an EC2 instance.

e. Choose Next.

5. For the Define Alarm step, do the following: a. Under Alarm Threshold, type a unique name for the alarm (for example, Recover EC2 instance) and a description of the alarm (for example, Recover EC2 instance when health checks fail).


0

. For for, type

2


Important

To avoid a race condition between the reboot and recover actions, we recommend that you set the alarm threshold to 2 for 1 Minute when creating alarms that recover an EC2 instance.

c.


To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic (for example, Recover_EC2_Instance), and for Email list, type a comma-separated list of email addresses to be notified when the alarm changes to the

ALARM



e. Under EC2 Action, for Whenever this alarm, choose State is ALARM. For Take this action, choose Recover this instance.

f.

Choose Create Alarm.

Viewing the History of Triggered Alarms and Actions

You can view alarm and action history in the Amazon CloudWatch console. Amazon CloudWatch keeps the last two weeks' worth of alarm and action history.

To view the history of triggered alarms and actions


190


Create a Billing Alarm


3. Choose the alarm.

4. Choose the Details tab to view the most recent state transition along with the time and metric values.

5. Choose the History tab to view the most recent history entries.

Create a Billing Alarm to Monitor Your Estimated

AWS Charges

You can monitor your estimated AWS charges using Amazon CloudWatch. When you enable the monitoring of estimated charges for your AWS account, the estimated charges are calculated and sent several times daily to CloudWatch as metric data.

Billing metric data is stored in the US East (N. Virginia) region and represents worldwide charges. This data includes the estimated charges for every service in AWS that you use, as well as the estimated overall total of your AWS charges.

You can choose to receive alerts by email when charges have exceeded a certain threshold. These alerts are triggered by CloudWatch and messages are sent using Amazon SNS.

Tasks

•

Enable Billing Alerts (p. 191)

•

Create a Billing Alarm (p. 192)

•

Check the Alarm Status (p. 193)

•

Delete a Billing Alarm (p. 193)

Enable Billing Alerts

Before you can create an alarm for your estimated charges, you must enable billing alerts, so that you can monitor your estimated AWS charges and create an alarm using billing metric data. After you enable billing alerts, you cannot disable data collection, but you can delete any billing alarms you created.

After you enable billing alerts for the first time, it takes about 15 minutes before you can view billing data and set billing alarms.

Requirements

• You must be signed in using root account credentials; IAM users cannot enable billing alerts for your

AWS account.

• For consolidated billing accounts, billing data for each linked account can be found by logging in as the paying account. You can view billing data for total estimated charges and estimated charges by service for each linked account as well as for the consolidated account.

To enable the monitoring of estimated charges

1. Open the Billing and Cost Management console at https://console.aws.amazon.com/billing/home?# .

2. In the navigation pane, choose Preferences.

3. Choose Receive Billing Alerts.

191



4. Choose Save preferences.


After you've enabled billing alerts, you can create a billing alarm. In this procedure, you create an alarm that sends an email message when your estimated charges for AWS exceed a specified threshold.

Note

This procedure uses the advanced options. To use the simple options, see

Create a Billing

Alarm (p. 13)

in Monitor Your Estimated Charges Using CloudWatch.

To create a billing alarm using the CloudWatch console


2. If necessary, change the region to US East (N. Virginia). Billing metric data is stored in this region and represents worldwide charges.


Alarms, Billing, Create Alarm.

4. Choose show advanced to switch to the advanced options.

5. Under Alarm Threshold, replace the default name for the alarm (for example, My Estimated

Charges) and a description for the alarm (for example, Estimated Monthly Charges).

6. Under Whenever charges for, for is, choose >= and then type the monetary amount (for example,

200) that must be exceeded to trigger the alarm and send an email.

Note

Under Alarm Preview, there is an estimate of your charges that you can use to set an appropriate amount.


8. Under Actions, for Whenever this alarm, choose State is ALARM. For Send notification to, choose an existing SNS topic or create a new one.

To create an SNS topic, choose New list. For Send notification to, type a name for the SNS topic, and for Email list, type a comma-separated list of email addresses where email notifications should be sent. Each email address is sent a topic subscription confirmation email. You must confirm the subscription before notifications can be sent to an email address.


192


Check the Alarm Status

Check the Alarm Status

You can check the status of your billing alarm.

To check alarm status




4. Select the check box next to the alarm. Note that until the subscription is confirmed, it is shown as

"Pending confirmation". After the subscription is confirmed, refresh the console to show the updated status.

Delete a Billing Alarm

You can delete your billing alarm when you no longer need it.

To delete a billing alarm




4. Select the check box next to the alarm and then choose Delete.

5. When prompted for confirmation, choose Yes, Delete.

193


Authentication

Authentication and Access Control for Amazon CloudWatch

Access to Amazon CloudWatch requires credentials. Those credentials must have permissions to access

AWS resources, such as retrieving CloudWatch metric data about your cloud resources. The following sections provide details about how you can use AWS Identity and Access Management (IAM) and

CloudWatch to help secure your resources by controlling who can access them:

•

Authentication (p. 194)

•

Access Control (p. 195)

Authentication

You can access AWS as any of the following types of identities:

• AWS account root user – When you sign up for AWS, you provide an email address and password that is associated with your AWS account. These are your root credentials and they provide complete access to all of your AWS resources.

Important

For security reasons, we recommend that you use the root credentials only to create an

administrator user, which is an IAM user with full permissions to your AWS account. Then, you can use this administrator user to create other IAM users and roles with limited permissions.

For more information, see IAM Best Practices and Creating an Admin User and Group in the

IAM User Guide.

• IAM user – An IAM user is simply an identity within your AWS account that has specific custom permissions (for example, permissions to view metrics in CloudWatch). You can use an IAM user name and password to sign in to secure AWS webpages like the AWS Management Console , AWS Discussion

Forums , or the AWS Support Center .

194


Access Control

In addition to a user name and password, you can also generate access keys for each user. You can use these keys when you access AWS services programmatically, either through one of the several

SDKs or by using the AWS Command Line Interface (CLI) . The SDK and CLI tools use the access keys to cryptographically sign your request. If you don’t use the AWS tools, you must sign the request yourself.

CloudWatch supports Signature Version 4, a protocol for authenticating inbound API requests. For more information about authenticating requests, see Signature Version 4 Signing Process in the AWS

General Reference.

• IAM role – An IAM role is another IAM identity you can create in your account that has specific permissions. It is similar to an IAM user, but it is not associated with a specific person. An IAM role enables you to obtain temporary access keys that can be used to access AWS services and resources.

IAM roles with temporary credentials are useful in the following situations:

• Federated user access – Instead of creating an IAM user, you can use preexisting user identities from

AWS Directory Service, your enterprise user directory, or a web identity provider. These are known as

federated users. AWS assigns a role to a federated user when access is requested through an identity provider . For more information about federated users, see Federated Users and Roles in the IAM User

Guide.

• Cross-account access – You can use an IAM role in your account to grant another AWS account permissions to access your account’s resources. For an example, see Tutorial: Delegate Access Across

AWS Accounts Using IAM Roles in the IAM User Guide.

• AWS service access – You can use an IAM role in your account to grant an AWS service permissions to access your account’s resources. For example, you can create a role that allows Amazon Redshift to access an Amazon S3 bucket on your behalf and then load data stored in the bucket into an

Amazon Redshift cluster. For more information, see Creating a Role to Delegate Permissions to an

AWS Service in the IAM User Guide.

• Applications running on Amazon EC2 – Instead of storing access keys within the EC2 instance for use by applications running on the instance and making API requests, you can use an IAM role to manage temporary credentials for these applications. To assign an AWS role to an EC2 instance and make it available to all of its applications, you can create an instance profile that is attached to the instance. An instance profile contains the role and enables programs running on the EC2 instance to get temporary credentials. For more information, see Using Roles for Applications on Amazon EC2 in the IAM User Guide.

Access Control

You can have valid credentials to authenticate your requests, but unless you have permissions you cannot create or access CloudWatch resources. For example, you must have permissions to create CloudWatch dashboard widgets, view metrics, and so on.

The following sections describe how to manage permissions for CloudWatch. We recommend that you read the overview first.

•

Overview of Managing Access Permissions to Your CloudWatch Resources (p. 196)

•

Using Identity-Based Policies (IAM Policies) for CloudWatch (p. 199)

•

Amazon CloudWatch Permissions Reference (p. 205)

195


Overview of Managing Access

Overview of Managing Access Permissions to Your

CloudWatch Resources

Every AWS resource is owned by an AWS account, and permissions to create or access a resource are governed by permissions policies. An account administrator can attach permissions policies to IAM identities (that is, users, groups, and roles), and some services (such as AWS Lambda) also support attaching permissions policies to resources.

Note

An account administrator (or administrator user) is a user with administrator privileges. For more information, see IAM Best Practices in the IAM User Guide.

When granting permissions, you decide who is getting the permissions, the resources they get permissions for, and the specific actions that you want to allow on those resources.

Topics

•

CloudWatch Resources and Operations (p. 196)

•

Understanding Resource Ownership (p. 197)

•

Managing Access to Resources (p. 198)

•

Specifying Policy Elements: Actions, Effects, and Principals (p. 199)

•

Specifying Conditions in a Policy (p. 199)

CloudWatch Resources and Operations

CloudWatch doesn't have any specific resources for you to control access to. Therefore, there are no

CloudWatch Amazon Resource Names (ARNs) for you to use in an IAM policy. For example, you can't give a user access to CloudWatch data for only a specific set of EC2 instances or a specific load balancer or.

Permissions granted using IAM cover all the cloud resources you use or monitor with CloudWatch. In addition, you can't use IAM roles with the CloudWatch command line tools.

You use an * (asterisk) as the resource when writing a policy to control access to CloudWatch actions. For example:

{

"Version": "2012-10-17",

"Statement":[{

"Effect":"Allow",

"Action":["cloudwatch:GetMetricStatistics","cloudwatch:ListMetrics"],

"Resource":"*",

"Condition":{

"Bool":{

"aws:SecureTransport":"true"

}

}

}

]

}

For more information about ARNs, see ARNs in IAM User Guide. For information about CloudWatch Logs

ARNs, see Amazon Resource Names (ARNs) and AWS Service Namespaces in the Amazon Web Services

General Reference. For an example of a policy that covers CloudWatch actions, see

Using Identity-Based

Policies (IAM Policies) for CloudWatch (p. 199) .

196


Understanding Resource Ownership

Action

Stop

Terminate

Reboot

ARN (with region)

arn:aws:automate:useast-1:ec2:stop arn:aws:automate:useast-1:ec2:terminate n/a

ARN (for use with IAM role)

arn:aws:swf:us-east-1:

customeraccount

:action/actions/

AWS_EC2.InstanceId.Stop/1.0

Note

You must create at least one stop alarm using the Amazon

EC2 or CloudWatch console to create the

EC2ActionsAccess IAM role. After this IAM role is created, you can create stop alarms using the CLI.


customeraccount

:action/actions/

AWS_EC2.InstanceId.Terminate/1.0

Note

You must create at least one terminate alarm using the Amazon


EC2ActionsAccess IAM role. After this IAM role is created, you can create terminate alarms using the CLI.


customeraccount

:action/actions/

AWS_EC2.InstanceId.Reboot/1.0

Note

You must create at least one reboot alarm using the Amazon


EC2ActionsAccess IAM role. After this IAM role is created, you can create reboot alarms using the CLI.

n/a

Recover arn:aws:automate:useast-1:ec2:recover

Understanding Resource Ownership

The AWS account owns the resources that are created in the account, regardless of who created the resources. Specifically, the resource owner is the AWS account of the principal entity (that is, the root

197


Managing Access to Resources account, an IAM user, or an IAM role) that authenticates the resource creation request. CloudWatch does not have any resources that you can own.

Managing Access to Resources

A permissions policy describes who has access to what. The following section explains the available options for creating permissions policies.

Note

This section discusses using IAM in the context of CloudWatch. It doesn't provide detailed information about the IAM service. For complete IAM documentation, see What Is IAM?

in the

IAM User Guide. For information about IAM policy syntax and descriptions, see AWS IAM Policy

Reference in the IAM User Guide.

Policies attached to an IAM identity are referred to as identity-based policies (IAM polices) and policies attached to a resource are referred to as resource-based policies. CloudWatch supports only identitybased policies (IAM policies).

Topics

•

Identity-Based Policies (IAM Policies) (p. 198)

•

Resource-Based Policies (p. 198)

Identity-Based Policies (IAM Policies)

You can attach policies to IAM identities. For example, you can do the following:

• Attach a permissions policy to a user or a group in your account – To grant a user permissions to create an Amazon CloudWatch resource, such as metrics, you can attach a permissions policy to a user or group that the user belongs to.

• Attach a permissions policy to a role (grant cross-account permissions) – You can attach an identity-based permissions policy to an IAM role to grant cross-account permissions. For example, the administrator in account A can create a role to grant cross-account permissions to another AWS account (for example, account B) or an AWS service as follows:

1. Account A administrator creates an IAM role and attaches a permissions policy to the role that grants permissions on resources in account A.

2. Account A administrator attaches a trust policy to the role identifying account B as the principal who can assume the role.

3. Account B administrator can then delegate permissions to assume the role to any users in account B.

Doing this allows users in account B to create or access resources in account A. The principal in the trust policy can also be an AWS service principal if you want to grant an AWS service permissions to assume the role.

For more information about using IAM to delegate permissions, see Access Management in the IAM

User Guide.

For more information about using identity-based policies with CloudWatch, see Using Identity-Based

Policies (IAM Policies) for CloudWatch (p. 199) . For more information about users, groups, roles, and

permissions, see Identities (Users, Groups, and Roles) in the IAM User Guide.

Resource-Based Policies

Other services, such as Amazon S3, also support resource-based permissions policies. For example, you can attach a policy to an Amazon S3 bucket to manage access permissions to that bucket. CloudWatch doesn't support resource-based policies.

198


Specifying Policy Elements: Actions, Effects, and Principals

Specifying Policy Elements: Actions, Effects, and

Principals

For each CloudWatch resource, the service defines a set of API operations. To grant permissions for these

API operations, CloudWatch defines a set of actions that you can specify in a policy. Some API operations can require permissions for more than one action in order to perform the API operation. For more

information about resources and API operations, see CloudWatch Resources and Operations (p. 196)

and CloudWatch Actions .

The following are the basic policy elements:

• Resource – You use an Amazon Resource Name (ARN) to identify the resource that the policy applies to. CloudWatch does not have any resources for you to control using policies resources, so you always use the wildcard character (*) in IAM policies. For more information, see

CloudWatch Resources and

Operations (p. 196) .

• Action – You use action keywords to identify resource operations that you want to allow or deny.

For example, the cloudwatch:ListMetrics

permission allows the user permissions to perform the

ListMetrics

operation.

• Effect – You specify the effect, either allow or deny, when the user requests the specific action. If you don't explicitly grant access to (allow) a resource, access is implicitly denied. You can also explicitly deny access to a resource, which you might do to make sure that a user cannot access it, even if a different policy grants access.

• Principal – In identity-based policies (IAM policies), the user that the policy is attached to is the implicit principal. For resource-based policies, you specify the user, account, service, or other entity that you want to receive permissions (applies to resource-based policies only). CloudWatch doesn't support resource-based policies.

To learn more about IAM policy syntax and descriptions, see AWS IAM Policy Reference in the IAM User

Guide.

For a table showing all of the CloudWatch API actions and the resources that they apply to, see

Amazon

CloudWatch Permissions Reference (p. 205) .

Specifying Conditions in a Policy

When you grant permissions, you can use the access policy language to specify the conditions when a policy should take effect. For example, you might want a policy to be applied only after a specific date.

For more information about specifying conditions in a policy language, see Condition in the IAM User

Guide.

To express conditions, you use predefined condition keys. There are no condition keys specific to

CloudWatch. However, there are AWS-wide condition keys that you can use as appropriate. For a complete list of AWS-wide keys, see Available Keys for Conditions in the IAM User Guide.

Using Identity-Based Policies (IAM Policies) for

CloudWatch

This topic provides examples of identity-based policies that demonstrate how an account administrator can attach permissions policies to IAM identities (that is, users, groups, and roles) and thereby grant permissions to perform operations on CloudWatch resources.

199


Permissions Required to Use the CloudWatch Console

Important

We recommend that you first review the introductory topics that explain the basic concepts and options available to manage access to your CloudWatch resources. For more information, see

Access Control (p. 195)

.

The sections in this topic cover the following:

•

Permissions Required to Use the CloudWatch Console (p. 200)

•

AWS Managed (Predefined) Policies for CloudWatch (p. 203)

•

Customer Managed Policy Examples (p. 203)

The following shows an example of a permissions policy.

{

"Version": "2012-10-17",

"Statement":[{

"Effect":"Allow",

"Action":["cloudwatch:GetMetricStatistics","cloudwatch:ListMetrics"],

"Resource":"*",

"Condition":{

"Bool":{


}

}

}

]

}

This sample policy has one statement that grants permissions to a group for two CloudWatch actions

( cloudwatch:GetMetricStatisticsdata

, and cloudwatch:ListMetrics

), but only if the group uses SSL with the request (


). For more information about the elements within an

IAM policy statement, see

Specifying Policy Elements: Actions, Effects, and Principals (p. 199) and

IAM

Policy Elements Reference in IAM User Guide.


For a user to work with the CloudWatch console, that user must have a minimum set of permissions that allows the user to describe other AWS resources in their AWS account. The CloudWatch console requires permissions from the following services:

• Auto Scaling

• CloudTrail

• CloudWatch

• CloudWatch Events

• CloudWatch Logs

• Amazon EC2

• Amazon ES

• IAM

• Kinesis

• Lambda

• Amazon S3

• Amazon SNS

• Amazon SQS

200



• Amazon SWF

If you create an IAM policy that is more restrictive than the minimum required permissions, the console won't function as intended for users with that IAM policy. To ensure that those users can still use the CloudWatch console, also attach the

CloudWatchReadOnlyAccess

managed policy to the user, as described in

AWS Managed (Predefined) Policies for CloudWatch (p. 203) .

You don't need to allow minimum console permissions for users that are making calls only to the AWS

CLI or the CloudWatch API.

The full set of permissions required to work with the CloudWatch console are listed below:

• applicationautoscaling:describeScalingPolicies

• autoscaling:describeAutoScalingGroups

• autoscaling:describePolicies

• cloudtrail:describeTrails

• cloudwatch:deleteAlarms

• cloudwatch:describeAlarmHistory

• cloudwatch:describeAlarms

• cloudwatch:getMetricData

• cloudwatch:getMetricDataForAccounts

• cloudwatch:getMetricStatistics

• cloudwatch:listMetrics

• cloudwatch:putMetricAlarm

• cloudwatch:putMetricData

• ec2:describeInstances

• ec2:describeTags

• ec2:describeVolumes

• es:describeElasticsearchDomain

• es:listDomainNames

• events:deleteRule

• events:describeRule

• events:disableRule

• events:enableRule

• events:listRules

• events:putRule

• iam:attachRolePolicy

• iam:createRole

• iam:getPolicy

• iam:getPolicyVersion

• iam:getRole

• iam:listAttachedRolePolicies

• iam:listRoles

• kinesis:describeStreams

• kinesis:listStreams

• lambda:addPermission

201



• lambda:createFunction

• lambda:getFunctionConfiguration

• lambda:listAliases

• lambda:listFunctions

• lambda:listVersionsByFunction

• lambda:removePermission

• logs:cancelExportTask

• logs:createExportTask

• logs:createLogGroup

• logs:createLogStream

• logs:deleteLogGroup

• logs:deleteLogStream

• logs:deleteMetricFilter

• logs:deleteRetentionPolicy

• logs:deleteSubscriptionFilter

• logs:describeExportTasks

• logs:describeLogGroups

• logs:describeLogStreams

• logs:describeMetricFilters

• logs:describeSubscriptionFilters

• logs:filterLogEvents

• logs:getLogEvents

• logs:putMetricFilter

• logs:putRetentionPolicy

• logs:putSubscriptionFilter

• logs:testMetricFilter

• s3:createBucket

• s3:listBuckets

• sns:createTopic

• sns:getTopicAttributes

• sns:listSubscriptions

• sns:listTopics

• sns:setTopicAttributes

• sns:subscribe

• sns:unsubscribe

• sqs:getQueueAttributes

• sqs:getQueueUrl

• sqs:listQueues

• sqs:setQueueAttributes

• swf:createAction

• swf:describeAction

• swf:listActionTemplates

• swf:registerAction

• swf:registerDomain

202


AWS Managed (Predefined) Policies for CloudWatch

• swf:updateAction

AWS Managed (Predefined) Policies for CloudWatch

AWS addresses many common use cases by providing standalone IAM policies that are created and administered by AWS. These AWS managed policies grant necessary permissions for common use cases so that you can avoid having to investigate what permissions are needed. For more information, see AWS

Managed Policies in the IAM User Guide.

The following AWS managed policies, which you can attach to users in your account, are specific to

CloudWatch:

• CloudWatchFullAccess – Grants full access to CloudWatch.

• CloudWatchReadOnlyAccess – Grants read-only access to CloudWatch.

• CloudWatchActionsEC2Access – Grants read-only access to CloudWatch alarms and metrics as well as Amazon EC2 metadata. Grants access to the Stop, Terminate, and Reboot API actions for EC2 instances.

Note

You can review these permissions policies by signing in to the IAM console and searching for specific policies there.

You can also create your own custom IAM policies to allow permissions for CloudWatch actions and resources. You can attach these custom policies to the IAM users or groups that require those permissions.

Customer Managed Policy Examples

In this section, you can find example user policies that grant permissions for various CloudWatch actions.

These policies work when you are using the CloudWatch API, AWS SDKs, or the AWS CLI.

Examples

•

Example 1: Allow User Full Access to CloudWatch (p. 203)

•

Example 2: Allow Read-Only Access to CloudWatch (p. 204)

•

Example 3: Stop or Terminate an Amazon EC2 Instance (p. 204)

Example 1: Allow User Full Access to CloudWatch

The following policy allows a user to access all CloudWatch actions, CloudWatch Logs actions, Amazon

SNS actions, and read-only access to Auto Scaling.

{

"Version": "2012-10-17",

"Statement": [

{

"Action": [

"autoscaling:Describe*",

"cloudwatch:*",

"logs:*",

"sns:*"

],

"Effect": "Allow",

"Resource": "*"

}

]

203


Customer Managed Policy Examples

}

Example 2: Allow Read-Only Access to CloudWatch

The following policy allows a user read-only access to CloudWatch and view Auto Scaling actions,

CloudWatch metrics, CloudWatch Logs data, and alarm-related Amazon SNS data.

{

"Version": "2012-10-17",

"Statement": [

{

"Action": [

"autoscaling:Describe*",

"cloudwatch:Describe*",

"cloudwatch:Get*",

"cloudwatch:List*",

"logs:Get*",

"logs:Describe*",

"sns:Get*",

"sns:List*"

],

"Effect": "Allow",

"Resource": "*"

}

]

}

Example 3: Stop or Terminate an Amazon EC2 Instance

The following policy allows an CloudWatch alarm action to stop or terminate an EC2 instance. In the sample below, the GetMetricStatistics, ListMetrics, and DescribeAlarms actions are optional. It is recommended that you include these actions to ensure that you have correctly stopped or terminated the instance.

{

"Version": "2012-10-17",

"Statement": [

{

"Action": [

"cloudwatch:PutMetricAlarm",

"cloudwatch:GetMetricStatistics",

"cloudwatch:ListMetrics",

"cloudwatch:DescribeAlarms"

],

"Sid": "00000000000000",

"Resource": [

"*"

],

"Effect": "Allow"

},

{

"Action": [

"ec2:DescribeInstanceStatus",

"ec2:DescribeInstances",

"ec2:StopInstances",

"ec2:TerminateInstances"

],

"Sid": "00000000000000",

"Resource": [

"*"

],

"Effect": "Allow"

204


Amazon CloudWatch Permissions Reference

}

]

}


When you are setting up

Access Control (p. 195) and writing permissions policies that you can attach

to an IAM identity (identity-based policies), you can use the following table as a reference. The table lists each CloudWatch API operation and the corresponding actions for which you can grant permissions to perform the action. You specify the actions in the policy's

Action

field, and you specify a wildcard character (*) as the resource value in the policy's

Resource

field.

You can use AWS-wide condition keys in your CloudWatch policies to express conditions. For a complete list of AWS-wide keys, see Available Keys in the IAM User Guide.

Note

To specify an action, use the cloudwatch:

prefix followed by the API operation name. For example: cloudwatch:GetMetricStatistics

, cloudwatch:ListMetrics

, or cloudwatch:*

(for all

CloudWatch actions).

Tables

• CloudWatch API Operations and Required Permissions

• CloudWatch Events API Operations and Required Permissions

• CloudWatch Logs API Operations and Required Permissions

• Amazon EC2 API Operations and Required Permissions

• Auto Scaling API Operations and Required Permissions

CloudWatch API Operations and Required Permissions for Actions

CloudWatch API Operations

DeleteAlarms

Required Permissions (API Actions)

cloudwatch:DeleteAlarms

Required to delete an alarm.

DeleteDashboards

DescribeAlarmHistory

DescribeAlarms cloudwatch:DeleteDashboards

Required to delete a dashboard.

cloudwatch:DescribeAlarmHistory

Required to view alarm history.

cloudwatch:DescribeAlarms

Required to retrieve alarm information by name.

DescribeAlarmsForMetric

DisableAlarmActions

EnableAlarmActions cloudwatch:DescribeAlarmsForMetric

Required to view alarms for a metric.

cloudwatch:DisableAlarmActions

Required to disable an alarm action.

cloudwatch:EnableAlarmActions

205



CloudWatch API Operations

GetDashboard

GetMetricData

GetMetricStatistics

ListDashboards

ListMetrics

PutDashboard

PutMetricAlarm

PutMetricData

SetAlarmState


Required to enable an alarm action.

cloudwatch:GetDashboard

Required to display data about existing dasbhoards.

cloudwatch:GetMetricData

Required to view or list dashboards and view metric data in dashboard widgets.

cloudwatch:GetMetricStatistics

Required to view graphs in other parts of the

CloudWatch console and in dashboard widgets.

cloudwatch:ListDashboards

Required to view the list of CloudWatch dashboards in your account.

cloudwatch:ListMetrics

Required to view or search metric names within the CloudWatch console and in the CLI. Required to select metrics on dashboard widgets.

cloudwatch:PutDashboard

Required to create a dashboard or update an existing dashboard.

cloudwatch:PutMetricAlarm

Required to create or update an alarm.

cloudwatch:PutMetricData

Required to create metrics and create or delete dashboards.

cloudwatch:SetAlarmState

Required to manually set an alarm's state.

CloudWatch Events API Operations and Required Permissions for Actions

CloudWatch Events API Operations

DeleteRule

DescribeRule

DisableRule


events:DeleteRule

Required to delete a rule.

events:DescribeRule

Required to list the details about a rule.

events:DisableRule

206



CloudWatch Events API Operations

EnableRule

ListRuleNamesByTarget

ListRules

ListTargetsByRule

PutEvents

PutRule

PutTargets

RemoveTargets

TestEventPattern


Required to disable a rule.

events:EnableRule

Required to enable a rule.

events:ListRuleNamesByTarget

Required to list rules associated with a target.

events:ListRules

Required to list all rules in your account.

events:ListTargetsByRule

Required to list all targets associated with a rule.

events:PutEvents

Required to add custom events that can be matched to rules.

events:PutRule

Required to create or update a rule.

events:PutTargets

Required to add targets to a rule.

events:RemoveTargets

Required to remove a target from a rule.

events:TestEventPattern

Required to test an event pattern against a given event.

CloudWatch Logs API Operations and Required Permissions for Actions

CloudWatch Logs API Operations

CancelExportTask

CreateExportTask

CreateLogGroup

CreateLogStream


logs:CancelExportTask

Required to cancel a pending or running export task.

logs:CreateExportTask

Required to export data from a log group to an

Amazon S3 bucket.

logs:CreateLogGroup

Required to create a new log group.

logs:CreateLogStream

207




DeleteDestination

DeleteLogGroup

DeleteLogStream

DeleteMetricFilter

DeleteRetentionPolicy

DeleteSubscriptionFilter

DescribeDestinations

DescribeExportTasks

DescribeLogGroups

DescribeLogStreams

DescribeMetricFilters


Required to create a new log stream in a log group.

logs:DeleteDestination

Required to delete a log destination and disables any subscription filters to it.

logs:DeleteLogGroup

Required to delete a log group and any associated archived log events.

logs:DeleteLogStream

Required to delete a log stream and any associated archived log events.

logs:DeleteMetricFilter

Required to delete a metric filter associated with a log group.

logs:DeleteRetentionPolicy

Required to delete a log group's retention policy.

logs:DeleteSubscriptionFilter

Required to delete the subscription filter associated with a log group.

logs:DescribeDestinations

Required to view all destinations associated with the account.

logs:DescribeExportTasks

Required to view all export tasks associated with the account.

logs:DescribeLogGroups

Required to view all log groups associated with the account.

logs:DescribeLogStreams

Required to view all log streams associated with a log group.

logs:DescribeMetricFilters

Required to view all metrics associated with a log group.

208




DescribeSubscriptionFilters

FilterLogEvents

GetLogEvents

PutDestination

PutDestinationPolicy

PutLogEvents

PutMetricFilter

PutRetentionPolicy

PutSubscriptionFilter

TestMetricFilter


logs:DescribeSubscriptionFilters

Required to view all subscription filters associated with a log group.

logs:FilterLogEvents

Required to sort log events by log group filter pattern.

logs:GetLogEvents

Required to retrieve log events from a log stream.

logs:PutDestination

Required to create or update a destination log stream (such as an Kinesis stream).

logs:PutDestinationPolicy

Required to create or update an access policy associated with an existing log destination.

logs:PutLogEvents

Required to upload a batch of log events to a log stream.

logs:PutMetricFilter

Required to create or update a metric filter and associate it with a log group.

logs:PutRetentionPolicy

Required to set the number of days to keep log events (retention) in a log group.

logs:PutSubscriptionFilter

Required to create or update a subscription filter and associate it with a log group.

logs:TestMetricFilter

Required to test a filter pattern against a sampling of log event messages.

Amazon EC2 API Operations and Required Permissions for Actions

Amazon EC2 API Operations

DescribeInstanceStatus

DescribeInstances


ec2:DescribeInstanceStatus

Required to view EC2 instance status details.

ec2:DescribeInstances

209



Amazon EC2 API Operations

RebootInstances

StopInstances

TerminateInstances


Required to view EC2 instance details.

ec2:RebootInstances

Required to reboot an EC2 instance.

ec2:StopInstances

Required to stop an EC2 instance.

ec2:TerminateInstances

Required to terminate an EC2 instance.

Auto Scaling API Operations and Required Permissions for Actions

Auto Scaling API Operations

Scaling


autoscaling:Scaling

Required to scale an Auto Scaling group.

Trigger autoscaling:Trigger

Required to trigger an Auto Scaling action.

210


CloudWatch Information in CloudTrail

Logging Amazon CloudWatch API

Calls in AWS CloudTrail

AWS CloudTrail is a service that captures API calls made by or on behalf of your AWS account. This information is collected and written to log files that are stored in an Amazon S3 bucket that you specify.

API calls are logged whenever you use the API, the console, or the AWS CLI. Using the information collected by CloudTrail, you can determine what request was made, the source IP address the request was made from, who made the request, when it was made, and so on.

To learn more about CloudTrail, including how to configure and enable it, see the What is AWS CloudTrail in the AWS CloudTrail User Guide.

Topics

•

CloudWatch Information in CloudTrail (p. 211)

•

Understanding Log File Entries (p. 213)


If CloudTrail logging is turned on, calls made to API actions are captured in log files. Every log file entry contains information about who generated the request. For example, if a request is made to create or update a CloudWatch alarm (

PutMetricAlarm

), CloudTrail logs the user identity of the person or service that made the request.

The user identity information in the log entry helps you determine the following:

• Whether the request was made with root or IAM user credentials

• Whether the request was made with temporary security credentials for a role or federated user

• Whether the request was made by another AWS service

For more information, see the CloudTrail userIdentity Element in the AWS CloudTrail User Guide.

You can store your log files in your bucket for as long as you want, but you can also define Amazon S3 lifecycle rules to archive or delete log files automatically. By default, your log files are encrypted by using

Amazon S3 server-side encryption (SSE).

211



If you want to be notified upon log file delivery, you can configure CloudTrail to publish Amazon SNS notifications when new log files are delivered. For more information, see Configuring Amazon SNS

Notifications for CloudTrail in the AWS CloudTrail User Guide.

You can also aggregate Amazon CloudWatch Logs log files from multiple AWS regions and multiple AWS accounts into a single Amazon S3 bucket. For more information, see Receiving CloudTrail Log Files from

Multiple Regions and Receiving CloudTrail Log Files from Multiple Accounts in the AWS CloudTrail User

Guide.

When logging is turned on, the following API actions are written to CloudTrail:

CloudWatch

• DeleteAlarms

• DescribeAlarmHistory

• DescribeAlarms

• DescribeAlarmsForMetric

• DisableAlarmActions

• EnableAlarmActions

• PutMetricAlarm

• SetAlarmState

The CloudWatch

GetMetricStatistics

,

ListMetrics

, and

PutMetricData

API actions are not supported.

For more information about all of these actions, see the Amazon CloudWatch API Reference .

CloudWatch Events

• DeleteRule

• DescribeRule

• DisableRule

• EnableRule

• ListRuleNamesByTarget

• ListRules

• ListTargetsByRule

• PutRule

• PutTargets

• RemoveTargets

• TestEventPattern

For more information about these actions, see the Amazon CloudWatch Events API Reference .

CloudWatch Logs

Request and response elements are logged for these API actions:

• CancelExportTask

• CreateExportTask

• CreateLogGroup

• CreateLogStream

• DeleteDestination

• DeleteLogGroup

• DeleteLogStream

212


Understanding Log File Entries

• DeleteMetricFilter

• DeleteRetentionPolicy

• DeleteSubscriptionFilter

• PutDestination

• PutDestinationPolicy

• PutMetricFilter

• PutRetentionPolicy

• PutSubscriptionFilter

• TestMetricFilter

Only Request elements are logged for these API actions:

• DescribeDestinations

• DescribeExportTasks

• DescribeLogGroups

• DescribeLogStreams

• DescribeMetricFilters

• DescribeSubscriptionFilters

The CloudWatch Logs

GetLogEvents

,

PutLogEvents

, and

FilterLogEvents

API actions are not supported.

For more information about these actions, see the Amazon CloudWatch Logs API Reference .


CloudTrail log files contain one or more log entries. Each entry lists multiple JSON-formatted events.

A log entry represents a single request from any source and includes information about the requested action, the date and time of the action, request parameters, and so on. The log entries are not an ordered stack trace of the public API calls, so they do not appear in any specific order. Log file entries for all API actions are similar to the examples below.

The following log file entry shows that a user called the CloudWatch PutMetricAlarm action.

{

"Records": [{

"eventVersion": "1.01",

"userIdentity": {

"type": "Root",

"principalId": "EX_PRINCIPAL_ID",

"arn": "arn:aws:iam::123456789012:root",

"accountId": "123456789012",

"accessKeyId": "EXAMPLE_KEY_ID"

},

"eventTime": "2014-03-23T21:50:34Z",

"eventSource": "monitoring.amazonaws.com",

"eventName": "PutMetricAlarm",

"awsRegion": "us-east-1",

"sourceIPAddress": "127.0.0.1",

"userAgent": "aws-sdk-ruby2/2.0.0.rc4 ruby/1.9.3 x86_64-linux Seahorse/0.1.0",

"requestParameters": {

"threshold": 50.0,

"period": 60,

"metricName": "CloudTrail Test",

"evaluationPeriods": 3,

213



"comparisonOperator": "GreaterThanThreshold",

"namespace": "AWS/CloudWatch",

"alarmName": "CloudTrail Test Alarm",

"statistic": "Sum"

},

"responseElements": null,

"requestID": "29184022-b2d5-11e3-a63d-9b463e6d0ff0",

"eventID": "b096d5b7-dcf2-4399-998b-5a53eca76a27"

},

..additional entries

]

}

The following log file entry shows that a user called the CloudWatch Events PutRule action.

{

"eventVersion":"1.03",

"userIdentity":{

"type":"Root",

"principalId":"123456789012",

"arn":"arn:aws:iam::123456789012:root",

"accountId":"123456789012",

"accessKeyId":"AKIAIOSFODNN7EXAMPLE",

"sessionContext":{

"attributes":{

"mfaAuthenticated":"false",

"creationDate":"2015-11-17T23:56:15Z"

}

}

},

"eventTime":"2015-11-18T00:11:28Z",

"eventSource":"events.amazonaws.com",

"eventName":"PutRule",

"awsRegion":"us-east-1",

"sourceIPAddress":"AWS Internal",

"userAgent":"AWS CloudWatch Console",

"requestParameters":{

"description":"",

"name":"cttest2",

"state":"ENABLED",

"eventPattern":"{\"source\":[\"aws.ec2\"],\"detail-type\":[\"EC2 Instance

State-change Notification\"]}",

"scheduleExpression":""

},

"responseElements":{

"ruleArn":"arn:aws:events:us-east-1:123456789012:rule/cttest2"

},

"requestID":"e9caf887-8d88-11e5-a331-3332aa445952",

"eventID":"49d14f36-6450-44a5-a501-b0fdcdfaeb98",

"eventType":"AwsApiCall",

"apiVersion":"2015-10-07",

"recipientAccountId":"123456789012"

}

The following log file entry shows that a user called the CloudWatch Logs CreateExportTask action.

{

"eventVersion": "1.03",

"userIdentity": {

"type": "IAMUser",

"principalId": "EX_PRINCIPAL_ID",

"arn": "arn:aws:iam::123456789012:user/someuser",

"accountId": "123456789012",

214



"accessKeyId": "AKIAIOSFODNN7EXAMPLE",

"userName": "someuser"

},

"eventTime": "2016-02-08T06:35:14Z",

"eventSource": "logs.amazonaws.com",

"eventName": "CreateExportTask",

"awsRegion": "us-east-1",

"sourceIPAddress": "127.0.0.1",

"userAgent": "aws-sdk-ruby2/2.0.0.rc4 ruby/1.9.3 x86_64-linux Seahorse/0.1.0",

"requestParameters": {

"destination": "yourdestination",

"logGroupName": "yourloggroup",

"to": 123456789012,

"from": 0,

"taskName": "yourtask"

},

"responseElements": {

"taskId": "15e5e534-9548-44ab-a221-64d9d2b27b9b"

},

"requestID": "1cd74c1c-ce2e-12e6-99a9-8dbb26bd06c9",

"eventID": "fd072859-bd7c-4865-9e76-8e364e89307c",

"eventType": "AwsApiCall",

"apiVersion": "20140328",

"recipientAccountId": "123456789012"

}

215


Document History

The following table describes the important changes to the Amazon CloudWatch User Guide.

Change

NAT gateway metrics

High-resolution metrics

Dashboard APIs

AWS Direct

Connect metrics

Amazon VPC VPN metrics

AppStream 2.0

metrics

CloudWatch console color picker

Alarms on dashboards

Added metrics for

Amazon Polly

Description

Added metrics for Amazon VPC NAT gateway. For more information, see

Amazon VPC NAT Gateway

Metrics and Dimensions (p. 166) .

You can now optionally set up custom metrics as high-resolution metrics, with a granularity of as

low as one second. For more information, see High-

Resolution Metrics (p. 42) .

You can now create, modify, and delete dashboards using APIs and the AWS CLI. For more information, see

Create a CloudWatch Dashboard (p. 19) .

Added metrics for AWS Direct Connect. For more

information, see AWS Direct Connect Metrics and

Dimensions (p. 58) .

Added metrics for Amazon VPC VPN. For more

information, see Amazon VPC VPN Metrics and


Added metrics for AppStream 2.0. For more

information, see AppStream 2.0 Metrics and


You can now choose the color for each metric on your

dashboard widgets. For more information, see Edit a

Graph on a CloudWatch Dashboard (p. 21) .

Alarms can now be added to dashboards. For more

information, see Add or Remove an Alarm from a

CloudWatch Dashboard (p. 23)

.

Added metrics for Amazon Polly. For more

information, see Amazon Polly Metrics (p. 139) .

Release Date

7 September 2017

26 July 2017

6 July 2017

29 June 2017

15 May 2017

8 March 2017

27 February 2017

15 February 2017

1 December 2016

216


Change

Added metrics for

Amazon Kinesis

Analytics

Added support for percentile statistics

Description

Added metrics for Amazon Kinesis Analytics. For more information, see

Amazon Kinesis Analytics

Metrics (p. 121) .

You can specify any percentile, using up to two decimal places (for example, p95.45). For more

information, see Percentiles (p. 7) .

Added metrics for

Amazon Simple

Email Service

Updated metrics retention

Updated metrics console interface

Added metrics for

Amazon Elastic

Transcoder

Added metrics for Amazon API

Gateway

Added metrics for AWS Key

Management

Service

Added metrics for the new

Application

Load Balancers supported by

Elastic Load

Balancing

Added metrics for Amazon Simple Email Service. For more information, see

Amazon Simple Email Service


Amazon CloudWatch now retains metrics data for 15 months instead of 14 days.

The CloudWatch console is updated with improvements to existing functionality and new functionality.

Added metrics for Amazon Elastic Transcoder. For more information, see

Amazon Elastic Transcoder


Added metrics for Amazon API Gateway. For more

information, see Amazon API Gateway Metrics and


Added metrics for AWS Key Management Service. For more information, see

AWS Key Management Service


Added metrics for Application Load Balancers. For more information, see

and Dimensions (p. 89) .

Elastic Load Balancing Metrics

Added new

NetworkPacketsIn and

NetworkPacketsOut metrics for

Amazon EC2

Added new NetworkPacketsIn and

NetworkPacketsOut metrics for Amazon EC2. For more information, see

Amazon EC2 Metrics and


Added new metrics for

Amazon EC2 Spot fleet

Added new

CloudWatch Logs metrics

Added new metrics for Amazon EC2 Spot fleet. For more information, see

Amazon EC2 Spot Fleet Metrics and Dimensions (p. 74) .

Added new CloudWatch Logs metrics. For more

information, see Amazon CloudWatch Logs Metrics and Dimensions (p. 56) .

Release Date

1 December 2016

17 November 2016

2 November 2016

1 November 2016

1 November 2016

20 September 2016

9 September 2016

9 September 2016

11 August 2016

23 March 2016

21 March 2016

10 March 2016

217


Change

Added Amazon

Elasticsearch

Service and AWS

WAF metrics and dimensions

Added support for CloudWatch dashboards

Added AWS

Lambda metrics and dimensions

Added Amazon

EC2 Container

Service metrics and dimensions

Added Amazon

Simple Storage


New feature:

Reboot alarm action

Added Amazon

WorkSpaces metrics and dimensions

Added Amazon

Machine Learning metrics and dimensions

New feature:

Amazon EC2 instance recovery alarm actions

Added Amazon

CloudFront and Amazon

CloudSearch metrics and dimensions

Added Amazon

Simple Workflow


Description

Added Amazon Elasticsearch Service and AWS WAF metrics and dimensions. For more information, see

Amazon Elasticsearch Service Metrics and

Dimensions (p. 107) and

AWS WAF Metrics and


Dashboards are customizable home pages in the

CloudWatch console that you can use to monitor your resources in a single view, even those that are spread out across different regions. For more information, see

Using Amazon CloudWatch Dashboards (p. 18)

.

Added AWS Lambda metrics and dimensions. For more information, see

AWS Lambda Metrics and


Added Amazon EC2 Container Service metrics and

dimensions. For more information, see Amazon ECS


Release Date

14 October 2015

8 October 2015

4 September 2015

17 August 2015

Added Amazon Simple Storage Service metrics and dimensions. For more information, see

Amazon Simple Storage Service Metrics and


Added the reboot alarm action and new IAM role for use with alarm actions. For more information, see

Create Alarms to Stop, Terminate, Reboot, or Recover an Instance (p. 185)

.

Added Amazon WorkSpaces metrics and dimensions.

For more information, see Amazon WorkSpaces


26 July 2015

23 July 2015

30 April 2015

Added Amazon Machine Learning metrics and

dimensions. For more information, see Amazon

Machine Learning Metrics and Dimensions (p. 134) .

Updated alarm actions to include new EC2 instance

recovery action. For more information, see Create

Alarms to Stop, Terminate, Reboot, or Recover an

Instance (p. 185)

.

Added Amazon CloudFront and Amazon

CloudSearch metrics and dimensions. For more

information, see Amazon CloudFront Metrics and

Dimensions (p. 53) and

Amazon CloudSearch Metrics and Dimensions (p. 54) .

Added Amazon Simple Workflow Service metrics and

dimensions. For more information, see


Amazon SWF

9 April 2015

12 March 2015

6 March 2015

9 May 2014

218


Change

Updated guide to add support for

AWS CloudTrail

Updated guide to use the new AWS

Command Line

Interface (AWS

CLI)

Description

Added a new topic to explain how you can use AWS

CloudTrail to log activity in Amazon CloudWatch. For more information, see

Logging Amazon CloudWatch

API Calls in AWS CloudTrail (p. 211)

.

The AWS CLI is a cross-service CLI with a simplified installation, unified configuration, and consistent command line syntax. The AWS CLI is supported on

Linux/Unix, Windows, and Mac. The CLI examples in this guide have been updated to use the new AWS

CLI.

Added Amazon

Redshift and AWS

OpsWorks metrics and dimensions

Added Amazon

Route 53 metrics and dimensions

New feature:

Amazon

CloudWatch Alarm

Actions

For information about how to install and configure the new AWS CLI, see Getting Set Up with the AWS

Command Line Interface in the AWS Command Line

Interface User Guide.

Added Amazon Redshift and AWS OpsWorks metrics and dimensions. For more information, see

Amazon

Redshift Metrics and Dimensions (p. 140)

and AWS

OpsWorks Metrics and Dimensions (p. 135) .

Added Amazon Route 53 metrics and dimensions. For more information, see

Amazon Route 53 Metrics and


Added a new section to document Amazon

CloudWatch alarm actions, which you can use to stop or terminate an Amazon Elastic Compute

Cloud instance. For more information, see Create

Alarms to Stop, Terminate, Reboot, or Recover an

Instance (p. 185)

.

Updated EBS metrics

New billing alerts You can now monitor your AWS charges using

Amazon CloudWatch metrics and create alarms to notify you when you have exceeded the specified

threshold. For more information, see Create a

Billing Alarm to Monitor Your Estimated AWS

Charges (p. 191)

.

New metrics

Updated the EBS metrics to include two new metrics for Provisioned IOPS volumes. For more information, see

Amazon EBS Metrics and Dimensions (p. 85)

.

New feature

You can now access six new Elastic Load Balancing metrics that provide counts of various HTTP response

codes. For more information, see Elastic Load

Balancing Metrics and Dimensions (p. 89)

.

You can now access metrics from Amazon EMR. For more information, see

Amazon EMR Metrics and


Release Date

30 April 2014

21 February 2014

16 July 2013

26 June 2013

8 January 2013

20 November 2012

10 May 2012

19 October 2011

30 June 2011

219


Change

New feature

New Feature

Updated metrics retention

New feature

New feature

New feature

Description

You can now access metrics from Amazon Simple

Notification Service and Amazon Simple Queue

Service. For more information, see Amazon Simple

Notification Service Metrics and Dimensions (p. 148)

and

Amazon SQS Metrics and Dimensions (p. 149)

.

Added information about using the

PutMetricData

API to publish custom metrics. For more information, see

Publish Custom Metrics (p. 42) .

Amazon CloudWatch now retains the history of an alarm for two weeks rather than six weeks. With this change, the retention period for alarms matches the retention period for metrics data.

Added ability to send Amazon Simple Notification

Service or Auto Scaling notifications when a metric has crossed a threshold. For more information, see

Alarms (p. 7) .

A number of CloudWatch actions now include the

MaxRecords and NextToken parameters, which enable you to control pages of results to display.

This service now integrates with AWS Identity and

Access Management (IAM).

Release Date

14 July 2011

10 May 2011

07 April 2011

02 December 2010

02 December 2010

02 December 2010

220