AWS Elastic Compute Cloud (EC2) is the grand-pappy of cloud computing products. In a nutshell, EC2 enables you to create any number of virtual computers with a huge variety of software preloaded on images.
There are several purchase options for EC2:
On-Demand Instances - need an instance? Rent an instance. Most expensive.
Spot Instances - you set a max price and if the prices rises about your instance gets nuked; A Spot instance runs as long as its capacity is available and your bid price is higher than the Spot price… may get terminated. If Amazon terminates then you don’t get charged for partial; if you terminate you DO get charged.
Dedicated Hosts - an entire physical server in an AWS datacenter dedicated to your use with targeted placement; enables visibility of cores, sockets and host ID and enables you to run MS, SUSE, REHL and other software bound to VMs, sockets or physical cores. Detailed reporting for the instance is provided by AWS Config.
Dedicated Instances - Instances that run on hardware that is dedicated to a single user.
Free Instance - free, yet small, and mostly
Reserved Instances have changed a lot in the last half of 2016. But before that, let’s talk about the easy stuff first. You can still get a contact for RI with 1 or 3 year terms and the best deal is still the all up front payment method. There is a marketplace for selling them if needed.
Scheduled RI - predictable reserved capacity for parts of a day, week or month
Standard RI - up to 75% off on-demand pricing; it is possible to modify the AZ, network platform, or instance size not instance type or region; changes can be made to the RI based if the Normalization Factor, which is the “Footprint” of the instance, is the same or less. Often is a better approach to buy more capacity than you need and distribute it across instances. Does not apply to SUSE or RHEL.
Converible RI - up to 45% off on-demand pricing but can be converted to the same or more expensive instance type and change OS, tenancy & payment options; can’t sell on the Reserve MarketPlace yet.
Each instance type offers different compute, memory, and storage capabilities and are grouped in instance families based on these capabilities. Instances can also be classified as burstable or consistent. The new mnemonic is DR. MCGIFT PX
|R||RAM||Memory Intensive Apps/DB|
|C||Compute||CPU intensive apps|
|G||Graphics||Video Encoding/3D Apps|
|T||Burstable Cheapness||Web servers/Small DB|
|P||Picts (graphics)||Machine Learning, Bitcoin|
|X||Extreme memory||HANA, Spark|
There are two types of storage-centric instance families:
D - Dense storage; mechanical disks; data warehousing & distributed file systems like Hadoop rock here; log or data processing too
I - IOPS storage; lots of SSD storage; big data or DB
T2 instances offer moderate baseline with burstable CPU using HVM virtualization and EBS root volume only. On-demand and reserved purchase options only.
Each instance size in the T2 family has an initial CPU credit and earns more each hours with a max credit balance but has a lower than full CPU baseline performance. Earned credit expire after 24 hours.
Because CPU can burn through credits and then will max out CPU quickly, T2 isn’t a great choice for ASG, at least using it with alarms based on CPU usage.
There are two types of memory optimzed instances:
R(am) family - For databases, HPC or Electronic Design Automation; R3 offers local SSD instance store
X family - for in-memory databases, like SAP HANA, Apache Spark or Presto and include Intel Scalable Memory Buffers, providing 300 GiB/s of sustainable memory-read bandwidth and 140 GiB/s of sustainable memory-write bandwidth
M(ain) family is the main choice or general purpose with consistent performance; m3 offer instance stores; m4 uses EBS optimized stores as the default
The C family is compute/processor optimized for batch processing, transcoding, web/app servers & HPC. C4 is EBS only and EBS optimized with require both HVM and VPC. Both C3 and C4 offer Intel 82599 VF advanced network and placement groups.
There are three advanced accelerated computing types:
F - uses Xilinx FPGA; uses AFI instead of AMIs
P - uses NVIDIA Tesla K80 GPUs and are designed for general purpose GPU computing using the CUDA or OpenCL programming model; for deep learning, graph databases, high performance databases, computational fluid dynamics, computational finance, seismic analysis, molecular modeling, genomics, rendering, and other server-side GPU compute workloads
G - Graphics and GPU optimized; good for video encoding; SSD instance store; 3D visualizations, graphics-intensive remote workstations, 3D rendering, video encoding, virtual reality, and other server-side graphics; does not support SR-IOV enhanced networking
A placement group is a logical group of instances within a single AZ that participate in a low latency 10Gbps network. The name for the group must be unique within your account and only certain types of instances can be launched in a placement group. Ideally only ONE type of instance per placement group. You can’t merge groups or move instances into a group so provision your placement group for the peak size of the placement. Placement groups can span subnets.
There are numerous limitations with placement groups:
Placement groups can’t span AZ; they can span VPC but might not have full bandwidth
Placement groups can be merged;
Can’t add an existing instance to a placement group
Enhanced networking uses SR-IOV (which reduces CPU interupts therefore improving overall performance). Some instances use Intel 82599 Virtual Function (10 Gbps) or Elastic Network Adapter (20 Gbps). This feature requires both an HVM AMI and the kernel module ixgbevf. Amazon Linux has ixgbevf on by default but other distros require configuration. Generally HPC requires enhanced networking require jumbo frames (up to 9000 bytes of data) instead of ethernet frames of 1500 bytes. C3/C4, D2, I2, M4, and R3 all support enhanced networking.
EBS-optimized instances enable you to get consistently high performance for your EBS volumes by eliminating contention between Amazon EBS I/O and other network traffic from your instance with options between 500 Mbps and 12,000 Mbps, depending on the instance type you use.
Instance storage - physically attached to the host but ephemeral; Root devices are deleted by default with the instance is terminated but this can be changed.
Tenancy for instances is follows:
Default - shared hardware; can’t be changed after launch
Dedicated - single tenant hardware; can be changed to Host
Host - Instance runs on dedicated host; can be changed to Dedicated
If the VPC is setup dedicated, then all instances are dedicated; can’t be changed at launch. So the tenancy of an instance can only be change between variants of dedicated tenancy hosting.
Performance for EBS is measured in IOPS which is a complicated metric. For SSDs, IOPS are measured in 256 Kibibytes, because they are so good a small, non-sequential reads. For HDDs, IOPS are measured in 1024 Kibibyte chunks, because the have better performance with large sequential reads.
Root devices are not encrypted and some instance types are EBS only.
Multiple IP addresses can be assigned to an interface for use cases including multiple SSL certs (one IP required per cert), might need to move one of the addresses, multiple subnets.
SG are attached to an interface NOT and IP address. You can’t move a primary interface but can move secondary interfaces. ENI have a fixed MAC address.
HVM - Emulates a bare-metal experience for virtualization; originally very slow but improved after CPU vendors added support for virtualization instructions and soon network & I/O as well. Enhanced networking features generally require HVM. Can run for Windows.
PV - Short for ParaVirtualization is the previous version of virtualization. Uses a special boot loader called PV-GRUB and does not run Windows. In general, poor performance is the name of the game yet attactive spot pricing can be attractive.
Status checks run every 5 minutes.
System Status Checks - Checks the host. generally networking, power, hardware and system software problems that can be resolved by stopping and restarting the instance or contacting AWS. Stop and start the VM to fix this issue.
Instance Status Checks - Checks the instance. configuration problems, out of memory, corrupted file system, or incompatible kernel. AWS does NOT help you with these sort of problems. Reboot the machine to fix these issues.
EC2 instances report CloudWatch metric every 5 minutes; selecting the “Enable CloudWatch detailed monitoring” to see the metrics in 1 minute intervals. Even if you have more granular data, CloudWatch rolls the data up into 1 minute intervals. Unlike many instance options, you can enable this feature after the instance starts. Autoscaling group metric aggregation is also reportable using detailed monitoring.
CPUUtilization, network (in/out), disk (read/write - ops and bytes) and status checks are reported to CloudWatch by default. RAM & diskspace is a custom metric.
CPUUtilization - duh
DiskReadOps & DiskWriteOps - total operations (not bytes) to and from disk (instance store)
DiskReadBytes & DiskWriteBytes - total bytes to and from the disk (instance store)
NetworkIn & NetworkOut - total bytes to and from the instance over the wire
NetworkPacketsIn & NetworkPacketsOut - number of packets in and out instead of bytes
StatusCheckFailed_Instance & StatusCheckFailed_System - has anything failed an instance or system check?
StatusCheckFailed - combination of both status checks… 1 = yes; 0 = no
EC2 Metric Dimensions & Custom Metrics
These metrics, except for AutoScalingGroupName, are only available with detailed monitoring therefore update every minutes vs the normal 5 minute interval. The metrics include: AutoScalingGroupName, ImageId, InstanceId, InstanceType
The big reason to use custom metrics and to push metrics to cloudwatch in general is the aggregation of data in CloudWatch instead of all over the place.. EC2 instances need a role to write information to CloudWatch.