Updated at 2016-01-07 20:18

Auto scaling groups (ASG) help you to manage clusters of indentical instances. You can set it to automatically increase the number of servers based on your application load and replaces unhealthy instances.

  • Auto scaling group that maintains the instances and their number.
  • Launch configuration that specifies what kind of instances to use and how they are started.

Auto scaling group is a regional construct. A single auto scaling group can have instances in multiple availability zones.

Three ways to allow public address for a single server setup: Only the ELB approach works if you have more than one instance in group.

  • Allocate Elastic IP that you associate with the instance on bootstrap.
  • Maintain a DNS entry linking to the current IP address of the instance.
  • Add an Elastic Load Balancer to the auto scaling group.

Scale up faster than scale down. This protects you from fluctuation in desired instances, which will save you money. For example, scale up every 5 min and scale down every 10 min.

Most important configurations of a auto scaling group are:

  • Launch Configuration: template for the started instances. Image to be used, instance type, key pair, security group etc.
  • Health Check Strategy: how do we determine that an instance is healthy? Is it enough that instance is just running or does it need to respond something when /health is called over HTTPS?
  • Minimum Size: minimum number of healthy instances that needs to be running.
  • Maximum Size: maximum number of healthy instances that needs to be running.
  • Desired Capacity: how many instances do we want to be running right now?
  • Instance Count: how many instances are we currently running?
  • Scaling Plan: specifies do you scale manually, on schedule or based on policies.
  • Scaling Policies: a rule that says that "when alert X happens change desired count by Y". One auto scaling group usually has multiple policies e.g. for adding and removing instances.
  • Alerts: CloudWatch alerts, these are fires when monitoring detects a specific event or pattern.

Define your auto scaling groups as CloudFormation templates. A lot easier to replicate later.

Queue Based Auto Scaling

This example has multiple problems regarding optimal use of EC2 instance resources so use with care. Might not work well in your use case.

Create scaling policies that tell what to do when a condition triggers. You get auto scaling group name from EC2 > Auto Scaling Groups.

aws autoscaling put-scaling-policy \
    --policy-name WorkerSqsScaleUpPolicy \
    -–auto-scaling-group-name <GROUP_ID> \
    --scaling-adjustment 1 \
    --adjustment-type ChangeInCapacity

aws autoscaling put-scaling-policy \
    --policy-name WorkerSqsZeroPolicy \
    -–auto-scaling-group-name <GROUP_ID> \
    --scaling-adjustment 0 \
    --adjustment-type ExactCapacity

Configure instances to use termination protection through AWS API to postpone termination to happen after it has stopped working.

Create CloudWatch Alarms to act as conditions to trigger the policies.

# Add new instance every minute as long as there is 1 or more messages in queue.
aws cloudwatch put-metric-alarm \
    --alarm-name AddCapacityToProcessQueue \
    --metric-name ApproximateNumberOfMessagesVisible \
    --namespace "AWS/SQS" \
    --statistic Average \
    --period 30 \
    --evaluation-periods 2 \
    --threshold 1 \
    --comparison-operator GreaterThanOrEqualToThreshold \
    --dimensions Name=QueueName,Value=<SQS_QUEUE_NAME> \
    --alarm-actions <SCALE_UP_POLICY_ARN>

# Tell all instances to terminate after no messages in 10 minutes.
aws cloudwatch put-metric-alarm \
    --alarm-name RemoveCapacityFromProcessQueue \
    --metric-name ApproximateNumberOfMessagesVisible \
    --namespace "AWS/SQS"
    --statistic Average \
    --period 300 \
    --evaluation-periods 2 \
    --threshold 0 \
    --comparison-operator LessThanOrEqualToThreshold \
    --dimensions Name=QueueName,Value=<SQS_QUEUE_NAME>
    --alarm-actions <SCALE_ZERO_POLICY_ARN>
# Verify that the alarms are in place.
aws cloudwatch describe-alarms \
    --alarm-names AddCapacityToProcessQueue RemoveCapacityFromProcessQueue

# Verify that the policies are in place.
aws autoscaling describe-policies \
    --auto-scaling-group-name <GROUP_OD>

# Create a few messages into the queue and see if it works.