Replacing EC2 On-Demand Instances With New Spot Instances

with an SMS text warning two minutes before interruption, using CloudWatch Events Rules And SNS

The EC2 Spot instance marketplace has had a number of enhancements in the last couple months that have made it more attractive for more use cases. Improvements include:

  • You can run an instance like you normally do for on-demand instances and add one option to make it a Spot instance! The instance starts up immediately if your bid price is sufficient given spot market conditions, and will generally cost much less than on-demand.

  • Spot price volatility has been significantly reduced. Spot prices are now based on long-term trends in supply and demand instead of hour-to-hour bidding wars. This means that instances are much less likely to be interrupted because of short-term spikes in Spot prices, leading to much longer running instances on average.

  • You no longer have to specify a bid price. The Spot Request will default to the instance type’s on-demand price in that region. This saves looking up pricing information and is a reasonable default if you are using Spot to save money over on-demand.

  • CloudWatch Events can now send a two-minute warning before a Spot instance is interrupted, through email, text, AWS Lambda, and more.

Putting these all together makes it easy to take instances you formerly ran on-demand and add an option to turn them into new Spot instances. They are much less likely to be interrupted than with the old spot market, and you can save a little to a lot in hourly costs, depending on the instance type, region, and availability zone.

Plus, you can get a warning a couple minutes before the instance is interrupted, giving you a chance to save work or launch an alternative. This warning could be handled by code (e.g., AWS Lambda) but this article is going to show how to get the warning by email and by SMS text message to your phone.

WARNING!

You should not run a Spot instance unless you can withstand having the instance stopped for a while from time to time.

Make sure you can easily start a replacement instance if the Spot instance is stopped or terminated. This probably includes regularly storing important data outside of the Spot instance (e.g., S3).

You cannot currently re-start a stopped or hibernated Spot instance manually, though the Spot market may re-start it automatically if you configured it with interruption behavior “stop” (or “hibernate”) and if the Spot price comes back down below your max bid.

If you can live with these conditions and risks, then perhaps give this approach a try.

Start An EC2 Instance With A Spot Request

An aws-cli command to launch an EC2 instance can be turned into a Spot Request by adding a single parameter: --instance-market-options ...

The option parameters we will use do not specify a max bid, so it defaults to the on-demand price for the instance type in the region. We specify “stop” and “persistent” so that the instance will be restarted automatically if it is interrupted temporarily by a rising Spot market price that then comes back down.

Adjust the following options to suite. The important part for this example is the instance market options.

ami_id=ami-c62eaabe # Ubuntu 16.04 LTS Xenial HVM EBS us-west-2 (as of post date)
region=us-west-2
instance_type=t2.small
instance_market_options="MarketType='spot',SpotOptions={InstanceInterruptionBehavior='stop',SpotInstanceType='persistent'}"
instance_name="Temporary Demo $(date +'%Y-%m-%d %H:%M')"

instance_id=$(aws ec2 run-instances \
  --region "$region" \
  --instance-type "$instance_type" \
  --image-id "$ami_id" \
  --instance-market-options "$instance_market_options" \
  --tag-specifications \
    'ResourceType=instance,Tags=[{Key="Name",Value="'"$instance_name"'"}]' \
  --output text \
  --query 'Instances[*].InstanceId')
echo instance_id=$instance_id

Other options can be added as desired. For example, specify an ssh key for the instance with an option like:

  --key $USER

and a user-data script with:

  --user-data file:///path/to/user-data-script.sh

If there is capacity, the instance will launch immediately and be available quickly. It can be used like any other instance that is launched outside of the Spot market. However, this instance has the risk of being stopped, so make sure you are prepared for this.

The next section presents a way to get the early warning before the instance is interrupted.

CloudWatch Events Two-Minute Warning For Spot Interruption

As mentioned above, Amazon recently released a feature where CloudWatch Events will send a two-minute warning before a Spot instance is interrupted. This section shows how to get that warning sent to an email address and/or SMS text to a phone number.

Create an SNS topic to receive Spot instance activity notices:

sns_topic_name=spot-activity

sns_topic_arn=$(aws sns create-topic \
  --region "$region" \
  --name "$sns_topic_name" \
  --output text \
  --query 'TopicArn'
)
echo sns_topic_arn=$sns_topic_arn

Subscribe an email address to the SNS topic:

email_address="YOUR@EMAIL.ADDRESS"

aws sns subscribe \
  --region "$region" \
  --topic-arn "$sns_topic_arn" \
  --protocol email \
  --notification-endpoint "$email_address"

IMPORTANT! Go to your email inbox now and click the link to confirm that you want to subscribe that email address to the SNS topic.

Subscribe an SMS phone number to the SNS topic:

phone_number="+1-999-555-1234" # Your phone number

aws sns subscribe \
  --region "$region" \
  --topic-arn "$sns_topic_arn" \
  --protocol sms \
  --notification-endpoint "$phone_number"

Grant CloudWatch Events permission to post to the SNS topic:

aws sns set-topic-attributes \
  --region "$region" \
  --topic-arn "$sns_topic_arn" \
  --attribute-name Policy \
  --attribute-value '{
    "Version": "2008-10-17",
    "Id": "cloudwatch-events-publish-to-sns-'"$sns_topic_name"'",
    "Statement": [{
      "Effect": "Allow",
      "Principal": {
        "Service": "events.amazonaws.com"
      },
      "Action": [ "SNS:Publish" ],
      "Resource": "'"$sns_topic_arn"'"
    }]
  }'

Create a CloudWatch Events Rule that filters for Spot instance interruption warnings for this specific instance:

rule_name_interrupted="ec2-spot-interruption-$instance_id"
rule_description_interrupted="EC2 Spot instance $instance_id interrupted"

event_pattern_interrupted='{
  "source": [
    "aws.ec2"
  ],
  "detail-type": [
    "EC2 Spot Instance Interruption Warning"
  ],
  "detail": {
    "instance-id": [ "'"$instance_id"'" ]
  }
}'

aws events put-rule \
  --region "$region" \
  --name "$rule_name_interrupted" \
  --description "$rule_description_interrupted" \
  --event-pattern "$event_pattern_interrupted" \
  --state "ENABLED"

Set the target of CloudWatch Events rule to the SNS topic using an input transfomer to make sensible text for an English reader:

sns_target_interrupted='[{
  "Id": "target-sns-'"$sns_topic_name"'",
  "Arn": "'"$sns_topic_arn"'",
  "InputTransformer": {
    "InputPathsMap": {
      "title": "$.detail-type",
      "source": "$.source",
      "account": "$.account",
      "time": "$.time",
      "region": "$.region",
      "instance": "$.detail.instance-id",
      "action": "$.detail.instance-action"
    },
    "InputTemplate":
      "\"<title>: <source> will <action> <instance> ('"$instance_name"') in <region> of <account> at <time>\""
  }
}]'

aws events put-targets \
  --region "$region" \
  --rule "$rule_name_interrupted" \
  --targets "$sns_target_interrupted"

Here’s a sample message for the two-minute interruption warning:

“EC2 Spot Instance Interruption Warning: aws.ec2 will stop i-0f47ef25380f78480 (Temporary Demo) in us-west-2 of 121287063412 at 2018-02-11T08:56:26Z”

Bonus: CloudWatch Events Alerts For State Changes

In addition to the two-minute interruption alert, we can send ourselves messages when the instance is actually stopped, and when it is started again, and when it is running. This is done with slightly different CloudWatch Events pattern and input transformer, but following basically the same pattern.

Create a CloudWatch Events Rule that filters for Spot instance interruption warnings for this specific instance:

rule_name_state="ec2-instance-state-change-$instance_id"
rule_description_state="EC2 instance $instance_id state change"

event_pattern_state='{
  "source": [
    "aws.ec2"
  ],
  "detail-type": [
    "EC2 Instance State-change Notification"
  ],
  "detail": {
    "instance-id": [ "'"$instance_id"'" ]
  }
}'

aws events put-rule \
  --region "$region" \
  --name "$rule_name_state" \
  --description "$rule_description_state" \
  --event-pattern "$event_pattern_state" \
  --state "ENABLED"

And again, set the target of the new CloudWatch Events rule to the same SNS topic using another input transfomer:

sns_target_state='[{
  "Id": "target-sns-'"$sns_topic_name"'",
  "Arn": "'"$sns_topic_arn"'",
  "InputTransformer": {
    "InputPathsMap": {
      "title": "$.detail-type",
      "source": "$.source",
      "account": "$.account",
      "time": "$.time",
      "region": "$.region",
      "instance": "$.detail.instance-id",
      "state": "$.detail.state"
    },
    "InputTemplate":
      "\"<title>: <source> reports <instance> ('"$instance_name"') is now <state> in <region> of <account> as of <time>\""
  }
}]'

aws events put-targets \
  --region "$region" \
  --rule "$rule_name_state" \
  --targets "$sns_target_state"

Here’s are a couple sample messages for the instance state change notification:

“EC2 Instance State-change Notification: aws.ec2 reports i-0f47ef25380f78480 (Temporary Demo) is now stopping in us-west-2 of 121287063412 as of 2018-02-11T08:58:29Z”

“EC2 Instance State-change Notification: aws.ec2 reports i-0f47ef25380f78480 (Temporary Demo) is now stopped in us-west-2 of 121287063412 as of 2018-02-11T08:58:47Z”

Cleanup

If we terminate the EC2 Spot instance, the persistent Spot Request will restart a replacement instance. To terminate it permanently, we need to first cancel the Spot Request:

spot_request_id=$(aws ec2 describe-instances \
  --region "$region" \
  --instance-id "$instance_id" \
  --output text \
  --query 'Reservations[].Instances[].[SpotInstanceRequestId]')
echo spot_request_id=$spot_request_id

aws ec2 cancel-spot-instance-requests \
  --region "$region" \
  --spot-instance-request-ids "$spot_request_id"

Then terminate the EC2 instance:

aws ec2 terminate-instances \
  --region "$region" \
  --instance-ids "$instance_id" \
  --output text \
  --query 'TerminatingInstances[*].[InstanceId,CurrentState.Name]'

Remove the targets from the CloudWatch Events “interrupted” rule and delete the CloudWatch Events Rule:

target_ids_interrupted=$(aws events list-targets-by-rule \
  --region "$region" \
  --rule "$rule_name_interrupted" \
  --output text \
  --query 'Targets[*].[Id]')
echo target_ids_interrupted='"'$target_ids_interrupted'"'

aws events remove-targets \
  --region "$region" \
  --rule "$rule_name_interrupted" \
  --ids $target_ids_interrupted

aws events delete-rule \
  --region "$region" \
  --name "$rule_name_interrupted"

Remove the targets from the CloudWatch Events “state” rule (if you created those) and delete the CloudWatch Events Rule:

target_ids_state=$(aws events list-targets-by-rule \
  --region "$region" \
  --rule "$rule_name_state" \
  --output text \
  --query 'Targets[*].[Id]')
echo target_ids_state='"'$target_ids_state'"'

aws events remove-targets \
  --region "$region" \
  --rule "$rule_name_state" \
  --ids $target_ids_state

aws events delete-rule \
  --region "$region" \
  --name "$rule_name_state"

Delete the SNS Topic:

aws sns delete-topic \
  --region "$region" \
  --topic-arn "$sns_topic_arn"