28

We end up with a fair amount of AWS EC2 snapshots where the AMI has been deleted, but the snapshot is left behind to rot. I'd like a non-manual way of identifying and deleting these orphans to save us money and space.

Ideally I'm thinking a bash script leveraging the CLI, but my AWS-fu is weak. I assume someone's done this before but I can't find a script that actually works.

In the best-case scenario this will also check volumes and clean those as well, but that may be better suited for a second question.

Alex
  • 4,612
  • 6
  • 29
  • 49

5 Answers5

16

Largely inspired by the blog posts and gist already linked in the other answers, here is my take to the problem.

I did use some convoluted JMESpath functions to get a list of snapshots and not require tr.

Disclaimer: Use at your own risks, I did my best to avoid any problem and keep sane defaults, but I won't take any blame if it cause problem to you.

#!/bin/sh
# remove x if you don't want to see the commands
set -ex

# Some variable initialisation with sane defaults
DRUN='--dry-run'
DO_DELETE=${1:-'no'}
REGION=${2:-'eu-west-1'}
ACCOUNTID=${3:-'self'}

# Get two temporary files
SNAP_FILE=$(mktemp)
IMAGE_FILE=$(mktemp)

# Get the snapshot list and the volume list
aws --region "$REGION" ec2 describe-snapshots --owner-ids "$ACCOUNTID" --query 'Snapshots[*].[SnapshotId]' --output text > "$SNAP_FILE"
aws --region "$REGION" ec2 describe-images --owners "$ACCOUNTID" --filters Name=state,Values=available --query 'Images[*].BlockDeviceMappings[*].Ebs.[SnapshotId]' --output text > "$IMAGE_FILE"

# Check if the outputed command should be dry-run (default) or not
if [ "$DO_DELETE" = "IAMSURE" ]
then
 DRUN=''
fi

# count each snapshot id, decrease when a volume reference it, print delete command for those with no volumes
awk -v REGION="$REGION" -v DRUN="$DRUN" '
FNR==NR { snap[$1]++; next } # increment snapshots and get to next line in file immediately

{ snap[$1]-- } # we changed file, decrease the snap counter when a volume reference it

END {
 for (s in snap) { # loop over the snapshots
   if (snap[s] > 0) { # if we did not decrese under 1 that means there is no volume referencing this snapshot
    cmd="aws --region " REGION " " DRUN " ec2 delete-snapshot --snapshot-id " s
    print(cmd)
  }
 }
}
' "$SNAP_FILE" "$IMAGE_FILE"
# Clean up the temp files
rm "$SNAP_FILE" "$IMAGE_FILE"

I hope the script itself is commented enough.

Default usage (no-params) will list delete commands of orphaned snapshots for the current account and region eu-west-1, extract:

aws --region eu-west-1 --dry-run ec2 delete-snapshot --snapshot-id snap-81e5856a
aws --region eu-west-1 --dry-run ec2 delete-snapshot --snapshot-id snap-95c68c7e
aws --region eu-west-1 --dry-run ec2 delete-snapshot --snapshot-id snap-a3bf50bd

You can redirect this output to a file for review before sourcing it to execute all the commands.

If you want the script to execute the command instead of printing them, replace print(cmd) by system(cmd).

Usage is as follow with a script named snap_cleaner:

for dry-run commands in us-west-1 region

./snap_cleaner no us-west-1

for usable commands in eu-central-1

./snap_cleaner IAMSURE eu-central-1 

A third parameter can be used to access another account (I do prefer to switch role to another account before).

Stripped down version of the script with awk script as a oneliner:

#!/bin/sh
set -ex

# Some variable initialisation with sane defaults
DRUN='--dry-run'
DO_DELETE=${1:-'no'}
REGION=${2:-'eu-west-1'}
ACCOUNTID=${3:-'self'}

# Get two temporary files
SNAP_FILE=$(mktemp)
IMAGE_FILE=$(mktemp)

# Get the snapshot list and the volume list
aws --region "$REGION" ec2 describe-snapshots --owner-ids "$ACCOUNTID" --query 'Snapshots[*].[SnapshotId]' --output text > "$SNAP_FILE"
aws --region "$REGION" ec2 describe-images --owners "$ACCOUNTID" --filters Name=state,Values=available --query 'Images[*].BlockDeviceMappings[*].Ebs.[SnapshotId]' --output text > "$IMAGE_FILE"

# Check if the outputed command should be dry-run (default) or not
if [ "$DO_DELETE" = "IAMSURE" ]
then
 DRUN=''
fi

# count each snapshot id, decrease when a volume reference it, print delete command for those with no volumes
awk -v REGION="$REGION" -v DRUN="$DRUN" 'FNR==NR { snap[$1]++; next } { snap[$1]-- } END { for (s in snap) { if (snap[s] > 0) { cmd="aws --region " REGION " " DRUN " ec2 delete-snapshot --snapshot-id " s; print(cmd) } } }' "$SNAP_FILE" "$IMAGE_FILE"
# Clean up the temp files
rm "$SNAP_FILE" "$IMAGE_FILE"
bgdnlp
  • 253
  • 2
  • 6
Tensibai
  • 11,416
  • 2
  • 37
  • 63
5

I used the following script on GitHub by Rodrigue Koffi (bonclay7) and it works pretty good.

https://github.com/bonclay7/aws-amicleaner

Command:

amicleaner --check-orphans

From the documentation blog post it does some more things:

It actually does a bit more than that, at of today it allows:

  • Removing a list of images and associated snapshots
  • Mapping AMIs:
    • Using names
    • Using tags
  • Filtering AMIs:
    • used by running instances
    • from autoscaling groups (launch configurations) with a desired capacity set to 0
    • from launch configurations detached from autoscaling groups
  • Specifying how many AMIs you want to keep
  • Cleaning orphan snapshots
  • A bit of reporting
Tensibai
  • 11,416
  • 2
  • 37
  • 63
3

Here is one script which can help you find orphaned snapshots

comm -23 <(echo $(ec2-describe-snapshots --region eu-west-1 | grep SNAPSHOT | awk '{print $2}' | sort | uniq) | tr ' ' '\n') <(echo $(ec2-describe-images --region eu-west-1 | grep BLOCKDEVICEMAPPING | awk '{print $3}' | sort | uniq) | tr ' ' '\n') | tr '\n' ' '

(from here)

Also you can check this article from serverfault

P.S. Of course you can change the region to reflect your

P.P.S. Here is updated code:

 comm -23 \
<(echo $(aws ec2 describe-snapshots --region eu-west-1 |awk '/SNAPSHOT/ {print $2}' | sort -u) | tr ' ' '\n') \
<(echo $(aws ec2 describe-images --region eu-west-1 |  awk '/BLOCKDEVICEMAPPING/ {print $3}' | sort -u) | tr ' ' '\n') | tr '\n' ' '

The sample exaplanations what the code do is:

echo $(aws ec2 describe-snapshots --region eu-west-1 | awk '/SNAPSHOT/ {print $2}' | sort -u) | tr ' ' '\n')

send to STDOUT the list of snapshots. this construction:

<(...)

create virtual temporary filehandler to make comm command read from two "files" and compare them

Romeo Ninov
  • 431
  • 5
  • 16
2

Here is a GitHub Gist code snippet of exactly what you are asking for by Daniil Yaroslavtsev.

It uses the list of all images and their snapshots and compares the IDs to list of all snapshot IDs. Whatever remains are the orphaned ones. The code works in the same principle as the answer above, but is better formatted and slightly more readable.

The code takes advantage of the JMESPath with --query Snapshots[*].SnapshotId option (you can also use jp command line utility for that, if its already in your distribution. The formats the output as text with --output text. Here is a link to API reference and few examples. It is slightly more elegant than a long chain of grep/awk/sort/uniq/tr pipes.

Warning by Todd Walton: Don't mistake with 'jq' utility which uses different query language to parse json documents.

Jiri Klouda
  • 5,867
  • 1
  • 22
  • 54
0

I've written snapshots.py script which iterates over all snapshots (in defined list of regions) and generates report.csv. This file contains information about instance, AMI and volume referenced by all snapshots.

There is also command to interactively remove dangling snapshots.

jazgot
  • 101
  • 1