Published: 20. 12. 2019   Category: Programming

Check free space on AWS RDS

The following bash script is an example of awscli usage for checking the free storage space of relational database servers deployed in the Amazon Web Services cloud.

As prerequisites, install awscli: pip3 install awscli and have setup your credentials. Download script from GitHub to your working directory:

wget https://raw.githubusercontent.com/BruXy/bash-utils/master/aws/rds_df.sh

Script has no parameter, so download it and run ./rds_df.sh, the example of output:

Name                Region     DB Engine               Type           Size  Avail  Used
---------------------------------------------------------------------------------------
prod-rds-us-east    us-east-1  postgres-9.4.21         db.m4.4xlarge  477G  303G   36.57 %
prod-rds-eu-london  eu-west-2  postgres-9.4.23         db.m4.2xlarge  187G  181G   3.01 %
london-dmapi-db-1   eu-west-2  aurora-postgresql-10.7  db.r4.large    954M  26G    N/A %
prod-rds-eu-west    eu-west-1  postgres-9.4.21         db.m4.4xlarge  477G  316G   33.80 %
development-rds-a   us-east-1  postgres-9.4.24         db.m4.2xlarge  187G  98G    47.52 %

It is getting list of DB instances, some basic info about them. The regions are defined as global array REGIONS. Change OUTPUT to 0, if you prefer rather CSV output with semicolon delimiter.

Feel free to modify and use this script for your purposes. Or get inspired for your scripts. It is using aws rds describe-db-instances to obtain a list of database instances and some info about them. The output of the command is parsed to global hash map RDS_INSTANCES. It is quite tricky because you cannot modify global variable in for-loop because it is running in its own subshell and this is why trick with while read <<< for-loop is used (honestly, in my script, it is implemented a little bit tortuously because I have additional while loop for modifying output of describe-db-instances). The next step is to query CloudWatch and obtaining metrics for FreeStorageSpace or FreeLocalStorage. The second metric is used for DB clusters and I am not sure why Size is smaller then Available space (not my cluster 😃). For obtaining actual free space, the average value of used metrics is obtained for the last 5 minutes.

#!/bin/bash
#
# Get list of AWS Relational Database Services (RDS) and free
# storage space for earch instance.
#
# Author: Martin 'BruXy' Bruchanov (bruchy at gmail)
#

###########
# Globals #
###########

REGIONS=(eu-west-1 eu-west-2 us-east-1)
declare -A RDS_INSTANCES
START="$(date -u -d '5 minutes ago' '+%Y-%m-%dT%T')"
END="$(date -u '+%Y-%m-%dT%T')"
OUTPUT=1 #0 for ';' delimited CSV, 1 for ASCII table
HEADER="Name;Region;DB Engine;Type;Size;Avail;Used"

#############
# Functions #
#############

#=== FUNCTION ============================================================
#        Name: get_metric
# Description: Query AWS CloudWatch metric for average value for the
#              interval between $START and $END
# Parameter 1: AWS Region.
# Parameter 2: RDS Name (DBInstanceIdentifier).
# Parameter 3: Metric Name.
#     Returns: Value or empty string to stdout.
#=========================================================================

function get_metric() {
local region=$1
local db_name=$2
local metric=$3

AWS_DEFAULT_REGION="$region" \
aws cloudwatch get-metric-statistics \
    --namespace AWS/RDS --metric-name "$metric" \
    --start-time $START --end-time $END --period 300 \
    --statistics Average \
    --dimensions "Name=DBInstanceIdentifier, Value=${db_name}" \
    --output=text \
    --query "Datapoints[].[Average]"
}

#=== FUNCTION ============================================================
#        Name: format_table
# Description: Format output as table or CSV, default according to $OUTPUT.
#=========================================================================

function format_table() {
if [ $OUTPUT -eq 1 ] ; then # Table format
    sed -e "1i $HEADER" | \
        column -s';' -t | \
        sed '1{p;s/./-/g}'
else # CSV format
    sed -e "1i $HEADER"
fi
}

########
# Main #
########

# List databases and some parameters in REGIONS:
# ----------------------------------------------
#
#   * DBInstanceIdentifier
#   * DBInstanceClass
#   * Engine
#   * EngineVersion
#   * AvailabilityZone
#   * AllocatedStorage (in gibibytes, converted to bytes)
#
# Output is available in hash arrad RDS_INSTANCES
#
# RDS_INSTANCES[Id]="Type Engine-Version Size Region"

while read line
do
id=$(cut -f 1 <<< "$line")
data=$(cut -f 2- <<< "$line")
RDS_INSTANCES[$id]="$data"
done <<< "$(
    for region in ${REGIONS[*]}
    do
        aws rds describe-db-instances --region=$region \
            --output=text --query \
            "DBInstances[].[DBInstanceIdentifier,DBInstanceClass,
            Engine,EngineVersion,AllocatedStorage,AvailabilityZone]" |
        while read DB TYPE ENGINE VERSION DISK AZ
        do
            printf "%s\t%s\t%s-%s\t%u\t%s\n" $DB $TYPE $ENGINE \
                $VERSION $[DISK*10**9] ${AZ:0:-1}
        done
    done
)"

# Display free space for each RDS instance
# ----------------------------------------
#

for i in ${!RDS_INSTANCES[@]}
do
name=$i
region=$(cut -f 4 <<< "${RDS_INSTANCES[$i]}")
total=$(cut -f 3 <<< "${RDS_INSTANCES[$i]}")
engine=$(cut -f 2 <<< "${RDS_INSTANCES[$i]}")
inst_type=$(cut -f 1 <<< "${RDS_INSTANCES[$i]}")
free_space=$(get_metric $region $name FreeStorageSpace )

if [ -z "$free_space" ] ; then
    free_space=$(get_metric $region $name FreeLocalStorage )
    # If FreeStorage is not available, it is probably DB cluster
    # and then allocated space is lower then FreeLocal so the
    # percentage does not make sense.
    used_percent='N/A'
else
    used_percent=$(bc <<< "scale=2;(($total-$free_space)*100)/$total+0.05")
fi

# Convert bytes to IEC (K, M, G, base 1024)
free_space=$(numfmt --to=iec <<< $free_space)
total=$(numfmt --to=iec <<< $total)

printf "%s;%s;%s;%s;%s;%s;%s %%\n" $name $region $engine \
    $inst_type $total $free_space $used_percent
done | format_table