Backup Your Digital Life To AWS S3

5 min readAug 13, 2021

Computer hard drive internals — Photo by benjamin lehman on Unsplash

More an more of our lives are purely digital, and a lot of us have outsourced the storage and back up of our those digital lives to products and services that are easy. There is no doubt that these services come with massive benefits including ease of use, redundancy, and experienced teams ensuring reliability and security well beyond our own capabilities. Yes some of these services are free, but they come at cost to our privacy. What if I told you that you could get most of those benefits with a small investment without giving up your privacy?

There are a lot of ways to solve this problem, this outlines one of many. It is not meant to be THE recommended way, only an option so allowing you to decide what works best for you.

Why AWS?

There are other solutions, so why AWS?

Privacy
Reliability
Reasonable Cost

AWS was started as a tool for businesses, but driven by the Amazon ethos it is provided as a self service tool available to individuals. The benefit you get from AWS’s lineage is that businesses pay for reliable and private storage since that’s where businesses store customer data. Those businesses are subject to laws, such as GDPR, and constraints that mean that AWS is 100% focused on privacy and reliability. They guarantee it. If they fail they cease to be valuable to businesses. AWS also continuously improves services and drives costs down for their customers.

AWS S3 is world wide and has unmatched service and reliability. Their costs are not the cheapest but extremely competitive, one of the best values.

Why NOT AWS?

This is definitely not for everyone. AWS services require significant learning and are geared towards businesses and developers. They are not “consumer facing” simple and easy.

This article assumes you have the basics of that experience in computering things, i.e. a tech bent. No, I would not recommended this for my Grandma, but that’s mostly because she loves Korn Shell and that has been a major sticking point at Thanksgiving for years.

AWS is also not the cheapest. There are cheaper and easier solutions if you want to “set it and forget it”.

Make It Happen

Everything following assumes that you have:

Setup an AWS account
Downloaded your AWS credentials and setup the AWS CLI with those credentials
Are comfortable on the command line
Some understanding of AWS or a curiosity to learn more. Nnot everything will be explained in this short article.
Bonus: scheduling and creating an IAM account for credentials only access and least privilege for your backup process

AWS S3 stores data in what’s called bucket. From here on out we’ll call our bucket stuff-backup and the data we will store there will come a folder named my-stuff.

Create And Configure Your Bucket

Create aws s3api create-bucket — bucket stuff-backup — acl private Recommended practice is to create a bucket for each set of things you want to store instead of one bucket for all of them. So for photos and documents you would want to create two buckets photos-backup and documents-backup
Set the bucket as non-public which prevents any public access of data. Don’t. Skip. This. Bit.

aws s3api put-public-access-block --bucket stuff-backup --public-access-block--configuration BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

Enable versioning, which will keep versions of your items as they change and allows you to use storage tiers to reduce costs

aws s3api put-bucket-versioning --bucket stuff-backup --versioning-configuration Status=Enabled

Enable encryption. This will ensure all your files stored in S3 will be encrypted by your default AWS key with AES encryption.

aws s3api put-bucket-encryption --bucket stuff-backup --server-side-encryption-configuration '{"Rules": [{"ApplyServerSideEncryptionByDefault": {"SSEAlgorithm": "AES256"}}]}'

Add a lifecycle policy. This uses a file (below) to move your backups to the S3 Glacier storage tier and versions to Glacier Deep Archive. This means your files will take longer to get back (up to 24–48 hours) but your costs will be orders of magnitude less.

aws s3api put-bucket-lifecycle-configuration --bucket stuff-backup --lifecycle-configuration file://bucket-lifecycle.json

Lifecycle Policy bucket-lifecycle.json

{
    "Rules": [
        {
            "ID": "Upload",
            "Status": "Enabled",
            "Filter": {
                "Prefix": ""
            },
            "Transitions" : [
                {
                    "Days": 30,
                    "StorageClass": "GLACIER"
                }
            ]
        },
        {
            "ID": "old-versions-to-glacier",
            "Status": "Enabled",
            "Filter": {
                "Prefix": ""
            },
            "NoncurrentVersionTransitions": [
                {
                    "NoncurrentDays": 60,
                    "StorageClass": "DEEP_ARCHIVE"
                }
            ]
        }
    ]
}

Backup It Up

Use the AWS S3 command line to back up my-stuff to S3. The --delete flag here makes sure that files deleted locally are marked as deleted from S3. Since we setup versioning they are still there.

aws s3 sync --delete my-stuff s3://stuff-backup

Depending on how much you have to backup, this may take a while the first time. sync will only upload changes, so subsequent runs will be fast and efficient.

Bonus Material

Scheduling

To make this valuable, run this command on your scheduling tool of choice. For me this is on Linux on systemd enabled distro.

Create the service (/etc/systemd/system/stuff-backup.service) which runs a shell script to do the work:

[Unit]
Description=Backup Stuff[Service]
Type=onshot
User=root
ExecStart=/root/stuff-backup.sh[Install]
WantedBy=default.target

Create the timer (/etc/systemd/system/stuff-backup.timer ):

[Unit]
Description=Scheduled stuff-backup[Timer]
Persistent=true
OnCalendar=*-*-* 02:35:00
Unit=stuff-backup.service[Install]
WantedBy=timers.target

Enable them:

systemctl enable stuff-backup.service
systemctl enable stuff-backup.timer

Security

It’s always a good idea to follow least privilege for access for your processes, like this backup process. Create a separate AWS IAM account with credential only access and only give it permissions to what it needs.

Here I will assume you created that IAM account and created and put that account into an IAM group named backups. You should execute your backup script or commands under those credentials.

Create a policy for your new bucket

aws iam create-policy --policy-name stuff-s3-backup-access --policy-document file://access-policy.json --query 'Policy.Arn' --output text

This creates a new policy called stuff-s3-backup-access and assigns it the rights from the access-policy.json file and gives it the minimum access to only to our newly created stuff-backupbucket. It will output the value (arn) needed in the next step

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:DeleteObject",
                "s3:ListBucket",
                "s3:GetObject",
                "s3:GetBucketLocation",
                "s3:PutObject",
                "s3:PutObjectAcl"
            ],
            "Resource": [
                "arn:aws:s3:::stuff-backup",
                "arn:aws:s3:::stuff-backup/*"
            ]
        }
    ]
}

Assign the policy to your group

Remember your backupsgroup? This command will give all members of that group that policy, and least privilege access necessary to do the job of backing up your data. You’ll need the output of the previous command when you created the policy.

aws iam attach-group-policy --group-name backups --policy-arn <copied-from-previous-command-output>

Backup Your Digital Life To AWS S3

Why AWS?

Why NOT AWS?

Make It Happen

Bonus Material

Written by Todd Palmer