Amazon S3 is one of the core services offered by AWS that has a wide variety of use cases, from serving static websites to hosting images, managing data, and much more. In this post, we will review the ins and outs of Amazon S3 Glacier, a special storage class of Amazon S3 that serves as a cost-effective tool for low-access, long-term storage such as archives required for compliance.
This post covers:
- The two S3 Glacier storage classes;
- Methods for getting data into S3 Glacier;
- Tips for restoring data from S3 Glacier.
Table of Contents
Amazon S3 Glacier Storage Classes
The first storage class for Amazon S3 Glacier is the classic S3 Glacier class, that is intended for long-term storage that you don’t need to access quickly.
The second storage class is S3 Glacier Deep Archive. This storage class is intended for extremely long-term archival with low access needs.
Plan your perfect disaster recovery strategy on AWS:
Below are a few axes on which to compare S3 Glacier against S3 Glacier Deep Archive.
Amazon S3 Glacier vs Amazon S3 Glacier Deep Archive
A gigabyte of data in S3 Glacier Deep Archive costs only $0.00099 per month, meaning you can store a terabyte of data in Deep Archive for only $1.01 per month.
To put you in a perspective of how cheap this is, a GB of data in S3 Glacier, previously known as one of the cheapest storage solutions on the market, is charged at $0.004 per month. A terabyte in Amazon S3 Glacier will set you back about $4.10 per month. This is four times the cost of S3 Glacier Deep Archive, but it’s still only one-sixth the price of storage in S3 Standard.
While in the standard Amazon S3 storage solution you can retrieve files instantly, with Amazon S3 Glacier and S3 Glacier Deep Archive you will have to wait minutes, but more oftenly hours or even days.
The classic Amazon S3 Glacier storage class has three options for your retrieval time:
- Expedited: Expedited retrievals take a few minutes to access the data, and are priced at $0.03 per GB and $0.01 per request. This is the fastest and the most expensive option.
- Standard: Standard retrieval makes your data accessible within 3-5 hours. Standard retrievals are priced at $0.01 per GB and $0.05 per 1000 requests.
Bulk: Bulk retrieval requests are the slowest option and usually take 5-12 hours before your data is accessible. Bulk retrievals are priced at $0.0025 per GB and $0.025 per 1,000 requests. This is the cheapest option and is great for restoring huge amounts of data that you don’t need immediately.
All prices above are for the US East 1 region in Northern Virginia. Prices in other regions will vary.
The S3 Glacier Deep Archive has two options for the retrieval time:
- Standard tier with the retrieval time of up to 12 hours
- Bulk tier with the retrieval time of up to 48 hours
Further reading Guide to Amazon S3 Storage Classes
Minimum Storage Duration
The final factor to compare is minimum storage duration. Because Amazon S3 Glacier is designed for long-term storage, AWS will charge you if you delete your data too quickly after storing it in Glacier.
For classic Amazon S3 Glacier, the minimum storage duration is 90 days. If an object is purged in less than 90 days, a prorated fee of $0.012 per GB will be applied. This charge is prorated over the 90 days. If you delete your object 45 days after placing it into S3 Glacier, you would pay half of the fee ($0.006 per gigabyte).
For S3 Glacier Deep Archive, the minimum storage duration is 180 days. As with S3 Glacier, a prorated fee is applied if an object is purged earlier.
Amazon S3 Glacier vs Amazon S3 Glacier Deep Archive Comparison Table
|Storage Price||Retrieval speeds||Early deletion fee|
|Amazon S3 Glacier||$0.004/GB||Expedited: Minutes
Standard: 4-5 hours
Bulk: 5-12 hours
|Amazon S3 Glacier Deep Archive||$0.00099/GB||Standard: 12 hours
Bulk: 48 hours
Now that you understand the basic storage classes with S3 Glacier, let’s review how to get data into S3 Glacier.
Storing Data in Amazon S3 Glacier
There are two ways to get data into Amazon S3 Glacier. The most common method is to transfer data that already exists in standard S3 storage to Glacier. However, since 2018, you can upload data directly to S3 Glacier bypassing standard S3 storage with S3 PUT to Glacier.
Let’s take a closer look at each of these options.
Transitioning Data from S3 to S3 Glacier
Many companies use Amazon S3 Glacier to store formerly ‘hot’ data that has gone ‘cold’. Hot data is data that is accessed frequently and/or needs to be available quickly, and cold data is unlikely to be accessed often. One example of hot data that has gone cold is a month’s worth of weekly backups of your databases. Within the first 30 days of the files being created, you need them accessible to recover rapidly.. After 30 days, they are unlikely to be used but kept available to extreme circumstances.
AWS has simplified this transition through the use of object lifecycle policies. Object lifecycle policies allow you to specify storage transition rules along with purging settings.
In our example, you could store the initial backups in S3 Standard. You could then set an object lifecycle policy to transition each backup from S3 Standard to S3 Glacier after 30 days. This object lifecycle policy would transfer each backup to Amazon S3 Glacier 30 days after it had been uploaded.
Configuring an Object Lifecycle Policy with MSP360
You can configure an object lifecycle policy to move objects from S3 Standard to S3 Glacier with MSP360 Backup. To set up an object lifecycle policy, follow the instructions in this article on Lifecycle Policies in MSP360 Backup.
Uploading Files Directly to Amazon S3 Glacier
Sometimes you may want to load data that is unlikely to be accessed directly into Amazon S3 Glacier. An example here is data that is stored solely for compliance purposes without any need for the quick display to end-users.
For the first few years of S3 Glacier’s existence, it was difficult to upload files directly to Amazon S3 Glacier. Because of this, the recommended way to quickly move files to S3 Glacier was to set an object lifecycle policy of 0 days for a particular bucket or prefix. Any data that was loaded into that bucket or prefix would immediately be transitioned to S3 Glacier according to your policy.
Starting from the version 6.0., MSP360 Backup supports direct upload to Glacier as well as S3 intelligent-tiering. To get more information, please refer to the corresponding blog post.
At AWS re:Invent 2018, AWS announced S3 PUT to Glacier. This unifies the S3 experience such that you can upload files directly to S3 Glacier similar to how you upload files to standard S3 storage classes.
In the sections below, you can see how to upload an object directly to Glacier using:
- MSP360 Explorer
- AWS Tools for PowerShell
- AWS CLI
Uploading Files Directly to S3 Glacier with MSP360 Explorer
Uploading Files Directly to S3 Glacier with AWS Tools for PowerShell
You can also use the AWS Tools for PowerShell to upload directly to S3 Glacier from your command line.
To do so, be sure to set your BucketName and File parameter to match the desired bucket and file you want.
Uploading Files Directly to S3 Glacier with the AWS CLI
Finally, you can use the AWS CLI to upload objects directly to Amazon S3 Glacier.
To do this, use the following steps:
1Make sure you have installed the AWS CLI.
2Use the following command to upload a file to Amazon S3 Glacier:
aws s3 cp myfile.jpg s3://my-bucket --storage-class GLACIER
3Change the “myfile.jpg” to match the file you want to be uploaded, and “my-bucket” to the name of the bucket you want to use.
Vaults vs Buckets
You may remember, back in the days Amazon S3 Glacier was called Amazon Glacier. It’s not only the S3 that was added to the name of the service. In fact, AWS has changed the architecture of its storage service. In Amazon Glacier data is stored in special containers, called vaults. That architecture was both unique and difficult to access and manage. Thus, over time, AWS has restructured its solution and created Amazon S3 Glacier, which used more user-friendly and common buckets to store data in.
That allowed 3-rd party software developers to abandon vaults and use more convenient buckets in Amazon S3 Glacier.
Restoring Objects from Amazon S3 Glacier
To get objects out of Amazon S3 Glacier, you need to request to restore the object. Restoring the object will pull it out of Glacier and into a standard S3 storage class where it can be accessed immediately.
When making a restore request from Glacier, there are a few things to consider:
- Restore speed: As mentioned in the Glacier Storage Classes section above, AWS provides three options on restore speed -- expedited, standard, and bulk. The right choice for you depends on your budget and how quickly you need the data.
- How long to keep the data: When restoring data from Glacier, you specify an amount of time to keep the restored data in S3. This helps you to save on cost if you only need the restored data for a short period. If you need the restored data for longer, copy the restored object to a permanent location in S3.
Further reading Temporary Restore from Glacier with MSP360
- Cost and AWS Free Tier: The costs for restoring data will be affected by a few factors: the size of the data you’re restoring, the restore speed you use, and the retention time you specify.
AWS does provide a Free Tier for S3 Glacier.
Further reading Amazon Glacier Pricing Explained.
While the basic restore functionality has been a core of Amazon S3 Glacier since the beginning, AWS has been steadily adding features over time. Let’s check out the features available since 2018.
Upgrading Amazon S3 Glacier Restore Speed
As of November 2018, AWS allows you to upgrade the speed of a Glacier restore whenever it takes too long.
To do this, you make a new restore request on the same object. The new restore request must use a faster restore speed than the existing restore -- you cannot downgrade your restore speed.
You may not change other details about your restore after it has been started, such as the number of days to retain your objects after they are restored.
Receiving S3 Glacier Restore Notifications
One of the downsides of using Amazon S3 Glacier is that restoring an object takes a variable amount of time, and you have to constantly check whether the restore process is completed..
In November 2018, AWS added the ability to receive S3 event notifications when the restore has finished. Rather than polling for the object to exist, you can receive a notification in an SQS queue, an SNS topic, or a Lambda function once your restore has completed.
In this post, we covered the powerful Amazon S3 Glacier service. S3 Glacier is a reliable, cost-effective way to store low-usage data for long periods. We covered the storage options in S3 Glacier, how to add data to S3 Glacier, and how to restore data from S3 Glacier.