Process for Faster Restores from Amazon S3 Glacier

This page describes the process for faster restores from Amazon S3 Glacier Flexible Retrieval, which use Amazon S3 Batch Operations to optimize and accelerate the restore operation.

Process

The application owner or cloud ops administrator initiates a restore using Commvault Command Center (CLI or REST API).

No special configurations or settings are required.

Commvault automates the Amazon S3 Glacier Faster Restore. Commvault automatically identifies the Amazon S3 objects that are required to restore the instances and passes those files to the Commvault Cloud Archive Recall workflow.
The Cloud Archive Recall workflow synthesizes an S3 Batch Operations CSV format manifest and uploads it to s3BatchOperationsRestore/CVRestoreJobId-nnn/manifest.csv within the S3 bucket (cloud library) that is used for the restore.

The manifest file is placed in the S3 Standard storage class. The manifest contains the bucket name as objectkey_to_restore.

cvltbkp,QXB5SZ_06.04.2023_19.51/CV_MAGNETIC/V_1/CHUNK_1/CHUNK_META_DATA_1.FOLDER/0

cvltbkp,QXB5SZ_06.04.2023_19.51/CV_MAGNETIC/V_5/CHUNK_4/CHUNK_META_DATA_4.FOLDER/0

cvltbkp,QXB5SZ_06.04.2023_19.51/CV_MAGNETIC/V_6/CHUNK_7/CHUNK_META_DATA_7.FOLDER/0

cvltbkp,QXB5SZ_06.04.2023_19.51/CV_MAGNETIC/V_4/CHUNK_8/CHUNK_META_DATA_4.FOLDER/0

Note

Commvault does not specify the (optional) S3 object version in the manifest file. Only the most recent version of each object is used for the recovery.

For the manifest file format, see Specifying a manifest.
The Cloud Archive Recall workflow compute node (typically, a Commvault MediaAgent or cloud access node) contacts the S3 Batch Operations regional service endpoint (https://account-id.s3-control.region.amazonaws.com), and does the following:
- Submits a new s3:CreateJob request
- Receives an S3 Batch Operations Job Id
Commvault passes the ConfirmationRequired = false parameter as part of the s3:CreateJob request.

For Amazon S3 Glacier Flexible Retrieval and Amazon S3 Glacier Deep Archive, a Restore Object batch operation creates a temporary copy of each of the objects requested, and then deletes the copy after the ExpirationInDays days have elapsed. Commvault sets the ExpirationInDays to 7 (use the AutoCloudRecallExpireDays setting to reduce the number of days to reduce storage cost in S3 Glacier Flexible Retrieval and S3 Glacier Deep Archive restores), because the data is not needed after the restore completes. For Amazon S3 Intelligent-Tiering, the objects are moved back into the Frequent Access tier.

For more information, see Restoring objects from the S3 Intelligent-Tiering Archive Access and Deep Archive Access tiers.

Note

When you restore an archived object from S3 Glacier Flexible Retrieval or S3 Glacier Deep Archive, you pay for both the archive and a copy that you restored temporarily. For more information about pricing, see Amazon S3 pricing.

ExpirationInDays in days can be configured using the AutoCloudRecallExpireDays entity setting.
The Cloud Archive Recall workflow polls all the Amazon S3 objects that are being restored by periodically running a HEAD Object (s3:HeadObject) against each object, until all objects are restored.

The default polling interval is 240 minutes (4 hours) and can be customized using the nCloudChunkRecallSleepIntervalMins entity setting.

Commvault has performed testing of restore performance using Amazon S3 Glacier Faster Restores and found the following polling intervals to be optimal to minimize restore time.
- S3 Glacier Flexible Retrieval: 15 minutes
- S3 Glacier Deep Archive: 540 minutes
The x-amz-restore field that is returned by s3:HeadObject shows whether a restore activity is currently in progress for the object or is completed with the expiry timer started:

x-amz-restore: ongoing-request="false", expiry-date="Fri, 21 Dec 2012 00:00:00 GMT"

If the object restoration is in progress, the header returns the value ongoing-request="true". For more information, see s3:HeadObject – Response Elements (x-amz-restore).
The Cloud Archive Recall workflow returns the success to the Commvault Job Manager, which then continues to restore the compute or containerized instance to the preferred location.