Monitoring and Troubleshooting Faster Restores from Amazon S3 Glacier

This page describes tools for monitoring and troubleshooting faster restores for Amazon S3 Glacier.

Commvault Jobs

Each restore job has two jobs listed in the Commvault Command Center Job Manager:

The initial restore job (parent)
The Cloud Storage Archive Recall job (child)

Commvault Log Files

Activity faster restores from S3 Glacier (which are initiated by the Cloud Storage Archive Recall workflow) cannot be monitored using the Commvault Job Monitor. To monitor restore progress interactively, consult the following log files on the MediaAgent that is performing the restore:

WorkflowEngine.log: On the host that executes the Cloud Archive Recall workflow. Contains details of the initiation of the workflow.
WorkflowCustom.log: On the host that executes the Cloud Archive Recall workflow. Contains detailed parameters and the command-line options that are passed to the CloudChunkRecall.exe utility on the MediaAgent. Summarizes the number of successful and unsuccessful objects for troubleshooting partial or complete restore failures.
CloudChunkRecall.log: Contains detailed information, including the Amazon S3 Batch Operations Job Id, the total number of objects requested, the total number of objects restored, and the total number of objects that remain.

Increasing to debug level = 3 allows detailed diagnostic reporting on the progress of the Cloud Archive Recall restore.
CloudActivity.log: Contains API requests made to the Amazon S3 service endpoints. Increasing to debug level = 3 will include full API request headers for troubleshooting.
CloudStats.log: Contains performance metrics on the volume of data transferred and the transfer speeds (throughput, latency) observed between the S3 endpoint and the Commvault MediaAgent.

Logging Amazon S3 API Calls Using AWSCloudTrail

Amazon S3 (including S3 Batch Operations) is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, a role, or an AWS service in Amazon S3. You can observe the use of CloudTrail to capture a subset of API calls for Amazon S3 as event, including the submission of the S3:CreateJob request.

If you create a trail, you can enable continuous delivery of CloudTrail events to an Amazon S3 bucket, or you can view the most recent events in the CloudTrail console in Event history.

Using the information collected by CloudTrail, you can determine the s3:CreateJob request that was made to Amazon S3, the IP address from which the request was made, who made the request, when the request was made, and additional details (such as the location of the Completion reports).

S3 Batch Operation Completion Report

Commvault requests Amazon S3 to write a completion report to the Commvault cloud storage location that the restore is performed from.

s3BatchOperationsRestore CVRestoreJobId-Commvault-Cloud-Storage-Archive-Recall-Workflow-Job-ID

To investigate failures, you can use the Commvault Restore Job Id to locate the appropriate CVRestoreJobId-CVWorkflowJobID prefix and completion report. For more information about how unsuccessful operations are handled and logged, see Tracking job failure.

The s3BatchOperationsRestore/CVRestoreJobId-Commvault-Cloud-Storage-Archive-Recall-Workflow-Job-ID folder contains a job folder for each s3:CreateJob that is executed. For example:

s3://source-bucket/s3BatchOperationsRestore/
   CVRestoreJobId-342/
      job-410b054c-be59-47ae-b04b-a713c148bedb/
         manifest.json
         manifest.json.md5
      results/
         e2ce4b092a4a670a58fa8d412e5a975658b5d49b.csv

For information about interpreting completion reports, see Examples: S3 Batch Operations completion reports.

Note

Commvault will not clean up or remove the uploaded manifest files or job completion reports, so that other data analytics activities can be performed. Commvault recommends configuring S3 Lifecycle policies to archive or delete these files in accordance with your business data strategy.

Tracking Job Status and Completion Reports for Faster Restores from Amazon S3 Glacier

Commvault requests that the Amazon S3 Batch job write a completion report to the Amazon S3 bucket that the restore is running from. You will find the reports in the following prefix:

s3BatchOperationsRestore/CVRestoreJobId-Commvault-CloudArchiveRecallWorkflow-Job-Id/CompletionReport/Amazon-S3-Batch-Operation-Job-ID

Commvault summarizes the number of successful and unsuccessful objects restored in Log Files/WorkflowCustom.log on the MediaAgent that performs the restore, for troubleshooting partial or complete restore failures.

To investigate failures, you can use the Commvault Cloud Archive Recall Workflow Job Id to locate the appropriate CVRestoreJobId-nnn prefix and completion report.

For more information about how S3 Batch Operation unsuccessful operations are handled and logged, see Tracking job failure.

The s3BatchOperationsRestore/CVRestoreJobId-Commvault-Cloud-Storage-Archive-Recall-Workflow-Job-ID folder contains a job folder for each s3:CreateJob that is executed. For example:

s3://source-bucket/s3BatchOperationsRestore/ 
   CVRestoreJobId-342/ 
      job-410b054c-be59-47ae-b04b-a713c148bedb/ 
         manifest.json 
         manifest.json.md5 
      results/ 
         e2ce4b092a4a670a58fa8d412e5a975658b5d49b.csv

For information about interpreting completion reports, see Examples: S3 Batch Operations completion reports.

Note

Commvault will not clean up or remove the uploaded manifest files or job completion reports, so that other data analytics activities can be performed. Commvault recommends configuring S3 Data Lifecycle policies to archive or delete these files in accordance with your business data strategy.