Verifying the Deduplicated Data
Use Data Verification on the Deduplication Engine to verify the deduplicated data managed by the deduplication database (DDB).
About This Task
Deduplicated Data Verification cross-verifies the unique data blocks on disk with the information contained in the DDB and the CommServe database. Verifying deduplicated data ensures that all jobs that are written as unique data blocks to the storage media are valid for restore or Auxiliary Copy operations.
The jobs containing invalid data blocks are marked with the Failed status. These invalid unique data blocks will not be referenced by the subsequent jobs. As a result, new baseline data for the invalid unique data blocks is written to the storage media.
For storage policies with silo copy, if the backup data volumes are moved to silo storage, the necessary volumes are automatically restored to disk during data verification job.
Tip: By default the system created data verification schedule policy is not configured with such data mover MediaAgents that are using a cloud library because the read operations from cloud are very slow and are not cost effective.
During the data verification process, the DDB is consulted until the verification process is complete. Therefore, before you run a data verification job, make sure that the following jobs are not running against the storage policy that is in effect for the data verification:
- DDB Move
- DDB Reconstruction
If any one of the above jobs listed is running, the data verification job does not start and an appropriate error message is generated. Wait for the jobs to complete and then run the data verification job.
- From the CommCell Browser, expand Storage Resources > Deduplication Engines > storage_policy_copy.
- Right-click the appropriate deduplication database, click All Tasks > Run Data Verification.
- In the Data Verification dialog box, select the appropriate options to verify deduplicated data:
- Run full or incremental data verification job.
- For full data verification job, clear the Run Incremental Verification check box.
- For incremental data verification job, select the Run Incremental Verification check box.
- In the Data Verification Options area, choose one of the following options to run deduplicated data verification:
Options When to use Applies To Verification of Deduplication Database Use this option to identify only unreadable or inaccessible data blocks so that new backups refer only to valid data blocks.
This option does not ensure that the existing backup jobs are restorable.
Full and Incremental Data Verification Job Quick Verification of Deduplication Database Use this option for a quick verification by checking the presence of the data blocks on the disk so that the new backup jobs refer only to the valid data blocks. The job also identifies all the files that can be defragmented and logs the details in the DDBMountpathInfo.log in the data mover MediaAgents.
In comparison with the Complete Verification of Existing Jobs on Disk and Deduplication Database and Verification of Deduplication Database options, this option is faster because it does not read the data blocks on the disk. Instead, it ensures that both the DDB and disk are in sync.
Full Data Verification Job Verification of Existing Jobs on Disk and Deduplication Database Use this option if you want to verify all existing backups and to ensure that the new backups refer only to valid data blocks.
This option ensures that existing backup jobs are restorable and can be copied during Auxiliary Copy operations.
Full and Incremental Data Verification Job Reclaim idle space on Mount Paths with no drill hole capability* For v10 and v11 deduplication database, use this option to reclaim idle space on disk mount paths that do not support sparse files by deleting the following:
- Invalid data blocks
- Orphan chunks
Exception: The idle space reclamation on mount paths with no drill hole capability is not supported if the ddb store is using a mount path that is shared using the DataServer-IP.
Full Data Verification Job
*Contact Customer Support to enable this feature.
For more information, see Data Verification Options.
- Run full or incremental data verification job.
- Click OK.
Note: The deduplicated data verification is automatically associated with the System Created DDB Verification schedule policy. This schedule policy runs incremental deduplicated data verification job every day at 3:00 AM on all the active DDBs available in the CommCell with Verification of Existing Jobs on Disk and Deduplication Database option.
A deduplicated data verification job is displayed in the Job Controller window.
- By default, deduplicated data verification job uses 20 streams during Validate Data phase and 50 streams during Verify Data phase. You can modify the default values by using the Maximum number of threads to be used during Validate Deduplicated Data phase of Data Verification Job parameters in the Media Management Configuration dialog box. For instructions, see Media Management Configuration.
- When the DDB data verification job is running, data pruning for the DDB will not happen until phase-1 of the DDB data verification is complete. However, you can manually delete the retained jobs. For more information on deleting a job, see Delete a job from the Copy.
- When the DDB data verification job is running, you can run backups and auxiliary copy operations if the DDB and the Data Mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.
- Incremental DDB data verification can only run if the DDB and the data mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.
- When the job is complete, you can view the deduplicated data verification job history from the CommServe node and the data verification status for the backup jobs from the storage policy level.
- If deduplicated data verification job goes into pending state, then the job attempts to run five times, for every 20 minutes. If the data verification job exceeds five attempts, then the job status is marked as failed.
- If deduplicated data verification job with Verification of Existing Jobs on Disk and Deduplication Database option is killed during the Verify Data phase, then any backup job not verified during Verify Data phase will not have data verification status updated. In this scenario, rerun the deduplicated data verification job.