Performing a Data Verification Operation on Deduplicated Data

Use the Data Verification feature on the Deduplication Engine to verify the deduplicated data managed by the deduplication database (DDB).

Before You Begin

During the data verification process, the DDB is consulted until the verification process is complete. Therefore, before you run a data verification job, make sure that the following jobs are not running against the storage policy that is in effect for the data verification:

  • DDB Move

  • DDB Reconstruction

If any one of the above jobs listed is running, the data verification job does not start and an appropriate error message is generated. Wait for the jobs to complete and then run the data verification job.

Procedure

By default, the deduplicated data verification is automatically associated with the System Created DDB Verification schedule policy. This schedule policy runs an incremental deduplicated data verification job every day at 11:00 AM on all the active DDBs in the CommCell that have the Verification of Existing Jobs on Disk and Deduplication Database check box selected. However, you can also run the data verification job as follows:

  1. From the CommCell Browser, expand Storage Resources > Deduplication Engines > storage_policy_copy.

  2. Right-click the appropriate deduplication database, click All Tasks > Run Data Verification.

    dedup_data_verification

  3. In the Data Verification dialog box, select the appropriate options to verify deduplicated data:

    1. Run full or incremental data verification job.

      • For full data verification job, clear the Run Incremental Verification check box.

      • For incremental data verification job, select the Run Incremental Verification check box.

        Default: Selected.

        Note

        Incremental DDB data verification runs only if the DDB and the data mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.

    2. In the Data Verification Options area, choose one of the following options to run deduplicated data verification:

      Options

      When to use

      Applies To

      Verification of Deduplication Database

      Use this option to identify only unreadable or inaccessible data blocks so that new backups refer only to valid data blocks.

      This option does not ensure that the existing backup jobs are restorable.

      Full and Incremental Data Verification Job

      Quick Verification of Deduplication Database

      Use this option for a quick verification by checking the presence of the data blocks on the disk so that the new backup jobs refer only to the valid data blocks. The job also identifies all the files that can be defragmented and logs the details in the DDBMountpathInfo.log in the data mover MediaAgents.

      In comparison with the Complete Verification of Existing Jobs on Disk and Deduplication Database and Verification of Deduplication Database options, this option is faster because it does not read the data blocks on the disk. Instead, it ensures that both the DDB and disk are in sync.

      Full and Incremental Data Verification Job

      Verification of Existing Jobs on Disk and Deduplication Database

      Use this option if you want to verify all existing backups and to ensure that the new backups refer only to valid data blocks.

      This option validates if the existing backup jobs are valid for restores and can be copied during Auxiliary Copy operations.

      Full and Incremental Data Verification Job

      Reclaim idle space on Mount Paths

      Use this option to reclaim idle space on disk mount paths that do not support sparse files by deleting the following:

      • Invalid data blocks

      • Orphan chunks

        After selecting this option, an Orphan Chunk Listing phase is run during the space reclamation job and the orphan chunks and blocks are pruned during the defragmentation phase.

        However, if you want to modify the behavior of this option, then you can use the following script:

        qoperation execscript -sn SetKeyIntoGlobalParamTbl.sql -si EnableOrphanChunkListing -si <y or n> -si <Valid Values>

        Where Valid Values are:

        1- Space that is reclaimable from the orphan chunks and orphan blocks is calculated

        2- Orphan chunks are pruned

        4- Orphan blocks are pruned

        6- This is the default value and both orphan chunks and orphan blocks are pruned.

      Full and Incremental Data Verification Job

    3. In the No of Streams to be used in Parallel area, choose one of the following options:

      To configure a specific number of streams for which backups are verified during the data verification operation click Number of Streams and type the number.

      To use the maximum number of streams during the data verification operation click Allow Maximum.

      1. If no streams are specified and the Allow Maximum check box is selected, then 20 streams are used during the Verify Data phase and 50 streams are used during the Validate Data phase.

      2. If the number of streams specified are less than 50, for example 15, then 15 streams are used during the Verify Data phase and 15 streams are used during the Validate Data phase.

      3. If the number of streams specified is more than 50, then 50 streams are used during the Verify Data phase and 50 streams are used during the Validate Data phase.

  4. Click OK.

For more information, see Data Verification Options.

Result

A deduplicated data verification job is displayed in the Job Controller window. You can view the deduplicated data verification job history from the CommServe node and the data verification status for the backup jobs from the storage policy level.

Notes

  • The deduplicated data verification job will appear to resume from the beginning if the job is suspended and resumed during the first phase.

  • When the DDB data verification job is running, you can run backups and auxiliary copy operations if the DDB and the Data Mover MediaAgents are in v11. The DDB store version can be in v9.0, v10.0 or v11.0.

  • If deduplicated data verification job goes into pending state, then the job attempts to run five times, for every 20 minutes. If the data verification job exceeds five attempts, then the job status is marked as failed.

  • If deduplicated data verification job with Verification of Existing Jobs on Disk and Deduplication Database option is killed during the Verify Data phase, then any backup job not verified during Verify Data phase will not have data verification status updated. In this scenario, rerun the deduplicated data verification job.

Loading...