Media Management Configuration: Deduplication

Updated

Use the following parameters to establish Deduplication parameters.

Parameter

Description

Config param to convert reconstruction to full automatically on invalid DDB backup job

Definition: Specifies that the DDB reconstruction job is transformed to a full reconstruction job when the last DDB backup job is invalid.

Default Value: 0

Range: 0 (disabled) or 1 (enabled)

Config param to mark jobs verification failed for read errors

Definition: Specifies that, during restore auxiliary copy, or synthetic full operations, if a data chunk is found that has read errors, then the Data Verification Status of the job is marked as Failed.

Default Value: 1

Range: 0 (disabled) or 1 (enabled)

Days to keep DDB on source location after successful move partition job

Definition: Specifies the number of days to keep DDB files on the source MediaAgent after a successful move partition job.

Default Value: 0 day

Range: 0 to 30 days

DDB horizontal scaling threshold free space percentage

Definition: Specifies the threshold for the percentage of free space that must be available on a deduplication database partition disk. When the available free space falls below this threshold, the software creates a new deduplication database.

Default Value: 0

Range: 0 to 50

DDB horizontal scaling threshold number of subclients per DDB

Definition: Specifies the threshold for the number of subclients that are associated with a deduplication database after which the software creates a new deduplication database for the storage policy copy. Configure a value other than zero (0) for the parameter to take effect.

Default Value: 0

DDB Reconstruction prune replay batch count

Definition: Specifies the number of prunable data blocks that can be sent in a batch to a deduplication database.

Default Value: 1000

Range: 1 to 100,000

DDB Reconstruction prune replay retry count

Definition: Specifies the number of times that DDB Reconstruction has to replay the pruned data blocks before reporting failure.

Default Value: 100,000

Range: 1 to 100,000

Deduplication pruning batch size

Definition: Specifies the number of job records that can be sent in a batch for deduplication pruning from the CommServe to the MediaAgent.

Default Value: 1000

Range: 0 to 100,000

Enable horizontal scaling of DDBs

Definition: When you enable horizontal scaling, the software regularly monitors the deduplication databases (DDBs) that exist before and after enabling the horizontal scaling. The software marks a DDB as full upon reaching the system threshold limits, and then stops association of any new subclients to the DDB. However, the existing subclients still back up to the current DDB.

When you create a storage pool with deduplication, the software creates a DDB with the name StoragePoolName_DDBStoreID. When you perform a backup operation for a subclient, the software renames the DDB to StoragePoolName_SubclientDataType_DDBStoreID. The value of SubclientDataType is Files for File System agents, VMs for virtual machines and Databases for databases. The software also renames the DDBs that exist before you enable horizontal scaling. However, the existing subclients of all data types still backup to the renamed current DDB. When you perform a backup operation for a subclient of different data type, the software creates a new DDB for the data type. Any new subclients are associated to a DDB based on their data type.

When a DDB of a data type is marked full upon reaching the system threshold limits, the software automatically creates a new DDB for the data type and associates any new subclients of the data type to the new DDB.

Default Value: 1

Range: 0 (disabled) or 1 (enabled)

Usage: Horizontal scaling improves deduplication efficiency because similar data types deduplicate more efficiently than dissimilar data types.

When you create a storage pool with deduplication, if one of the following conditions is true, then horizontal scaling does not apply for the copy:

  • You select a library that is associated with a MediaAgent that is on Service Pack 14 or an earlier version.

  • The deduplication database or a deduplication database partition is hosted on a MediaAgent that is on Service Pack 14 or an earlier version.

When you enable horizontal scaling, the auxiliary copy operation for a storage policy copy uses the Use Scalable Resource Allocation option by default, and you cannot turn off the option. For more information, see Performing an Auxiliary Copy Operation.

You can review and update values for the DDB horizontal scaling threshold free space percentage and DDB horizontal scaling threshold number of subclients per DDB parameters in the current Deduplication tab.

Note:

  • Any new subclients from the client computers with Service Pack 14 and more recent service packs associate to the new deduplication databases that are created with horizontal scaling for backups.

  • The existing and any new subclients from the client computers with Service Pack 13 and earlier service packs continue to use the earlier deduplication database for backups.

  • The software performs a periodic weekly check on the full DDB. When the number of records in the full DDB exceed the system threshold, the software starts associating few subclients to the new DDB.

  • You can use the Move Clients to New DDB workflow to move the subclients of a client computer from the previous full DDB to a new DDB created for their data type. For instructions, see Moving Clients from a Full Deduplication Database to a New Deduplication Database.

Maximum allowed substore configuration

Definition: Specifies the maximum number of partitions that you can configure in a deduplication database (DDB).

Default Value: 4

Range: 1 to 6

Maximum number of DDBs allowed per MediaAgent

Definition: Specifies the maximum number of deduplication database (DDB) partitions that a MediaAgent in the CommCell environment can host. When the number of DDB partitions on a MediaAgent exceed this limit, you cannot create a storage pool with a DDB partition on the MediaAgent. When horizontal scaling of DDBs is enabled, the software does not create a new DDB if any MediaAgent that the DDB is associated to reaches the threshold.

Default Value: 50

Range: 10 to 500

Maximum number of streams allowed during the deduplication database reconstruction job

Definition: Specifies the number of data streams that are used while running the deduplication database reconstruction jobs.

Default Value: 50

Range: 5 to 400

Option to retain extra DDB backups for every partition

Definition: Specifies the number of DDB backups that can be retained for each partition.

Default Value: 1

Range: 1 to 4

Usage: If multiple DDB backups for each partition are retained and the most recent DDB backup is corrupted, then you can use one of the retained DDB backups for DDB reconstruction.