Seeding a Deduplicated Storage Policy

DDB Seeding is a predefined workflow that allows you to transfer the initial baseline backup between two sites using an available removable disk drive such as a USB drive. This workflow can be started manually from the CommCell Console.

Use DDB Seeding when remote office sites are separated from the data center across a WAN and data needs to be either backed up remotely or replicated periodically to a central data center site. Once the initial baseline is established, all subsequent backups and Auxiliary Copy operations consume less network bandwidth because only the changes are transferred.

This workflow is used to transfer only the initial baseline backup between two sites. It cannot be used for subsequent backups.

Download this workflow from the Commvault Store. For instructions, see Download Workflows from Commvault Store.

Seeding a Deduplicated Storage - Overview (1)

How Does It Work?

When the DDB Seeding predefined workflow is run the following operations are performed:

  1. Data is backed up from the client computers to the Primary Copy located on the Remote Office site.

  2. A secondary (temporary) copy that points to the USB drive is created. This copy is used to facilitate the transfer of baseline data from a Remote Office to a Data Center site.

  3. The data from the primary copy is copied to a secondary copy through an Auxiliary Copy operation (on USB drive).

  4. After the Auxiliary Copy operation, the workflow job goes into a suspended state.

  5. The USB drive is shipped to the Data Center site.

  6. Once the USB drive is delivered to the Data Center site, the administrator must plug the USB drive into the MediaAgent located at the Data Center site, and then resume the workflow job from the Job Controller.

  7. The secondary copy (represented by USB drives) is associated as a Source copy to the global deduplication policy.

  8. Data is then copied from the secondary copy to a global deduplication policy through the Auxiliary Copy operation.

  9. Once the baseline is established, an e-mail is sent to tell the user that the transfer of initial baseline backup is complete and ready for the subsequent backups and Auxiliary Copy operations.

To perform the DDB Seeding process without a workflow, refer to Manually Seeding a Deduplicated Storage Policy.

Prerequisites

Skip this section if you already have disk libraries and storage policies configured on both sites

Before you run the workflow, make sure that you have the following configuration set up.

  1. Configure the following on the MediaAgent that is located at the Remote Office site:

    1. Create a disk library. See Disk Library - Getting Started for instructions.

      For example, RemoteOfficeMediaAgent is the MediaAgent at the Remote Office site and RemoteOfficeLibrary is the library.

      Seeding a Deduplicated Storage - Prerequisites (1)

    2. Create a new seeding storage policy and associate Remote_Office_Library with the Primary Copy. For instructions, see Storage Policy - Getting Started.

      For example, Seeding_Storage_Policy is created using RemoteOfficeLibrary as a library.

      Seeding a Deduplicated Storage - Prerequisites (2)

  2. Configure the following on the MediaAgent client computer located in the Data Center site:

    1. Create a local disk library to be used for the Auxiliary Copy operation. See Disk Library - Getting Started for instructions.

      For example, DataCenterMediaAgent is the MediaAgent located at Data Center site and DataCenterLibrary is the library.

      Seeding a Deduplicated Storage - Prerequisites (3)

    2. Create a global deduplication policy and specify DataCenterLibrary as the library. See Creating a Global Deduplication Policy for instructions.

      In the screenshot below Seeding_Global_Deduplication_Policy is created using DataCenterLibrary as a library.

      Seeding a Deduplicated Storage - Prerequisites (4)

Procedure

  1. From the CommCell Browser, go to Workflows.

  2. Right-click DDB Seeding and then click All Tasks > Execute.

  3. In the DDB Seeding Options dialog box, specify the following:

    1. From the Run workflow on list, select the workflow engine to execute the workflow.

    2. From the StoragePolicy To Seed list, select the seeding storage policy that you created on the Remote Office MediaAgent.

      Note

      An error occurs if you select global deduplicated storage policy instead of a regular deduplicated storage policy.

    3. From the Target Global Deduplication StoragePolicy for seeding list, select Global Deduplication Policy that you created on the Data Center MediaAgent.

    4. If you already performed the remote office site (source-side) operation, the USB drive is available at the Data Center MediaAgent, and you want to perform only the Data Center (destination-side) operations, select the Skip SourceSide Operation? check box.

      For example, if the USB drive with the data is already available at the Data Center MediaAgent, and during the workflow job, if the Auxiliary Copy operation fails with any issues such as No Resource Found or MediaAgent Offline and so on, then you might have to run the workflow job again. In this case, setting the skipSourceSideOperation value to true skips all of the operations that were performed on the Remote Office site and continues the operations at the Data Center site.

    5. If you did not select Skip SourceSide Operation? check box, the RemoteOffice MediaAgent and USB Drive dialog box appears.

      Complete the following steps:

      1. From the Remote Office MediaAgent list, select the source MediaAgent.

      2. In the USB Drive Location text box, enter the path to the USB drive.

      3. Click Next.

        The Backup before seeding dialog box appears.

      4. For Which type of backup do you wish to run, select from the options - Incremental, Full or None.

      5. Select the Fail the workflow if any backup job fails check box if you want the workflow to fail when the selected backup job fails.

      6. Click Finish.

  4. You can view the progress of the DDB Seeding job from the Job Controller window.

    Separate jobs for backup and auxiliary copy operations will be run to copy the data.

    When an auxiliary copy operation is complete, the DDB Seeding job status changes to Suspended state.

    Seeding a Deduplicated Storage - Procedure (1)

    Important

    At this point, you can perform backup operations at the Remote Office site. However, do not perform auxiliary copy jobs from the Remote Office site until the DDB Seeding predefined workflow completes.

  5. Remove the USB drive and ship it to the Data Center.

  6. Once the USB drive is available at the Data Center, insert the USB drive into the Data Center MediaAgent (DataCenterMediaAgent).

  7. Resume the DDB Seeding job, by right-clicking the DDB Seeding operation in the Job Controller, and then click Resume.

    When you resume the DDB Seeding job, the job goes into Waiting state with the following message:

    Error Code: [19:857]
     Description: waiting on user input [Mount Path and MA]
     Source: prodcs, Process: Workflow
  8. Right-click the DDB Seeding job, perform the following:

    1. Click Action.

    2. In the Actions for Job <ID> dialog box, double-click the Mount Path and MA entry.

    3. In the DataCenter MediaAgent and USB Drive dialog box, select the appropriate settings:

      1. From the DataCenter MediaAgent list, select the destination MediaAgent.

      2. In the USB Drive Location text box, enter the path to the USB drive.

      3. Click OK.

    4. On the Actions for Job <ID> dialog box, click Close.

    The DDB Seeding job resumes automatically, and the auxiliary copy operation copies the baseline data from the USB drive to the global deduplication copy.

  9. When Auxiliary Copy job is complete, DDB Seeding job sends an e-mail indicating that the DDB Seeding workflow was successful.

  10. You can now perform a client backup followed by an auxiliary copy operation.

    When the auxiliary copy operation is performed, a minimum amount of data is transferred between the Remote Office MediaAgent and the Data Center MediaAgent.

    Any backup job or Auxiliary Copy operations that are started at the Remote Office site verifies the data signatures from the seeded deduplication database on the source MediaAgent. If a signature is already present in the deduplication database, this means that the data blocks are already available at the Data Center site and are not transferred.

    The screen below displays the Auxiliary Copy operation details with minimum data transfer on the network.

    Seeding a Deduplicated Storage - Procedure (2)

Loading...