Loading...

Seeding a Deduplicated Storage Policy for Remote Office Branch Office (ROBO)

Data transfers across high latency networks such as Wide Area Networks (WAN) can be time consuming, especially during the transfer of baseline backups where most of the data is unique and needs to be transferred.

The process in this section explains how to manually transfer baseline backup between two sites using easily available removable disks such as USB disks. As a part of this process, a pre-seeded source-side deduplication database (DDB) is created that is used to lookup signatures locally instead of across the network, thereby speeding up signature lookup operation and hence improving the overall data transfer speed.

This is useful in Remote Office Branch Office (ROBO) environment where Remote Office sites are separated from the Data Center site across WAN and data either needs to be remotely backed up or periodically replicated to central Data Center site. Once the initial baseline is established, all subsequent backup and Auxiliary Copy operations consume less network bandwidth because only the changes are transferred on the network.

How It Works

In this process, storage policy is configured with three copies - primary, secondary and tertiary copy.

  • Primary copy (Copy01) points to local disk library at the Remote Office site.
  • Secondary copy (Copy02) is a temporary copy that points to a USB drive and is used to facilitate the transfer of baseline data to Data Center site.
  • Tertiary copy (Copy03) points to local disk library at the Data Center site.

Seeding process works as follows:

  • At Remote Office site, data is first backed up to primary copy.
  • Data is copied to secondary copy through an auxiliary copy operation on the USB disk.
  • USB disk is physically shipped over to the Data Center site.
  • Data is then copied from secondary copy (that is from USB disk) to tertiary copy through Auxiliary Copy operation at the Data Center (or DR site) thereby preventing baseline data transfer over the WAN.
  • After copy of data to the tertiary copy, the source-side DDB is manually copied from destination MediaAgent to source MediaAgent.
  • Once the baseline is established, secondary copy is no longer required and is deleted. The source for tertiary copy will be modified to primary copy.

Before You Begin

You should have the following configurations before seeding:

  1. Configure a disk library (Remote_Office_Library), on MediaAgent01 located in the Remote Office site.

    For instructions, see Disk Libraries - Getting Started.

  2. Configure a disk library (named as USB_Library), between MediaAgent01 and MediaAgent02 computers using USB drive as mount path that is connected in the Remote Office site.
    1. On the ribbon in the CommCell Console, click the Storage tab, and then click Expert Storage Configuration.
    2. Under Available MediaAgents, select both Remote Office MediaAgent (MediaAgent01) and Data Center MediaAgent (MediaAgent02), click Add, and then click OK.
    3. Click OK on a warning message.
    4. In the Expert Storage Configuration dialog box, click Start > Add > Disk Library.
    5. In the Add Disk Library dialog box, specify the name for the library (USB_Library), and then click OK.
    6. In the Shared Mount Path dialog box, specify the following:
      1. From the MediaAgent list, select MediaAgent configured on the Remote Office site (MediaAgent01).
      2. Select Local Path.
      3. In the Folder box, type the USB drive location, for example if USB drive is connected as G: drive, type G:\.
      4. Click OK.
  3. Configure a local disk library (Data_Center_Library), used for Auxiliary Copy operation on MediaAgent02 located in the Data Center site.

    For instructions, see Disk Libraries - Getting Started.

Configuration

Pre-configure your setup with the following steps that involves the creation of a new storage policy and copies.

  1. Create a deduplication-enabled storage policy on MediaAgent01 and specify Remote_Office_Library as the library to which the Primary Copy (Copy01) should be associated.

    For instructions, see Creating a Storage Policy with Deduplication.

  2. Create a secondary copy (Copy02) on MediaAgent01 computer and specify USB_Library as the library to which the secondary copy should be associated.
    1. Right-click the storage policy just created and click All Tasks > Create New Copy.
    2. From the Library list, select USB_Library.
    3. From the MediaAgent list, select MediaAgent01.
    4. Select the Enable Deduplication check box.
    5. On the Deduplication tab, click Configure and specify the location to host the DDB.
    6. Click OK.
  3. Create a tertiary copy (Copy03) on MediaAgent02 computer and specify Data_Center_Library as the library to which the tertiary copy (Copy03) should be associated.
    1. Right-click the storage policy and click All Tasks > Create New Copy.
    2. From the Library list, select Data_Center_Library.
    3. From the MediaAgent list, select MediaAgent02.
    4. Select the appropriate option:
      • Select Enable Deduplication check box when transferring data from one Remote Office site.
      • Select Use Global Deduplication Policy check box when transferring data from multiple Remote Office sites, and then select a Global Deduplication Policy that is assigned to Data_Center_Library.

        For instructions to create a Global Deduplication Policy, see Creating a Storage Policy with Global Deduplication

    5. On the Deduplication tab, click Configure and specify the location to host the DDB.
    6. On the Copy Policy tab, under Source Copy area, select the Specify Source for Auxiliary Copy check box, and then select secondary copy(Copy02)from the list.

      A message appears, that asks you to continue with different source copy for auxiliary copy operation.

    7. Click Yes to continue.
    8. Click OK.

Procedure

  1. Perform backups on all clients associated to seeding storage policy Primary Copy (Copy01).

    For more information on how to run backups, see the documentation for the specific agent.

  2. Perform Auxiliary Copy to copy all the jobs from primary copy (Copy01) to secondary copy (Copy02).

    For instructions, see Run an Auxiliary Copy.

    Note: The backup data is transferred to the USB drive.

  3. After completion the auxiliary copy job, unplug the USB drive and ship it to the Data Center site.
  4. After the USB disks is available at the data center, plug the USB to the Data Center MediaAgent (MediaAgent02), and perform the following to add the USB drive as a mount path on the USB Disk Library.
    1. From the CommCell Browser, expand Storage Resources > Libraries > USB_Library.
    2. Right-click the mount path, and click Properties.
    3. In the Mount Path Properties dialog box, on the Sharing tab, select the mount_path and then click Share.
    4. In the Share Mount Path dialog box, specify the following:
      1. From the MediaAgent list, select the Data Center MediaAgent (MediaAgent02).
      2. Select Local Path.
      3. In the Folder box, type the USB drive location.
      4. Click OK.
    5. Click OK.
  5. Enable DASH Copy on the tertiary copy (Copy03).
    1. From the CommCell Browser, expand Policies > Storage Policies > storage_policy.
    2. Right-click the tertiary copy (Copy03), and then click Properties.
    3. In the Copy Properties dialog box, click the Deduplication tab.
    4. On the Advanced tab, specify the following selections:
      1. Select the Enable DASH Copy (Transfer only unique data segments to target) check box.
      2. Select Network Optimized Copy.
      3. Select the Enable source side disk cache check box.
    5. Click OK.
  6. Run an auxiliary copy operation to copy data from secondary copy (USB disk) to tertiary copy (Copy03).
    1. From the CommCell Console, expand Policies > Storage Policies.
    2. Right-click the storage_policyand click All Tasks > Run Auxiliary Copy.
    3. Under Copy Selection, select Select A Copy, and then select tertiary copy (Copy03)from the list.
    4. Click OK.

    Important: Auxiliary copy jobs should not be run from the Remote Office site until the source-side database is copied from MediaAgent02 to MediaAgent01 computer after seeding. This will seed the deduplicated storage policy and creates a source-side DDB on MediaAgent02 computer under the job results folder.

  7. Manually copy the seeded source-side database from the Job Results folder of MediaAgent02 back to MediaAgent01 (Remote Office site).

    If Global Deduplication Storage Policy (GDSP) is used and there are multiple storage policies pointing to the same GDSP policy, you must copy the seeded database from the Job Results folder for each seeded copy to the respective MediaAgents Job Results source folder.

    Example: Copy CV_CLDB_AUX_<copy_id> from the Job Results folder of MediaAgent02 to the Job Results folder of MediaAgent01:

    <Installation_Directory>\iDataAgent\JobResults\CV_CLDB\CV_CLDB_AUX_<copy_id>

  8. Delete the seeded database from the Job Results folder of MediaAgent02.
  9. After seeding process, re-associate primary copy as source copy on the tertiary copy.
    1. Right-click Tertiary_Copy (Copy03) and then click Properties.
    2. Click the Copy Policy tab.
    3. Clear Source Copy check box to use Copy01 (Primary Copy) as source copy during Auxiliary Copy operation.
  10. Delete the secondary copy (Copy02) as this is no longer needed.
    1. From the CommCell Browser, select Policies > Storage Policies > storage_policy_name
    2. Right-click the secondary_copy and then click All Tasks > Delete.
    3. A message appears, that asks you if you are sure that you want to delete the storage policy copy.

      Click Yes.

    4. In the Enter Confirmation text dialog box, type erase and reuse media and then click OK.
  11. Run a full backup followed by an Auxiliary Copy job.

    You will see a minimum amount of data being transferred between MediaAgent01 and MediaAgent02.

    Any Auxiliary Copies started at the Remote office site will now verify data signatures from the seeded source-side DDB on the source MediaAgent. If a signature is already present in the source-side DDB this means that the data blocks are already available at the Data Center site and will not be transferred.

Last modified: 3/1/2018 7:33:13 PM