Loading...

Hardware Specifications for Deduplication Mode

Review the suggested hardware specifications for MediaAgents to host the deduplicated data. These specifications can be used for setting up partitioned deduplication databases (DDBs) in any of configurations mentioned under the Scaling and Resiliency section.

For details about supported platforms, see Building Block Guide - Deduplication System Requirements.

Important:

  • The following hardware requirements are applicable for MediaAgents with deduplication. The requirements do not apply to tape libraries or MediaAgents without deduplication or in setups where third party deduplication applications are used.
  • In this page, for ease of use, the terms node and MediaAgents are used interchangeably.
  • The suggested workloads are not software limitations, rather design guidelines for sizing under specific conditions.
  • The TB values are base-2.
  • The index cache disk recommendation is for unstructured data types like files, VMs and granular messages. Structured data types like application, databases and so on need significantly less index cache.

Capacity

Use the following table to calculate the storage requirements.

Please contact your software vendor for assistance with more advanced design cases.

Back-end Size10 for Disk Storage

 

Extra large

Extra large

Large

Large

Medium

Medium

Small

Extra Small

Grid

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

1 DDB Disk

1 Node

Up to 250 TB

Up to 500 TB

Up to 150 TB

Up to 300 TB

Up to 75 TB

Up to 150 TB

Up to 50 TB

Up to 25 TB

2 Node

Up to 500 TB

Up to 1000 TB

Up to 300 TB

Up to 600 TB

Up to 150 TB

Up to 300 TB

Up to 100 TB

Up to 50 TB

3 Node

Up to 750 TB

Up to 1500 TB

Up to 450 TB

Up to 900 TB

Up to 225 TB

Up to 450 TB

 

 

4 Node

Up to 1000 TB

Up to 2000 TB

Up to 600 TB

Up to 1200 TB

Up to 300 TB

Up to 600 TB

 

 

Back-end Size10 for Cloud Storage9

 

Extra large

Extra large

Large

Large

Medium

Medium

Small

Extra Small

Grid

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

2 DDB Disks

1 DDB Disk

1 DDB Disk

1 Node

Up to 500 TB

Up to 1000 TB

Up to 300 TB

Up to 600 TB

Up to 150 TB

Up to 300 TB

Up to 100 TB

Up to 50 TB

2 Node

Up to 1000 TB

Up to 2000 TB

Up to 600 TB

Up to 1200 TB

Up to 300 TB

Up to 600 TB

Up to 200 TB

Up to 100 TB

3 Node

Up to 1500 TB

Up to 3000 TB

Up to 900 TB

Up to 1800 TB

Up to 450 TB

Up to 900 TB

 

 

4 Node

Up to 2000 TB

Up to 4000 TB

Up to 1200 TB

Up to 2400 TB

Up to 600 TB

Up to 1200 TB

 

 

CPU/RAM

Components

Extra large

Large

Medium

Small

Extra Small

CPU/RAM

16 CPU cores, 128 GB RAM (or 16 vCPUs/128 GB)

12 CPU cores, 64 GB RAM (or 12 vCPUs/64 GB)

8 CPU cores, 32 GB RAM

(or 8 vCPUs/32 GB)

4 CPU cores, 24 GB RAM

(or 4 vCPUs/24 GB)

2 CPU cores, 16 GB RAM

(or 2 vCPUs/16 GB)

Disk Layout

Components

Extra large

Large

Medium

Small

Extra Small

OS or Software Disk

400 GB SSD class disk

400 GB usable disk, min 4 spindles 15K RPM or higher OR SSD class disk

400 GB usable disk, min 4 spindles 15K RPM

300 GB usable disk, min 2 spindles 15K RPM

200 GB usable disk, min 2 spindles 15K RPM

DDB Disk

2 TB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

1.2 TB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

600 GB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

400 GB SSD Class Disk/PCIe or SATA Interface2

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See, Building Block Guide - Deduplication Database.

200 GB SSD Class Disk/PCIe or SATA Interface2

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See, Building Block Guide - Deduplication Database.

Supported Number of DDB Disks

Up to 2

Up to 2

Up to 2

Up to 1

Up to 1

Index Cache Disk3, 4

2 TB SSD Class Disk1

1 TB SSD Class Disk1

1 TB SSD Class Disk1

400 GB SSD Class Disk8

400 GB SSD Class Disk8

Deploying MediaAgent on Cloud / Virtual Environments

Installing MediaAgent on cloud or virtual environment is supported. For more details about AWS and Azure Sizing, see the following:

Configuration

After determining the number of MediaAgents for your setup by using the Back-end-size table, you must create a network storage pool using these MediaAgents . For more information about creating a network storage pool, see Network Storage Pool. If you have multiple sites, then create a network storage pool per site for cross-site replication . For information about cross-site replication, see Cross-site Replication.

Scaling and Resiliency

You can scale up or scale down your setup by adding or deleting the number of MediaAgents required to process your data. The following factors affect the number of MediaAgents that are required in your setup:

  • The back-end size of the data. For example: Each 2TB DDB disk holds up to 250 TB for disk and 500 TB for cloud extra large MediaAgent.
  • Resiliency for backups allows for node failover by automatically redirecting the backup process to another node if one node is temporarily unavailable.

You can use these scaling and resiliency factors to set up partitioned deduplication databases in any of following configurations:

  • Partition mode:

    In this mode only one storage pool is configured using all the MediaAgents in a grid with one, two or four partitions.

  • Partition extended mode:

    In this mode the MediaAgents host partitions from multiple storage pools (up to 20 storage pools per grid).  Each storage pool can be configured with one, two or four partitions.

    You can use the partition extended mode in the following scenarios:

    • When you want the primary copy of the data on the disk and the secondary copy on the cloud. In this case, create one disk storage pool and one cloud storage pool using the same MediaAgents.
    • In case of multi-tenancy, where the total back-end size of multiple tenants together is within the limit of the grid. In this case, to segregate data for each tenant you can configure the partition in extended mode by creating separate storage pool for each tenant using the same MediaAgents.

Footnotes:

  1. SSD class disk indicates PCIe based cards or internal dedicated endurance value drives. Use MLCs (Multi-Level Cells) class or better SSDs. It is recommended to use the mixed use enterprise class SSDs.
  2. Recommend dedicated RAID 1 or RAID 10 group.
  3. To improve the indexing performance, store your index data on a solid-state drive (SSD). The following agents and cases require the best possible indexing performance:
    • Exchange Mailbox Agent
    • Virtual Server Agents
    • NAS filers running NDMP backups
    • Backing up large file servers
    • SharePoint Agents
  4. The index cache directory must be on a local drive. Network drives are not supported.
  5. Use dedicated volume for Index Cache Disk and DDB Disk.
  6. For Linux, host the DDB on LVM volumes. This helps DDB backups by using LVM software for snapshots. It is recommended to use thin provisioned LV for DDB volumes to get better Query-Insert performance during the DDB backups.
  7. If RAID cache is shared by other components, then 2GB controller cache might not be sufficient.
  8. Spinning disk can be used.
  9. Higher deduplication block size is used to calculate the back-end size for the cloud storage.
  10. Back-end storage size (BET) requirements approximately range from 1.0 - 1.6 times of the front-end data size (FET). This factor will vary directly in proportion with the amount of retention required and the amount of daily change rate for the front-end data. For example, if the data is retained for lower number of days, then it reduces the predicted amount of back-end storage requirement. Whereas, if the extended retention rules are applied to a larger portion of the managed data, then it can increase the back-end storage consumption. The FET estimate can be used in the storage policy design to help size the appropriate resources to support the use-case.

    The following are two examples of commonly used settings for the backup retention.

    Case 1:

    • 80% VM/File data
    • 20% Database data
    • Daily change rate 2%
    • Compression rate 50%
    • Daily backups are retained for 30 days

    Factoring in these parameters, the back-end storage size requirements can range from 1.0 - 1.2 times the front-end data size.

    Case 2:

    • 80% VM/File data
    • 20% Database data
    • Daily change rate 2%
    • Compression rate 50%
    • Daily backups are retained for 30 days
    • 8 weekly backups are retained for 1 year
    • 9 monthly backups are retained for 1 year
    • 1 yearly backup is retained

    Factoring in these parameters, the back-end storage size requirements can range from 1.4 - 1.6 times the front-end data size.

    Please contact your software vendor for assistance with more advanced design cases.

Last modified: 4/16/2020 6:33:51 PM