Hardware Specifications for Deduplication Mode

Review the suggested hardware specifications for MediaAgents to host the deduplicated data. These specifications can be used for setting up partitioned deduplication databases (DDBs) in any of configurations mentioned under the Scaling and Resiliency section.

For more details about the supported platforms, see Building Block Guide - Deduplication System Requirements.

Important

  • The following hardware requirements are applicable for MediaAgents with deduplication. The requirements do not apply to tape libraries or MediaAgents without deduplication or in setups where third party deduplication applications are used.

  • In this page, for ease of use, the terms node and MediaAgents are used interchangeably.

  • The suggested workloads are not software limitations, rather design guidelines for sizing under specific conditions.

  • The TB values are base-2.

  • The index cache disk recommendation is for unstructured data types like files, VMs and granular messages. Structured data types like application, databases and so on need significantly less index cache. The recommendations given are per MediaAgent.

  • The hardware specifications are listed for the most common configurations. You can add as many MediaAgents as required depending on the storage capacity.

Capacity

Use the following table to calculate the storage requirements.

Please contact your software vendor for assistance with more advanced design cases.

Back-end Size for Disk Storage

Extra large

Extra large

Large

Large

Medium

Medium

Small

Extra small

Grid 9 10

1 DDB location per node

2 DDB locations per node

1 DDB location per node

2 DDB locations per node

1 DDB location per node

2 DDB locations per node

1 DDB location per node

1 DDB location per node

1 Node

Up to 250 TB

Up to 500 TB

Up to 150 TB

Up to 300 TB

Up to 75 TB

Up to 150 TB

Up to 50 TB

Up to 25 TB

2 Node

Up to 500 TB

Up to 1000 TB

Up to 300 TB

Up to 600 TB

Up to 150 TB

Up to 300 TB

Up to 100 TB

Up to 50 TB

3 Node

Up to 750 TB

Up to 1500 TB

Up to 450 TB

Up to 900 TB

Up to 225 TB

Up to 450 TB

4 Node

Up to 1000 TB

Up to 2000 TB

Up to 600 TB

Up to 1200 TB

Up to 300 TB

Up to 600 TB

Back-end Size for Cloud Storage

Extra large

Extra large

Large

Large

Medium

Medium

Small

Extra small

Grid 8 9 10

1 DDB location per node

2 DDB locations per node

1 DDB location per node

2 DDB locations per node

1 DDB location per node

2 DDB locations per node

1 DDB location per node

1 DDB location per node

1 Node

Up to 500 TB

Up to 1000 TB

Up to 300 TB

Up to 600 TB

Up to 150 TB

Up to 300 TB

Up to 100 TB

Up to 50 TB

2 Node

Up to 1000 TB

Up to 2000 TB

Up to 600 TB

Up to 1200 TB

Up to 300 TB

Up to 600 TB

Up to 200 TB

Up to 100 TB

3 Node

Up to 1500 TB

Up to 3000 TB

Up to 900 TB

Up to 1800 TB

Up to 450 TB

Up to 900 TB

4 Node

Up to 2000 TB

Up to 4000 TB

Up to 1200 TB

Up to 2400 TB

Up to 600 TB

Up to 1200 TB

Parallel Data Stream Transfers

Grid

Extra large

Large

Medium

Small

Extra small

1 Node

100

100

75

50

25

2 Node

200

200

150

100

50

3 Node

300

300

225

4 Node

400

400

300

CPU/RAM

Components

Extra large

Large

Medium

Small

Extra small

CPU/RAM

16 CPU cores, 128 GB RAM (or 16 vCPUs/128 GB)

12 CPU cores, 64 GB RAM (or 12 vCPUs/64 GB)

8 CPU cores, 32 GB RAM

(or 8 vCPUs/32 GB)

4 CPU cores, 24 GB RAM

(or 4 vCPUs/24 GB)

2 CPU cores, 16 GB RAM

(or 2 vCPUs/16 GB)

Disk Layout

Components

Extra large

Large

Medium

Small

Extra small

OS or Software Disk

400 GB SSD class disk

400 GB usable disk, min 4 spindles 15K RPM or higher OR SSD class disk

400 GB usable disk, min 4 spindles 15K RPM

300 GB usable disk, min 2 spindles 15K RPM

200 GB usable disk, min 2 spindles 15K RPM

DDB Disk

2 TB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

1.2 TB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

600 GB SSD Class Disk/PCIe IO Cards1,2

2 GB Controller Cache Memory7

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See Building Block Guide - Deduplication Database

400 GB SSD Class Disk/PCIe or SATA Interface2

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See, Building Block Guide - Deduplication Database.

200 GB SSD Class Disk/PCIe or SATA Interface2

For Linux, the DDB volume must be configured by using the Logical Volume Management (LVM) package.6 See, Building Block Guide - Deduplication Database.

Supported Number of DDB Disks

Up to 2

Up to 2

Up to 2

Up to 1

Up to 1

Index Cache Disk3, 4

2 TB SSD Class Disk1

1 TB SSD Class Disk1

1 TB SSD Class Disk1

400 GB SSD Class Disk8

400 GB SSD Class Disk8

Deploying MediaAgent in Cloud/Virtual Environments

Installing a MediaAgent in a cloud or virtual environment is supported. For more details about AWS and Azure sizing, see the following:

Configuration

After determining the number of MediaAgents for your setup by using the back-end-size table, you must create a network storage pool using these MediaAgents. For more information about creating a network storage pool, see Network Storage Pool.

If you have multiple sites, then create a network storage pool per site for cross-site replication. For information about cross-site replication, see Cross-site Replication.

Scaling and Resiliency

You can scale up or scale down your setup by adding or deleting the number of MediaAgents required to process your data. The following factors affect the number of MediaAgents that are required in your setup:

  • The back-end size of the data. For example: Each 2TB DDB disk holds up to 250 TB for disk and 500 TB for cloud extra large MediaAgent.

  • Resiliency for backups allows for node failover by automatically redirecting the backup process to another node if 1 node is temporarily unavailable.

You can use these scaling and resiliency factors to set up partitioned deduplication databases in any of following configurations:

  • Partition mode: In this mode only 1 storage pool is configured using all the MediaAgents in a grid with 1, 2, 4, or 6 partitions. Use 6 partition DDB for 6 or more nodes.

  • Partition extended mode: In this mode the MediaAgents host partitions from multiple storage pools (up to 20 storage pools per grid). Each storage pool can be configured with 1, 2, 4, or 6 partitions.

    You can use the partition extended mode in the following scenarios:

    • When you want the primary copy of the data on the disk and the secondary copy on the cloud. In this case, create 1 disk storage pool and 1 cloud storage pool using the same MediaAgents.

    • In case of multi-tenancy, where the total back-end size of multiple tenants together is within the limit of the grid. In this case, to segregate data for each tenant you can configure the partition in extended mode by creating a separate storage pool for each tenant using the same MediaAgents.

Footnotes

  1. SSD class disk indicates PCIe based cards or internal dedicated endurance value drives. Use MLCs (Multi-Level Cells) class or better SSDs. It is recommended to use the mixed use enterprise class SSDs.

  2. Recommend dedicated RAID 1 or RAID 10 group.

  3. To improve the indexing performance, store your index data on a solid-state drive (SSD). The following agents and cases require the best possible indexing performance:

    • Exchange Mailbox Agent

    • Virtual Server Agents

    • NAS filers running NDMP backups

    • Backing up large file servers

    • SharePoint Agents

  4. The index cache directory must be on a local drive. Network drives are not supported.

  5. Use dedicated volume for Index Cache Disk and DDB Disk.

  6. If RAID cache is shared by other components, then 2GB controller cache might not be sufficient.

  7. Spinning disk can be used only for small and extra small configurations.

  8. Higher deduplication block size is used to calculate the back-end size for the cloud storage. When you use cloud storage for secondary copy where the source is disk storage, refer to Back-end Size for Disk Storage for storage requirements.

  9. Back-end storage size (BET) requirements approximately range from 1.0–1.6 times of the front-end data size (FET). This factor will vary directly in proportion with the amount of retention required and the amount of daily change rate for the front-end data. For example, if the data is retained for lower number of days, then it reduces the predicted amount of back-end storage requirement. Whereas, if the extended retention rules are applied to a larger portion of the managed data, then it can increase the back-end storage consumption. The FET estimate can be used in the storage policy design to help size the appropriate resources to support the use-case.

    The following are examples of commonly used settings for the backup retention.

    Example 1:

    • 80% VM/File data

    • 20% Database data

    • Daily change rate 2%

    • Compression rate 50%

    • Daily backups are retained for 30 days

    Factoring in these parameters, the back-end storage size requirements can range from 1.0 - 1.2 times the front-end data size.

    Example 2:

    • 80% VM/File data

    • 20% Database data

    • Daily change rate 2%

    • Compression rate 50%

    • Daily backups are retained for 30 days

    • 8 weekly backups are retained for 1 year

    • 9 monthly backups are retained for 1 year

    • 1 yearly backup is retained

    Factoring in these parameters, the back-end storage size requirements can range from 1.4 - 1.6 times the front-end data size.

    Please contact your software vendor for assistance with more advanced design cases.

  10. As specified in the Scaling and Resiliency section above, you can add as many nodes as required based on size. Take sizes from single node mentioned in table to compute size for 6 or more nodes.

Loading...