Requirements for File Indexing for Virtual Machines

A file indexing job requires one or more nodes to be configured with the recommended specifications. These nodes run the worker processes that perform the scan and index operations of the file-level data from VM backups.

Hardware Requirements

File indexing one or more VMs or VM groups requires a recommended number of nodes configured with the following specifications (or greater):

  • Large node: 12 CPU cores, 64 GB RAM (or 12 vCPUs/64 GB)

Commvault recommends configuring 1 node per 150 VMs (or 150 million files), assuming a medium-sized VM (1 million files).

For optimal performance, Commvault strongly recommends configuring Windows nodes for file indexing Windows VMs. In order to achieve that, perform the above calculation for the number of nodes for Windows and Linux VMs, separately, to obtain the exact number of nodes needed for each operating system type. Also, to ensure that Windows nodes are always picked for indexing Windows VMs, add the following registry key to all access nodes configured on the hypervisor or VM group:

Property

Value

Name

bEnforceCatalogOSMatch

Category

VirtualServer

Type

Integer

Value

1

Note

Do not set the path for the FBR cache mount point to the root directory or to the Commvault ransomware protection-enabled directory.

Use Case Scenario

Suppose a user wishes to file index 1000 VMs (assumes an average of approximately 1 million files in each VM).

Hardware Specifications for File Indexing Nodes

8 to 10 nodes need to be configured with the following specifications:

  • 12 CPU cores, 64 GB RAM (or 12 vCPUs/64 GB)

Configure Usage of the Nodes for File Indexing Job

In order for the nodes to be used for running file indexing jobs, the following additional setting needs to be added to all access nodes configured on the VM Group. These nodes will be used along with access nodes on the VM group for running file indexing jobs.

  • To the virtual server, add the ExtraFileIndexingProxies additional setting with the following values.

    For information about adding the additional setting from the CommCell Console, see Adding a CommCell Setting.

    Property

    Value

    Name

    ExtraFileIndexingProxies

    Category

    VirtualServer

    Type

    String

    Value

    Comma-separated list of file indexing node names

Performance Considerations

For best performance, ensure that VMs are file indexed on like-OS nodes.

  • When configuring file indexing nodes as described above, ensure that a proportionate mix of Windows and Linux nodes are configured to match the ratio of Windows and UNIX Guest VMs. For example, if there are 600 UNIX VMs and 400 Windows Guest VMs, then provision 6 Linux file indexing nodes and 4 Windows file indexing nodes.

  • Additional configuration changes may be needed when configuring a Linux node for File Indexing. Check the following documentation for the same Automatic Configuration of a MediaAgent as a Linux Access Node

Note

  • If a Linux node attempts to file index a Windows VM that has a dynamic volume, a FAT/FAT-32/ReFS file system, or MS Windows storage spaces configured, the file indexing job will go to Pending state and then re-distributed to a Windows access node on the next attempt available at the subclient level.

  • For RHEL/OEL/Rocky 9.x XFS 5.12, for file indexing of Linux VMs, the VSA and MediaAgent must be at the same Commvault release.

  • The Linux access node provides UNIX file system support for ext2, ext3, ext4, XFS, JFS, Btrfs, HFS, and HFS Plus. To enable extended file system recovery for UNIX-based virtual machines, deploy a Linux access node or convert a Linux MediaAgent to an access node. For more information, see Converting a Linux MediaAgent to a Linux Access Node for VMware.

Example Scenarios for Calculating the Number of Required Nodes

Scenario 1

Goal:

File index 1,000 Windows VMs with a total file count of 1 Billion files.

Number of (Windows) nodes required:

N = 1000 / 150 = 7

Total number of recommended nodes = 7

Scenario 2

Goal:

File index 450 VMs with 1 million files each, comprising 150 Windows VMs and 300 Linux VMs.

Number of nodes required:

N_Windows = 150 VMs / 150 = 1

N_Linux = 300 VMs / 150 = 2

Total number of recommended nodes = 3

Loading...