A file indexing job requires one or more nodes to be configured with the recommended specifications. These nodes run the worker processes that perform the scan and index operations of the file-level data from VM backups.
Hardware Requirements
File Indexing one or more VMs or VM Groups requires a recommended number of nodes configured with the following specifications (or greater):
- Large node: 12 CPU cores, 64 GB RAM (or 12 vCPUs/64 GB)
Commvault recommends configuring 1 node per 150 VMs (or 150 million files), assuming a medium-sized VM (1 million files).
For optimal performance, Commvault strongly recommends configuring Windows nodes for file indexing Windows VMs. In order to achieve that, perform the above calculation for the number of nodes for Windows and Linux VMs, separately, to obtain the exact number of nodes needed for each Operating System type. Also, to ensure that Windows nodes are always picked for indexing Windows VMs, add the following registry key to all access nodes configured on the hypervisor or VM group:
Property |
Value |
---|---|
Name |
|
Category |
VirtualServer |
Type |
Integer |
Value |
1 |
Note
File Indexing jobs will continue to work even with fewer nodes than the recommended/calculated number of nodes, but more slowly.
For instructions about adding an additional setting from the CommCell Console, see Adding or Modifying Additional Settings from the Command Center.
Configuring Nodes for File Indexing Usage
By default, a file indexing job will use the access nodes that are already configured on the hypervisor or VM Group. Any additional nodes that are needed to meet the recommended number of nodes will have to be configured and added as a comma separated list in Additonal Settings on all Access nodes.
Property |
Value |
---|---|
Name |
|
Category |
VirtualServer |
Type |
String |
Value |
Comma separated list of node identifiers |
Note
The node identifier is the Client/Server identifier field shown in the Overview section for the node.
For instructions about adding an additional setting from the CommCell Console, see Adding or Modifying Additional Settings from the CommCell Console.
Guest OS-Specific Considerations
-
As stated above, for performance reasons, it is strongly recommended to configure a Windows node when file indexing Windows VMs.
-
A Linux node can file index a Windows VM, if the VM has basic disk(s) and NTFS file system. For all other Windows VM configurations, a Windows node is mandatory to index Windows VMs.
-
Linux node is required to file index Linux VMs. The nodes must be at the same operating system as the Linux VM or higher. For example, to browse a VM at RHEL 9, the Linux access node must be at RHEL 9 or higher.
-
Additional configuration changes may be needed when configuring a Linux node for File Indexing. Check the following documentation for the same Automatic Configuration of a MediaAgent as a Linux File Recovery Enabler (FREL)
Note
-
If a Linux node attempts to file index a Windows VM that has a dynamic volume, a FAT/FAT-32/ReFS file system, or MS Windows storage spaces configured, the file indexing job will go to Pending state and then re-distributed to a Windows access node on the next attempt available at the subclient level.
-
For RHEL/OEL/Rocky 9.x XFS 5.12, for file indexing of Linux VMs, the VSA and MediaAgent must be at the same Commvault release.
-
The FREL provides UNIX file system support for ext2, ext3, ext4, XFS, JFS, Btrfs, HFS, and HFS Plus. To enable extended file system recovery for UNIX-based virtual machines, deploy a FREL or convert a Linux MediaAgent to a File Recovery Enabler. For more information, see Converting a Linux MediaAgent to a VMware File Recovery Enabler for Linux.
Example Scenarios for Calculating the Number of Required Nodes
Scenario 1
Goal:
File index 1,000 Windows VMs with a total file count of 1 Billion files.
Number of (Windows) nodes required:
N = 1000 / 150 = 7
Total number of recommended nodes = 7
Scenario 2
Goal:
File index 450 VMs with 1 million files each, comprising 150 Windows VMs and 300 Linux VMs.
Number of nodes required:
N_Windows = 150 VMs / 150 = 1
N_Linux = 300 VMs / 150 = 2
Total number of recommended nodes = 3