After installing the Hadoop package on the data access nodes, you must configure a pseudo-client.
Procedure
-
From the CommCell Browser, right-click Client Computers, point to New Client > File System and then click Hadoop.
The Create Hadoop Client dialog box appears.
-
Enter the HDFS user and library path details to connect to the Hadoop cluster:
-
In the Client Name box, type a name for the pseudo-client.
-
In the Instance Name box, type a name for the instance.
-
HDFS URI: Commvault recommends using the default value for HDFS URI. The URI details are automatically fetched from the core-site.xml file that is located in the hadoop_installation_directory/conf/ directory.
-
Optional: In the HDFS User box, type the Hadoop user name if a non-root user account is used to manage the Hadoop cluster.
-
In the Hadoop Native Library Path box, type or click Browse to specify the path to the Hadoop native library (libhdfs.so).
Syntax:
/path_containing_hadoop_install_directory_native_folder
Example:
/usr/hadoop-2.6.1/lib/native
-
In the JVM Library Path box, type or click Browse to specify the path to the JVM library (libjvm.so).
Syntax:
/path_containing_jvm_jre_folder
Example:
/usr/lib/jvm/java-XX.xx-openjdk-XX.xx.x86_64/jre/lib/amd64/server
-
-
On the Hadoop tab, specify the data access nodes that will be part of the instance.
-
In the Master Node list, select one of the data access nodes as a master node for the instance.
-
Under Data Access Nodes, select the required data access nodes, and then click Add to include them in the instance.
-
In the Number of Data Readers box, enter the number of data streams.
Tip: For optimal sharing of the backup load, the number of data readers must be greater than the number of data access nodes.
-
-
On the Storage Device tab, select a storage policy from the Storage Policy list.
-
Optional: To create a new storage policy, click Create Storage Policy, and then follow the instructions in the storage policy creation wizard.
-
Optional: To perform LAN-free backup and restores, select a grid storage policy.
For more information, see GridStor® (Alternate Data Paths) - Overview.
-
Click OK.