Step 1: Review the following requirements and supported features.
Step 2: Install the Hadoop Package on Data Access Nodes
Review each of the following topics to prepare for the installation and to select the best installation method for your environment.
-
Prepare the Installation on UNIX, Linux, and Macintosh Computers
-
Note
When you upgrade the existing data access nodes (and the master node) using push installation, the Hadoop package is installed automatically on all the nodes.
If you want to also use the data access nodes as MediaAgents, then install the MediaAgent package on the nodes. For instructions, see Installing the MediaAgent.
Notes:
-
All of the participating data access nodes and the master node must be at the same service pack level.
-
All of the participating data access nodes and the master node must be time synced.
Step 3: Configure All Nodes that You Want to Use for Backup and Restore Operations as HDFS Clients on the Hadoop Cluster, so that the Nodes have Access to the HDFS File System.
Verify that you are able to run the following commands correctly without any errors:
hdfs dfs –ls /
hadoop classpath --glob
Note: Verify that the Hadoop bin path is correctly set in the environment for the root user and you are able to run the above commands successfully as the root user. (Start the Commvault services from the same environment.)
Step 4: In Secure Hadoop Environments, Provide the Keytab File Location in the Configuration File on Data Access Nodes
For Kerberos authentication, a keytab file is used to authenticate to the Key Distribution Center (KDC). Add the keytab file location as a property in the hdfs-site.xml configuration file on all data access nodes, including the master node. The hdfs-site.xml file is located under the hadoop_installation_directory/conf/ directory.
Example:
<property> <name>hadoop.user.keytab.file</name> <value>/etc/krb5.keytab</value> </property>
To change the default path of the keytab file, see Changing the Default Path of the Keytab File.
Step 5: Prepare for Your First Backup and Restore
-
Configure a storage device.
-
To configure a disk library, see Disk Libraries - Getting Started.
-
To configure a tape library, see Tape Libraries - Getting Started.
-
-
Create an appropriate subclient based on your requirement.
Requirement
More Information
Back up Hadoop data
Archive Hadoop data
-
Decide whether you want the following additional functionality:
-
You want to be notified of events that require attention. For more information, see Alerts and Notifications - Overview.
-
You want to report on and analyze critical data about your CommCell operations.
For more information about reports, see Reports Overview.
-
You want to manage security.
For more information, see User Account and Password Management - Getting Started.
-
Step 6: Run Your First Backup and Restore
Step 7: What to Do Next
Configure data retention and data aging. For more information, see Data Aging - Getting Started.