Getting Started with HBase
Verify that your environment meets the requirements for protecting HBase data.
Step 1: Review the System Requirements
Review the requirements for HBase. For more information, see System Requirements.
Step 2: Install the Hadoop Package on Data Access Nodes
To prepare for the installation and to select the best installation method for your environment, review the following topics:
- Prepare the Installation on UNIX, Linux, and Macintosh Computers
- Preinstallation Checklist for Hadoop on Linux
- Installation Methods
Note: When you upgrade the existing data access nodes (and the master node) using push installation, the Hadoop package is installed automatically on all nodes.
If you want to also use the data access nodes as MediaAgents, then install the MediaAgent package on the nodes. For instructions, see Installing the MediaAgent.
Step 3: Review Requirements for Data Access Nodes
Review the following requirements for data access nodes:
- The HBase gateway role must be installed on the master node.
- All data access nodes and the master node that you include in backups must share the same Job Results folder. For information about changing the path of the Job Results folder, see Changing the Path of the Job Results Directory.
- All data access nodes and the master node must have the same version and service pack of the Commvault software.
- All data access nodes and the master node must be time synced.
Step 4: To Access the HDFS File System, Configure the Nodes That Will Be Used for Backup Operations and Restore Operations as HDFS Clients on the Hadoop Cluster
After you set the correct Hadoop bin path for the root user, verify that you are able to run the following commands as the root user without any errors. (Start the Commvault services from the same environment.)
hdfs dfs –ls /
hadoop classpath --glob
Step 5: In Secure Hadoop Environments, Provide the Keytab File Location in the Configuration File on Data Access Nodes
For Kerberos authentication, a keytab file is used to authenticate to the Key Distribution Center (KDC). Add the keytab file location as a property in the hbase-site.xml configuration file on all data access nodes, including the master node. The hbase-site.xml file is located under the hadoop_installation_directory/conf/ directory.
where /etc/krb5.keytab is the keytab file location
Step 6: For Cloudera Distributions, Provide the Super Username with Necessary Permissions in the Configuration File on the Master Node
Edit the hdfs-site.xml file located under the hadoop_installation_directory/conf/ directory on the master node, and add the following entry.
Step 7: Prepare for Your First Backup Operation and Restore Operation
- Open the CommCell Console.
- Configure a storage device.
- Create a pseudoclient for Hadoop.
- Add an HBase App to a Hadoop instance.
- Plan your backup by considering the following configuration items.
Decide whether to schedule backups. Scheduling backups has the advantage of ensuring that the data is automatically backed up at regular intervals. Gather the following information prior to configuring schedules:
- The schedule name
- The schedule frequency
- The schedule start time and start date
- The schedule exceptions
Create schedules for the backups.
Decide whether to encrypt the data for transmission over non-secure networks or storage media. Gather the following information prior to configuring encryption:
- The encryption type
- The encryption level (client, instance, subclient)
Configure the encryption.
Step 8: Run the First Backup Operation and Restore Operation
Step 9: What to Do Next
Configure data retention and data aging. For more information, see Data Aging - Getting Started.
Last modified: 2/15/2019 12:06:22 PM