Getting Started with HBase

Updated

Verify that your environment meets the requirements for protecting HBase data.

Step 1: Review the System Requirements

Review the requirements for HBase. For more information, see System Requirements.

Step 2: Install the Hadoop Package on Data Access Nodes

To prepare for the installation and to select the best installation method for your environment, review the following topics:

If you want to also use the data access nodes as MediaAgents, then install the MediaAgent package on the nodes. For instructions, see Installing the MediaAgent.

Step 3: Review Requirements for Data Access Nodes

Review the following requirements for data access nodes:

  • The HBase gateway role must be installed on the master node.

  • All data access nodes and the master node must have the same version and service pack of the Commvault software.

  • All data access nodes and the master node must be time synced.

Step 4: To Access the HDFS File System, Configure the Nodes That Will Be Used for Backup Operations and Restore Operations as HDFS Clients on the Hadoop Cluster

After you set the correct Hadoop bin path for the root user, verify that you are able to run the following commands as the root user without any errors. (Start the Commvault services from the same environment.)

hdfs dfs –ls /
 hadoop classpath --glob
    

Step 5: In Secure Hadoop Environments, Provide the Keytab File Location in the Configuration File on Data Access Nodes

For Kerberos authentication, a keytab file is used to authenticate to the Key Distribution Center (KDC). Add the keytab file location as a property in the hbase-site.xml configuration file on all data access nodes, including the master node. The hbase-site.xml file is located under the hadoop_installation_directory/conf/ directory.

Example

<property> <name>hbase.client.keytab.file</name> <value>/etc/krb5.keytab</value> </property>

where /etc/krb5.keytab is the keytab file location

To change the default path of the keytab file, see Changing the Default Path of the Keytab File.

Step 6: For Cloudera Distributions, Provide the Super Username with Necessary Permissions in the Configuration File on the Master Node

Edit the hdfs-site.xml file located under the hadoop_installation_directory/conf/ directory on the master node, and add the following entry.

Example

<property>
<name>hbase.superuser</name>
<value>hbase</value>
</property>
    

Step 7: Prepare for Your First Backup Operation and Restore Operation

  1. Open the CommCell Console.

  2. Configure a storage device.

  3. Create a pseudoclient for Hadoop.

  4. Add an HBase App to a Hadoop instance.

  5. Plan your backup by considering the following configuration items.

    Consideration

    More Information

    Decide whether to schedule backups. Scheduling backups has the advantage of ensuring that the data is automatically backed up at regular intervals. Gather the following information prior to configuring schedules:

    • The schedule name

    • The schedule frequency

    • The schedule start time and start date

    • The schedule exceptions

      Create schedules for the backups.

    Scheduling

    Decide whether to encrypt the data for transmission over non-secure networks or storage media. Gather the following information prior to configuring encryption:

    • The encryption type

    • The encryption level (client, instance, subclient)

      Configure the encryption.

    Data Encryption

Step 8: Run the First Backup Operation and Restore Operation

  1. Perform a full backup operation.

  2. Perform an in-place restore operation.

Step 9: What to Do Next

Configure data retention and data aging. For more information, see Data Aging - Getting Started.