Performing Hive Restore Operations Using Workflows

The Hive Restore workflow performs a restore of Hive databases. You can configure Hive Restore workflow from the CommCell Console.

The workflow supports in-place and out-of-place restores at the table level. In-place table level restores are supported only if none of the tables to be restored already exist in the database.

Data access nodes are inherited from the default subclient of the destination instance.

You can manually execute the workflow from the CommCell Console.

If the name of the Hive external table does not contain .db as file name extension, you might have to manually restore the table.

Before You Begin

Download the workflow from the Commvault Store.

How Does It Work?

The predefined workflow automates the following operations:

The browse operation runs on the source client at the backup set level to fetch the databases that are backed up.
During a restore operation, a new database will be created on the destination client, if the database does not exist. Tables are restored to the existing or new database (only if tables to be restored are not present).
When the restore operation is successful, the workflow sends an email to the user that executed the workflow.

If the restore operation fails, the workflow sends a message to the Job Controller indicating the job ID of the operation.

Procedure

From the CommCell Browser, go to Workflows.

The Workflows window appears.
Right-click Hive Restore, and then go to All Tasks > Execute.

The Hive Restore dialog box is displayed.

From the Run workflow on list, select the engine to use to execute the workflow.

For the browse operation, provide values for the workflow inputs:

Workflow input	Description
Source Client Name	Name of the Hadoop source client where the backup operation was run.
To Time	Browse data up to the point in time.
Instance Name	Name of the Hadoop source instance.
Browse from Copy Precedence	The copy from which the data must be accessed.

Click OK on the information box.

The Select the database to restore dialog box appears.

To select the source database details, provide values for the workflow inputs:

Workflow input	Description
Source Database Name	Name of the Hadoop source database. For example, hadoop52.
Source Client Name	Name of the Hadoop source client where the backup operation was run.
Source Instance Name	Name of the Hadoop source instance.

If the Hive source database is not automatically populated (when the Hive database directory does not have the .db extension), provide values for the workflow inputs:

Workflow input	Description
Source Database Location	The location of the Hadoop source database on HDFS. For example, /hive/custom_path/mydb.
Source Database Name	The Hadoop source database. For example, hadoop52.

Click Next.

The Select tables to restore dialog appears.
To select the required tables for the restore operation, provide values for the workflow inputs:

Workflow input

Description

Table/s

Name of the tables to be restored.

If no tables are selected, then the entire database is restored.
Click Next.

The Select Destination Client dialog appears.
To provide the destination database details, provide values for the workflow inputs:

Workflow input

Description

Destination Database Name

Name of the Hadoop destination database. For example, hadoop72.

Destination Client Name

Name of the Hadoop destination client.

The destination details are auto-populated for default in-place restore operations.
Click Next.

The Select Destination Instance dialog appears.

Workflow input	Description
Table/s	Name of the tables to be restored.

Workflow input	Description
Destination Database Name	Name of the Hadoop destination database. For example, hadoop72.
Destination Client Name	Name of the Hadoop destination client.

To select the destination instance details, provide values for the workflow inputs:

Workflow input	Description
Destination Instance Name	Name of the Hadoop destination instance.
Hive Node	Name of the node where the Hive application is installed.
Connection String	The connection URL that is used to connect to the Hive application.
Hive User	The Hive user name. Default user name will be `hive` if no input is provided
Hive Password	Password used for Hive user to establish connection to the Hive application.
HDFS User	The HDFS user name in case of a non-default user.
Table Location	The destination location to restore the table data, on the HDFS file system. For example, hdfs://ns1/apps/hive/warehouse/firstdatabase.db/table1
Number of Access Nodes	The number of data access nodes that will participate in the restore operation.

Click Next.

The Summary dialog appears.
Click Finish.

On the Job Initiation tab, specify whether you want to run the job immediately or to schedule it.
Click OK.

What to Do Next

You can view all workflow jobs that have executed. See Viewing the Workflow Job History.