The Hive Restore workflow performs a restore of Hive databases. You can configure Hive Restore workflow from the CommCell Console.
The workflow supports in-place and out-of-place restores at the table level. In-place table level restores are supported only if none of the tables to be restored already exist in the database.
Data access nodes are inherited from the default subclient of the destination instance.
You can manually execute the workflow from the CommCell Console.
If the name of the Hive external table does not contain .db as file name extension, you might have to manually restore the table.
Before You Begin
Download the workflow from the Commvault Store.
How Does It Work?
The predefined workflow automates the following operations:
-
The browse operation runs on the source client at the backup set level to fetch the databases that are backed up.
-
During a restore operation, a new database will be created on the destination client, if the database does not exist. Tables are restored to the existing or new database (only if tables to be restored are not present).
-
When the restore operation is successful, the workflow sends an email to the user that executed the workflow.
If the restore operation fails, the workflow sends a message to the Job Controller indicating the job ID of the operation.
Procedure
-
From the CommCell Browser, go to Workflows.
The Workflows window appears.
-
Right-click Hive Restore, and then go to All Tasks > Execute.
The Hive Restore dialog box is displayed.
-
From the Run workflow on list, select the engine to use to execute the workflow.
-
For the browse operation, provide values for the workflow inputs:
Workflow input
Description
Source Client Name
Name of the Hadoop source client where the backup operation was run.
To Time
Browse data up to the point in time.
Instance Name
Name of the Hadoop source instance.
Browse from Copy Precedence
The copy from which the data must be accessed.
-
Click OK on the information box.
The Select the database to restore dialog box appears.
-
To select the source database details, provide values for the workflow inputs:
Workflow input
Description
Source Database Name
Name of the Hadoop source database. For example, hadoop52.
Source Client Name
Name of the Hadoop source client where the backup operation was run.
Source Instance Name
Name of the Hadoop source instance.
-
If the Hive source database is not automatically populated (when the Hive database directory does not have the .db extension), provide values for the workflow inputs:
Workflow input
Description
Source Database Location
The location of the Hadoop source database on HDFS. For example, /hive/custom_path/mydb.
Source Database Name
The Hadoop source database. For example, hadoop52.
-
Click Next.
The Select tables to restore dialog appears.
-
To select the required tables for the restore operation, provide values for the workflow inputs:
Workflow input
Description
Table/s
Name of the tables to be restored.
If no tables are selected, then the entire database is restored.
-
Click Next.
The Select Destination Client dialog appears.
-
To provide the destination database details, provide values for the workflow inputs:
Workflow input
Description
Destination Database Name
Name of the Hadoop destination database. For example, hadoop72.
Destination Client Name
Name of the Hadoop destination client.
The destination details are auto-populated for default in-place restore operations.
-
Click Next.
The Select Destination Instance dialog appears.
-
To select the destination instance details, provide values for the workflow inputs:
Workflow input
Description
Destination Instance Name
Name of the Hadoop destination instance.
Hive Node
Name of the node where the Hive application is installed.
Connection String
The connection URL that is used to connect to the Hive application.
Hive User
The Hive user name. Default user name will be
hive
if no input is providedHive Password
Password used for Hive user to establish connection to the Hive application.
HDFS User
The HDFS user name in case of a non-default user.
Table Location
The destination location to restore the table data, on the HDFS file system. For example, hdfs://ns1/apps/hive/warehouse/firstdatabase.db/table1
Number of Access Nodes
The number of data access nodes that will participate in the restore operation.
-
Click Next.
The Summary dialog appears.
-
Click Finish.
-
-
On the Job Initiation tab, specify whether you want to run the job immediately or to schedule it.
-
Click OK.
What to Do Next
You can view all workflow jobs that have executed. See Viewing the Workflow Job History.