Performing Hive Restore Operations Using Workflows

The Hive Restore workflow performs a restore of Hive databases. You can configure Hive Restore workflow from the CommCell Console.

The workflow supports in-place and out-of-place restores at the table level. In-place table level restores are supported only if none of the tables to be restored already exist in the database.

Data access nodes are inherited from the default subclient of the destination instance.

You can manually execute the workflow from the CommCell Console.

If the name of the Hive external table does not contain .db as file name extension, you might have to manually restore the table.

Before You Begin

Download the workflow from the Commvault Store.

How Does It Work?

The predefined workflow automates the following operations:

  1. The browse operation runs on the source client at the backup set level to fetch the databases that are backed up.

  2. During a restore operation, a new database will be created on the destination client, if the database does not exist. Tables are restored to the existing or new database (only if tables to be restored are not present).

  3. When the restore operation is successful, the workflow sends an email to the user that executed the workflow.

    If the restore operation fails, the workflow sends a message to the Job Controller indicating the job ID of the operation.

Procedure

  1. From the CommCell Browser, go to Workflows.

    The Workflows window appears.

  2. Right-click Hive Restore, and then go to All Tasks > Execute.

    The Hive Restore dialog box is displayed.

  3. From the Run workflow on list, select the engine to use to execute the workflow.

    1. For the browse operation, provide values for the workflow inputs:

      Workflow input

      Description

      Source Client Name

      Name of the Hadoop source client where the backup operation was run.

      To Time

      Browse data up to the point in time.

      Instance Name

      Name of the Hadoop source instance.

      Browse from Copy Precedence

      The copy from which the data must be accessed.

    2. Click OK on the information box.

      The Select the database to restore dialog box appears.

    3. To select the source database details, provide values for the workflow inputs:

      Workflow input

      Description

      Source Database Name

      Name of the Hadoop source database. For example, hadoop52.

      Source Client Name

      Name of the Hadoop source client where the backup operation was run.

      Source Instance Name

      Name of the Hadoop source instance.

    4. If the Hive source database is not automatically populated (when the Hive database directory does not have the .db extension), provide values for the workflow inputs:

      Workflow input

      Description

      Source Database Location

      The location of the Hadoop source database on HDFS. For example, /hive/custom_path/mydb.

      Source Database Name

      The Hadoop source database. For example, hadoop52.

    5. Click Next.

      The Select tables to restore dialog appears.

    6. To select the required tables for the restore operation, provide values for the workflow inputs:

      Workflow input

      Description

      Table/s

      Name of the tables to be restored.

      If no tables are selected, then the entire database is restored.

    7. Click Next.

      The Select Destination Client dialog appears.

    8. To provide the destination database details, provide values for the workflow inputs:

      Workflow input

      Description

      Destination Database Name

      Name of the Hadoop destination database. For example, hadoop72.

      Destination Client Name

      Name of the Hadoop destination client.

      The destination details are auto-populated for default in-place restore operations.

    9. Click Next.

      The Select Destination Instance dialog appears.

    10. To select the destination instance details, provide values for the workflow inputs:

      Workflow input

      Description

      Destination Instance Name

      Name of the Hadoop destination instance.

      Hive Node

      Name of the node where the Hive application is installed.

      Connection String

      The connection URL that is used to connect to the Hive application.

      Hive User

      The Hive user name. Default user name will be hive if no input is provided

      Hive Password

      Password used for Hive user to establish connection to the Hive application.

      HDFS User

      The HDFS user name in case of a non-default user.

      Table Location

      The destination location to restore the table data, on the HDFS file system. For example, hdfs://ns1/apps/hive/warehouse/firstdatabase.db/table1

      Number of Access Nodes

      The number of data access nodes that will participate in the restore operation.

    11. Click Next.

      The Summary dialog appears.

    12. Click Finish.

  4. On the Job Initiation tab, specify whether you want to run the job immediately or to schedule it.

  5. Click OK.

What to Do Next

You can view all workflow jobs that have executed. See Viewing the Workflow Job History.

×

Loading...