Rebooting Nodes Safely in a HyperScale X Cluster

To improve infrastructure resilience and reduce manual recovery efforts, you can perform a safe reboot on one or more nodes within a Commvault HyperScale X cluster.

A safe reboot automates several validation steps and operations to ensure system stability.

When you perform a safe reboot, the system automatically:

  • Stops Commvault services and unmounts the CVFS vdisk on all nodes.
  • Prompts you to confirm that no jobs are currently running.
  • Performs a cluster health check.
  • Validates SSH key-based authentication.
  • Prompts you to confirm that the storage pool will be offline during a safe cluster reboot.

Before You Begin

Before rebooting:

  1. Stop or suspend all running jobs associated with the MediaAgents in the cluster.

    For more information, see Controlling Jobs.

  2. Set the MediaAgents to maintenance mode.

    For instructions, see Setting the MediaAgent on Maintenance Mode.

Procedure

  1. Use an SSH client (for example, PuTTY on Windows) to log in to the node using the root or cvbackupadmin user credentials.

  2. Run the appropriate safe reboot command:

  3. To reboot a single node:

    cvnode --safe_reboot  
    

  4. To reboot all nodes in the cluster:

    cvcluster --safe_reboot  
    

  5. When prompted, confirm the following:

  6. That no jobs are currently running (y).

  7. That cluster performance may temporarily degrade during the reboot (y).

  8. (For cluster reboots only) That the storage pool will remain offline during the operation (y).

  9. After the reboot, the required Commvault services and Commvault File System (CVFS) services are automatically restarted on each node.

  10. Disable maintenance mode for the MediaAgent to bring it back online.

    For more information, see Setting the MediaAgent on Maintenance Mode.

Additional Commands

The following commands are available for safe shutdown and reboot operations.

Command

Description / Additional Options

cvnode –lastreboot_status

Displays the status of the current or most recent safe reboot operation on the node.

cvnode –lastreboot_resume

Resumes a previously failed safe reboot operation on the node.

cvnode –safe_shutdown

Initiates a safe shutdown operation on the node.

cvnode –lastshutdown_resume

Resumes a previously failed safe shutdown operation on the node.

cvnode –lastshutdown_status

Displays the status of the current or most recent safe shutdown operation on the node.

cvcluster –lastreboot_status

Displays the status of the current or most recent safe reboot operation on the cluster.

cvcluster –lastreboot_resume

Resumes a previously failed safe reboot operation on the cluster.

cvcluster –safe_shutdown

Initiates a safe shutdown on all the nodes in the cluster.

The node that runs the command is identified as the local node, and all other nodes are identified as remote nodes. During the safe shutdown, the remote nodes are shut down first, followed by the local node.

After the shutdown, power on the nodes in reverse order: first power on all remote nodes and wait until they fully boot, and then power on the local node. This ensures that the master (control) node resumes the safe shutdown process and restarts the cluster services.

cvcluster –lastshutdown_resume

Resumes a previously failed safe shutdown operation on the cluster.

cvcluster –lastshutdown_status

Displays the status of the current or most recent safe shutdown operation on the cluster.

×

Loading...