Rebooting Nodes Safely in a HyperScale X Cluster

To enhance infrastructure resilience and reduce manual recovery efforts, you can perform a safe reboot on one or more nodes within a Commvault HyperScale X cluster. A safe reboot automates several checks and operations to ensure system stability.

A safe reboot automates the following operations:

  • Stops Commvault Services and unmount the CVFS vdisk on all the nodes
  • Prompts the user to confirm that no jobs are currently running
  • Performs a cluster health check
  • Validates SSH key-based authentication
  • Prompts the user to confirm that the storage pool will be offline during a safe cluster reboot

Procedure

  1. Stop or suspend all running jobs associated with the MediaAgents in the cluster . For more information, see Controlling Jobs.
  2. Set the MediaAgents on maintenance mode. For more information, see Setting the MediaAgent on Maintenance Mode.

  3. Using an SSH client program, like PuTTy on Windows, log on to the node using the root or cvbackupadmin user credentials.

  4. Run the safe reboot command.

    • To safe reboot a single node, run the following command:

      # cvnode --safe_reboot
      
    • To reboot all the nodes in the cluster, run the following command:

      # cvcluster --safe_reboot
      
  5. A prompt to confirm that there are no jobs running on the nodes is displayed. Type y to confirm and proceed.

  6. A prompt indicating that cluster performance may degrade during the reboot operation is displayed. Type y to proceed.

  7. For a cluster reboot, a prompt indicating that the storage pool will remain offline during the reboot operation is displayed. Type y to confirm and proceed.

    After the reboot operation, the required Commvault services and Commvault File System (CVFS) services are started automatically on the node.

  8. Disable the maintenance mode for the MediaAgent to bring it online. For more information, see Setting the MediaAgent on Maintenance Mode.

Additional Safe Reboot and Shutdown Commands

The following commands are available for safe shutdown and reboot operations:

Command

Description / Additional Options

cvnode –lastreboot_status

Displays the status of the current or recent safe reboot operation on the node.

cvnode –lastreboot_resume

Resumes a previously failed safe reboot operation on the node.

cvnode –safe_shutdown

Initiates a safe shutdown operation on the node.

cvnode –lastshutdown_resume

Resumes a previously failed safe shutdown operation on the node.

cvnode –lastshutdown_status

Displays the status of the current or recent safe shutdown operation on the node.

cvcluster –lastreboot_status

Displays the status of the current or recent safe reboot operation on the cluster.

cvcluster –lastreboot_resume

Resumes a previously failed safe reboot operation on the cluster.

cvcluster –safe_shutdown

Initiates a safe shutdown on all the nodes in the cluster.

The node that runs the command is identified as the local node, and all other nodes are identified as remote nodes. During the safe shutdown, the remote nodes are shut down first, followed by the local node.

After the shutdown, power on the nodes in reverse order: first power on all remote nodes and wait until they fully boot, and then power on the local node. This ensures that the master (control) node resumes the safe shutdown process and restarts the cluster services.

cvcluster –lastshutdown_resume

Resumes a previously failed safe shutdown operation on the cluster.

cvcluster –lastshutdown_status

Displays the status of the current or recent safe shutdown operation on the cluster.

×

Loading...