To enhance infrastructure resilience and reduce manual recovery efforts, you can perform a safe reboot on one or more nodes within a Commvault HyperScale X cluster. A safe reboot automates several checks and operations to ensure system stability.
A safe reboot automates the following operations:
- Stops Commvault Services and unmount the CVFS vdisk on all the nodes
- Prompts the user to confirm that no jobs are currently running
- Performs a cluster health check
- Validates SSH key-based authentication
- Prompts the user to confirm that the storage pool will be offline during a safe cluster reboot
Procedure
- Stop or suspend all running jobs associated with the MediaAgents in the cluster . For more information, see Controlling Jobs.
-
Set the MediaAgents on maintenance mode. For more information, see Setting the MediaAgent on Maintenance Mode.
-
Using an SSH client program, like PuTTy on Windows, log on to the node using the
rootorcvbackupadminuser credentials. -
Run the safe reboot command.
-
To safe reboot a single node, run the following command:
# cvnode --safe_reboot -
To reboot all the nodes in the cluster, run the following command:
# cvcluster --safe_reboot
-
-
A prompt to confirm that there are no jobs running on the nodes is displayed. Type
yto confirm and proceed. -
A prompt indicating that cluster performance may degrade during the reboot operation is displayed. Type
yto proceed. -
For a cluster reboot, a prompt indicating that the storage pool will remain offline during the reboot operation is displayed. Type
yto confirm and proceed.After the reboot operation, the required Commvault services and Commvault File System (CVFS) services are started automatically on the node.
-
Disable the maintenance mode for the MediaAgent to bring it online. For more information, see Setting the MediaAgent on Maintenance Mode.
Additional Safe Reboot and Shutdown Commands
The following commands are available for safe shutdown and reboot operations:
|
Command |
Description / Additional Options |
|---|---|
|
|
Displays the status of the current or recent safe reboot operation on the node. |
|
|
Resumes a previously failed safe reboot operation on the node. |
|
|
Initiates a safe shutdown operation on the node. |
|
|
Resumes a previously failed safe shutdown operation on the node. |
|
|
Displays the status of the current or recent safe shutdown operation on the node. |
|
|
Displays the status of the current or recent safe reboot operation on the cluster. |
|
|
Resumes a previously failed safe reboot operation on the cluster. |
|
|
Initiates a safe shutdown on all the nodes in the cluster. The node that runs the command is identified as the local node, and all other nodes are identified as remote nodes. During the safe shutdown, the remote nodes are shut down first, followed by the local node. After the shutdown, power on the nodes in reverse order: first power on all remote nodes and wait until they fully boot, and then power on the local node. This ensures that the master (control) node resumes the safe shutdown process and restarts the cluster services. |
|
|
Resumes a previously failed safe shutdown operation on the cluster. |
|
|
Displays the status of the current or recent safe shutdown operation on the cluster. |