Performing an In-Place Cassandra Restore

You can perform an in-place restore operation of the data and commit log.

If you enabled incremental backups on the Cassandra cluster, then after the restore or dead node replacement, Cassandra might create several files under the directory named "backups" as part of the compaction process. If you disable incremental backups before the restore, then you can reduce the following:

The space used for the files that are created under the "backups" directory
The application size of the next incremental backup

sstableloader Considerations

To perform a repair-free restore, use the sstableloader option. The sstableloader uses the Cassandra APIs to stream data across the nodes. With the sstableloader option, you do not need to perform a repair operation after the restore operation completes.

When you select the option to use the sstableloader, you must provide a path that has enough free space to hold the sstables for the selected keyspaces and column families that you want to restore. The software automatically creates the path when it performs the restore.

If the tables are not deleted or truncated before sstableloader restore is run, the new files will be streamed into the data directory while the old files are still there. The old files will not be deleted until compaction occurs and this will result in increased space consumption. So, before sstableloader restore is run, delete or truncate the tables for which the restore is being run to reduce the space utilization.

Before You Begin

To disable incremental backups, on the command line, type the following command:
```
nodetool disablebackup
```
You can optionally configure the software to:
- Use a specified replication factor and strategy. For more information, see Overwriting the Cassandra Restore Replication Factor.
- Set a specified OS user and group for the restored data. For more information, see Using a Different OS User Name on the Cassandra Destination Node and Setting the OS User Group for Cassandra Restored Files.
If you want to restore column families that use user defined TYPES and during the restore if those TYPES do not exist in the keyspace, then you can have the software restore the TYPES and then the column families. Make sure the keyspace does not exist before you start the restore.

Procedure

From the CommCell Browser, expand Client Computers > client > Big Data Apps > instance.
Right-click the subclient, and then click Browse and Restore.
On the Browse and Restore Options dialog box, select the restore options.
- To restore the latest backup, select Latest Backup.
- To restore to a point-in-time, select Time Range, and then type the date and time in the Start and End boxes.
Click View Content.
On the Browse page, determine what data to restore.

The Browse page displays all the cluster nodes in the left pane.
1. Expand the nodes to view the keyspaces, and the ColumFamily entities that keyspace contains.
2. To include data from all the nodes in the restore, perform the following operations:
  - For a keyspace, right-click the keyspace, and then click Select Keyspace from all nodes.
  - For a ColumnFamily, right-click the ColumnFamily, and then click Select ColumnFamily from all nodes.
  Click Recover All Selected.
  
  The Restore Options dialog box appears.
On the General tab, select the restore options:
1. To invoke the sstableloader to load the data into the database for a repair-free restore, select the Use SSTableLoader Tool check box.
  
  When you select this option, you must provide a path in the Staging Location box.
2. Optional: To recover the database without using staging location (stage-free recovery), select the Run Stage Free Recover check box.
  
  A stage-free recovery process creates a 3DFS share on the MediaAgent and mounts the share on the client computers. The sstableloader is then run on the mounted directory. You do not require to select the Use SSTableLoader Tool or Staging Location options.
  
  Do not use the same MediaAgent for both 3DFS configuration and other configurations like IP library, because both of them use port 2049 by default.
  
  Stage-free recovery is performed using a Linux MediaAgent. If you used a Windows MediaAgent for the backup, click the Data Path tab, and then in the Use MediaAgent box, select a Linux MediaAgent for the recovery.
  
  If there is a firewall between the Cassandra client nodes and the MediaAgent, see Configuring Access to the MediaAgent for Cassandra Restores.
3. To restore the commit log, select the Restore Commit Log check box.
  
  Note
  
  If you have a high number of commit logs to be replayed, you can set the nCassandraStartWaitTime additional setting to wait for the number of seconds before verifying if the Cassandra service has come back up.
4. Select the database recover option.
  - To have the software recover the database, select Recover.
  - To have the Commvault software restore the database files for the selected keyspaces/ColumnFamily to the staging path, and to prevent the software from recovering the database after the restore, select Do Not Recover.
5. In the Staging Location box, type the full path to the staging location.
  
  The Cassandra user must have access to this staging location.
6. In the Number of Streams box, type the number of streams the software uses for the restore operation.
Click OK.

Result

The software restores the data. If you chose Do Not Recover, you must manually run the sstableloader after the restore completes, to recover the database.

What to Do Next

If you disabled incremental backups, use the following command to enable them:

nodetool enablebackup