Hadoop Backups

You can back up, archive, and restore the data that resides on a Hadoop cluster.

Backup Types

The following backup types are available for Hadoop data:

  • Full
  • Incremental
  • Differential
  • Synthetic full


    • The file access time is not preserved during backup operations.
    • Hadoop does not support UNIX ctime timestamps for files, so incremental backups will not include files that have only attribute changes.
    • We recommend that you run synthetic full backups with multiple streams, because synthetic full backups run as single-stream backups are slower. Also, if you run synthetic full backups with single stream, you cannot run restores with multiple streams and nodes. For more information on multiple-stream synthetic full backups, see Configuring Multiple Streams for Synthetic Full Backups.

Single Node Scans

The scan phase runs only on the master node because a multi-threaded scan is resource intensive and may slow down the scan job. 

Shared Job Results

All data access nodes and the master node that you include in backups must share the same Job Results directory. For information about changing the path of the Job Results directory, see Changing the Path of the Job Results Directory.

Note: All of the participating data access nodes and the master node must be at the same service pack level.

Last modified: 12/13/2017 6:15:21 AM