Backup Operations for IBM Spectrum Scale (GPFS)
You can back up and restore the data on a GPFS cluster.
The following backup types are available for GPFS data:
- Synthetic full
Note: We recommend that you run synthetic full backups with multiple streams, because synthetic full backups run as single-stream backups are slower. Also, if you run synthetic full backups with single stream, you cannot run restores with multiple streams and nodes. For more information on multiple-stream synthetic full backups, see Configuring Multiple Streams for Synthetic Full Backups.
The GPFS command mmapplypolicy will be used to collect the list of changed files that are eligible for backup. Before running the mmapplypolicy command, the GPFS volumes will be suspended and resumed to check that all of the pending I/O activities are flushed to the disk. The suspend and resume operations can be run only on the nodes belonging to the cluster. Suspend and resume operations on a remote cluster is not supported by GPFS. So, the master node where the scan phase runs must be part of the GPFS cluster.
GPFS snapshots preserve the file system at a point in time. With GPFS snapshots, you can perform scan and backup operations at the same time that you run user updates, and still obtain snapshots that are consistent with their point in time.
The scan phase creates GPFS snapshots of the subclient content and of all the independent filesets that are under the subclient content. Then, a GPFS scan that is based on the mmapplypolicy command runs on the snapshot paths. For example, if /gpfs1 is the subclient content and /gpfs1/dir1/fileset1 is an independent fileset under /gpfs1, then the scan creates one snapshot for /gpfs1 and one snapshot for /gpfs1/dir1/fileset1.
The path where the snapshots are created is visible on all the GPFS cluster nodes. The scan phase runs only on the master node.
The backup phase reads files from the snapshot. At the end of the backup phase, the master node deletes all the snapshots that were created.
Shared Job Results
All data access nodes and the master node that you include in backups must share the same Job Results directory. For information about changing the path of the Job Results directory, see Changing the Path of the Job Results Directory.
Note: All of the participating data access nodes and the master node must be at the same service pack level.
Last modified: 7/8/2019 5:11:09 AM