Streams are logical data channels that connect the client data to the storage media.
Commvault provides the ability to define the multiple streams to run backups and restores in parallel to improve the rate at which data can be written to or retrieved from the storage media.
During backups, the streams are created as follows:
- Data Reader
During backups, streams originate from the source file or application and are created by the read operation (data readers) that is used to read the source data.
- Data Stream
Once the data is read from the client, it is processed by the iDataAgent, and then sent to the MediaAgent as Data Streams.
- Device Stream
The MediaAgent then processes the data, divides the data into chunks, and writes the data to the storage through Device Streams.
The following diagram illustrates the stream movement process from client computer to storage media:
Data Reader determines the number of parallel read operations while the data is backed up. If necessary, you can configure multiple data readers per subclient.
However, note that the number of readers that are allowed for parallel read operations is based on the number of the physical disk that are available, which is one reader per physical disk. For example, if a physical disk has two partitions, then setting the readers to two will have no effect. On the other hand having multiple data readers on SAN or RAID arrays will speed up the backup operations.
Additionally, you can configure multiple data readers for a disk array that contains several physical drives, that are logically addressed as a single drive. This will allow you to take advantage of the fast read access from the array.
Data Streams for backup jobs are the network streams that run from the client to the MediaAgent.
For File System, data streams are generated by the data readers in the subclient.
For database applications like Oracle, SQL and so on, you can use the streaming capability of the application to set up streams for the subclient.
Job Streams for data are network streams running from client to the MediaAgent. The number of concurrent job streams that can run in an environment is based on the number of streams configured on the MediaAgent.
When the data streams are received by the MediaAgent, the data is divided into chunks that are written to media using Device Streams.
Note: The maximum number of concurrent streams that can run in an environment is based on the number of streams MediaAgent are configured to accept, the number of streams a library accepts and the number of storage policy device streams configured. The Commvault software always uses the lowest stream value throughout the data movement process.
The following concepts are important to understand before you configure device streams.
- MediaAgent Level
You can set the maximum number of concurrent read/write operations on the MediaAgent by the Maximum number of parallel data transfer operations setting. This value controls the maximum number of streams that can be managed by the MediaAgent.
For more information see, Setting the Maximum Number of Parallel Data Transfer Operations.
- Storage Policy Level
Storage Policy data streams are logical channels that connect client data to the media where data that is secured by backup operations are stored. For a storage policy, the number of device streams that is configured must be equal to the number of drives or writers of all libraries that are defined in the storage policy copy. No benefit is gained if the number of device streams is greater than the total number of the resources that are available.
- Disk Library Level
For disk library, the number of device streams is based on the total number of mount path writers for all mount paths in the library. For example, if you have a disk library with two mount paths that have five writers each, a total of ten device streams can be written to the library. When you increase the number of mount path writers, more job streams can be written to device streams.
- Tape Library Level
For tape libraries, one sequential write operation can be performed to each drive. If there are eight drives in the library, then no more than eight device steams are used. By default, each data stream writes to a device stream. To allow multiple data stream to be written to a single tape drive, multiplexing can be enabled. This multiplexing factor is determined by how many data streams can be written to a single device stream. If a multiplexing factor is set to four, and there are eight drives, a total of thirty two data streams can be written to eight device streams.
Streams for Deduplication Database
You can set the number of streams used by backups and auxiliary copy operations to access the deduplication database (DDB) by using the Maximum number of parallel data transfer operations for deduplication database parameter in the Media Management Configuration dialog box. For setups is in Service Pack 2 or later, by default, the number of streams to the DDB is set to 200. However, if you have upgraded from the previous software version, the default number of streams to the DDB is set to 50 which can be modified. For instructions on how to modify the value, see Media Management Configuration Parameters.
Using Multiple Streams to Restore Data
By default, restore operation uses a single stream. For faster restore operations, you can configure the restore operation to use multiple streams, and you can define alternate data paths. For more information, see GridStor® (Alternate Data Paths) - Overview.
Restoring data using multiple streams is supported by the following agents:
Increasing the Number of Tunnels per Network Route
For multi-streaming via tunnels, see Increasing the Number of Tunnels per Network Route.
Last modified: 3/1/2018 8:56:17 PM