Data Multiplexing - Overview
In a typical storage policy configuration, many clients/subclients can point to the same storage policy. Each storage policy copy has one or more streams related to the number of drives in a drive pool. On a particular stream, only one subclient can perform a data protection operation at any one time. The limit for the number of data protection operations that can go to any one stream is one. Therefore, only one data protection operation can be sent to a media/drive at any one time.
This limitation has its disadvantages. Backing up one client/subclient to a single piece of media does not fully utilize the drive's throughput, as the backing up of client data can be much slower than actual speeds of the tape.
In a large enterprise with many clients, many data protection operations may need to be performed within a fixed backup window. This may lead to high hardware requirement costs if the drive or media used for those data protection operations is being under utilized.
To optimally use the high speed tape drives available today, data from several clients/subclients can be multiplexed and written to media.
Chunk Size of Data That is Multiplexed
Multiplexed data chunk sizes are determined by the type of data that is being multiplexed; file system data and database data.
- If the first backup is a file system type backup, all other backups joining multiplexing will have a chunk size of 4 GB.
- If the first backup is a database type backup, all other backups joining multiplexing will have a chunk size of 16 GB.
Multiplexed data is aged when all jobs (multiplexed) on a single chunk have met the defined retention rules of their associated storage policy copy. For more information, see Data Aging.
- Data Multiplexing is not supported if the storage policy copy is enabled with Deduplication. However, a Silo copy supports Data Multiplexing even if the storage policy copy is enabled with Deduplication.
- Multiplexed data can be copied to a deduplicated storage policy copy. See Deduplicating Multiplexed Data for instructions.
- An Auxiliary Copy can be configured with Data Multiplexing when the source copy is enabled for Deduplication.
How Data Multiplexing Works
During a data protection operation, agent data is transferred to media over a data pipeline. This data is transferred by data movers that read agent data then write the data to the media.
During data multiplexing, many such data movers must read and write data to the same piece of media. To achieve this, these data movers are comprised of two components, data receivers and data writers. During data multiplexing, one data receiver per backup stream reads the data coming through the data pipeline. One data writer per media receives data from multiple data receivers then writes data to the media.
In the sample image that follows, Subclient_A and Subclient_B are being backed up at the same time and their data is being multiplexed. Multiple data receivers read the data and then one data writer writes the data to a single piece of media.
Determining the Multiplexing Factor
The multiplexing factor should be determined by analyzing your network configuration and by examining your needs for maximizing disk throughput to decrease the total amount of time it takes to protect your data. The multiplexing factor is determined by the following:
- Network card speed
- Network switch speed
- Drive speed
The following examples will help you determine the multiplexing factor. Keep in mind that these are only hypothetical examples.
- Let's analyze a network configuration that involves three clients, without and with multiplexing.
- What happens when a fourth client is added to the example and the multiplexing factor is set to four.
- A fifth client is added, and the multiplexing factor is set to five, is this over-multiplexing?
- If you have over-multiplexed, either set the multiplexing factor lower and multiplex less clients, or add some gigabit Ethernet switches to your network.
- In another example, client disk speeds are fast and they become slower after multiplexing.
Note that the maximum multiplexing factor that can be set from the CommCell Console is 10 and the system displays a warning message when the multiplexing factor is set to 5 or above.
Perform a Multiplexed Data Protection Operation
Once the multiplexing factor is set on the primary copy of the storage policy whose subclients are to be backed up, all data protection operations of the storage policy can run at the same time, to the same piece of media.
Perform Data Multiplexing Using a Disk Library
Data Multiplexing can be performed on a disk library by setting the maximum number of streams on the disk storage policy to a value equal to the number of data protection operations that are to be performed simultaneously. For more information on setting the number of data streams, see Storage Policy Copy Properties.
De-Multiplexing Multiplexed Data
De-multiplexing segregates/de-multiplexes the data for selected clients/subclients from the larger list of clients. The software does not require de-multiplexing; however, if you want to de-multiplex the data that you have multiplexed, you can create a deduplication enabled storage policy copy for the clients/subclients, and then perform an Auxiliary Copy operation on that copy.
Be sure to adhere to Best Practices when using the data multiplexing feature.
Multiplexing and Data Streams
Data Multiplexing is performed differently based on whether or not you are performing multiple stream data protection operations.
Data Multiplexing With Single Stream Data Protection Operations
In the following example, J1, J2, J3,and J4 have been run as single stream data protection operations. There are two drives available, D1, and D2.
If there is no data multiplexing:
J1 will use D1, J2 will use D2. J3,and J4 will go into a waiting state until J1 and J2 have completed.
If data multiplexing was used with a multiplexing factor of two:
J1 and J2 will use D1. J3 and J4 will use D2.
Data Multiplexing With Multiple Stream Data Protection Operations
The following examples illustrate data multiplexing with data protection operations that use multiple streams.
Data Multiplexing with File System Multiple Stream Data Protection Operations
In the following example, there are two jobs, J1 and J2. Each job was run with three streams. There are two drives, D1 and D2.
If there is no data multiplexing:
- J1 has three streams, and each stream uses D1, but they run one after another.
- J2 also has three streams, and each stream uses D2, and they also run one after another.
If there is data multiplexing with a multiplexing factor of three:
- The three streams of J1 can run concurrently to D1.
- The three streams of J2 can run concurrently to D2.
Data Multiplexing with Database Multi Streaming
In the following example, a three stream database data protection operation is performed with a multiplexing factor of three. J1, J2, and J3 are database data protection operations, and each used three streams. There are three drives available, D1, D2, and D3.
If there is no data multiplexing:
D1 - J1
D2 - J1
D3 - J1
The second and third job (J2 andJ3) must wait for the necessary resources.
If there is data multiplexing with a multiplexing factor of three.
The first job (J1) uses three drives, D1, D2, and D3:
D1 - J1
D2 - J1
D3 - J1
The second and third job (J2 andJ3) are multiplexed and use the same drives as J1:
D1 - J1, J2, J3
D2 - J1, J2, J3
D3 - J1, J2, J3
Therefore, J1, J2, and J3 use D1, D2, and D3 in parallel.
Impact of Data Multiplexing on Data Recovery Operations
The following data recovery operations can be performed on multiplexed data without significant degradation of performance:
- Data recovery operations using CommCell Console
- Data recovery operations using Media Explorer
Note: In case, you have not enabled data multiplexing, but you are experiencing slower restores from the secondary copies, then configure the Disable the Chunk Concatenation parameter.
It is recommendedthat you keep the following in mind when performing data multiplexing:
- Using Multiplexing Options to Improve the Data Movement Performance
Multiplexing does not improve performance of an individual backup operation. However multiple backups run in parallel to a single tape drive, results in better utilization of the tape drives, especially when the backups are from slower clients. This helps in better overall throughput and reduction in the backup window.
For LAN backups, make sure that the network between the clients and MediaAgent is capable of supporting multiple simultaneous backups.
- Typical multiplexing factors are set between 2 and 5 on 100BaseT networks.
- Typical multiplexing factors are set between 5 and 8 on 1000BaseT networks.
You can determine the multiplexing factor by analyzing the network configuration and the required disk throughput. Do not over multiplex. That would be counter-productive and slow down the backups as well as restores. Multiplexing factor must be set equal to the ratio of tape drive throughput and client source speed. For example, if the tape drive has rated speed of 40 Mb/sec and clients are able to supply the data at about 12 Mb/sec, then a multiplexing factor of 3 is advisable. Typical multiplexing factor is between 2 and 5.
Note: Restores from multiplexed data on tape libraries will be slower.
- Use different storage policies for file system and database type data before performing data multiplexing. Therefore, there will not be differences in the chunk sizes of the different types of data.
- If possible use the Restore by Jobs option to restore multiplexed data, especially when restoring large amount of data. This will provide the optimum performance during the restore operation as there are fewer tape rewinds to secure the data.
- It is recommended that you perform data multiplexing for jobs that have similar speeds (i.e. two database jobs), instead of mixing faster jobs (i.e. file systems) with slower jobs (i.e. databases). Mixing faster and slower jobs results in data stored on media that is not uniform.. Hence, data recovery operations of slower clients will have added performance penalty.
- Multiplexing is recommended if you are planning to recover:
- Individual items, files and folders.
- Entire computers or databases.
- It is not recommended under following conditions:
- If you are planning to recover scattered folders as multiplexing will further scatter the data. Also it adds to up to extra tape mounts and rewinding/forwarding on the media.
- Clients which undergo very frequent restore requests.
- The multiplexing factor is determined based on the ratio of how fast the tape drive is compared to the disk. For example, consider the following ratios:
- Tape write speed = 80 GB per hour
- Disk read speed (backup) = 25 GB per hour
- Tape read speed = 80 GB per hour
- Disk write speed (restore) = 60 GB per hour
Tape write speed/disk read speed (backup) = 80/25 = 3.2 GB per hour
Tape read speed/disk write speed (restore) = 80/60 = 1.33 GB per hour
It is recommended that the lower of the two ratios as the multiplexing factor if you want no-penalty data recovery operations.
Last modified: 3/1/2018 8:56:12 PM