Data Multiplexing - Overview

Introduction

In a typical storage policy configuration, many clients/subclients can point to the same storage policy. Each storage policy copy has one or more streams related to the number of drives in a drive pool. On a particular stream, only one subclient can perform a data protection operation at any one time. The limit for the number of data protection operations that can go to any one stream is one. Therefore, only one data protection operation can be sent to a media/drive at any one time.

This limitation has its disadvantages. Backing up one client/subclient to a single piece of media does not fully utilize the drive's throughput, as the backing up of client data can be much slower than actual speeds of the tape.

In a large enterprise with many clients, many data protection operations may need to be performed within a fixed backup window. This may lead to high hardware requirement costs if the drive or media used for those data protection operations is being under utilized.

To optimally use the high speed tape drives available today, data from several clients/subclients can be multiplexed and written to media.

How Data Multiplexing Works

During a data protection operation, agent data is transferred to media over a data pipeline. This data is transferred by data movers that read agent data then write the data to the media.

During data multiplexing, many such data movers must read and write data to the same piece of media. To achieve this, these data movers are comprised of two components, data receivers and data writers. During data multiplexing, one data receiver per backup stream reads the data coming through the data pipeline. One data writer per media receives data from multiple data receivers then writes data to the media.

In the sample image that follows, Subclient_A and Subclient_B are being backed up at the same time and their data is being multiplexed. Multiple data receivers read the data and then one data writer writes the data to a single piece of media.

data_multplexing

Determining the Multiplexing Factor

The multiplexing factor should be determined by analyzing your network configuration and by examining your needs for maximizing disk throughput to decrease the total amount of time it takes to protect your data. The multiplexing factor is determined by the following:

  • Network card speed

  • Network switch speed

  • Drive speed

The following examples will help you determine the multiplexing factor. Keep in mind that these are only hypothetical examples.

  1. Let's analyze a network configuration that involves three clients, without and with multiplexing.

  2. What happens when a fourth client is added to the example and the multiplexing factor is set to four.

  3. A fifth client is added, and the multiplexing factor is set to five, is this over-multiplexing?

  4. If you have over-multiplexed, either set the multiplexing factor lower and multiplex less clients, or add some gigabit Ethernet switches to your network.

  5. In another example, client disk speeds are fast and they become slower after multiplexing.

Note that the maximum multiplexing factor that can be set from the CommCell Console is 10 and the system displays a warning message when the multiplexing factor is set to 5 or above.

Perform a Multiplexed Data Protection Operation

Once the multiplexing factor is set on the primary copy of the storage policy whose subclients are to be backed up, all data protection operations of the storage policy can run at the same time, to the same piece of media.

Perform Data Multiplexing Using a Disk Library

Data Multiplexing can be performed on a disk library by setting the maximum number of streams on the disk storage policy to a value equal to the number of data protection operations that are to be performed simultaneously. For more information on setting the number of data streams, see Storage Policy Copy Properties.

De-Multiplexing Multiplexed Data

De-multiplexing segregates/de-multiplexes the data for selected clients/subclients from the larger list of clients. The software does not require de-multiplexing; however, if you want to de-multiplex the data that you have multiplexed, you can create a deduplication enabled storage policy copy for the clients/subclients, and then perform an Auxiliary Copy operation on that copy.

Be sure to adhere to Best Practices when using the data multiplexing feature.

Multiplexing and Data Streams

Data Multiplexing is performed differently based on whether or not you are performing multiple stream data protection operations.

Data Multiplexing With Single Stream Data Protection Operations

In the following example, J~1~, J~2~, J~3,~and J~4~ have been run as single stream data protection operations. There are two drives available, D~1~, and D~2.~

If there is no data multiplexing:

J~1~ will use D~1~, J~2~ will use D~2~. J~3,~and J~4~ will go into a waiting state until J~1~ and J~2~ have completed.

If data multiplexing was used with a multiplexing factor of two:

J~1~ and J~2~ will use D~1~. J~3~ and J~4~ will use D~2~.

Data Multiplexing With Multiple Stream Data Protection Operations

The following examples illustrate data multiplexing with data protection operations that use multiple streams.

Data Multiplexing with File System Multiple Stream Data Protection Operations

In the following example, there are two jobs, J~1~ and J~2~. Each job was run with three streams. There are two drives, D~1~ and D~2~.

If there is no data multiplexing:

  • J~1~ has three streams, and each stream uses D~1~, but they run one after another.

  • J~2~ also has three streams, and each stream uses D~2~, and they also run one after another.

If there is data multiplexing with a multiplexing factor of three:

  • The three streams of J~1~ can run concurrently to D~1~.

  • The three streams of J~2~ can run concurrently to D~2~.

Data Multiplexing with Database Multi Streaming

In the following example, a three stream database data protection operation is performed with a multiplexing factor of three. J~1~, J~2~, and J~3~ are database data protection operations, and each used three streams. There are three drives available, D~1~, D~2~, and D~3~.

If there is no data multiplexing:

D~1~ - J~1~

D~2~ - J~1~

D~3~ - J~1~

The second and third job (J~2~ andJ~3~) must wait for the necessary resources.

If there is data multiplexing with a multiplexing factor of three.

The first job (J~1~) uses three drives, D~1~, D~2~, and D~3~:

D~1~ - J~1~

D~2~ - J~1~

D~3~ - J~1~

The second and third job (J~2~ andJ~3~) are multiplexed and use the same drives as J~1~:

D~1~ - J~1~, J~2~, J~3~

D~2~ - J~1~, J~2~, J~3~

D~3~ - J~1~, J~2~, J~3~

Therefore, J~1~, J~2~, and J~3~ use D~1~, D~2~, and D~3~ in parallel.

Impact of Data Multiplexing on Data Recovery Operations

The following data recovery operations can be performed on multiplexed data without significant degradation of performance:

  • Data recovery operations using CommCell Console

  • Data recovery operations using Media Explorer

Note

In case, you have not enabled data multiplexing, but you are experiencing slower restores from the secondary copies, then configure the Disable the Chunk Concatenation parameter.

Chunk Size of Data That is Multiplexed

Multiplexed data chunk sizes are determined by the type of data that is being multiplexed; file system data and database data.

  • If the first backup is a file system type backup, all other backups joining multiplexing will have a chunk size of 4 GB.

  • If the first backup is a database type backup, all other backups joining multiplexing will have a chunk size of 16 GB.

Multiplexed data is aged when all jobs (multiplexed) on a single chunk have met the defined retention rules of their associated storage policy copy. For more information, see Data Aging.

Note

  • Data Multiplexing is not supported if the storage policy copy is enabled with Deduplication. However, a Silo copy supports Data Multiplexing even if the storage policy copy is enabled with Deduplication.

  • Multiplexed data can be copied to a deduplicated storage policy copy. See Deduplicating Multiplexed Data for instructions.

  • An Auxiliary Copy can be configured with Data Multiplexing when the source copy is enabled for Deduplication.

Best Practices

It is recommended that you keep the following in mind when performing data multiplexing:

  • Using Multiplexing Options to Improve the Data Movement Performance

    Multiplexing does not improve performance of an individual backup operation. However multiple backups run in parallel to a single tape drive, results in better utilization of the tape drives, especially when the backups are from slower clients. This helps in better overall throughput and reduction in the backup window.

    For LAN backups, make sure that the network between the clients and MediaAgent is capable of supporting multiple simultaneous backups.

    • Typical multiplexing factors are set between 2 and 5 on 100BaseT networks.

    • Typical multiplexing factors are set between 5 and 8 on 1000BaseT networks.

    You can determine the multiplexing factor by analyzing the network configuration and the required disk throughput. Do not over multiplex. That would be counter-productive and slow down the backups as well as restores. Multiplexing factor must be set equal to the ratio of tape drive throughput and client source speed. For example, if the tape drive has rated speed of 40 Mb/sec and clients are able to supply the data at about 12 Mb/sec, then a multiplexing factor of 3 is advisable. Typical multiplexing factor is between 2 and 5.

    Note

    Restores from multiplexed data on tape libraries will be slower.

  • Use different storage policies for file system and database type data before performing data multiplexing. Therefore, there will not be differences in the chunk sizes of the different types of data.

  • If possible use the Restore by Jobs option to restore multiplexed data, especially when restoring large amount of data. This will provide the optimum performance during the restore operation as there are fewer tape rewinds to secure the data.

  • It is recommended that you perform data multiplexing for jobs that have similar speeds (i.e. two database jobs), instead of mixing faster jobs (i.e. file systems) with slower jobs (i.e. databases). Mixing faster and slower jobs results in data stored on media that is not uniform.. Hence, data recovery operations of slower clients will have added performance penalty.

  • Multiplexing is recommended if you are planning to recover:

    • Individual items, files and folders.

    • Entire computers or databases.

  • It is not recommended under following conditions:

    • If you are planning to recover scattered folders as multiplexing will further scatter the data. Also it adds to up to extra tape mounts and rewinding/forwarding on the media.

    • Clients which undergo very frequent restore requests.

  • The multiplexing factor is determined based on the ratio of how fast the tape drive is compared to the disk. For example, consider the following ratios:

    • Tape write speed = 80 GB per hour

    • Disk read speed (backup) = 25 GB per hour

    • Tape read speed = 80 GB per hour

    • Disk write speed (restore) = 60 GB per hour

    Tape write speed/disk read speed (backup) = 80/25 = 3.2 GB per hour

    Tape read speed/disk write speed (restore) = 80/60 = 1.33 GB per hour

    It is recommended that the lower of the two ratios as the multiplexing factor if you want no-penalty data recovery operations.

Loading...