Hybrid File Store Disk Cache Guidelines

Review the following disk cache guidelines for Hybrid File Store:

Disk Type

Use a dedicated physical drive for the disk cache location. Hybrid File Store(HFS) will fill all the available space, so it is not advisable to share the drive with other files. In addition, other partitions on the same drive will reduce the performance due to reduced IO bandwidth and longer disk seek times. For small applications with a low amount of data transfer, it is acceptable to share the physical drive with other partitions.

The speed of HFS is most dependent on the IO speed of the disk cache. Therefore, use the fastest hard drive available. Multiple SSDs in a RAID configuration is usually faster than a single SSD, SSDs are faster than hard drives, and a RAID of hard drives is faster than a single hard drive. In general, the data transferred into HFS using NFS or SMB share is first saved to the disk cache, along with a small amount of internal metadata, and then read by the backup subsystem. The speed of the drive needs to support the transfer of that data according to the performance expectations.

Disk Size

The minimum recommended disk cache size is 1 TB. Beyond that, the ideal size is difficult to quantify, and is dependent on several factors:

  1. The sustained amount of data being written and read from the HFS.

  2. The amount of expected burst data.

  3. The speed to back up to the disk library.

  4. The amount of data written that must be cached for faster reads later.

Because data is being written to HFS over NFS or SMB, the data is stored in the disk cache and backed up asynchronously:

Hybrid File Store Disk Cache Guidelines (1)

As the data is backed up, the old data can be removed to make more room for the new data. If the backup speed is slower than the user data speed, eventually the cache will start to fill up and HFS will throttle the user data. The larger the disk cache, the longer it takes before the user data is throttled. Under these conditions, up to 1 TB may be required to allow the backup subsystem time to back up the incoming data, HFS to find old data to remove, and to store internal files. Faster throughput will require a larger cache for HFS to keep up. The excess space is used to store burst data and to make more data available for reading.

It is important to note that the disk cache can be smaller than the size of the files being stored. The files being stored in HFS are split into 10 MB extents as they are being written, and stored in the disk cache and backed up separately. So for example, if the client is uploading a large 10 TB file to HFS, it is okay to use a 1TB disk cache, since it does not store the entire file at once.

Loading...