Fujitsu Laboratories Ltd. has developed a high-speed access technology for magnetic tape storage, which is attracting renewed interest as a low-cost, large-capacity storage solution alternative to hard disks (HDD).
Traditionally, magnetic tape storage has been used mainly for backup purposes, but because of its high capacity and low cost, as well as the acceleration of transfer speeds and the spread of the Linear Tape File System (LTFS) in recent years, it is expected to find increasing use for archival purposes.
Fujitsu has expanded the functionality of LTFS by innovating a file system that virtually integrates multiple tape cartridges. By improving random read performance from tape through data management and access order control in accordance with tape characteristics, the new technology achieves speeds 4.1 times faster than conventional methods.
This technology could accelerate the adoption of magnetic tape storage technology as an archival medium in anticipation of the exponential growth of data in the future.
Fujitsu Laboratories is currently conducting a verification trial assuming the application of this technology to its operations, with plans to commercialize the technology by the end of fiscal year 2022.
About the Newly Developed Technology
While magnetic tape storage is ideally suited to reading from and writing to sequential areas of tape, its ability to randomly access discontinuous locations remains limited. The technology's relative inability to deliver random access reading presents a roadblock to broadening its use into high volume data archiving applications.
In general, to manage a large amount of data with magnetic tape storage, data is held under a different directory for each tape cartridge on LTFS, which uses multiple tape cartridges and can access data on a file-by-file basis in the same way as data on HDDs, USB memory, etc.
Fujitsu developed a new file system on LTFS that virtually integrates multiple tape cartridges. This virtually integrated file system consolidates multiple tape cartridges into one, allowing users to access the data they need without thinking about each individual tape cartridge. In addition, the following newly developed technologies have been applied to this file system to achieve high-speed magnetic tape access performance.
1.Access Order Control with Physical Location
On a magnetic tape, data is divided along its length in units called wraps, and each wrap wraps around and is written in a write-once fashion. Therefore, the distance between the logical address and the physical address is very different. The virtual consolidated file system accepts multiple random read requests and processes them starting with the closest physical location on the tape, not the logical address.
When writing to magnetic tape, write and error checking are performed in parallel, and when an error occurs, only the part of the error that occurred after the end of writing is automatically rewritten. Therefore, it is difficult to predict the physical location where a rewrite has finished from the change in file size. The physical location of each file is estimated by periodically measuring the head position after writing the file.
Also, when accessing the magnetic tape, it takes time to align the head to the start position. Therefore, two read requests that are close to each other on the same wrap will not read two files at once, but will read all the files in between and discard unnecessary files.
2.Multiple File Aggregation Function
LTFS maintains an index of each file on magnetic tape, and the impact increases exponentially as the number of files increases. When using tapes for archival purposes, users write and access files of various sizes, but writing large numbers of small files can significantly degrade read performance.
Therefore, Fujitsu has developed a mechanism to keep small files smaller than a specified file size together as large files on LTFS so that users can access them without worrying about the location of the files. In addition, by managing the metadata of user files in the virtual integrated file system, it is possible to quickly display a list in a way other than data reading, add extended attributes, or delete files without accessing the magnetic tape.
Fujitsu has constructed a hierarchical storage system for HDDs and magnetic tapes by using Ceph (1), an open source distributed storage software, and evaluated the access performance of the system. As a result, the time required to read 100 files randomly from a total of 50,000 individual 100 MB files stored on magnetic tape was 5,400 seconds with the conventional method. By using the new technology, it was possible to confirm a read in 1,300 seconds, which is 4.1 times faster than the conventional method. In addition, while the conventional method required 2.5 seconds to move 256 individual 1 MB files on the HDD onto magnetic tape, the new technology enabled us to confirm data movement in 1.3 seconds, which is 1.9 times faster than the conventional method.
This technology enables high-speed tape access performance, such as random reads and writes of various sizes occurring in archive applications, and is expected to provide a cost-effective data archiving infrastructure for long-term archiving of large volumes of data.