EMC is not using off-the-shelf SSDs; instead, it created custom flash modules (CFMs). These modules deliver high flash performance, reliability, and storage density with low latency. The individual state of each flash cell is visible to the controllers, and the system's operating environment takes advantage of this visibility to better manage I/O and more efficiently schedule housekeeping operations such as free space management
The D5 dispenses with conventional shared storage host connections in favor of a significantly higher-performance direct memory connection over PCIe. In addition, the D5 is the first shared NVMe device. The platform features a PCIe host connection that can extend up to 50m and is compatible with the PCIe slots available in most commodity server products.
EMC developed a nonblocking PCIe mesh that provides up to 96 physical host connections. Each server is attached to the mesh via a DSSD-designed PCIe Client Card.
This design decision allows the D5 to deliver the same latencies as server-based flash while providing the advantages of shared storage: efficient resource allocation, the ability to handle large data sets, data sharing, and access to enterprise-class data services.
A system designed to scale to the 10M IOPS range over time, handle hundreds of terabytes of data, and support thousands of devices that respond in microseconds would be significantly held back if it used an I/O stack built for HDDs. With the D5, EMC has implemented a new I/O stack that allows applications to talk much more directly with the underlying memory-mapped media. Also, big data/analytics solutions must be able to simultaneously handle structured, unstructured, and semistructured data. The D5 supports three different access methods: a block driver, a direct memory API for object and key value access, and an HDFS plug-in.
The block driver provides compatibility with legacy applications such as relational databases while offering very low–latency data access over memory-mapped PCIe.
The Flood Direct Memory API provides the same low-latency access for object and key value stores over the PCIe fabric for custom applications or can support DSSD or ISV-developed plug-ins for key applications.
The first of these plug-ins was jointly developed by DSSD and Cloudera for Hadoop (HDFS) application environments and is available now.
With traditional storage architectures, data moving from the application to persistent storage is copied into server CPU memory and then forwarded to persistent storage from there.
The D5 uses dedicated CPUs, outside the data path, to move data directly from the application to persistent flash storage via PCIe direct memory access using NVMe.
This design can move an order of magnitude more data than conventional storage architectures that use the traditional store -and-forward method.
These technogies combine to enable the D5 to deliver consistent low-latency performance (100 microsecond response times) against massive big data workloads even as I/O scales to millions of IOPS.
For data protection, the D5 uses a flash-optimized RAID design called Cubic RAID. This method uses an interlocking multidimensional grid approach to data protection. Other RAID systems, even those that utilize more than one dimension, are limited in the types of failures they can support at the drive level. The D5 was designed with a unified approach to correcting errors across thousands of NAND die (as opposed to correcting errors across drives).
The appliance provides up to 36 flash modules with 144TB RAW (100TB usable) capacity, in a five rack-unit chassis that can be accessed redundantly by up to 48 direct-attached servers.
EMC DSSD D5 will be generally available in March 2016.