Ultra 320 SCSI - Printer Friendly version

1. Introduction

Ultra320 SCSI - Page 1

Source: Adaptec

- Introduction

SCSI celebrates its 20th anniversary with a bang by moving to the seventh generation of the bus that introduces a maximum data transfer at a staggering 320 MB/sec. Over the course of the past two decades the protocol has evolved from an 8-bit, single-ended interface transferring data at 5 MB/sec to a 16-bit, differential interface transferring data at 160 MB/sec. For the first time the SCSI protocol has been revised in order to reduce the time spent on processing overhead, resulting in increased performance. Ultra320 SCSI launched at end of 2001.

The Three electrical levels of SCSI:

SE=Single Ended
HVD SCSI or Differential SCSI=High voltage differential SCSI, based on EIA485
LVD SCSI=Low voltage differential SCSI

2. Features

Ultra320 SCSI - Page 2

Source: Adaptec

- SCSI Feature Sets

Mandatory features for Ultra320 SCSI as currently defined include:

? 320 megabyte per second transfer rate using DT data transfers
? 32-bit CRC
? Simple domain validation
? Backward compatibility
? Packetized transfers only
? A free-running clock
? Skew compensation
? A training pattern
? Transmitter precompensation with cutback

Optional features for Ultra320 SCSI currently include:
? AAF
? QAS
? Fairness
? AIP

- What's New

Ultra320 SCSI introduces additional technologies that will reduce overhead and improve performance. These changes will allow data to transfer safely and reliably at 320 MB/sec. Ultra320 SCSI includes the following key features:

Double Transfer Speed: This doubles the transfer rate across the SCSI bus to a burst rate of 320 MB/sec allowing higher transfer rates across the SCSI bus and increasing the disk drive saturation point. This results in increased performance, especially in environments that use extended transfer lengths or have many devices on a single bus.
Packetized SCSI: This includes support for packet protocol. Packetized devices decrease command overhead by transferring commands, data, and status using DT (dual transition) data phases instead of slower asynchronous phases. This improves performance by maximizing bus utilization and minimizing command overhead. Furthermore, packet protocol also enables multiple commands to be transferred in a single connection. In Ultra160 SCSI, data is transferred in synchronous phase at 160 MB/sec, while the command and status phases are still transferring at slower asynchronous phases and limited to a single transfer per connection.
Quick Arbitration and Selection (QAS): This reduces the overhead of control release on the SCSI bus from one device to another. This improvement reduces command overhead and maximizes bus utilization.
Read and Write Data Streaming: This minimizes the overhead of data transfer by allowing the target to send one data stream LUN Q-TAG (LQ) packet followed by multiple data packets. In a non-streaming transfer, there is one data LQ packet for each data packet. Write data streaming performance is also increased because the bus turn-around delay (from DT data in to DT data out) is not incurred between each LQ and data packet.
Flow Control: This allows the initiator to optimize its pre-fetching of data during writes and flushing of data FIFOs during reads. The target will indicate when the last packet of a data stream will be transferred which will allow the initiator to terminate the data pre-fetch or begin flushing data FIFOs sooner than was previously possible.

- Ultra320 SCSI lines up with PCI-X

Faster I/O performance will saturate the PCI bus, therefore most host implementations are tied to PCI-X. Disk drive media rates continue to increase. Later this year the drive data rates are expected to exceed 40MB/sec. SCSI will need to jump past Ultra160 SCSI in order to support sustained throughput from the average number of drives in a server (four).

Under standard PCI the host bus has a maximum speed of 66 MHz. This allows for a maximum transfer rate of 533 MB/sec across a 64-bit PCI bus. With Ultra160 SCSI, two SCSI channels on a single device achieve a maximum transfer rate of 320 MB/sec leaving plenty of overhead before saturating the PCI bus. However, at 320 MB/sec, two SCSI channels can now achieve 640 MB/sec, which will saturate a 64-bit / 66MHz PCI bus. In addition to PCI-X doubling the performance of the host bus from 533 MB/sec to a maximum of 1066 MB/sec, there are protocol improvements so that efficiency of the bus is improved over PCI. Together PCI-X and Ultra320 SCSI provide the bandwidth necessary for today’s applications.

3. Conclusion

Ultra320 SCSI - Page 3

Source: Adaptec

- Conclusion

Ultra320 SCSI is sure to add to the existing legacy of past SCSI technologies. SCSI has come a long way from its original 5MB/sec transfer rate. At 320 MB/sec, Ultra320 SCSI is only the latest in SCSI evolution. As technology continues to move into the 21st century, the industry can continue to look forward to new and faster SCSI technology. Ultra640 is already in development.

With new technologies such as packetized SCSI, QAS, training and pre-comp, SCSI will continue to deliver performance safely and reliably for generations to come. As performance continues to grow, so will the applications that can take full advantage of greater I/O performance. PCI-X accelerates performance across the host bus to 1066 MB/sec and Ultra320 SCSI is there to take full advantage of this available bandwidth.

And as always, SCSI maintains its backward compatibility allowing customers to protect their investment while concurrently giving them the ability to grow as their needs increase. No other I/O technology can provide these advantages. SCSI continues to increase its performance, features, enhancements and market share. Ultra320 SCSI is the newest example of SCSI’s continued commitment to providing the industry with the I/O bandwidth necessary for an increasing number of performance hungry applications. SCSI will continue to evolve and with Ultra640 SCSI already on the roadmap, it will be impossible to replace.

4. Detailed Features

Ultra320 SCSI - Page 4

Source: Maxtor

- SCSI features

Additional detail about the features described below is available in the ANSI standard document SCSI Parallel Interface ? 4 (SPI-4). The latest draft of this standard is available at ftp://ftp.t10.org/t10/document.00/00-378r0.pdf and ftp://ftp.t10.org/t10/document.00/00-378r0.pdf.

DT (or ?Double-transition?) data transfers: DT transfers use both asserting and negating transitions of the ACK and REQ signals on the SCSI bus for clocking data transfers. This allows the transfer rate to be doubled without increasing the frequency of the clock signal. Each transition of the clock signal transfers two bytes of data as DT transfers are defined for use only with wide (16-bit) transfers.

CRC (or ?Cyclic redundancy check?): CRC is an algorithm that a sender uses to generate check bytes from transferred data. These check bytes are then transmitted immediately following the data. The recipient calculates check bytes from the received data and compares the result to the check bytes received following the data. If the two sets of check bytes match, the data is correct. In this manner CRC provides improved data reliability. CRC is defined for use only with DT transfers.

Note: DT clocking, CRC, and other protocol components were developed for Ultra160 and patented by Quantum and are offered under ?no-fee? license agreements to all.

Simple domain validation (also known as ?Physical layer integrity checking?): Simple domain validation defines how an initiator can use the INQUIRY command to query targets to determine their capabilities (e.g., maximum transfer rate), the system configuration (e.g., the width of the bus), basic functionality of the system components, and how the initiator can use the READ and WRITE BUFFER commands to send and receive known data patterns from the targets for simple data integrity validation.

Backward compatibility: Backward compatibility means that a device supporting a new feature set can be used in physical configurations with devices that only support transfer rates and protocols previously defined for the SCSI interface. Examples include: the ability for transceivers to operate in ?single-ended? mode (as opposed to the LVD, or ?low-voltage differential?, mode required by the higher transfer rates), the ability to tolerate five volt single-ended signaling from older devices, and the ability to function properly with the current cable plant specifications (i.e., 25 meters in a point-to-point configuration or 12 meters with up to 16 devices on the bus).

Information unit transfers (or ?IU transfers?, also know as ?packetized? or ?packetization?): IU transfers provide a protocol to significantly increase overall system performance. Some of the elements of the protocol that provide this performance increase include:

? A method for non-data transfers (like commands sent from the initiator to the target and status sent from the target to the initiator) to occur at the maximum negotiated data rate of up to 320 megabytes per second for Ultra320 SCSI ? as opposed to those same transfers occurring in asynchronous mode at five megabytes per second;
? A method to transfer SPI information units for a number of I/O processes without an intervening physical disconnection (e.g., an initiator could send several packets each containing a queued command to the target during a single physical connection without intervening BUS FREE phases);
? Minimizing the overhead required by eliminating several bus phase changes per I/O process, for example: a typical WRITE operation using normal data group transfers would require ARBITRATION, SELECTION, COMMAND, DATA OUT, STATUS, and MESSAGE IN phases. The same WRITE operation using IU transfers would only require ARBITRATION, SELECTION, DATA OUT, and DATA IN phases. The command and data would be transferred during the DATA OUT phase, and the STATUS and COMMAND COMPLETE message information would be transferred during the DATA IN phase, all at the maximum data rate.

QAS (or ?Quick Arbitration and Selection?): QAS allows for increased overall system performance by providing a method for arbitration to occur without intervening BUS FREE phases. QAS can only be enabled if information unit transfers are enabled.

Note: Packetized and QAS can each save several microseconds per operation as this is the scale of the time it takes to perform functions like arbitration and bus turnaround. For example: it takes 3.2 microseconds to transfer one sector of data (512 bytes) at 160 megabytes per second for Ultra160 SCSI. Since this time goes down to 1.6 microseconds at 320 megabytes per second for Ultra320 SCSI, it?s possible for the overhead required for a single sector READ command to be several times greater than the time required to transfer the data for the command for normal data group transfers (i.e., ?non-packetized? or standard parallel SCSI transfer mode).

SCSI bus fairness (or simply ?fairness?): Fairness prevents a device from ?hogging? the bus by guaranteeing that all devices have an opportunity to arbitrate. Fairness must be enabled when QAS is enabled as ?hogging? could potentially be more of an issue with that protocol.

AIP (or ?Asynchronous Information Protection?): AIP provides an enhanced error detection method for the COMMAND, MESSAGE, and STATUS asynchronous information transfer phases. In systems without AIP, these phases transfer information on the lower eight data bits of a SCSI bus with only parity protection on those transfers. AIP transfers error detection information (a BCH Hamming code) on the upper eight data bits of the data bus simultaneous with the information transfer. The protection code will detect all errors of three bits or fewer, all errors of an odd number of bits, and 98.4% of all possible errors.

Free-running clock (sometimes called ?FRC?): A free-running clock is used to improve data integrity of the clock signal by removing intersymbol interference (or ?ISI?). ISI is the effect of a transition on a signal line on transitions immediately before or after it on the same line. A pulse (or ?symbol?) will cause a nearby preceding pulse to shift forward in time, and it will cause a nearby subsequent pulse to shift backward in time (i.e., a pulse will ?interfere? with the placement in time of adjacent pulses). By having a clock running at a constant frequency, this effect is neutralized. The free-running clock is restricted for use with packetized DT transfers at a 320-megabyte per second or greater transfer rate.

Skew compensation of data signals relative to the clock signal: Skew is the difference in time between one signal on a bus arriving at a point (e.g., a recipient?s connector) relative to a second signal launched by the sender at the same time on another line on the same bus. This is caused by any combination of several factors including differences in PCB trace or cable length and different electrical characteristics of the different signal paths. A device looks for the state of the data signals during a ?data valid? window in time established by the clock. If a data transition is skewed so much relative to the clock that it falls outside of the window, the device will not accurately detect the data. One of the largest numbers in the error budget for Ultra320 is skew. At this transfer rate a one nanosecond difference in the time a signal arrives at the recipient relative to the clock could be the difference between good data and an error. For Ultra320 the receiving device performs skew compensation on all data signals simultaneously while examining a known data pattern (see the description of Training pattern that follows for more detail). By knowing when data transitions should occur on the signal lines, the receiving device determines any shift of the data signals in time required to make the signals fall at, or near the center of the data valid window. This shift is then applied to the signals on all subsequent data transmissions.

Training pattern: The training pattern is a pre-determined pattern that is transmitted from the sender to the receiver at a specified time. Because the receiver knows what the pattern will be (i.e., exactly when data transitions should occur), it can use portions of this pattern to perform skew compensation. Other portions of this pattern are used by devices implementing adaptive active filtering (a.k.a., ?AAF? or receiver equalization described later in this section) to set the gain of the amplification and other signal adjustments. The definition of the training pattern in the most recent draft of the ANSI standard allows the target to control how often the pattern is sent. The pattern may be sent before each data transmission or after some period of time or event such as a bus reset caused by a new device being added to the system.

Transmitter precompensation with cutback: Transmitter precompensation with cutback is an ?open loop? method of trying to compensate for signal loss on the first pulse of a transition by ?boosting? the amplitude of the first part of a transition, or ?cutting back? the signal for the remainder of the transition. This method compensates for some of the signal loss that is most severe on the first part of a transition. Transmitter precomp is called ?open loop? because there is no standard method for the transmitter to receive feedback from the receiver as to how much cutback should be used in any particular case or to adjust dynamically to changes in configurations (e.g., ?hot swapping? of devices in systems).

AAF (or ?Adaptive Active Filter?, also know as ?receiver equalization with filtering?): AAF uses the training pattern for adaptive equalization of the received signal while removing unwanted noise components of the signal with a filter. This method significantly improves the quality of the received signal (background on Quantum?s development and additional detail of this feature are in the next section below). Using the training pattern to perform this adjustment of signal amplitude provides for an inherent ?closed loop? system that adjusts signal quality for different cable plants and changes in other conditions. In addition, a standard method has been developed to provide a method for a receiver to disable transmitter precomp in a transmitter. This method was developed because a transmitter-receiver nexus where the receiver implements AAF provides better signal quality when transmitter precomp is disabled, and significantly better signal quality than a nexus with transmitter precomp only.