Know your I/O: The Technology

| 2 Comments

We've all heard that SATA is good for sequential I/O and SCSI/SAS is better for random I/O, but why is that? And what are these new RAID variants that are showing up? Fibre Channel vs. iSCSI vs. FCoE? Books have been written on these topics, but this should help familiarize you with some of the high level details.

SATA vs SAS

When it gets down to the brass tacks, there are a few protocol differences that make SAS a more robust protocol. However, the primary difference between the two when it comes to performance is simple cost, not inherent betterness. It's a simple fact that a 15,000 RPM drive will provide faster random I/O performance than a 7,200 RPM drive, regardless of which one is SATA and which is SAS. It's a simple fact of the marketplace that the 15K drive will be SAS, and the 7.2K drive will be SATA.

This has a lot to do with tradition, and a healthy pinch of protocol differences.

First, tradition. Since the early days of enterprise computing on Intel hardware, the SCSI bus has reigned supreme. When the need for an interconnect able to connect more than two nodes to the same storage bus arrived, Fibre Channel won out. SCSI/FC allowed attaching many disks to a single bus, ATA allowed for... 2. The ATA spec was for, ahem, desktops. This is a mindset that is hard to break.

As for protocol differences, SAS is able to handle massive device counts much better than plain SATA can. SAS allows a single device to present multiple virtual devices to a server, where SATA can't. So in enterprise storage, especially shared storage, SAS is needed to provide the level of virtualization needed by a flexible storage environment. What's more, if a storage environment includes non-disks, such as a tape library with associated robotics, SATA has no capabilities for handling such devices. And finally, SAS has better descriptors for errors than SATA does, which improves the ability to deal with errors (depending on implementation, of course).

Combine the two, and you get a marketplace where everyone in the enterprise market has SAS already. While SATA drives can plug into a SAS backplane, why bother with SATA when you can use SAS? To my knowledge there is nothing stopping the storage vendors from offering 10K or even 15K SATA drives; I know people who'd love to use those drives at home. 10K SATA drives exist, the Western Digital VelociRaptor, though I only know of one maker marketing them.

The storage makers impose the performance restriction. Disks can be big, fast, cheap, or error-free; and you can only pick two of these attributes. SATA drives are, in general, aimed at the desktop and mid-market storage markets. Desktop class drives are Big and Cheap. Mid-market storage class drives are Big and Error Free. Enterprise and High Performance Computing drives are Fast and Error Free. The storage vendors have decided that SATA=Big, and SAS=Fast. And since FAST is what determines your random I/O performance, that is why SAS beats out SATA in random I/O.

Now you know.

New RAID

If you're reading this blog, I'm assuming you know WTF RAID1 and RAID5 are, and can make a solid guess about RAID10, RAID01, and RAID50. In case you don't, the quick run-down:

  • RAID10: A set of mirrored drives which are then striped.

  • RAID01: A set of striped drives, which are then mirrored.

  • RAID50: A set of discrete RAID5 sets, which are then striped.

The new kid on the block, though not on the RAID spec-sheets, is RAID6. RAID6 is RAID5 with a second parity stripe-set. This means it can survive a double disk failure. You didn't really see RAID6 much until SATA drives started to penetrate the enterprise storage marketplace, and there is a good reason for that.

Back when SATA started showing up in enterprise markets, the storage makers hadn't get managed to get the build quality of their SATA drives to the standards they'd set for their SCSI drives. Therefore, they failed more often. And as the SATA drives were a lot bigger than their SCSI and FC brethren, a higher error rate meant a vastly higher chance of an error during a RAID5 rebuild. Thus, RAID6 entered the marketplace as a way to sell SATA drives and still survive the still-intrinsic faults.

These days SATA drives aren't as bad as they used to be, but the storage vendors are still sensitive to sacrificing the 'Cheap' from their products in the quest for lower error rates. The reason the mid-market drives are 'only' at 750GB while the consumer grade drives are hitting 2TB is that very error rate problem. An 8 drive RAID5 array of those consumer-grade 2TB disks give a niiiice size of 14TB, but the odds of it being able to rebuild after replacing the bad drive are very, very small. A 20 drive RAID5 array of 750GB mid-market drives (yes, I know a 20 drive RAID5 array is bad, bear with me please) gives the same size, but has a far higher chance of surviving a rebuild.

Storage Area Networking

Storage Area Networks came into the Intel datacenter on the heels of clustering, and soon expanded to include storage consolidation. A SAN allows you to share storage between multiple hosts, which is a central part of clustering for high availability. It also allows you to have a single storage array provide storage to a number of independent servers, allowing you to manage your storage centrally as a sharable resource rather than a static resource assigned to each server at time of purchase.

Fibre Channel was the technology that really launched the SAN into the wide marketplace. Earlier technologies allows sharing between two servers (SCSI), or involved now-obscure interconnects borrowed from Mainframes. Fibre Channel managed to hit the right feature set to make it mainstream.

Fibre Channel had some nice ideas that have been carried forward into the newer iSCSI and SAS protocols. First and foremost, FC was designed for storage from the outset. The packet size was specified with this in mind, and in-order arrival was a big focus. FC beat the pants off of Ethernet for this kind of thing, which showed up during the early iSCSI years. What's more, FC was markedly faster than Ethernet, and supported higher data contention before bottlenecking.

The Ethernet based iSCSI protocol came about as a way to provide the benefits of a SAN without the eye-bleeding cost-per-port of Fibre Channel. The early years of iSCSI were somewhat buggy. The reason for which can be summarized by a quote from a friend of mine who worked for a while building embedded firmware for iSCSI NICs:

"An operating system makes assumptions about its storage drivers. All data will be sent and received in order. They don't handle it well when that doesn't happen, some are worse than others. So when you start basing your storage stack on a network [TCP/IP] that has out-of-order arrival as an assumption, you get problems. If you absolutely have to go iSCSI, which I don't recommend, go with a hardware iSCSI adapter. Don't go with a software iSCSI driver."

This was advice from over five years ago, but it is illustrative of the time. These days operating systems have caught up to the fact that storage I/O can arrive out of order, and the software stacks are now much better than they were. In fact, if you can structure your network for it (increasing the MTU of your Ethernet from the standard 1500 bytes to something significantly larger than 4096 bytes, also known as 'jumbo frames') iSCSI provides a very simple and cheap way of getting the benefits of Storage Area Networking.

Taking the same idea as iSCSI, make a SAN out of cheaper parts, Serial Attached SCSI is doing a lot of it. Unlike Fibre Channel, SAS is strictly a copper-based cabling system which restricts its distance. Think of it as a rack-local or row-local SAN. However, if you want a lot of storage in a relatively small space, SAS can provide. SAS switches similar to Fibre Channel switches are already on the market and able to connect multiple hosts to multiple targets. Like Fibre Channel, SAS also can connect tape libraries.

A new thing has been coming for a while called Fibre Channel over Ethernet, or FCoE. Unlike iSCSI, FCoE is not based on TCP/IP, it is an Ethernet protocol. The prime benefit of FCoE is to do away with the still very expensive-per-port Fibre Channel ports and use a standard Ethernet port. It will still require some enhancements on the Ethernet network, part of why it has been taking this long to ratify is standardizing what exactly needs to be done, but should be markedly cheaper to implement and maintain than traditional Fibre Channel. Unsurprisingly, Cisco is very into FCoE and Brocade somewhat lukewarm.


Know your I/O: Access Patterns

Know your I/O: The Components

Know your I/O: The Technology

Know your I/O: Caching

Know your I/O: Putting it together, Blackboard

Know your I/O: Putting it together, Exchange 2007 Upgrade

2 Comments

does anyone have any info on whether or not apple is doing a 3rg gen iphone this summer, I want a new phone now and was waiting on getting the up-to-date iphone and just update it to the 3.0 in the summer, but i’m simply curious if it worth holding back in case a new one does come out

I have recently started a web site, the information you provide on this site has helped me tremendously. Thank you for all of your time & work.