Wednesday, October 22, 2008

An old theme made new

Yesterday on Slashdot was a link to an article that sounds a lot like one I published two years ago tomorrow. The main point in the article is that due to the unrecoverable-read-error rate in your standard SATA drive (10^14 bits, or 12.5TB), and the ever increasing sizes of SATA drives means that Raid 5 arrays can get to 12.5TB pretty quickly. Heck, high-end home media servers chock full of HD content can get there very fast.

While it doesn't say this in the specs page for that new Seagate drive, if you look on page 18 of the accompanying manual you can see the "Nonrecoverable read error" rate of the same 10^14 as I talked about two years ago. So, no improvement in reliability. However.... For their enterprise-class "Savvio" drives, they list a "Nonrecoverable Read Error" rate of 10^16 (1 in 1.25PB), which is better than the 10^15 (125TB) they were doing two years ago on their FC disks. So clearly, enterprise users are juuuust fine for large RAID5 arrays.

As I said before, the people who are going to be bitten by this will be home media servers. Also, whiteboxed homebrew servers for small/medium businesses will be at risk. So those of you who have to justify buying the really expensive disks, when there are el-cheepo 1.5TB drives out there? You can use this!

Labels: , ,


Tuesday, September 23, 2008

That darned 32-bit limit

Today I learned that the disk-space counters NetWare provides in SNMP use signed integers for its disk-space monitoring. These are stats published at a table at OID .1.3.6.1.4.1.23.2.28.2.14.1.3. Having just expanded our FacShare volume past 2TB, it went negative-space according to the monitors. A simple integer overflow since apparently Novell is using a signed integer for a number that can never be legitimately negative.

I've pointed this out on an enhancement request. This being NetWare, they may not chose to fix it if it is more than a two-line fix. We'll see.

This also means that volumes over 4TB can not be effectively monitored with SNMP. Since NSS can have up to 8TB volumes on NetWare, this could potentially be a problem. We're not there yet.

Labels: , ,


Monday, October 23, 2006

A needed bit of info about SATA drives

Anandtech ran an article recently about enterprise storage. In it they go over SATA vs. SCSI vs. SAS. Most of it I already knew, but towards the back was a kernel of information that I hadn't caught before.

We know that generally speaking SATA drives can't quite keep up to the same kind of workloads that SCSI can. Differences in the manufacturing process, quality control, and the like. I don't fully understand it, which irks me, but there it is. One of those areas is something called 'nonrecoverable read error' rate.

Take a look at this Seagate drive. It's almost the last thing on the spec page. The Nonrecoverable Read Error rate is 1 bit in 1014 bits, or 1 bad read in 12.5TB. Mainline SCSI and FC drives have that error rate as high as 10 to the 15th or 16th. Every 12.5TB of data transferred includes a corrupted bit.

We don't see this as a problem in most enterprise situations because they all run in some form of redundant array setup. RAID5 drivers, usually in the RAID controller, see the bad bit and go to the parity data to fill in the real value. RAID1 drivers go to the mirror. No biggie. The problem comes with RAID5 rebuilds, when the entire array is read in order to generate the parity data. If you have 14 500GB drives in your RAID5 array, that means during a rebuild you transfer around 7TB of data. If a bad bit shows up during the rebuild process, a 56% chance, game over. That's a from-tape rebuild.

This is why systems such as RAID6 are showing up. That's a double parity system, so rebuilding one bad disk does not risk the whole array if a nonrecoverable read error occurs. You lose two disks to parity, but you can still have a 30-disk array without much risk.

One more reason why SATA isn't quite ready for realtime data applications. Nearline, yes, but not realtime. This'll play hob with our ideas for our BCC cluster.

Tags:

Labels:


This page is powered by Blogger. Isn't yours?