May 2010 Archives

I'm on vacation right now, so posting is light. However, that still isn't stopping my subconscious from bringing work here. Therefore, I NEED this vacation. Anyway.

This morning just before waking up I had a dream. I was in a car, driving, with another person. On our way to somewhere I needed to drop off a letter at the post-office. Whilst driving there my brain was trying to figure out how that would work. Post office.. port 25... obviously that would be a teller window with 'port 25' over it. Right?

Right?

Er, no. Even in my dream fogged state it realized that wasn't how it was supposed to work, so when we got there I dropped said letter into a regular old blue post-box.

Which in turn brought visions of SMTP routing and spam filters.

*headdesk*

This vacation. I neeeeeds it.

More than OpenFiler

| 8 Comments
I've received better requirements than I had before, and OpenFiler by itself doesn't meet them. The requirements are, roughly:
  • Must support both file-based and block-based storage serving.
  • Must have some kind of non-hierarchical backup capability.
  • Able to create a mirror copy of the storage in a remote location.
This distills down to:
  • Must support both iSCSI and SMB serving.
  • Must have snapshots, or some other copy-on-write technology.
  • DRBD or some other replication technology.
Since OpenFiler's SMB integration just doesn't work in our environment, I can't use just that. Also, Samba's annoying habit of requiring a smb-daemon reset to add shares makes it annoying to work with. We can't risk pissing off the Access database users (not to mention PST users) who'd be most peeved when they have to do DB recovery on their files after a reset. Nothing a little change-management can't fix, but our users are already used to instant gratification.

Another option, less free, is to use a combination of Windows Server 2008 and KernSafe iStorage. It has the features we need, and the entire environment is still cheaper per GB than the other storage options we already have.

A second potential is the combination of OpenFiler in pure iSCSI mode and then a Windows Server 2008 instance in the ESX cluster to front-end iSCSI storage for SMB sharing. This has its problems as well, as filers are memory hungry, and we're currently bandwidth-constrained in the ESX cluster right now (this is changing, but we're still a month or two out from fixing that). Once you amortize utilized resources for this ESX-based filer you get a price that's pretty close to the KernSafe/Windows combo if not a bit more expensive.

I'm open to other ideas, but in the mean time KernSafe's free option has enough of the right features that I can at least test the thing.

A network problem

| 1 Comment
I have a server attempting to talk SMTP to our internal smart-host. But it seems our hardware load-balancer is getting in the way. When sniffing the switch-port the server is on, the  conversation goes like this:

Server -> Mailer [SYN]
Mailer -> Server [SYN, ACK]
Server -> Mailer [Ack]
Mailer -> Server [RST, ACK]
[3 seconds pass]
Mailer -> Server [SYN, ACK]
Server -> Mailer [RST]
[6 seconds pass]
Mailer -> Server [SYN, ACK]
Server -> Mailer [RST]

What's going on here?

Well, the first three packets are the classic TCP 3-step handshake. The Mailer then issues a Acknowledge-Reset packet, which shuts down the conversation. Then things get weird. Three seconds pass, and the mailer retransmits the second packet. The Server, having shut down the TCP conversation normally like it was told to in the 4th packet, just issues a RESET packet telling the sender there is no connection to ACK and to stop trying. This repeats 6 seconds later.

So how did the Mailer forget it had torn down the TCP connection? That is the mystery. I haven't had a chance to get a sniffer on the Mailer side of things yet, so I'm not certain what it's seeing. It could be the load-balancer is throwing a fit, and the follow-on packets at 3 and 6 seconds are from the Mailer server itself somehow.

Strange things.

Worst-case thinking

Worst-case thinking is something that Sysadmins are kind of prone to. We all know what level of disaster would cause us to lose everything, and it's not a good feeling. At my last job I was asked once what my worst-case scenario was. And it was a truck-bomb in the wrong spot that would cause our datacenter to suddenly drop a few floors, as well as do serious damage to most of our offices (and note, this was asked AFTER 9/11).

Fixing that was easy, don't allow traffic on that road. But that wasn't an option for us. So we just lived with it.

Having been around enough people worrying about this, the thinking goes that if we mitigate the worst-case we also mitigate the bad-cases too. Let's take a look at this, shall we?

If we HAD been able to stop traffic on that road, it would have done nothing for certain other just as costly incidents. A direct hit by a tornado would render the building structurally uncertain for a week or two as the engineers assessed its soundness, and that would cost us quite a lot thank you. A sprinkler release on the floor above the datacenter could cause water to fall into the datacenter, which would be bad. A fire on the same floor as the DC would cause a sprinkler release in the datacenter (no FM-200 system there!) and short a bunch of stuff out. None of this would have been mitigated by stopping traffic on that one road.

WWU is the kind of enterprise where physical presence is required for most of our business. The kind of disaster that would limit our ability to teach while not also affecting our classrooms themselves limits the kind of disaster to plan for. As it happens, cutting two fiber runs would stop most network-based instruction, so that's the disaster we plan for. This building sinking into the bog it was built on is... a dark fantasy, and only likely in the kind of earthquake that'd also do serious damage to campus itself.

So yes. Good risk-management involves looking at the probable risks, not the worst-case risks and hoping good overall coverage inherits from that.

A note of small interest

| 2 Comments
Today I discovered that Movable Type, the blogging software I'm using, as a CAS Plugin. The only reason I care is because WWU uses CAS as our single-sign-on solution for any web-service we provide (that in turn can be CASified). Awww.
I've been playing around with OpenFiler the last week. It seems to fit our need for a free-to-us software package that allows us to serve both CIFS and iSCSI from the same host, in an easy to manage package. I haven't done much serious testing with it, but I have done enough to get a feel for how it works.

One thing is pretty clear, if we domain this thing certain UI elements become unusable due to timeouts building the page. Because we have so many groups in our AD tree, and the fact that it has to list every single group in the system in one big pick-list on the Share Permission screen, that page takes a very very long time to load. Long enough that it won't show the network-based permissions dialog at the bottom of the page, and is critical for enabling CIFS sharing in the first place. Unless I can find a tweak somewhere, that's a pretty serious road-block for CIFSiness.

iSCSI, on the other hand, just flies like a dream. I haven't had a chance to try out real complexity with it, I lack enough servers with GigE NICs capable of an MTU larger than 1500b that can be used for testing, so I can't say how robust it is. But I can say that I can saturate the GigE NIC in the OpenFiler box.

This does suggest a solution, though. We'll need another (%!#$!) server, upon which we'll install a Winders of some flavor and use an iSCSI presentation for the storage. Or, if we feel like we need more hand-holding in our lives, a Linux box of some flavor and hand-roll the Samba config needed.

This thing can also do NFS, but we have limited demand for that. The same for 1980's style FTP. There is also a WebDAV option, but I shiver at the notion of turning that on; the WebDAV setup in our existing file-cluster has already caused enough hair loss thank you.

It can also do snapshots. Since this thing is Linux based, these are LVM-level snapshots. That could be useful.

File-systems are restricted to Ext3 and XFS, which is good to a point. These are not the filesystems you want for multi-million-file shares. However, if all you want is bulk storage for disk images, they're just peachy. Or a departmental share space (hundreds of thousands of files). Neither of these are terribly great at handling the "bajillion files in one directory" problem, but we have few of those as it is.

But if we can't figure out a way to make the CIFS sharing useful, file-system choice is mostly moot.

Anyway, more testing!

Not all Cat 5 is Cat 5

| 4 Comments
Yesterday I had a need for a 40 foot ethernet cable. I needed to connect two switches in different rack-rows, and needed to follow the cable-trays. That's not a size we use, well, ever, so finding one took some doing. But find one I did, and I rejoiced. It even had "Cat 5" on the side of it, as well as a slightly disturbing "100Mb" marking which just shows how old this particular cable was.

I got it chased through the floors, run through both racks, and connected the switches. Nada. No blinkenlights. I connected to the console port on one switch to see if it was somehow old enough that it had to be told about a crossover cable, but no dice. Nothing, at all, in the port diagnostics either.

Head scratching I looked at the cable end. And found that this particular cable only had 2 pairs of wires connected to the plug. Not 4. 2.

*headdesk*

I don't know where this thing was used before, but that ain't Ethernet. Or at least one I can use.

And I still need a 40ft cable.

Screen size

XKCD made an observation last week:



I find it impressive because I can sit on my couch and watch well detailed images, not upright in my dining room chair at the laptop, or staring down at the iphone on the table or in my lap. Apparent pixel size makes a big difference here.

Let's take a look at the 24" Apple Cinema display. It has a 1920x1200 native resolution. If you put that at the arms-length recommended by ergonomicists, it's 28" away for me. What does that mean? It means an apparent horizontal pixel width of 0.022274 degrees, that's what. And the math:

Actual horizontal width per pixel = 1920 / 20.9" (the actual width of the screen) = 0.010885"
Distance to that pixel = 28"
Angle = tan-1 ( 0.010885 / 28 ) = 0.022274 degrees

Lets say I have a 42" HD-TV at home that sits 8 feet from the couch. That's a 1920 horizontal resolution at 96 inches. Giving an apparent horizontal pixel width of 0.011376 degrees, markedly smaller than the Apple display at 28".

An iPhone at 12 inches has an apparent pixel size of 0.034817 degrees. Just so you know.

Generally speaking, smaller apparent pixel sizes allow you to cram more detail into a given viewing angle. However, there are limits here. As human eyeballs age, their ability to distinguish very fine detail fades; a 16yo may be perfectly happy with a 1920x1200 monitor and 9pt type, but their 60-something grand-parents most definitely won't be and their parents would have to squint hard. 

When I was hunting for an HD-TV I found a few articles describing how far away from the TV you had to be to tell the difference between 720p and 1080p. What that distance was depended on two things; how old the viewer was, and how wide the screen was. For a 42" TV at 96 inches, only 5 year olds can tell the difference between 720p and 1080p. If they're on the couch that is, and not parked on the floor 3 feet from the TV.

For comparison, a 720p 42" TV at 96" gives an apparent pixel size of 0.017066 degrees. A 1080p panel with the same apparent pixel size would have to be 54.9 inches wide (a 63" panel).

I've read reports of display makers showing off HD+ TV displays with in excess of 2000 vertical pixels. These aren't really commercially available in no small part due to the lack of media available at that resolution, but also the fact that the panels would have to be very large indeed for the average consumer to notice a difference from 1080p in their actual living rooms. Eyeballs are so limiting.

So I am impressed with HD TV, even though I've been a daily user of display tech capable of more detail for years before I got one. Scaling displays up to that size takes work, and getting them affordable to the likes of me takes very high manufacturing quality. The fact they've done it is woot-worthy.

But don't get me started on 3D.