January 2013 Archives

Text-mode email preferences

By SysAdmin1138 on January 29, 2013 10:15 AM | 1 Comment

Or, those bloody pine/mutt users.

In my experience there are two largish pools of residual ASCII-only email users:

People who first got on email back when ASCII-only was the only option and have never bothered to update.
People who first got on email back when ASCII-only was the only option, and vehemently hate HTML in email (it's dangerous!) and actively resist changing their reading mode.

This is one of the areas where sysadmins tend to show off their inherent conservatism, though I've noticed that this twitch only extends to sysadmins who were around back in the ASCII-only Internet (very roughly, anyone born after 1980 is not likely to be such a person).Â A lot of us considered our preferences vindicated in the wake of such memorable events as Melissa and Anna Kournikova that completely broke whole email systems, as text-mode users were effectively immune to them(1). We may have a reputation as gadget fiends, but a sizable percentage of us only use GUIs when absolutely unavoidable (Windows Core was made for these people).

With the advent of ubiquitous webmail and the cost of doing email right increasing every year forcing ever larger companies to outsource their email handling to Google or Microsoft, the HTML-in-email boat has sailed. Screen readers, and outright mail readers, for the blind used to be an argument against HTML in mail but even those can handle it in these advanced-computing times. Sure, things like bullets are hard to read, but then, text-mode bullets are just as hard:

The problems from this weekend were:

 * Too many cooks in the kitchen.
 * One too many recipe books. More than one in fact.
 * An out-dated copy of the recipe in someone's hands.
   * This resulted in too little salt.
   * Frank refuses to ever eat anything we make again.
 * Who forgot the plates??

Where the problems come in is using advanced formatting, such as you find in all of those 'newsletters' you get whenever you sign up for a site for any reason. Those are barely intelligible to such readers. Mail with bolding, italics, underlines, lists and indents is quite machine-readable now.

This brings up another sub-set, the text-rendering preference; the mail-reader is quite able to handle HTML but is set to display the plain-text part if one exists. This is most commonly experienced in organizations that allow email "stationary". In my experience, these are not HTML-deniers by in large, they just hate crimes against good formatting-sense(2).

As with anything, the choices you make sometimes get wrapped up in the perception of your own identity. I saw this in my Novell days, I'm seeing some of it now in some Windows users, and I've been seeing it in myself when it comes to Linux usage. By now, continued use of text-mode mail-readers are the result of laziness or identity. As a result, they're not going to change without a lot of convincing.

(1): In my case, I was working for a company using Novell GroupWise which broke the propagation vector. Still HTML-in-mail, but also dodged those bullets. The smug in Novell circles was mighty those weeks.

(2): Which I understand. I was one for ages.

A new StackExchange: The Workplace

By SysAdmin1138 on January 27, 2013 5:08 PM

http://workplace.stackexchange.com/

It's in beta, not released, but kinda right up my alley. I've been giving office-politics advice across the dinner table, conference table, beers, hallway, and the water-cooler for a long time now, and now they have a SE site where that kind of advice is right at home. Awesome. Their scope isn't well defined yet, but this is the kind of place to ask questions that are manifestly off-topic on ServerFault.

Questions like "what methods are useful for convincing a skeptical management that this technical project needs funding?" I've been fighting that fight since 2001, so I have a variety of methods I've used to climb that mountain. It's also the kind of fight that Senior Sysadmins get to have, but can't ask about on SF.

I have hopes for this site.

I also have fears. There are two diametrically opposed forces facing this site:

It's aimed at Professionals, who need help navigating workplace issues.
The internet isn't as anonymous as it once was, and higher-ups don't take their underlings airing the company dirty laundry on the Internet very well.

The second point rings rather true for me. It is very much not hard to trace this handle back to my real name, and my LinkedIn profile to get a short list of companies I just might be talking about if I ask a question. Once they have my nick associated with my company, suddenly I'm in Corporate Speechistan where it can mean my job if I say the wrong things.

For an example of how this can go bad, take this poor chap. He's stumbled across a co-worker who is potentially thieving company resources, but he isn't sure that's what's happening, and even if it is he doesn't know how to respond. I can say with confidence that none of the managers I've ever worked for would be pleased to see this associated with where I was working, and all of them would be saying, "You really should have come to me with that."

I'm perfectly happy giving advice to people having problems in the office, and I'm even happy to do it on a site as well indexed as the SE sites are. I just won't be asking questions there.

Which is unfortunate. According to their Area51 stats, they're doing excellently well in every metric but question-rate. There is a reason for this.

Multi-disk failures: follow-up

By SysAdmin1138 on January 14, 2013 8:00 AM

By far the biggest criticism to that piece are the following two ideas.

That's what the background scan process is for. It comes across a bad sector, it reallocates the block. That gets rid of the bad block w-a-y early so you don't ever actually get this problem.

And

That never happens with ZFS. It checksums blocks so it'll even recover the lost data as it's reallocating it.

Which are both very true. That's exactly what those background scanning processes are for, to catch this exact kind of bit-rot before it gets bad enough to trigger the multi-disk failure case I illustrated. Those background processes are important.

Even so, they also have their own failure modes.

Some only run when externally initiated I/O is quiet, which never happens for some arrays.
Some run constantly, but at low I/O priority. So for very big storage systems, each GB of space may only get scanned once a month if that often.
Some run just fine, thank you; they're just built wrong.
- They only mark a sector is bad if it completely fails to read it; sectors that read just fine after the 1st or 2nd retry are passed.
- They use an ERROR_COUNTER with thresholds set too high.
- Successful retry-reads don't increment ERROR_COUNTER.
- Scanning I/O doesn't use the same error-recovery heuristics as Recovery I/O. If Recovery I/O rereads a sector 16 times before declaring defeat, but Scanning only tries 3 times, you can hit an ERROR_COUNTER overflow during a RAID Recovery you didn't expect.
Some are only run on-demand (ZFS), and, well, never are. Or are run rarely because it's expensive.

I had mentioned I had seen this kind of fault recently. I have. My storage systems use just these background scanning processes, and it still happened to me.

Those background scanning processes are not perfect, even ZFS's. It's a balance between the ultimate paranoia of if there is any error ever, fail it! and the prudence of rebuilds are expensive, so only do them when we need to. Where your storage systems fall on that continuum is something you need to be aware of.

Disks age! Bad blocks tend to come in groups, so if each block is only getting scanned every few weeks, or worse every other month, a bad spot can take a disk out well before the scanning process detects it. This is the kind of problem that a system with 100 disks faces; back when it was a 24 disk system things worked fine, but as it grew and I/O loads increased those original 24 disks aren't scanned as often and they should be.

As I said at the end of the piece this only touches on one way you can get multi-disk failures. There are others, definitely.

Is going physical, instead of cloud, a good option for startups

By SysAdmin1138 on January 11, 2013 9:10 AM

Inspired from this closed post over on programmers-stackexchange.

Pretty much, for an application that is resource intensive, is going physical a good idea for startups?

Disclaimer: The application I'm working on right now is such a 'resource intensive' application, and we've gone physical (well, more hybrid) and we're a startup. But sort of not a startup. The company founded way back in 2004ish and the application they built back then didn't have any web presence. Also, it was 2004; the Cloud wasn't even a word yet. Factor in a couple of other issues (like: the cloud vendors at the time couldn't do what we needed them to do) and physical made a lot of sense for us. The answer to this question for an already-existing company that already has a physical-based product but is looking to radically change that product will be different than for a group of four people with a big idea and some cash.

The question of whether or not to stay with what we're doing came up a lot during the building process of our new App, so I've given this issue a heck of a lot of thought. Therefore, I have Opinions about it.

There is a reason that conventional wisdom for startups is to do it all in the cloud somewhere. Several reasons, actually.

When you're building a product, it's nigh impossible to predict how much hardware you're going to need so the flexibility allowed by the cloud is extremely attractive.
If things crash and burn having all of your compute in the cloud means fewer assets to liquidate
You don't need a hardware expert, just OS-experienced people.
You don't need to find a location for the hardware.
If you manage to get a rocket-launch, you can make your infrastructure bigger by throwing money at it a lot faster than you could if you were physical-based.

All very valid reasons.

Before I get into the details, it's important to note that there are three levels of going physical:

Renting Servers from a colo or managed services provider. Which is the option the P.SE questioner was asking about.
Renting Space from a colo and filling it with your own gear. Which is what we do. But then, even Amazon goes this route.
Building your own datacenter on company property. This covers everything from my old job at WWU to Google. And probably Amazon in some regions/AZ's.

There are a few edge cases where cloud becomes less of a good idea, and the P.SE asker has one of them: the application under development works best on configs that the cloud-vendors consider unusual. For instance, if whatever you're doing requires GPU processing, the cloud options for that are rather scanty right now.

The cloud vendors provide virtual server configs for a wide range of use-cases that they believe cover most of expected needs. If whatever you're working on doesn't fit into their idea of "most", then physical begins to look attractive.

Need lots and lots of RAM but not that much CPU horse-power?

Need GPU, or multiple GPU?

Need extremely fast I/O?

Some cloud vendors can handle all of these, but the prices can be rather off-putting. But is it worth it to muck around with physical?

Buying your own servers, configuring them how you need, then shipping them off to an MSP gives you a good environment for your hardware and professional hardware techs when something goes wrong. It still costs, of course, but you get just what you need. It is entirely possible to beat the cloud costs going this route. But, you do lose the flexibility cloud gives you.

Going the route we did, rent space at a colo and install purchased hardware, is something that I'd only recommend when building your production environment. Or if funding isn't an issue, I'd recommend going there early on in the development cycle to start working on deployment issues. This is the option that does need a hardware expert. My company has me, and you can hire people like me either full-time or on a part-time basis through contracting companies. When you hit go-live, you will want full-time coverage though, so plan for that.

Going the self-hosting route is an even more complex decision to make. Plenty of startups began with a few servers in a basement or living-room of someone's house, but that doesn't scale (housemates rapidly become tired of the noise). Building a server-closet in the office (or home-office, such as those startups that began in 2004, ahem) is another route plenty of startups take, but scale issues continue; that closet is going to get hot if more than a few big servers get in there. Going this route means you're also assuming the environmental control aspects of managing computing hardware in addition to the computing hardware itself.

Which is several ways of saying:

If your case is special enough, it can overcome some of the negatives associated with the physical tiers.

Our case is just special enough, if you add in the physical infrastructure we already had.

Someone just starting out won't have that. The special will have to be more special than what we're doing. Only you can figure that out, but there are a few warning signs that you just might be that special:

Your application requires server builds that are either very expensive, or not present in the cloud providers.
The regulatory regime you're planning to operate under doesn't trust the Cloud yet.
You need extremely reliable I/O performance (though this is getting better in cloudistan)
You need extremely fast I/O performance, such as provided by SSDs.

THEN you may want to go physical!

Or embrace the power of and, go hybrid!

« December 2012 | Main Index | Archives | February 2013 »