August 2009 Archives

Fabric merges

When doing a fabric merge with Brocade gear, when they say that the Zone configuration needs to be exactly the same on both switches, they mean that. The merge process does no parsing, it just compares the zone config. If the metaphorical diff returns anything it doesn't merge. So if one zone has a swapped order of two nodes but is otherwise identical, it'll not merge.

Yes, this is very conservative. And I'm glad for it, since failure here would have brought down our ESX cluster and that's a very wince-worthy collection of highly visible services. But it took a lot of hacking to get the config on the switch I'm trying to merge into the fabric to be exactly right.

LDAP result size

| 1 Comment
It turns out that Microsoft changed how it handles the MaxPageSize value for the domain controller LDAP policy. The Server 2003 DC returns just over 34,000 objects for a certain query, but the Server 2008 DC in the same domain and therefore subject to the same LDAP policy returns around 20,000 objects. This is breaking things.

The question is: Is this a result of a new policy in Server 2008, or did Microsoft code in a new, lower max-value for this particular policy value? Whatever it is, the upper limit seems to be 20K. Don't know yet.

Trying KDE

In light of the announcement that openSUSE 11.2 and later will have KDE as the default desktop, I installed milestone 6 of 11.2 (barely beta quality, but much improved from earlier Milestones) and had a look at KDE for the first time since, oh, SLE9. I should further note that the Slackware servers I've run at home since college have had GNOME by preference pretty much since Gnome came available.

My first reaction? It probably matched what a lot of long-time Windows users had when they saw Vista for the first time and wanted to do something that wasn't run a web browser:
Dear God, I can't find anything, and none of my hotkeys work. THIS SUCKS!
In time I calmed down. I installed gnome, and lo it was good. I also noted certain settings that I needed on the KDE side. I switched back and explored some more. Found out where they kept the hotkeys. Found out where the sub-pixel hinting settings were hiding. This made my fingers and eyeballs happier.

Now that I've tried it for a while, in fact I'm posting from it right now, it's not that bad. Another desktop-dialect I can learn. I've gotten over that initial clueless flailing and have grasped the beginnings of the basic metaphor of KDE. I still prefer the Gnome side, but we'll see where I go in the future.

On databases and security

Charles Stross has a nice blog post up about the UK DNA database, database security, and the ever dropping price of gene sequencing and replication. The UK has a government DNA database of anyone ever booked by anything by the police. Because of how these things work, lots of entities have access to it for good reasons. Like the US No Fly List, being on it is seen as a black mark on your trustability. He posits some scenarios for injecting data into the DNA database through wireless and other methods.

Another thing he points out is that the gear required to reproduce DNA is really coming down in price. In the not too distant future, it is entirely possible that the organized criminal will be able to plant DNA on the scene of a crime. This could result in pranks ("How'd the Prime Minister get to Edinburgh and back to London in time to jiz on a shop window?") to outright frame jobs.

Which is to say, once DNA reproduction gets into the hands of the criminal elements, it'll no longer be a good single-source biometric identifier. Presuming of course that the database backing it hasn't been perved.

Didn't know that

The integrated network card in the HP DL380-G2 doesn't have a Windows Server 2008 driver. Anywhere. And the forum post that says you can use the 2003 driver on it lies, unless there is some even sneakier way of getting a driver in than I know of.

This is a problem, as that's one of our Domain Controllers. But not much of one, since it's one of the three DC's in the empty root (our forest is old enough for that particular bit of discredited advice) and all it does is global-catalog work. And act as our ONLY DOMAIN CONTROLLER on campus. In the off chance that a back-hoe manages to cut BOTH fiber routes to campus, it's the only GC up there.

Also, since it couldn't boot from a USB-DVD drive I had to do a parallel install of 2008 on it. So I still had my perfectly working 2003 install available. So I just dcpromoed the 2003 install and there we are!

Once we get a PCI GigE card for that server I can try getting 2008 working again.

SANS Virtualization

Mr. Tom Liston of ISC Diary fame is at the SANS Virtualization Summit right now. He has been tweeting it. I wish I was there, but there is zero chance of me convincing my boss to send me. Even if it was a year in which out of state travel was allowed.

Mostly just interesting quotes so far, but there have been a few interesting ones.

"When your server is a file, network access equals physical access" - Michael Berman, Catbird

From earlier: "You can tell how entrenched virtualization has become when the VM admin has become the popular IT scapegoat" - Gene Kim

On VMsprawl: "The 'deploy all you want, we'll right click and make more' mentality." Herb Goodfellow, Guident.

I expect to see more as the week progresses.

WINS... the Windows Internet Name Service. Introduced in, I believe, Windows NT 3.5 in order to allow Windows name resolution to work across different IP subnets. NetBIOS relies on broadcasts for name resolution, and WINS allowed it to work by using a unicast to the WINS server to find addresses. In theory, DNS in Active Directory (now nine years old!) replaced it.

Not for us.

There are two things that drive the continued existence of WINS on our network, and will ensure that I'll be installing the Server 2008 WINS server when I upgrade our Domain Controllers in the next two weeks:
  1. We still have a lot of non-domained workstations
  2. Our DNS environment is mind-bogglingly fragmented
Here is a list of domains we have, and this is just the domains we're serving with DHCP. There are a lot more:
There are more we're serving with DHCP, I just got bored making the list. The thing is, a lot of those networks, and especially the labs, contain 100% domained workstations. Since we only have the one domain, this means all those computers are in a flat DNS structure. In effect, each domained workstation on campus has two DNS names: the one on our BIND servers, and the one in the MS-DNS servers.

That said, for those machines that AREN'T in the domain the only way they can find anything is to use WINS. We will be using until the University President says unto the masses, "Thou Shalt Domain Thy PC, Or Thou Shalt Be Denied Service." Until then, WINS will continue to be the best way to find Windows resources on campus.

Legal key recovery

Remember this? About the UK's new laws stating that failing to reveal decryption codes on-demand could result in jail sentences?

Well, it happened. We have yet to see what size of rubber hose is being used, but these two are being sized up.
One of the increasingly annoying things that IT shops have to put up with is web based administration portals using self-signed SSL certificates. Browsers are increasingly making this setup annoying, and for a good reason. Which is why I try and get these pages signed with a real key if they allow me to.

HP's Command View EVA administration portal annoyingly overwrites the custom SSL files when it does an upgrade. So you'll have to do this every time you apply a patch or otherwise update your CV install.
  1. Generate a SSL certificate with the correct data.
  2. Extract the certificate into base-64 form (a.k.a. PEM format) in separate 'certificate' and 'private key' files.
  3. On your command view server overwrite the %ProgramFiles%\Hewlett-Packard\sanworks\Element Manager for StorageWorks HSV\server.cert file with the 'certificate' file
  4. Overwrite the %ProgramFiles%\Hewlett-Packard\sanworks\Element Manager for StorageWorks HSV\server.pkey file with the 'private key' file
  5. Restart the CommandView service
At that point, CV should be using your generated certificates. Keep these copied somewhere else on the server so you can quickly copy them back in when you update Command View.

Non-paid work hours

| 1 Comment
Ars Technica has an article up today about workers who put in a lot of unpaid hours thanks to their mobile devices. This isn't a new dynamic by any means, we had a lot of this crop up when Corporate web-mail started becoming ubiquitous, and before that with the few employees using remote desktop software (PCAnywhere anyone?) to read email from home over corporate dialup. The BlackBerry introduced the phenomena to the rest of the world, and the smartphone revolution is bringing this to the masses.

My old workplace was union, so was in the process of figuring out how to compensate employees for after-hours call-out shortly after we got web-mail working. There were a few state laws and similar rulings that directed how it should be handled, and ultimately they decided on no less than 2-hours overtime pay for issues handled on the phone, and no less than 4-hours overtime pay for issues requiring a site-visit. Yet, no payment for being officially on-call with a mandatory response time; it was seen that actually responding to the call was the payment. Even if being on-call meant not being able to go to a child's 3 hour Dance recital.

Now that I'm an exempt employee, I don't get anything like overtime. If I spend 36 hours in a weekend shoving an upgrade into our systems through sheer force of will, I don't automatically get Monday off or a whonking big extra line-item on my next paycheck. It's between me and my manager how many hours I need to put in that week.

As for on-call, we don't have a formal on-call schedule. All of us agree we don't want one, and strive to make the informal one work for us all. No one wants to plan family vacations around an on-call schedule, or skip out of town sporting events for their kids just so they can be no more than an hour from the office just in case. It works for us, but all it'll take to force a formal policy is one bad apple.

For large corporations with national or global workforces, such gentleman's agreements aren't really doable. Therefore, I'm not at all surprised to see some lawsuits being spawned because of it. Yes, some industries come with on-call rotations baked in (systems administration being one of them). Others, such as tech-writing, don't generally have much after-hours work, and yet I've seen second hand such after hours work (working on docs, conference calls, etc) consume an additional 6 hours a day.

Paid/unpaid after hours work gets even more exciting if there are serious timezone differences involved. East Coast workers with the home-office on the West Coast will probably end up with quite a few 11pm conference calls. Reverse the locations, and the West Coast resident will likely end up with a lot of 5am conference calls. Companies that have drank deeply from the off-shoring well have had to deal with this, but have had the benefit of different labor laws in their off-shored countries.

"Work" is now very flexible. Certain soulless employers will gleefully take advantage of that, which is where the lawsuits come from. In time, we may get better industry standard practice for this sort of thing, but it's still several years away. Until then, we're on our own.

This article was primarily aimed at K-12, which is a much different environment than higher ed. For one, the budgets are a lot smaller per-pupil. However, some of the questions do apply to us as well.

As it happens, part of our mandate is to prepare our students for the Real World (tm). And until very recently, Real World meant MS Office. We've been installing Open Office along side MS Office on our lab images for some time, and according to our lab managers they've seen a significant increase on OO usage. I'm sure part of this is due to the big interface change Microsoft pushed with Office 2007, but this may also be reflective of a shift in mind-share on the part of our incoming students. Parallel installs just work, so long as you have the disk space and CPU power it is very easy to set up.

Our choice of lab OS image has many complexities, not the least of which is a lack of certain applications. There are certain applications, of which Adobe Photoshop is but one, that don't have Linux versions yet. Because of this, Windows will remain.

We could do something like allow dual-boot workstations, or have a certain percentage of each lab as Linux stations. Hard drive sizes are big enough these days that we could dual-boot like that and still allow local-partition disk-imaging, and it would allow the student a choice in environments they can work in. Now that we're moving to a Windows environment, that actually better enables interoperability (samba). Novell's NCP client for Linux was iffy performance-wise, and we had political issues surrounding CIFS usage.

However... one of the obstacles in this is the lack of Linux workstation experience on the part of our lab managers. Running lab workstations is a constant cat and mouse game between students trying to do what they want, malware attempting to sneak in, and the manager attempting to keep a clean environment. You really want your lab-manager to be good at defensive desktop management, and that skill-set is very operating system dependent. Thus the reluctance regarding wide deployment of Linux in our labs.

Each professor can help urge OSS usage by not mandating file formats for homework submissions. The University as a whole can help urge it through retraining ITS staff in linux management, not just literacy. Certain faculty can promote it in their own classes, which some already do. But then, we have the budget flexibility to dual stack if we really want to.

Identity Management in .EDU land

We have a few challenges when it comes to an identity management system. As with any attempt to automate identity management, it is the exceptions that kill projects. This is an extension of the 80/20 rule, where 80% of the cases will be dead easy to manage, and it's the 20% that are special are where most of the business-rules meeting-time will be spent.

In our case, we have two major classes of users:
  • Students
  • Employees
And a few minor classes littered about like Emeritus Professors. I don't quite know enough about them to talk knowledgeably.

The biggest problem we have are how to handle the overlaps. Student workers. Staff who take classes. We have a lot of student workers, but staff who take classes are another story. The existence of these types of people make impossible having the two big classes as exclusive.

Banner handles this case pretty well from what I understand. The systems I manage, however, are another story. With eDirectory and the Novell Client, we had two big contexts named Students and Users. If your object was in one, that's the login script you ran. Active Directory was until recently Employee-only because of Exchange. We put the students in there (with no mailboxes of course) two years ago, largley because we could and it made the student-employee problem easier to manage.

One of the thorniest questions we have right now is defining, "when is a student a student with a job, and when is a student an employee taking classes." Unfortunately, we do not have a handy business rule to solve that. A rule, for example, like this one:
If a STUDENT is taking less than M credit-hours of classes, and is employed in a job-class of C1-F9, then they shall be reclassed EMPLOYEE.
That would be nice. But we don't have it, because the manual exception-handling process this kicks off is not quite annoying enough to warrant the expense of deciding on an automatable threshold. Because this is a manual process, people rarely get moved back across the Student/Employee line in a timely way. If the migration process were automated, certain individuals would probably flop over the line every other quarter.

This one nice example of the sorts of discussions you have to have when rolling out an identity management automation system. If we were given umpty thousand dollars to deploy Novell IDM in order to replace our home-built system, we'd have to start having these kinds of discussions again. Even though we've had some kind of identity provisioning system since the early 90's. Because we DO have an existing one, some of the thornier questions of data-ownership and workflow are already solved. We'd just have to work through the current manual-intervention edge cases.

Another nice How-To

On Novell Cool-Solutions:

Setting up Novell LDAP Libraries for C#

Another one of those things I went, "Ooh! USEFUL! Oh wait, we don't care any more. Drat. I bet I can blog that, though." So I am.

Use VisualStudio for developing applications (we did)? Need to talk to eDirectory? Why not use LDAP and Novell's tools for doing so! We've used their elderly ActiveX controls to do great things, and this should to about half of what we do with those. File manipulations will need to be another library, though.

Update: And how to set it up to use SSL. It requires Mono.

Permission differences

In part, this blog post could have been written in 1997. We haven't exactly beaten down the door migrating away from NetWare.

Anyway, there are two areas that are vexing me regarding the different permissioning models between how Novell does it, and how Microsoft does it. The first has been around since the NT days, and relates to the differences (vast differences) between NTFS and the Trustee model. The second has to do with Active Directory permissions.

First, NTFS. As most companies contemplating a move from NetWare to Microsoft undoubtedly find out, Microsoft does permissions differently. First and foremost, NTFS doesn't have the concept of the 'visibility list', which is what allows NetWare to do this:

Grant a permission w-a-y down a directory tree.
Members of that rights grant will be able to browse from volume-root to that directory. They will see each directory entry along the path, and nothing else. Even if they have no rights to the intervening directories.

NTFS doesn't do that. In order to fake it you need two things:
  • Access Based Enumeration turned on on the share (default in Server 2008, and add-on option in Server 2003)
  • A specific rights grant on each directory between the share and the directory with the rights grant. The "Read" simple right granted to "this directory only".
Unfortunately, the second one is tricky. In order to grant it you have to add an Advanced right, because the "read" simple right grants read to, "This directory, files, and subdirectories," when what you want is, "this directory only". What this does is grant you the right to see that directory-entry in the previous directory's list.

Example: if I grant the group "StateAuditors" the write access to this directory:


If I just grant the right directly on "Procedures", the StateAuditors won't be able to get to that directory by way of that share. I could just create a new share on that spot, and it'd work. Otherwise, I'll have to grant the above mentioned rights to each of DocTeam, StandardsOffice, and Accounting.

It can be done, and it can even be scripted, but it represents a significant change in thinking required when it comes to handling permissions. As most permissions are handled by our Desktop group, this will require retraining on their part.

Second, AD permissions. AD, unlike eDirectory, does not allow the permissions short-cut of assigning a right to an OU. In eDirectory, this allowed anything in that OU access to the whatever. In AD, you can't grant the permission in the first place without a lot of trouble, and it won't work like you expect even if you do manage to assign it.

This is going to be a problem with printers. In the past, when creating new print objects for Faculty/Staff printers, I'd grant the users.wwu OU rights to use the printer. As students aren't in the access list, they can't print to it unless they're in a special printer-access group. All staff can print, but only special students can. As it should be. No biggie.

AD doesn't allow that. In order to allow "all staff but no students" to print to a printer, I'd have to come up with a group of some kind that contains all staff. That's going to be too unwieldy for words, so we have to go to the 'printer access group for everyone' model. Since I'm the one that sets up printer permissions, this is something *I* have to keep in mind.

Exchange transport-rules

| 1 Comment
Exchange 2007 supports a limited set of Regular Expressions in its transport-rules. The Microsoft technet page describing them is here. Unfortunately, I believe I've stumbled into a bug. We've recently migrated our AntiSpam to ForeFront. And part what ForeFront does is header markup. There is a Spamminess number in the header:
X-SpamScore: 66
That ranges from deeply negative to over a hundred. With this we can structure transport-rules to handle spammy email. In theory, the following trio of regexes should catch anything with a score of 15 or higher:
Those of you that speak Unix regex are quirking an eyebrow at that, I know. Like I said, Microsoft didn't do the full Unix regex treatment. The "\d" flag, "matches any single numeric digit." The parenthetical portion, "Parentheses act as grouping delimiters," and, "The pipe ( | ) character performs an OR function."

Unfortunately, for reasons that do not match the documentation the above trio of regexes is returning true on this:
X-SpamScore: 5
It's the second recipe that's doing it, and it looks to be the combination of paren and \d that's the problem. For instance, the following rule:
returns true for any single numeric value, but returns false for "56". Where this rule:
only returns true for 56 and 57. To me this says there is some kind of interaction going on between the \d and the () constructors that's causing it to change behavior. I'll be calling Microsoft to see if this is working as designed and just not documented correctly, or a true bug.

The obsolecence of Word

Ars Technica had a nice opinion essay posted today called, "The prospects of Microsoft Word in the wiki-based world." In case you didn't catch it, the actual page name for the link is, "microsoft-word-1983---2008-rest-in-peace.ars". Clearly, they're predicting the death of Word as a major force.

And it isn't OpenOffice that's doing it, it's the cloud. Google Docs. MediaWiki. Anything with a RichEditor text interface. And for those things that just aren't usable in those interfaces, there are specialized tools that do that job better than Word does.

The second page of the essay goes into some detail about how the author was able to replace an old school file-server with a MediaWiki. MediaWiki, it seems, is an excellent document-management product. Most people already know how to use it (thank you Wikipedia), anything entered is indexed with the built in search tools, and there is integrated change-tracking. Contrast this with a standard File Server, where indexing is a recent add-on if it exists at all, change tracking is done at the application level if at all, and files just get lost and forgotten. MediaWiki just does it better.

I never expected, "MediaWiki is the Word killer," to be made as an argument, but there are some good points in there. I do very little editing in any word processor at work. I do much more spreadsheet work, as that's still a pretty solid data manipulation interface. Tech Services has a Wiki now, and we're slooooly increasing usage of it.

And yet, there are still some areas of my life where I still use a stand-alone word processor. If I really, truly need better type-setting than can be provided by javascript and CCS driven HTML, a stand-alone is the only way to go. If I'm actually going to print something off, perhaps because I have to fax it, I'm more likely to use a word processor. There are some cultural areas where solidly type-set documentation is still a must; wedding invitations, birth announcements, resumes, cover letters. And even these are going ever more electronic.

The last time I seriously job-searched (back in 2003) I spent hours polishing the formatting of my resume. Tweaking margins so the text would flow cleanly from one page to the next. Picking a distinctive yet readable font. Fine tuning the spacing to help fit the text better. Inserting subtle graphic elements line horizontal lines. Inserting small graphics, such as my CNE logo. In the end I had a fine looking document! I even emailed it to HR when I applied. The cover letter got much the same treatment, but less focus on detailed formatting.

If I were to start looking today, it is vastly more likely that I'd attach the document (a PDF by preference, to preserve formatting, but DOC is still doable) to an online job application submission system of some kind. Or worse yet, be presented a size-limited ASCII text-entry field I'd have to cut-and-paste my resume into. The same would go for the cover letter. One of these two still encourages finely tuned type-setting like I did in 2003. The other explicitly strips everything but line feeds out.

Even six years ago there was no actual paper involved.

So I'll close with this. If you need typesetting, which is distinct from text formatting, then you still need offline tools for processing words. This is because you're doing more than simple word processing, you're also processing the format of it all. But if all you're doing is bolding, highlighting, changing text sizes, and creating the odd table, then the online tools as they exist now are well and truly all you need. It has been a SysAdmin addage for years that most people could use WordPad instead of Word for most of what they do, and these days everything WordPad can do is now in your browser.

Robust NTP environments

Due to my background as a NetWare guy, time-synchronization is something I pay attention to. Early versions of NDS were touchy about that, since the time-stamp was used in the conflicting-edits resolution process. NetWare didn't use a full up NTP client for this, Novell built their own form of it based on NTP code and called it TimeSync. Unlike NTP, TimeSync did what it could to ensure the entire environment was within a second or two of a single time. Because of the lower time resolution, it synced a lot faster than NTP did, and this was considered a good thing since out-of-sync time was considered an outage.

With that in mind, it is no surprise that I like to have a solid time-sync process in place on my networks. One of the principles of Novell's TimeSync config was the concept of a time-group. A group of servers who coordinated time between themselves, and a bunch of clients who poll the members of the time-servers for correct time. Back before internet connections were as ubiquitous as air, this was a good way for an office network to maintain a consensus time. Later on, TimeSync gained the ability to talk over TCP/IP, and could use NTP sources for external time, and this allowed TimeSync to hook into the universial time coordinated (UTC) system.

You can create much the same kind of network with NTP as you could with TimeSync. It requires more than one time server, but your clients only have to directly speak with one of the time servers in the group. Yet the same type of robustness can be had.

The concept is founded in the "peer" association for NTP. The definition of this verb is rather dry:
For type s addresses (only), this command mobilizes a persistent symmetric-active mode association with the specified remote peer.
And doesn't tell you much. This is much clearer:
Symmetric active/passive mode is intended for configurations were a clique of low-stratum peers operate as mutual backups for each other. Each peer operates with one or more primary reference sources, such as a radio clock, or a set of secondary (stratum, 2) servers known to be reliable and authentic. Should one of the peers lose all reference sources or simply cease operation, the other peers will automatically reconfigure so that time and related values can flow from the surviving peers to all hosts in the subnet. In some contexts this would be described as a "push-pull" operation, in that the peer either pulls or pushes the time and related values depending on the particular configuration.
Unlike TimeSync, if all the peers lose their upstreams (the internet connection is down) then the entire infrastructure goes out of sync. This can be mitigated somewhat through judicious use of the 'maxpoll' parameter; set it high enough, and it can be hours (or days if you set it really high) before the peer even notices it can't talk to its upstream and will continue to report in-sync time to clients.

It is also a very good idea to use ACLs in your ntp.conf file to restrict what types of connections clients can mobilize. It is quite possible to be evil to NTP servers. You can turn on enough options to allow trouble-shooting, but not allow config changes.

It is a very good idea for your peers to be cryptographically associated with each other as well. There are at least two methods for this with NTP, v3's autokey, and v4's symmetric key. Autokey is a somewhat easier to set up preshared-key system, symmetric key is more secure, either is more preferable to nothing.

Here is a pair of /etc/ntp.conf files for a hypothetical set of WWU time-servers (items like drift-file and logging options have been omitted):
server maxpoll 13
server maxpoll 13
peer key 1

enable auth monitor
keys /etc/ntp.keys
trustedkey 1
requestkey 1

restrict default ignore
restrict mask nomodify nopeer
server maxpoll 13
server maxpoll 13
peer key 1

enable auth monitor
keys /etc/ntp.keys
trustedkey 1
requestkey 1

restrict default ignore
restrict mask nomodify nopeer

The 'maxpoll' values ensure that once time has been synchronized for long enough, the time between polls of the upstream NTP servers will be 137 minutes. Hopefully, any internet outages should be less then that. Setting max-poll to even higher values will allow longer times between polling intervals, and therefore longer internet outage tolerance. This can get QUITE long, I've seen some NTP servers that poll twice a week.

The key settings set up an Autokey-style crypto system. The "key 1" option on the peer line indicates that the designated connection should use crypto validation. The actual data passed isn't encrypted, the crypto is used for identity validation. This prevents spoofing of time, which can lead to wildly off time values.

The 'restrict' lines tell the NTPD to ignore off campus requests for time (it'll still listen, but return access-denied to all requests), allow on-campus users to get time and do time tracing but nothing else, and allow full access to the peer time server. In theory, inbound NTP traffic should be stopped at the border firewall but just in case it'll deny any that get through.

This is a two server setup, but three or more server could easilly be involved. For a network our size (large) and complexity (simple), two to three time-servers is probably all we need. The peered time-servers will all report in-sync so long as one still considers itself in-sync with an upstream time-server.

Because peers sync time amongst themselves, clients only have to talk to a single time-server to get valid time. Of course, that introduces a single-point-of-failure in the system if that time-host ever has to go down. Because of this, I strongly recommend configuring NTP clients to use at least two upstreams.

Enjoy high quality time!