April 2009 Archives

Conflicting email priorities

As mentioned in the Western Front, we're finally migrating students to the new hosted Exchange system Microsoft runs. They've since changed the name from Exchange Labs to OutlookLive. It has taken us about two quarters longer than we intended to start the migration process, but it is finally under way.

Unfortunately for us, we got hit with a problem related to conflicting mail priorities. But first, a bit of background.

ATUS was getting a lot of complaints from students that the current email system (sendmail, with SquirrelMail) was getting snowed under with spam. The open-source tools we used for filtering out spam were not nearly as effective as the very expensive software in front of the Faculty/Staff Exchange system. Or much more importantly, were vastly less effective than the experience Gmail and Hotmail give. Something had to change.

That choice was either to pay between $20K and $50K for an anti-spam system that actually worked, or outsource our email for free to either Google or Microsoft. $20K.... or free. The choice was dead simple. Long story short, we picked Microsoft's offering.

Then came the problem of managing the migration. That took its own time, as the Microsoft service wasn't quite ready for the .EDU regulatory environment. We ran into FERPA related problems that required us to get legal opinions from our own staff and the Registrar relating to what constitutes published information, which required us to design systems to accommodate that. Microsoft's stuff didn't make that easy. Since then, they've rolled out new controls that ease this. Plus, as the article mentioned, we had to engineer the migration process itself.

Now we're migrating users! But there was another curveball we didn't see, but should have. The server that student email was on has been WWU's smart-host for a very long time. It also had the previously mentioned crappy anti-spam. Being the smart-host, it was the server that all of our internal mail blasts (such as campus notifications of the type Virginia Tech taught us to be aware of) relayed through. These mail blasts are deemed critical, so this smart-host was put onto the OutlookLive safe-senders list.

Did I mention that we're forwarding all mail sent to the old .cc.wwu.edu address to the new students.wwu.edu address? The perceptive just figured it out. Once a student is migrated, the spam stream heading for their now old cc.wwu.edu address gets forwarded on to OutlookLive by way of a server that bypasses the spam checker. Some students are now dealing with hundreds of spam messages in their inbox a day.

The obvious fix is to take the old mail server off of the bypass list. This can't be done because right now critical emails are being sent via the old mail server that have to deliver. The next obvious fix, turn off forwarding for students that request it, won't work either since the ERP system has all the old cc.wwu.edu addresses hard-coded in right now and the forwards are how messages from said system get to the students.wwu.edu addresses. So we geeks are now trying to set up a brand new smart-host, and are in the process of finding all the stuff that was relaying through the old server and attempting to change settings to relay through the new smart-host.

Some of these settings require service restarts of critical systems, such as Blackboard, that we don't normally do during the middle of a quarter. Some are dead simple, such as changing a single .ini entry. Still others require our developers to compile new code with the new address built in, and publish the updated code to the production web servers.

Of course, the primary sysadmin for the old mail-server was called for Federal jury-duty last week and has been in Seattle all this time. I think he comes back Monday. His grep-fu is strong enough to tell us what all relays through the old server. I don't have a login on that server so I can't try it out myself.

Changing smart-hosts is a lot of work. Once we get the key systems working through the new smart-host (Exchange 2007, as it happens), we can tell Microsoft to de-list the old mail-server from the bypass list. This hopefully will cut down the spam flow to the students to only one or two a day at most. And it will allow us to do our own authorized spamming of students through a channel that doesn't include a spam checker. Valuable!

Windows 7 RC is out

And they're saying that Win7 will ship well before the Jan-2010 Vista timeframe mentioned before. Well, we kind of expected that. We've also been doing a lot with our network to make it more Win7 (and Vista) friendly, since we know we'll get a LOT of Win7 once it shows up for real.

The biggest concern is that Microsoft still hasn't fixed the issue that makes the Novell Client for Vista so darned slow. This is a major deal-breaker for us, so we've been informed from on high to Do Something so our Vista/Win7 clients can have fast file-serving and printing.

That "Something" has been to turn on the CIFS stack on our NetWare servers, with domain integrated login. The Vista and Win7 clients will have to turn their LanMan Authentication Level from the default (and secure) setting of, "Send NTLMv2 Response Only" to at most, "Send NTLM Response Only." The NetWare CIFS stack can't handle NTLMv2, nor will it ever. Those people who have been suffering through the NCV get downright bouncy when they see how fast it is.

Printing... we'll see. A LOT of the printing in fac/staff land is direct-IP which has no Novell dependencies. There are a few departments out there that have enough print volume that a print-server is a good idea, so I'm hoping there is an iPrint client for Win7 out pretty fast.

All in all, we're expecting uptake of Win7 to be a lot faster than Vista ever was. In this sense Win7 is a lot like Win98SE. All the press saying that Win7 is a lot better than Vista will help drive the push away from WinXP.
I went to LinuxFest Northwest this weekend. It was interesting! OpenSUSE, Ubuntu, Fedora, and FreeBSD were all there and passing out CDs. I learned stuff.

In one of the sessions I went to, "Participate or Die!" the question was asked of the presenter, a Fedora guy, about whether he is seeing any change in the gender imbalance at Linux events. He hemmed and hawed and said ultimately, 'not really'. I've been thinking about that myself, as I've noticed a similar thing at BrainShare.

Looking at what I've seen in amongst my friends, the women ARE out there. They're just not well represented in the ranks of the code-monkeys. Among closed source shops, I see women many places.

I have known several women involved with technical writing and user-factors. In fact, I don't know any men involved in these roles. Amongst all but the largest and well funded of open-source projects, tech-writing is largely done by the programmers themselves or by the end-user community on a Wiki. Except for the Wiki, the same can be said for interface design. As the tech-writers I know lament, programmers do a half-assed job of doc, and write for other programmers not somewhat clueless end-users. At the same time, the UI choices of some projects can be downright tragic for all but fellow code-monkeys. This is the reason the large pay-for-it closed-source software development firms employ dedicated Technical Writers with actual English degrees (you hope) to produce their doc.

I've also known some women involved with QA. In closed-source shops QA is largely done in house, and there may be a NDA-covered beta among trusted customers towards the end of the dev-cycle. In the land of small-project open-source, QA is again done by developers and maybe a small cadre of bug-finders. The fervent hope expectation is that bug-finders will occasionally submit a patch to fix the bug as well.

I also know women involved with defining the spec for software. Generally this is for internal clients, but the same applies elsewhere. These are the women who meet with internal customers to figure out what they need, and write it up as a feature list that developers can code against. These women also frequently are the liaison between the programmers and the requesting unit. In the land of open-source, the spec is typically generated in the head a programmer who has a problem that needs solving, who then goes out to solve it, and ultimately publishes the problem-resolution as an open source project in case anyone else wants that problem solved too.

All of this underlines one of the key problems of Linux that they've been trying to shed of late. For years Linux was made by coders, for coders, and it certainly looked and behaved like that. It has only been in recent years when a concerted effort has been made to try and make Linux Desktops look and feel comprehensible to tech-shy non-nerds. Ubuntu's success comes in large part due to these efforts, and Novell has spent a lot of time doing the same through user-factors studies.

Taking a step back, I see women in every step of the closed-source software development process, though they are underrepresented in the ranks of the code-monkeys. The open-source dev-process, on the other hand, is almost all about the code-monkey in all but the largest projects. Therefore it is no surprise that women are significantly absent at Linux-conventions.

I've read a lot of press about the struggles Computer Science departments have in attracting women to their programs. At the same time, Linux as an ecosystem has a hard time attracting women as active devs. Once more women start getting degreed, or not scared off in the formative teen years which is something Linux et. al. can help with, we'll see more of them among the code-slingers.

Something else that might help would be to tweak the CompSci course offerings to perhaps partner with the English departments or Business departments to produce courses aimed at the hard problem of translating user requirements into geek, and geek into manuals. Because the Software Engineering process involves more than just writing code in teams, it involves:
  • Building the spec
  • Working to the spec
  • Testing the produced code against the spec
  • Writing the How-To manual against the deliverable code
  • Delivery
This is an interdisciplinary process involving more than just programmers. The programmers need to be familiar with each step, of course, but it would be better if other people gave the same focus to those steps as the programmers have to focus on producing the code. Doing this in class might just inspire more non-programmers to participate on projects in such key areas as helping guide the UI, writing how-tos and man-pages, and creatively torturing software to make bugs squeal. And maybe even inspire some to give programming a try, some who never really looked at it before due to unfortunate cultural blinders. That would REALLY help.

Novell wants your BrainShare input

Just posted on the Cool Solutions community page:

Novell BrainShare 2010 Advisory Board

Since BrainShare took 2009 off, they're planning on bringing it back in 2010. And they're looking for end user input into what it should look like. Should it stay in Salt Lake City? Should events be dropped? Should events be added? This looks to be an online colaboration rather than physical presence, so proximity to Provo, UT shouldn't be a problem. Though, proximity to the US Mountain Timezone may be a good idea.

If you get selected for the board, a perk is a pass for BrainShare 2010.

A new version of BIND

I saw on the SANS log today that the ISC is starting work on BIND10. A list of the new stuff can be found here. A couple of those items are very interesting to me. Specifically the Modularity and Clustering items.

Modularity:
...the selection of a variety of back-ends for data storage, be it the current in-memory database, a traditional SQL-based server, an embedded database engine or back-ends for specific applications such as a high performance, pre-compiled answer database.
Which makes me think of eDirectory backed DNS. Novell has had this for ages with NetWare, and from what I recall it was based on BIND. But... BIND8. BIND10 would formalize this in the linux base, which would further allow Novell to publish a more 'pure' eDir-integrated BIND.

Clustering:
run on multiple but related systems simultaneously, using a pluggable, open-source architecture to enable backbone communications between individual members of the cluster. These coordination services would enable a server farm to maintain consistency and coherence.
This is exactly what AD-integrated DNS and the DNS on NetWare has been doing for over 8 years now. Glad to see BIND catch up.

The big thing about using a database of some kind as the back-end for DNS is that you no longer have to create Secondary servers and muck about with Zone Transfers. For domains that change on a second by second basis, such as an AD DNS domain with dynamic updates enabled and thousands of computers during morning power-on, it is entirely possible for a BIND secondary-server to be missing many, many DNS updates. Microsoft has known about this issue, which is why they have their own directory-integrated DNS service.

This also shows just how creaky the NetWare DNS service really is. That's based on BIND8 code, which is now over 10 years old. Very creaky.

I'm looking forward to BIND10. It is a needed update that addresses DNS as it is done today, and would better enable BIND to handle large Active Directory domains.

Zen Asset Inventory

| 1 Comment
A while back we installed Zen Asset Inventory (but not Asset Management) since it came with our Novell bundle, and inventory is a nice thing to have. At the beginning of this quarter it started to crash while inventorying certain workstations. After sending the logs to Novell, it turned out to be crashing on a lot of workstations.

Novell said that the reason for the crashes was excessive duplicate workstations. ZAM is supposed to handle this, but it seems 2 years of quarterly lab reimaging seems to have finally overwhelmed the de-dup process. The fix is fairly straight forward, but very labor intensive:
  1. Clean out the Zenworks database
  2. Force a WorkstationOID change on all workstations
The second took quite a while. Those steps are:
  1. Stop the Collection Client service
  2. Delete a specific registry key
  3. Start the Collection Client service
These three steps can be done by way of Powershell (or the 'pstools' suite of command-line utilities if you want to rock it old school). One at a time. As we have on the order of 3,700 workstations, this took a few days and I'm sure I missed some. I did get all of the lab machines, though. That's important.

Cleaning out the database proved to be more complicated than I thought. At first I thought I just had to delete all the workstations from the Manager tool. But that would be wrong. Actually looking at the database tables showed a LOT of data in a supposedly clean database.

The very first thing I tried was to remove all the workstations from the database by way of the manager, and restart inventory. The theory here is that this would eliminate all the duplicate entries, so we'd just start the clock ticking again until the imaging caught us out. Since I had modified our imaging procedures, this shouldn't happen again any way. Tada!

Only the inventory process started crashing. Crap.

The second thing I tried was to strobe through the Lab workstations with the WorkstationOID-reset script I worked up in PowerShell (this is not something I could have done without an Active Directory domain, by the way). These are the stations with the most images, and getting them reset should clear the problem. Couple that with a clearing of the database by way of the Manager, and we should be good!

Only the inventory process started crashing. It took a bit longer, but it still crashed pretty quickly.

Try number three... run the powershell script across the ENTIRE DOMAIN. This took close to four days. Empty the database via Manager again, restart.

It crashed. It took until the second day to crash, but it still crashed.

As I had reset the WorkstationOID on all domained machines (or at least a very large percentage of them), the remaining dups were probably in the non-domained labs I have no control over. So why the heck was I still getting duplication problems with a supposedly clean database? So I went into SQL Studio to look at the database tables themselves. The NC_Workstation table itself had over 15,000 workstations in it. Whaaa?

However, this would explain the duplication problems I'd been having! If it had been doing the de-dup processing on historical data that included a freighter full of duplicates already, it was going to crash. Riiiiight. So. How do I clean out the tables? Due to foreign key references and full tables elsewhere, I had to build a script that would purge leaf tables, then core tables. The leaf tables (things like NC_BIOS) could be Truncated, handy when a table contains over a million rows. Core tables (NC_Component) have to be deleted line-by-line, which for the 2.7 million row NC_Component table took close to 24 hours to fully delete and reindex.

With a squeaky clean database, and the large majority of WorkstationOID values reset enterprise wide, I have restarted the inventory process. The Zenworks database is growing at a great pace as the Component tables repopulate. This morning we have 3,750 workstations and growing. We inventoried close to 3,300 stations yesterday and didn't get a single inventory crash. This MAY have fixed it!

I'm keeping these SQL scripts for later use if I need 'em.

They key learning here? Removing the workstations from the Manager doesn't actually purge the workstation from the database itself.

Reminders of the technical past

Today marks the anniversary of the Columbine shootings. This has had me confused for some time, as I clearly remember getting alpha-numeric pagers at my old job right before the Columbine shootings, and yet I also clearly remember getting them in November or something. The pagers we got were an AT&T model that had a NewsAlert thingy with four mail-slots of news pushed down by the pager-provider. As it happens, I didn't care about it but they just CAME with the pagers so we had to live with them. We'd had them all of two days before the school shooting happened, and we were ALL getting updates every 15-30 minutes. This was before most of us figured out how to turn off the vibrate alert when a new one arrived. This caused a lot of us to figure out how to do it.

It was on the drive in to work today that I finally remembered which school shooting it was: Jonesboro. Which was March, not November. Still, that was it. The experience was similar to twitter-bombing after a major event causes everyone on my follow list to comment about something within 30 minutes of the event. Only, this was in 1998, the medium was the alpha-numeric pager, and the source was a major news outlet of some kind.

A Mac botnet?

Ars Technica has an article up about a detected botnet based on Mac OSX machines. This is interesting stuff since you don't SEE this kind of thing all that often. OSX is the #2 operating system after Windows, but it is a distant #2. Also interestingly the infection vector appears to be pirated software, a vector that bring a tear of nostalgia to my eye for its sheer antiquity. Clearly this would be a slow growing botnet, but that's OK since a large percentage of Mac users don't bother with AV software since they're not running Windows and "don't need it".

What would be more impressive would be a drive-by downloader ala IE, but with Safari instead. I don't remember hearing any press about anything other than proof-of-concept with that, though.

Windows 7 forces major change

I've said before that you'll have to pry the login-script out of our cold dead hands. The simple Novell login-script is the single most pervasive workstation management tool we have, since EVERYONE needs the Novell Client to talk to their file servers. Its one reason we have computer labs when others are paring down or getting rid of theirs. People can live without the Zen agents if they work at it, but they can't live without the Novell Client. Therefore, we do a lot of our workstation management through the login-script.

The Vista client has been vexing in this regard since it is so painfully slow in our clustered environment. The reason it is slow is the same reason the first WinXP clients were slow, the Microsoft and Novell name-resolution processes conmpete in bad ways. As each drive letter we map is its own virtual-server, every time you attempt to display a Save/Open box or open Windows Explorer it has to resolve-timeout-resolve each and every drive letter. This means that opening a Save/Open box on a Vista machine running the Novell client can take upwards of 5 minutes to display thanks to the timeouts. Novell knows about this issue, and has reported it to Microsoft. This is something Microsoft has to fix, and they haven't yet.

This is vexing enough that certain highly influential managers want to make sure that the same thing doesn't happen again for Windows 7. As anyone who follows any piece of the tech media knows, Windows 7 has been deemed, "Vista done right," and we expect a lot faster uptake of Win7 than WinVista. So we need to make sure our network can accommodate that on release-day. Make it so, said the highly placed manager. Yessir, we said.

So last night I turned CIFS on for all the file services on the cluster. It was that or migrate our entire file-serving function to Windows. The choice, as you can expect, was an easy one.

This morning our Mac users have been decidedly gleeful, as CIFS has long password support where AFP didn't. The one sysadmin here in techservices running Vista as his primary desktop has uninstalled the Novell Client and is also cheerful. Happily for us, the directive from said highly placed manager was accompanied by a strong suggestion to all departments that domaining PCs into the AD domain would be a Really Good Idea. This allows us to use the AD login-script, as well as group-policies, for those Windows machines that lack a Novell Client.

Ultimately, I expect the Novell Client to slowly fade away as a mandatory install. So that clientless-future I said we couldn't take part in? Microsoft managed to push us there.

MyWeb statistics

An interesting inversion has happened recently. The Fac/Staff myweb has overtaken the Student myweb in terms of data moved. Most of the student-side top-transfer stuff are mp3 files and other media files, where on the fac/staff side the top transfers seem to be big zip-files for classwork.

It would not surprise me to learn that some faculty have figured out that you can get around Copy Services by putting together PDFs of the articles they want to pass out to students. I already know that many faculty use MyWeb for distributing things to their students, not everyone loves BlackBoard.

Going on vacation

Next week I'll be on vacation. Expect postings to be nearly nonexistent. It could happen that I'll be struck with a blogable thought and need to post about it. Or not! And now for some quick bites.
  • The House budget is out and... ow. President Shepard has been vocal.
  • SLES11 shipped, which I forgot to mention. Still no hints on what OES-Next will be based on.
  • OES2-SP2's beta was open a while ago, but doesn't seem to be any more. This will be based on SLES10, very likely SLES10-SP3. I'm not doing it again because the stars are not in the correct alignment.

Open-sourcing eDirectory?

| 3 Comments
The topic of open-sourcing eDirectory comes up every so often. The answer is always the same, it can't be done. Novell NDS and the eDirectory that followed it uses technology licensed from RSA, and RSA will not allow their code to be open-sourced. And that's it.

However... it isn't the RSA technology that allows eDirectory to scale as far as it does. To the best of my knowledge, that's pure Novell IP, based on close to 20 years of distributed directory experience. The RSA stuff is used in the password process, specifically the NDS Password, as well as authenticating eDirectory servers to the tree and each other. The RSA code is a key glue to holding the directory together.

If Novell really wanted to, they could produce another directory that scales as far as eDirectory does. This directory would be fundamentally incompatible with eDir because it would have to be made without any RSA code, which eDirectory requires. This hypothetical open-source directory could scale as far as eDir does, but would have to use a security process that is also open-source.

This would take a lot of engineering on the part of Novell. The RSA stuff has been central to both NDS and eDir for all of those close to 20 years, and the dependency tree is probably very large. The RSA code even is involved in the NCP protocol that eDir uses to talk with other eDir servers, so a new network protocol would probably have to be created from scratch. At the pace Novell is developing software these days, this project would probably take 2-3 years.

Since it would take a lot of developer time, I don't see Novell creating an open-source eDir-clone any time soon. Too much effort for what'll essentially be a good-will move with little revenue generating potential. That's capitalism for you.