August 2007 Archives

SSL puzzler resolved!

| 3 Comments
The anonymous commenter on the previous post nailed it. Disabling TLS in firefox & seamonkey removed the delay. Now that they've mentioned it, I knew this! This is one of the reasons why we're discouraging students from using Vista! I just didn't know that the newer Mozilla browsers were also affected by NetWare's inability to do TLSv1.

Here's an interesting thing

Novell is putting together a Best Practices guide for migrating to OES2 from NetWare. Obviously this is OES2-Linux, as there is not much that needs migrating when going from OES-NW to OES2-NW. They're soliciting community input for the guides, and will be offering Cool Solutions reward points for contributions.

This is interesting. I know that the Novell Support Forum Sysops tend to build up their own micro guides based on problems people report in the forums, and this is a way to better formalize that. Some of the sysops have taken to using the Cool Solutions Wiki as a place to park boiler-plate answers and forward questioners to those pages. This is an interesting concept.

More interesting as OES2 isn't out yet, even in an open-beta form. Where are we going to get our experience from, eh? This implies that shortly we'll have at least an open beta to try out. I hope so.

I can't contribute much to this document because my main migration is contingent on AFP being eDir integrated, and they've said that'll not happen until probably SP1. If I do anything it'll be the eDir servers, and those are relatively easy migrations. DFS is the only sticking point for that.

An SSL puzzler

| 1 Comment
One thing I've noticed lately is that hitting NetWare SSL webpages gives me a 20-60 second lag if I hit them with Seamonkey or Firefox. IE6 doesn't give the same lag. In order to see what's happening at the network level I broke out Wireshark.

Weirdly, the IE6 trace has 6 packets until the SSLv3 Server Hello, and the Seamonkey trace is 16 packets (and a big delay) until then. Some other differences in the Seamonkey trace (firefox shows the same delay, so I'm assuming similar reasons):
  • Uniformly, packet 6 in the Seamonkey trace is a FIN, ACK from the client
  • Packets 7-10 are connection tear-down
  • Packets 11-13 are connection setup
  • Packet 14 is an SSLv2 Client Hello (it was SSLv3 up there in packet 4)
  • Packet 15 is an ACK from the server
  • Packet 16 is the SSLv3 Server Hello
So what is going on that the NetWare SSL provider is not reponding? It looks to me that the client, Seamonkey, is timing out and failing back to an older SSL spec. What's strange, is that in the Seamonkey trace, the SSL Server Hello lists protocol SSLv3 after the SSLv2 Hello.

Another difference in the traces is that the first SSLv3 Client Hello in the Seamonkey trace includes 28 Cipher Suites, to IE's 11. Wireshark can only identify 12 of them (for the curious, most of the identifiable ciphers are different than the IE ones). I can only suppose that the NetWare SSL provider gets this Hello and goes +++OUT OF CHEESE ERROR+++ and waits to get more sensible data.

This is a tricky one. Tomorrow I delve into the Novell KB database and see if I can find anything like it. And if that and delving the support forums fails, a call in.

PS: I'd post some packet traces, but wireshark here on openSUSE 10.2 is crashing hard everytime I try and bring up a 'browse files' window. This makes saving traces difficult.

When ads are ironic

I was browsing my feeds at lunch, when I see this gem:
Apropo ad
That's right. On the Slashdot article about the extensive point-and-click wiretap network the FBI has built for wireless providers, is an ad for a wireless provider. I REALLY love the tag line, "Your world. Delivered."

Should that be, "Your world. Delivered to the FBI."? Heee!

Patent trolls

I see that Polaris IP is suing several large companies over patent infringement. The patent? Email auto-responders. I wonder why Novell wasn't included in the suit since I was doing JUST THAT with GroupWise 4.1 in 1997.

Oh wait. That's not infringement, that's prior art. My bad.

Dynamic Storage Technology, more data

Two days ago Novell posted an AppNote on Dynamic Storage Technology, formerly known as 'shadow volumes'.

Setting up Dynamic Storage Technology with Open Enterprise Server 2

One thing I noticed right at the top of the article is a little blurb that reads:
This article was written for Novell Open Enterprise Server 2. Sign up here to be notified when the Novell Open Enterprise Server 2 open beta becomes available.
Which tells me that the public beta is probably pretty near, and that OES2 release will probably not be "end of Q3" like Jason Williams indicated a while back. I could be wrong, of course. As soon as I get the public beta code there is some serious testing I need to do.

Anyway, back to the article. This is a click-by-click guide for setting up DST. This includes screenshots, which are of the new iManager 2.7. Unsurprisingly, Novell re-themed the iManager interface. There is a gotcha on step 17, where you have to edit a local config file on the OES server to get it going, that would probably trip up most people trying to set up DST by going solely on looking at the UI.

This is a very good article describing it all. I recommend it!

ZenCM (a.k.a. Zen 10)

| 2 Comments
We're taking a look at ZenCM right now, as it's the Zen that supports Vista. I saw quite a lot about it back at Brainshare, so I have some expectations. Now that it is out, there is a manual out. We like manuals. They tell us what to expect.

Some quick hits on differences that'll cause us to do things differently from previous Zen versions:
  • No NetWare support, so we'll need at least one new server to drive this thing.
  • Application deployment won't work with NetWare as a file source, so all those MSI's will have to be hosted on Windows somewhere (no Linux support yet, though that's probably SP1 stuff).
  • Inventory has been rolled into the central product, so no need for a separate Inventory server like the past (the past product was the Inventory side of Zen Asset Management. ZAM now is just Asset Management).
  • No AXT-based installs, just MSI and Simple Application.
  • The MSI installs will FORCE us to start looking at ApplicationStudio, whether we like it or not.
  • Introduces a single point of failure for Zen policy, as it no longer uses eDir for a policy repository, and instead uses the ZenCM server. Which in our case will be a single server, as we're not $$$ enough for two, nor do our users segregate into large chunks well.
Yeah. We've done just a weensyiest bit of application packaging. We have zero experience with this internally. I predict that this will be THE major stumbling block with the new architecture. Yes, I know, the whole bloody industry is moving to MSI based installs but that doesn't mean we've been keeping up. We haven't. I have a buddy who works full time as an application install writer, so I know how complex it can get. This is a skillset that large organizations really need, and we really don't have it.

There is also internal disgruntlement about the abandonment of edir as the policy repository, but I understand the sound business reasons for it. The loss of the ability to have NetWare be the file-source for application installs is also something that is causing grumbling, as that's the infrastructure that's well built for large scale deployments. The fact that we need yet another server for this is also causing grumbling, though in the end we'll be using the server that was slated to be the real server for the Zen Asset Inventory, as its beefy enough to handle it (we think).

And then the imaging question... we've not been using Zen's imaging because we've been using another product. However, that other product plays hobb with Workstation Import, and from the sounds of it that'll still be a problem with ZenCM. So, no device-based policies for us.

All in all, we'll end up going with it because 1) it's free to us, and 2) we've always been using it. That said, there will be a push to use the built-in AD tools as they are perceived to be less clunky. No agent to install, that sort of thing.

Measuring sysadmin productivity

| 1 Comment
There was another thread on Slashdot today that caught my attention:

http://ask.slashdot.org/askslashdot/07/08/25/1753220.shtml

The asker asked:
RailGunSally writes "I am a (strictly technical) member of a large *nix systems admin team at a Fortune 150. Our new IT Management Overlord is a hardcore bean-counter from hell. We in the trenches have been tasked with providing 'metrics' on absolutely everything from system utilization to paper clip recycling. Of course, measuring productivity is right up there at the top of the list. We're stumped as to a definition of the basic unit of productivity for a *nix admin. There is a school of thought in our group that holds that if the PHBs are simple enough to want to operate purely from pie charts and spreadsheets, then we should just graph some output from /dev/random and have done with it. I personally love the idea, but I feel the need for due diligence, so I put the question to the Slashdot community: How does one reasonably quantify admin productivity?"
I don't have a "bean-couter from hell" boss, but this is a topic I've spent a bit of time thinking about at my last job. How to you measure productivity of a sysadmin? The question at previous job was how do you determine which employee holds more value than another. This is not an easy thing.

Productivity at its most abstract is the rate at which an employee adds value to an organization. The tricky part is determining how to measure that rate and the value itself. In manufacturing, it is easier as 'widgets-per-hour' is generally OK. IBM and Microsoft attempted to do this to programming back in the development phase for OS/2, and the infamous "KLOC", or, "thousand lines of code."

System Administration is something that doesn't lend itself well to such quantification. A significant part of our job is quite literally, fire-watch; do nothing until something breaks and then spring into action to contain and correct the damage. While we're waiting for something to break, we're also working on projects to get new or upgraded systems online.

What I have seen done is to have to account for every minute of my day. Every moment of my day has to be chargable against something; a project, a department, or other time-tracking tool. It is also my experience that such managers take a dim view of entries such as these:

9:50-10:00 Bathroom
11:45-12:00 Time-sheet entry
15:45-16:00 Time-sheet entry

The questioner asked, "what is the basic unit of productivity for an *nix admin?"

I could come up with a funny name for this fictional unit, but in essence there isn't one. To fully quantify an admin's productivity requires fully quantified metrics for:
  • The impact of server and service downtime.
  • The value gained from meetings.
  • The seasonal variations in business (in our case, when are classes in session? When are finals? When do grades need to be reported? When are parents on campus? Things like that.)
  • Bureaucratic friction (how much 'process' is required to get things done?)
I have yet to run into a business where the above are fully quantified. Through knowledge of the above you can determine the prodtivity of any single cog in the while mechanism. This is the best way to determine these things.

Trying to reduce the complexity of the problem to certain 'proxy' metrics, metrics that are easy to track but also tend to mirror the much more complex metric, is the method of choice in these circumstances. Yet what proxy metric will do? Trouble-tickets resolved per week is one method, but it overlooks the differing complexity of some trouble-tickets (misplaced file versus install BlackBoard 9.4). Projects completed is another way, but as with trouble-tickets the complexity of some projects differs and projects can be canned from on-high without notice.

It is for reasons like this that Unions really like seniority. It is a simple supposition:

IF (timeAtCompany($NAME)) > (timeAtCompany($OTHERNAME)) THEN moreValuable($NAME)

Plus, it is hard for managers to game. Time of service is easy!

Yet every single tech-worker I've spoken with hates this system because we've all seen the flaw of it. If you've spent any amount of time at a company with more that 4 IT workers, there will be at least one of them that is not very good, just marking time until retirement, or is there for some reason besides to do a good job. These people have a tendency to have a lot of years of service, so are hard to get rid of. Just because you've been at a company in one general role is no guarantee of increased knowledge, skill, or value.

Sysadmin productivity is not something that can be measured easy. It is similar to trying to measure the productivity of a department-level Project Manager. It can be done, but it is a very squishy measurement.

Which just means we'll end up justifying every minute we're at work, and have the boss decide what productivity means through intuition.

openSUSE news

| 1 Comment
They have a new news-portal site, which is nifty:

http://news.opensuse.org

They had a nice article today about the results of a recent desktop linux survey. Bucky had a nice bit of analysis about it too.

This is nice to see. As I've mentioned before, I'm using openSUSE 10.2 at work (and right now). I can't use SLED because we're not entitled to it, nor will my boss pay for it. That said, I probably could do what a co-worker is doing and run SLES10sp1 instead ;). OpenSUSE now has 10.3 beta 2 out, which I'm not going to test quite yet as I don't have a test system for it; I wish I did have one.

One of the nice things that'll be in 10.3 (or rather, not in) is they're doing away with zmd for updates, and using a libzypp based process instead. This cheers me, as I've had a lot of trouble with the zmd one. It's better than it was in 10.0/1, but still not good.

Also, of course, openSUSE code forms the basis for what'll eventually become SLED.

Politics of passwords

| 1 Comment
It has been a common theme here for some years now to increase our password security. Two years ago (or was it three?) we rolled out Universal Passwords in an effort to gain more flexibility in the passwords we support. We've had password sync between Novell, AD, and Solaris for y-e-a-r-s, so our users have grown used to single sign-on. I've talked a couple of times about password complexity and how it works in a multiple system environment, twice in October (the 16th, and 17th). I've even talked about why Novell had to resort to Universal Passwords, because the NDS Password was too secure.

When we got our new Vice Provost, we got a person who wasn't familiar with the history of our organization. These sorts of things are always a mixed blessing. In this case, he wanted to get password aging going. The previous incumbent had considered it, but the project was on perma-hold while he worked certain political issues. The new guy managed to make a convincing argument to the University leadership, and the fiat to do password aging came down from the very top. And So It Shall Be. And Is. As with our existing password sync systems, this is a system we built from internal components and uses no Novell IDM stuff at all. It works for us.

Yesterday we got asked to make certain that the Novell password was case-sensitive.

I thought it already was, as Universal Passwords are case sensitive. But testing showed that you could set a mixed-case password on an account, and log in to Novell with the lower-case password. It won't allow workstation login on domained PCs as the AD password is mixed-case. In the case of students who only ever login using web-services, they sometimes got a shock when using a lab for the first time and the password they'd been using for months didn't work.

There are two things working against us here.
  1. We did NOT set the "NMAS Authentication = On" setting in the Client we push. This means that while we are setting a universal password, none of our Novell Clients have been told to use them.
  2. LDAP logins to edir 8.7.3 use the NDS password by default first, and those are caseless. This means that anything using an LDAP bind, all of our web-sites that require authentication, will have a caseless password.
We're fixing the first through a registry setting we'll be pushing out. The second is much harder, as it'll require either turning off NDS passwords, or upgrading to eDir 8.8 where the LDAP server can be configured to use Universal Passwords first by default.

Looking at what would break if we turn NDS passwords off, I got a large list. We have some older servers in the tree (NetWare 6.0, and one lone NetWare 5.1 out there), and some console utilities would just plain break. Plus, at least one of us is still using ArcServe of an unknown version and I have zero clue if that would break if we remove NDS passwords (I'm guessing so, but I have no proof). Also, all older clients, such as the DOS boot disks used by our desktop group for imaging and any lingering Win9x we have out there, would break. Not Good.

The list of what'll break if we go to eDir 8.8 is shorter. As that allows the continued setting of the NDS Password, the amount of broken things out there is reduced. We'll have to put a specific dsrepair.nlm on all servers in the tree, but that is easier than working around breaking things. So, we're going to go to eDir 8.8.

This is not without its own problems, as some things DO still break. That lone NetWare 5.1 server will have to go. I've been assured that it is redundant and can go, but it'll need to ACTUALLY go. The NetWare 6.0 servers should be fine, as they're all at a DS rev that'll work with 8.8. Some of the 8.7.3 servers are still at 8.7.3.0 and should get updated for safety's sake. Also, all administrative workstations need to have NICI 2.7.x installed on them in order to understand the new eDir dialect, but that's a minor detail.

We won't be able to take advantage of some of the other nifty things eDir 8.8 introduces, as were still 95% NetWare when it comes to replica holders. Encrypted replication and multiple eDir instances will have to wait.

I HOPE to get eDir 8.8 in before class start, as the downtime required for DIB conversion is not trivial, and the first 4 weeks of class are always pretty hard on the DS servers due to updates.

Events while I was gone

We did some work with the UPS/Generator transfer switch. That caused a spew of SMS messages that I got out in the boonies.

Also, Novell has released the Novell Vista Client. This is the full release, not a beta. It even is fully localized!

The end of classes is nigh, so we're gearing up for the mass of upgrades that'll be going in while we have no classes being taught.
  • Get a new VCS version on the EVA
  • Upgrade BlackBoard to a newer rev
  • Move BlackBoard back-end database to SQL2005
  • BIQuery upgrade
  • Banner work
  • A lot of disaster-recovery testing and setup
Gonna be a busy few weeks, there.

Updates

We've finally moved the MSA up to Bond Hall. I've spent the better part of the last two days undressing and redressing racks as a result of this. It has been interesting.

One of the things we learned is that it would be a really good idea to invest in power cables of custom length. Especially since we have the nice power-strips up there in BH. 4 footers will do in most cases, and have a minimum of cable to coil and stow. What we do have are longer cables and that leads to having black power cables crammed everywhere there is space, and that complicates chasing them when the time comes.

We have four racks over in Bond Hall in which to stuff things. Because of this, those are some of our densest racks.

The fibre channel bridged the distance just peachy. I don't know what the total distance is, but it is well under 10km so we could get both the fibre switch in BH and the fibre switch up here talking. Servers up here can see the MSA and talk to it at full speed. And do so without incurring any extra Brocade licensing costs to make it all work. That was nice to see.

Right now we're doing some tests on backup-to-disk to the MSA. The performance is... underwhelming. I'm not yet sure if the MSA is the bottleneck, or the speed of data coming from the backup sources. I do know that the MSA is all too easy to saturate for I/O, and it wouldn't surprise me in the least if B2D pegs the needle. Early indications are that the MSA is a welcome addition to our existing backup strategy, but it won't come close to replacing tape.

And finally, I'll be on vacaton all next week. So there won't be any updates until the week of the 21st.

NW65SP6 is in

I put it into WUF on Tuesday night. It went well enough, though not hitch free. The third server I did hung hard on a PORTAL.NLM error, so I ended up having to backrev that NLM to the SP5 version (and yes, I did use the post-SP6 PORTAL.NLM, it was the one that hung). I also managed to get all the fibre cards flashed to new BIOS to support an EVA maintenance we'll be doing during intersession.

I must say that the Novell Wiki on NW65SP6 is a fine and wonderful thing. This is a page that the Novell Support Forum sysops keep up to date with their distilled knowledge. Once upon a time the NetWare minimum patch list was useful for this, but that's fallen by the wayside in recent years. This wiki page includes abends that people have run into, the patches that fix known problems with SP6 (biiiig problems with NDPS, by the way), and suggestions learned through other people's hard experience. I recommend it.