July 2006 Archives

NW65SP5 rolls on

Today it hit the NDS servers, tomorrow... WUF!

Tags: ,

NW65SP5 incoming

In the next week I intend to put SP5 for NetWare 6.5 onto our servers. Since SP5 came out in January, things have progressed quite a bit in the realm of post-SP5 patches. So here is the list of patches I'll be applying:

SERVER.EXE update that fixes a lot of memory allocator problems. Highly recommended.
Post-SP5 NSS update that is in 'public' release. Includes a fix for an AFPTCP abend we've had a time or two.
Security fix.
Fix of 'Salvage' bug introduced by SP5. Highly recommended.
Security fix.
Specific to the server brand I'm running. Fixes some memory problems. Hey, anything helps.
Many fixes to NDS/iPrint. NDPSM is the big one for us, several of the listed fixed problems we are seeing.
It's a LibC fix. How could I NOT apply it?
Post SP5 TSA fixes. Anything that improves backup speeds.
Fixes bug introduced in SP5 in NetStorage.
Fixes NetStorage WebDav problems, and memory leak in interupted upload/downloads in NetStorage.
Security fix in iPrint.

The patches in red are the ones I recommend for everyone. Everything else I'm applying due to policies here, and demonstrated need. Everyone probably doesn't need the LibC patch, but I absolutely do, so it's in there. The NetStorage patches are because I use that product, which not everyone does. The httpstk and nile patches are because all of my servers face the public, and I need to be able to defend against such malicious SSL trickery.

With all of the above, SP5 is looking pretty solid so far. In the support forums, at least. We'll see how well it works around here. We've already pushed the eDir update, so departments who are not me have already put SP5 into production here at WWU, and I haven't heard any problems yet.

Tags: ,

Still not quite fixed!

Just had this error happen when publishing the last blog post:

Failed to create identity for [username] on local server. rc: 116, errno 7, h_errno: 10053, clienterrno: 10053

At least OpenSSH is now giving full error details. The fact this is happening at all shows that the libc problem with accessing remote servers isn't fixed yet.

rc: 116ENCP generic NCP error, see 'h_errno'
errno 7not useful
h_errno: 10053Winsock error, connection aborted
clienterrno: 10053Winsock error

Happily, I see that there is a new LibC out there that released while I was on vacation. You can find it here. And encouragingly, it said in the TID:
- Fixed a connection problem commonly seen with Apache and remote home directories where a 111 or Winsock 10053 error would be returned. We now will purge the cached connection information and start over. BUG142979.
I'm not getting the 111 error, but I'm absolutely getting the Winsock error. I think I'll just integrate this into my SP5 build.

Tags: ,

Hardware replacement cycles

We just got the numbers for maintenance next year for our EVA. And, as I sort of suspected, Year 4 begins the painful payments. This seems to be a bit of a trend in the industry. Everyone Knows that hardware should be replaced on a 3-year cycle. But reality means that a server gets pulled back from high performance production work after 3ish years and redeployed as a mid-line or low-line server for the 2-3 years it has left.

Hardware companies know this. They now have almost two decades of PC experience now, and know how the replacement cycle works. This is why 'extended maintenance' on something generally costs a heckova lot. I ran into this at OldJob as well, as a certain key server was taking too long to cut over to the new hardware and we had to extend the premier expensive support another year, and ended up taking it on the chin.

In this case, the Maintenance cost for another year on the EVA is about a third the purchase cost of a new one. This is intentional I believe. HP doesn't want to maintain old hardware longer than they have to, and therefore provide incentive for customers to stay current. As it happens we had been looking at ways to in-place upgrade the EVA to a newer model that can handle larger hard drives.

Part of the specific problem with this university is finance. Because we're a public university in Washington State, the monies that can be used for financial commitments beyond a two year horizon is devilishly hard to obtain. We're funded through a mix of state funding, Alumni giving, a blizzard of grant money (most of it requires renewal each year), and other philanthropic giving. Money that can be used for payments longer than the two-year fiscal period has to come from State funds, and those by all internal accounts haven't increased bar inflationary increases (and sometimes not even those) in a very long time.

Because of our financial situation, we can't use certain instruments available to spread out the pain that other institutions and businesses can use. OldJob was a great fan of three year leases (they were moving to four due to server replacement realities). We can't encumber funds from year to year as a way to save up for big purchases. We can't issue bonds. Everything except for a base few items that make it into the general budget have to be purchased cash, or cash over two years.

This, by the way, is what scuttled the Sophos bid for our AntiVirus/AntiSpam contract. They refused to give us a bid with a two year contract, insisting that a three year contract was better. The three year contract did indeed cost a lot less than anyone else. But the RFP specifically said "2 year contract", Sophos did not meet that, and Sophos was dropped from consideration.

We have a lot of older ML530 (G2 I think) still in various forms of service. This is a machine with a 1GHz CPU in it, to give you an idea of how old it is. They're also really big, so only four of them fit into a rack. This makes me twitch, but they'll be leaving us in the not too distant future. I'm betting that HP will be cranking up the maintenance renewals on those to painful levels sometime in the next 12 months, which will help expedite their departures.

Not that we have the money to replace them.

Just across my desk

| 1 Comment
Our Microsoft rep just notified us that Internet Explorer 7 will be delivered by Automatic Update in "Q4 2006". This will be interesting. I haven't been following the discussions about how IE7 breaks sites, or whatnot. But now it becomes a lot more urgent. We have a few web portals that'll have to be QAed against IE7 well before that particular date.


TimeSync oops

One of the things that greeted me once I started looking at server health, was that the TimeSync in eDir was +1:51 from true. All the servers agreed on this, which is by design. But still... nearly two minutes? Aye!

It turns out that the server configured as Reference forgot which server it was synching against. Since it didn't have anything to sync against, it went with its local clock-chip instead. Unfortunately, clocks in servers are famously unreliable beasties, and it drifted nearly two minutes in the time I was away.

So I fixed it. I also told the two PRIMARY servers under my control to sync against the same NTP timesource in their configured-sources list, so at least the correct time will be involved in time negotiations.

I'd like to convert to XNTP, but that'll require ALL of the servers in the tree modifying their configured sources to accomodate that change. We're not quite there yet.

Tags: ,


I have returned. I'm now going through the vast expanse of e-mail that stacked up in my absence. But things have already moved along since I was out, and I haven't even gotten all the way through the stacked up e-mail.
  • The NovusHR update was worked while I was gone, and the folk who managed it feel good about the whole process. This is good, as this was the first test of 'in production' procedures while we're in test.
  • The SymantecAV replacement of McAfee has proceeded apace. The lack of "oh crap!" mail suggests the process has been largely error free. [and as an aside, Sophos was notified of the RFP for this upgrade process, but they never submitted a bid that fit all the requrements and were therefore eliminated as a contender]
  • We're getting another UPS for our data-center to help share the load. The first round of rack moves to accomodate the new equipment will happen tomorrow!
And I should change my voice-mail message while I'm thinking about it.