May 2005 Archives

The future of Netware, my opinion

We've heard from a number of sources that there will be a NetWare 7. I trust that this will be the case for a number of reasons, chief among them being that Open Enterprise Server 2.0 will need it. So we will have at least one more full-rev of the NetWare kernel. Yay!

Since we will have at least another rev in the 18-24 month timeframe, this opens the door for software vendors to decide to jump ship. I strongly suspect that the Veritas, ComputerAssociates, and Legato's of the world will support NetWare 7 thanks to a strong installed base. So this keeps us in a NetWare kernel for at least the next 2-5 years.

But NW7 is the make-or-break point of NetWare as a viable kernel. NetWare is hard to develop against, even with all the POSIX improvements they keep putting in. Once the backup vendors stop supporting NetWare as a backup-engine the writing will be on the wall.

If OES 2.0 manages to solve the interoperability problems between NW and Linux (and by that I mean differences of install, look and function of management tools, and the like), which needs doing anyway, it'll make it even harder for the software vendors to justify continuing development towards NetWare. One of the biggest reasons that someone would choose NW over the Linux kernel in a brand new OES install would be file-serving performance; should Novell manage to make Linux perform as well as NetWare in that regard, dooooooooom.

But I suspect that NetWare will remain in some form for a number of years. NetWare is a kernel that was designed from the bolts up to be a file-server, and that shows in the performance it cranks out. As an iSCSI back-end, it's an off-the-shelf OS that works great. So NetWare in this form may last as long as a v8.

But if the file-serving performance gap between hardware-identical NetWare and Linux servers ever closes to within a percentage point, the sheer market-share of Linux will drive companies to migrate fileserving to that half of the OES platform. We're not there yet, and we're still 3-5 years away from getting there.

Final Words:

NetWare as a kernel has at least 5 years of commercial viability ahead of it. At that point, market forces may very well force another round of NetWare -> Linux conversions. But we won't know until we get there.

More fruit

Space data. Gotta love it.

Sample chart

That went well

Remember what I was talking about yesterday? About converting my volume-space tracker to use a database back-end? Well, I did that yesterday, and it worked. It took about three hours to:
  • Reactivate the commented-out database bits in the script
  • Convert the database bits to use dbi:odbc instead of dbi:oracle
  • Handle a lookup for a volume-name/server-name pair
  • Generate a new unique-ID for a new volume-name/server-name pair
  • Get the database table created, indexed, and linked appropriately
I was very happy to see that the problems I was having with the directory quotas didn't happen here. Perhaps it has something to do with only INSERTing 15 rows at a whack instead of 23,000. Who knows. Whatever it was, this works for the scale I'm at.

The thing I worked on this morning was importing the existing CSV file the script has been dumping to, and getting it into the table. That took about 45 minutes to do the required transformations in Perl. I couldn't use the native import tools since I ran smack into some data-type issues, and the CSV file has a servername/volumename pair instead of the uniqueID. Once I got those bugs worked out, it imported just peachy.

Once this all hits production, my boss is going to be s-o-o-o happy. He loves data like this. He has already used the volume-space tracker to justify several thousand dollars worth of infrastructure upgrades to his boss. Yay!


That _admin directory is a bounty of goodness. I've managed to get my directory-quota tracker down in the majority, now its experiencing minor tweaks as I figure out the best way to run reports and how to gather the data itself. Since this data doesn't change much from day-to-day, it'll be a week or three before we get enough data to do usability thingies with it.

Meanwhile, there are a couple of other projects I can set my mind to. Tracking of Trustee assignments would be nifty from an auditing and documentation point of view. Re-rigging my disk-space tracker to use the database instead of the SQL would be a perfectly useful thing to do, especially since it was originally developed with a DB as a back-end anyway.

However, I'm running into a perl issue. The disk-space tracker was written in perl and works great from there. Worked great at OldJob, and is probably still working great as I type; it has data back to 1999 in that table. For some reason, the Perl DBI modules for SQL-server aren't working well for me.

In order to improve space usage in the database and to better link the various tables together, I have a central lookup table.




Since the key bit of data all that stuff in _admin lacks is what server and volume we're talking about, I need to provide that link when I drop stuff in the database. The field-list for the volume disk-space tracker I have looks like this:

DateTime, VolServID(FK: UniqueID), TotSpace, FreeSpace, UsedSpace

What this means is that the data-gathering script needs to figure out what UniqueID to use for the Server/Volume combination it just queried. This is part of where my problem seems to lay. For reasons unknown to me, find-unique-ID part works great, but the INSERT to the freespace table freezes as I bind parameters. And the freeze happens in the DBI code itself, not my code, and it isn't giving me useful trace info. This is why I ended up using C# for the quota-tracker, by the way. I don't want to port this script into C# since that means learning how to do SNMP in C#, and I sense pain ahead.

ZEN Security hole

FrSIRT posted it, which is where I found it.

They link to the details.

In short, there are a couple of reliable overflows in ZenRem32.EXE that will allow server and workstation compromise. This executable is part of the Remote Desktop portion. According to the advisories, this may also be included in servers with Zen for Servers installed on them. The default port for this product is TCP/1761 and UDP/1761, though it can be configured to use a different port.

This is a critical flaw for us in .edu land, where firewalls are scarce on the ground.

Hidden features

| 1 Comment
This one snuck past all of us:


That was in a winsock patch readme. Very useful things, there. Especially since NetWare doesn't like using DNS resolvers most of the time.

Code monkey

The reason there haven't been any posts lately is due to two factors. Things being relatively quiet at work, and me banding on a programming problem. The previous post shows the begining of it.

What I was trying to do was to parse in the XML contained in "dirinfo.xml" on the _admin volume for specific servers, and dump the data into a database. I had the perl working for two days, but somewhere I irretrievably broke it in a way that resisted all efforts of finding the problem. Setting the DBI to max-debug, I noticed that the hang happened in the middle of two bind_param requests. Since the perl debugger won't step into calls like "$sql->bind_param(1, holyGroats);" I couldn't tell where the problem was dying. I ultimately gave up.

So I've been spending the last three days attempting the same thing in C#.Net instead. This was done under the presumption that if I went end-to-end with Microsoft products, it might work better. So far, that's holding true. Right now I have C# code that'll duplicate what I was doing with the perl script. This code is about twice as long as the perl code thanks to the level of exception-handling I have to throw in, and having to work in a strongly-typed language instead of a loosely-typed language.

The biggest stumbling blocks was figuring out how to parse in the XML and manipulate it. In perl this was very easy, and took me about 35 minutes to figure out. In C#, XML parsing is about as complex as, say, Perl regex syntax, and about as powerful. In other words, I was drowing in options. It took two days to figure out how to get at the data I need, and I'm very sure that it isn't done 'correctly' for what I am doing. If Novell ever changes the format of this XML file this program will break hard. The fact that object-oriented programming makes my brain leak out my ears didn't help any either.

I don't program for a living, and it shows.

Now to start commenting like a madman.

Programming side-effects

Once I get into a programming mode, I have to remember to surface twice an hour to check e-mail and other things. Time really flies when I'm solving problems. Happily, these succomed in only a few days.

I've managed to get my perl script to upload data into the MS-SQL database I want it to. And I've spent the last few hours hammering Access to give it the reports I need. This is subtly different from the perl/Oracle/Crystal combination I had at my last job, and I'm having to learn a few new dialects. But it is bearing fruit.

I now have a report that'll dump out all users who are at or above 90% disk-quota utilization. Not handy for me, but I know a few groups of people that'd be very useful. One of the interesting side-effects of looking at directories and not users is that "orphaned" directories where the user has been deleted but the user-dir hasn't show up in the list. In fact, number 11 on this list is an ex-employee who was high enough in the scheme of things that deleting their crap may cause unfortunate side-effects.

Directory quota parsing

After hammering on it for a day or two I managed to get my perl skilz to give me a program that'll parse the DirInfo.XML file in the _admin area. This gives quite handy output.

"5/10/2005 11:28:54","wuf-stu2","STU2","tstts2",536870912,4096

The CSV file that contains all of the student-side directory quotas is about 1.2Mb in size. A lot of data there, but useful data. This can easily be extened to FacStaff-side user directory quotas, and facstaff-side shared directory quotas.

It's in CSV just to prove I can get the data into the format I need. I hope to dump this to a SQL-ized database of some form in order to make queries of what'll be a large dataset run quickly. With logs this size, it won't take long before things get really big.

Neat reporting tricks

I've played around with the stuff on _admin a few times in the past, but not seriously since the NW65 update. And I have to say that they've really improved things. You can extract a lot of information from there.

All this data comes from the _ADMIN directory that all Netware servers have if NSS is running. This is where more an more OS and filesystem meta-data is being stashed. NW5 (or was it 5.1?) introduced NSS and the _ADMIN directory, and not much was in there back then. NW6 had more interesting data, and NW65 has even more detailed data. This behaves sort of like the /proc filesystem on linux, but not quite.

The thing that grabbed my attention most recently is the per-volume NSS data. The path of:

Gives you six files that hold meta-data for that volume.
DirInfo.xmlDirectory quota information
FileEvents.xmlNot quite sure
ModifiedFilesList.xmlIf you've turned on the MFL for your volume, this file keeps that list
TrusteeInfo.xmlThe list of all trustees on a volume. Useful!
UserInfo.xmlPer-user disk-quota and disk usages, if user-quotas are turned on
VolumeInfo.xmlMisc statistics for the volume

Each of those files is in reality a pre-built query that executes when you open the file and read the contents. Novell calls this the Virtual File Services.

It was the DirInfo.xml file that got me. We use directory quotas to manage disk-space, so on volumes like user home-directory volumes all root-level directories have a quota on it. It gives output like this:


Which, as you can see, can be quite useful. Especially if we grab it once a day, throw it in a database, and track how each directory grows over time. Very useful indeed.

Victory on NDPS?

I just got shipped official fix-code for the printer-pooling bugs we've been facing since September. The debug build I had been running on the last few weeks was pretty stable, though a couple of breakpoints were left in certain unusual places that caused the server to dump to the debugger suddenly. Oops. This build isn't a debug build and is a lot smaller as a result. I don't expect to have any problems with this one.

Subversive stickers

A hold-over from BrainShare.

Linux Desktop Decal Kit

They had these at the Upgrade Depot at Brainshare, and completely ran out. And got a lot of comments from people wanting them. Now, you can have your own!

An obscure request

A friend is looking for a certain someone to help convince his management that people like that really exist.
I'd like to find at least one good Unix developer, who can design and implement software that meets my (apparently high) quality standards. Expertise in TCP/IP networking is required. Familiarity with embedded system/appliance kinds of products would be helpful, as would experience with network protocol implementation, IPsec, TLS, secure configuration of Unix, and kernel development. Familiarity with FreeBSD would also be plus. If you are such a person, or know such a person, send me mail.
Or just leave a comment! I'll forward you on.

More yummy stats.

The following numbers are from the Student side of our cluster. These are the NSS filesystem stats, colors are mine.
*****  Buffer Cache Statistics *****
Min cache buffers: 512
Num hash buckets: 262144
Min OS free cache buffers: 256
Num cache pages allocated: 140470
Cache hit percentage: 95%
Cache hit: 45220557
Cache miss: 2235514
Cache hit percentage(user): 93%
Cache hit(user): 30771799
Cache miss(user): 2160008
Cache hit percentage(sys): 99%
Cache hit(sys): 14448758
Cache miss(sys): 75506
Percent of buckets used: 39%
Max entries in a bucket: 8
Total entries: 138424
The line that worries me is the "cache hit percentage" sitting at 95%. The student side has a bit over 1TB on it, and these stats were captured during 'real usage' times when no backups were being run and real users were accessing things. So I have high confidence these are real. So we need to throw some more memory into these servers.

What kind of side-effects does this present as? Since not as much data is being cached as we need, this means that the server is going to disk more often than it otherwise should. This will present as slower file accesses. Not beastly slow, since this is FC-backed not SCSI-backed, but slower than it could be.

Yummy stats

April has finished, and now we have yummy stats to play with! The continuing trend of MyWeb-Students dominating the two has gone another month. No surprise there. One little itty bitty surprise is how much of the total traffic out of myweb are media files of some form. So far nothing needing a DMCA notice.

MyWeb--Students (April):
Top 3 file-types by bytes:
  1. WMV @ 27.49%
  2. MP3 @ 16.44%
  3. JPG @ 16.31%
Top 3 file-types by hits:
  1. JPG @ 48.31%
  2. html @ 18.16%
  3. GIF @ 13.42%
Top 3 requested files, by bytes:
  1. /ferrym/videos/oregoonshootout3.wmv 6.03GB, 7.47%
  2. /ferrym/videos/pisshippie.avi 3.74GB, 4.63%
  3. /~hullp/Dan%20and%20Amy%20Wedding.wmv 3.34GB, 4.13%
Top 3 requested files, by hits:
  1. /~castonn/more%20car%20pics/sig32.jpg @ 3.48%
  2. /%7Egriffi4/karahotty.wma @ 2.02%
  3. /ferrym/pictures/jesus/jesus01.jpg @ 0.99%
MyWeb--FacStaff (April):
Top 3 file-types, by bytes:
  1. PDF @ 22.40%
  2. MHT @ 13.41%
  3. MOV @ 13.05%
Top 3 file-types, by hits:
  1. html @ 49.76%
  2. gif @ 11.92%
  3. jpg @ 8.90
Top 3 requested files, by bytes:
  1. /singlem/images/katie_3/Photo%20Album1.mht 174.71MB, 13.42%
  2. /~bowkerb/ATUS/Utilities/ServicePacks/W2KSP4.EXE 129.20MB, 9.30%
  3. /~nesheij/movies/ 99.92MB, 7.67%
Top 3 requested files, by hits:
  1. /~riedesg/sysadmin1138/atom.xml @ 7.51%
  2. /~riedesg/sysadmin1138/banner_blogwise.gif @ 1.85%
  3. /~walkers/wwwmfgiscom.css @ 1.84%