Thursday, January 28, 2010

Evolving best-practice

As of this morning, everyone's home-directory is now on the Microsoft cluster. The next Herculean task is to sort out the shared volume. And this, this is the point where past-practice runs smack into both best-practice, and common-practice.

You see, since we've been a NetWare shop since, uh, I don't know when, we have certain habits ingrained into our thinking. I've already commented on some of it, but that thinking will haunt us for some time to come.

The first item I've touched on already, and that's how you set permissions at the top of a share/volume. In the Land of NetWare, practically no one has any rights to the very top level of the volume. This runs contrary to both Microsoft and Posix/Unix ways of doing it, since both environments require a user to have at least read rights to that top level for anything to work at all. NetWare got around this problem by creating traverse rights based on rights granted lower down the directory structure. Therefore, giving a right 4 directories deep gave an inplicit 'read' to the top of the volume. Microsoft and Posix both don't do this weirdo 'implicit' thing.

The second item is the fact that Microsoft Windows allows you to declare a share pretty much anywhere, and NetWare was limited to the 'share' being the volume. This changed a bit when Novell introduced CIFS to NetWare, as they introduced the ability to declare a share anywhere; however, NCP networking still required root-of-volume only. At the same time, Novell also allowed the 'map root' to pretend there is a share anywhere but it isn't conceptually the same. The side-effect of being able to declare a share anywhere is that if you're not careful, Windows networks have share-proliferation to a very great extent.

In our case, past-practice has been to restrict who gets access to top-level directories, greatly limit who can create top-level directories, and generally grow more permissive/specific rights-wise the deeper you get in a directory tree. Top level is zilch, first tier of directories is probably read-only, second tier is read/write. Also, we have one (1) shared volume upon which everyone resides for ease of sharing.

Now, common-practice among Microsoft networks is something I'm not that familiar with. What I do know is that shares proliferate, and many, perhaps most, networks have the shares as the logical equivalent of what we use top-level directories for. Where we may have a structure like this, \\cluster-facshare\facshare\HumRes, Microsoft networks tend to develop structures like \\cluster-facshare\humres instead. Microsoft networks rely a lot on browsing to find resources. It is common for people to browse to \\cluster-facshare\ and look at the list of shares to get what they want. We don't do that.

One thing that really gets in the way of this model is Apple OSX. You see, the Samba version on OSX machines can't browse cluster-shares. If we had 'real' servers instead of virtual servers this sort of browse-to-the-resource trick would work. But since we have a non-trivial amount of Macs all over the place, we have to pay attention to the fact that all a Mac sees when they browse to \\cluster-facshare\ is a whole lot of nothing. We're already running into this, and we only have our user-directories migrated so far. We have to train our Mac users to enter the share as well. For this reason, we really need to stick to the top-level-directory model as much as possible, instead of the more commonly encountered MS-model of shares. Maybe a future Mac-Samba version will fix this. But 10.6 hasn't fixed it, so we're stuck for another year or two. Or maybe until Apple shoves Samba 4 into OSX.

Since we're on a fundamentally new architecture, and can't use common-practice, our sense of best-practice is still evolving. We come up with ideas. We're trying them out. Time will tell just how far up our heads are up our butts, since we can't tell from here just yet. So far we're making extensive use of advanced NTFS permissions (those permissions beyond just read, modify, full-control) in order to do what we need to do. Since this is a deviation from how the Windows industry does things, it is pretty easy for someone who is not completely familiar with how we do things to mess things up out of ignorance. We're doing it this way due to past-practice and all those Macs.

In 10 years I'm pretty sure we'll look a lot more like a classic Windows network than we do now. 10 years is long enough for even end-users to change how they think, and is long enough for industry-practice to erode our sense of specialness more into a compliant shape.

In the mean time, as the phone ringing off the hook today foretold, there is a LOT of learning, decision-making, and mind-changing to go through.

Labels: , ,

Thursday, January 21, 2010

Migrating knowledge bases

This morning we moved the main class volume from NetWare to Windows. We knew we were going to have problems with this since some departments hadn't migrated key groups into AD yet, so the rights-migration script we wrote just plain missed bits. Those have been fixed all morning.

However, it is becoming abundantly clear that we're going to have to retrain a large portion of campus Desktop IT in just what it means to be dealing with Windows networking. We'd thought we'd done a lot of it, but it turns out we were wrong. It doesn't help that some departments had delegated 'access control' rights to professors to set up creative permissioning schemes, this morning the very heated calls were coming in from the professors and not the IT people.

There are two things that are tripping people up. One has been tripping people up on the Exchange side since forever, but the second one is new.
  1. In AD, you have to log out and back in again for new group-memberships to take.
  2. NTFS permissions do not grant the pass-through right that NSS permissions do. So if you grant a group rights to \Science\biology\BIOL1234, members of that group will NOT be able to pass through Science and Biology to get to BIOL1234.
We have a few spots here and there where for one reason or another rights were set at the 2nd level directories instead of the top level dirs. Arrangements like that just won't work in NTFS without busting out the advanced permissions.

An area we haven't had problems yet, but I'm pretty certain we will have some are places where rights are granted and then removed. With NSS that could be done two ways: an Inherited Rights Filter, or a direct trustee grant with no permissions. With NTFS the only way to do that is to block rights inheritance, copy the rights you want, and remove the ones you don't. That sounds simple, but here is the case I'm worried about:


At 'HumRes' the group is granted 'read' rights, and the HR director is granted Modify directly to their user (bad practice, I know. But it's real-world).
At 'JobReview' the group is granted 'Modify'
At 'VPIT' Inheritance is Blocked and rights copied.
At 'JohnSmith' the HR user AngieSmith is granted the DENY right due to a conflict of interest.

Time passes. The old director retires, the new director comes in. IT Person gets informed that the new director can't see everything even though they have Modify to the entire \Humres tree. That IT person will go to us and ask, "WTH?" and we will reply with, "Inheritance is blocked at that level, you will need to explicitly grant Modify for the new director on that directory."

So this is a bit of a sleeper issue.

Meanwhile, we're dealing with a community of users who know in their bones that granting access to 'JohnSmith' means they can browse down from \HumRes to that directory just on that access-grant alone. Convincing them that it doesn't work that way, and working with them to rearrange directory structures to accommodate that lack will take time. Lots of time.

Labels: ,

Saturday, January 16, 2010

The things you learn

We had cause to learn this one the hard way this past week. We didn't know that Windows Server 2008 (64-bit) and Symantec Endpoint Protection just don't mix well. It affected SMBv1 clients, SMBv2 clients (Vista, Win7) were unaffected.

The presentation of it at the packet-level was pretty specific, though. XP clients (and Samba clients) would get to the second step of the connection setup process for mapping a drive and time out.

  1. -> Syn
  2. <- Syn/Ack
  3. -> NBSS, Session Request, to $Server<20> from $Client<00>
  4. <- NBSS, Positive Session Response
  5. -> SMB, Negotiate Protocol Request
  6. <- Ack
  7. [70+ seconds pass]
  8. -> FIN
  9. <- FIN/Ack
Repeat two more times, and 160+ seconds later the client times out. The timeouts between the retries are not consistent so the time it takes varies. Also sometimes the server issues the correct "Protocol Request Reply" packet and the connection continues just fine. There was no sign in any of the SEP logs that it was dropping these connections, and the Windows Firewall was quiet as well.

In the end it took a call to Microsoft. Once we got to the right network person, they knew immediately what the problem was.

ForeFront is now going on those servers. It really should have been on a month ago, but because these cluster nodes were supposed to go live for fall quarter they were fully staged up in August, before we even had the ForeFront clients. We never remembered to replaced SEP with ForeFront.

Labels: ,

Wednesday, December 23, 2009

That TCP Windowing fault

Here is the smoking gun, let me show you it (new window).

That's an entire TCP segment. Packet 339 there is the end of the TCP window as far as the NetWare side is concerned. Packet 340 is a delayed ACK, which is a normal TCP timeout. Then follows a somewhat confusing series of packets and the big delay in packet 345.

That pattern, the 200ms delay, and 5 packets later a delay measurable in full seconds, is common throughout the capture. They seem to happen on boundaries between TCP windows. Not all windows, but some windows. Looking through the captures, it seems to happen when the window has an odd number of packets in it. The Windows server is ACKing after every two packets, which is expected. It's when it has to throw a Delayed ACK into the mix, such as the odd packet at the end of a 27 packet window, is when we get our unstable state.

The same thing happened on a different server (NW65SP8) before I turned off "Receive Window Auto Tuning" on the Server 2008 server. After I turned that off, the SP8 server stopped doing that and started streaming at expectedly high data-rates. The rates still aren't as good as they were when doing the same backup to the Server 2003 server, but at least it's a lot closer. 28 hours for this one backup versus 21, instead of over 5 days before I made the change.

The packets you see are for an NW65 SP5 server after the update to the Windows server. Clearly there are some TCP/IP updates in the later NetWare service-packs that help it talk to Server 2008's TCP/IP stack.

Labels: , ,

Tuesday, December 22, 2009

It's the little things

One thing that Microsoft's Powershell does that I particularly like is that they alias common unix commands. No longer do I have to chant to myself:
ls on linux, dir on dos.
ls on linux, dir on dos.
ls works in powershell. The command line options are different, of course, but at least I don't have to retrain my fingers when doing a simple file list. This is especially useful when I'm fast switching between the two environments. Such as when I'm using, say, Perl to process a text file, and then Powershell to do a large Exchange operation.

Labels: ,

Monday, December 07, 2009

Account lockout policies

This is another area where how Novell and Microsoft handle a feature differ significantly.

Since NDS was first released back at the dawn of the commercial internet (a.k.a. 1993) Novell's account lockout policies (known as Intruder Lockout) were set-able based on where the user's account existed in the tree. This was done per Organizational-Unit or Organization. In this way, users in .finance.users.tree could have a different policy than .facilities.users.tree. This was the case in 1993, and it is still the case in 2009.

Microsoft only got a hierarchical tree with Active Directory in 2000, and they didn't get around to making account lockout policies granular. For the most part, there is a single lockout policy for the entire domain with no exceptions. 'Administrator' is subjected to the same lockout as 'Joe User'. With Server 2008 Microsoft finally got some kind of granular policy capability in the form of "Fine Grained Password and Lockout Policies."

This is where our problem starts. You see, with the Novell system we'd set our account lockout policies to lock after 6 bad passwords in 30 minutes for most users. We kept our utility accounts in a spot where they weren't allowed to lock, but gave them really complex passwords to compensate (as they were all used programatically in some form, this was easy to do). That way the account used by our single-signon process couldn't get locked out and crash the SSO system. This worked well for us.

Then the decision was made to move to a true blue solution and we started to migrate policies to the AD side where possible. We set the lockout policy for everyone. And we started getting certain key utility accounts locked out on a regular basis. We then revised the GPOs driving the lockout policy, removing them from the Default Domain Policy, creating a new "ILO polcy" that we applied individually to each user container. This solved the lockout problem!

Since all three of us went to class for this 7-9 years ago, we'd forgotten that AD lockout policies are monolithic and only work when specified in Default Domain Policy. They do NOT work per-user the way they are in eDirectory. By doing it the way we did, no lockout policies were being applied anywhere. Googling on this gave me the page for the new Server 2008-era granular policies. Unfortunately for us, it requires the domain to be brought to the 2008 Functional Level, which we can't do quite yet.

What's interesting is a certain Microsoft document that suggested settings of 50 bad logins every 30 minutes as a way to avoid DoSing your needed accounts. That's way more that 6 every 30.

Getting the forest functional level raised just got more priority.

Labels: , , , , , ,

Monday, November 16, 2009

Packet size and latency

The event-log parser I'm working on has run into a serious performance wall. Parsing 60 minutes worth of security events takes 90 minutes. The bulk of that time is consumed in the 'get the event logs' part of the process, the 'parse the event logs' portion takes 5% of that time. Looking at packets, I see why.

I'm using Powershell2 for this script, as it has the very lovely Get-WinEvents commandlet. It is lovely because I can give it filter parameters, so it'll only give me the events I'm interested in and not all the rest. In practice, this reduces the number of events I'm parsing by 40%. Also nice is that it returns static list of events, not a pointer list to the ring-buffer that is the ususal Windows Event Log, so $Event[12345].TimeCreated() will stay static.

The reason the performance is so bad is that each event-log is individually delivered via RPC calls. Looking at packets, I see that the average packet size is around 200bytes. Happily, the interval between RPC-Response and the next RPC-Request are fractions of a millisecond, and the RPC-Response times are about a half millisecond so at least the network is crazy-responsive. But that packet size and the serial nature of this request means that the overall throughput is really bad.

If there were a way to phrase the Get-WinEvents command to only populate the attributes I'm interested in and not any of the others (such as .Message(), which is the free-text message portion of the event, quite large, and noticibly laggy in retrieving), it could go a LOT faster. Since I don't have powershell installed on my domain controllers right now, I can't see how much running it directly on them would improve things. I suspect it would improve it by a lot since it should be able to use direct-API methods to extract event-log data rather than RPC-based methods.

As it is, we may have to resort to log-shipping. With over a GB of security event-log generated per DC, per day, that log server is going to have to have large logs. It has to be at least Vista or Windows 2008 in order to get shipped logs in the first place. Sad for us, we don't really have any spare hardware or VM space for a new server like that.

And finally, yes. There are 3rd party tools that do a lot of this. We need something both free and scalable, and that's a tricky combination. Windows generates a LOT of login/logout data in a network our size, keeping up is a challenge.

Labels: ,

Wednesday, October 28, 2009

Filesystem drop-boxes on NTFS

We have a need to provide dropboxes on our file-servers. Some professors don't find Blackboard's dropbox functionality meets their needs, so they rock it 1990's style. On NetWare/OES, this is a simple thing. Take this directory structure:


And a group called PHYS-1234.CLASSES.WWU

Under NetWare, you set an Inherited rights filter or explicitly remove inherited rights, grant the PHYS-1245.CLASSES.WWU group the "C" trustee to the directory, and the professor's user object full rights to it. This allows students to copy files into the directory, but not see anything. On the day the assignment is due, the professor revokes the class-group's rights and tada. A classic dropbox. Dead simple, and we've probably been doing it this way since 1996 if not earlier.

It's not so simple on Windows.

First of all, Windows has different rights for Directories and Files. They use the same bits, but the bits mean different things for files and directories. For instance, one bit means both "write files" for directories, allowing users with this right to create files in the directory (analogus to the "C" NSS trustee right), and "write data" which grants the ability for a user to modify data in a file (analogus to the "M" NSS trustee right). So, this bit on a Directory grants Create, but this bit on a file grants Modify. Right.

To create a dropbox on NTFS, several things need to happen:
  • Inherited rights need to be copied to the directory, and inheritence blocked. (There is no Inherited Rights Filter on NTFS)
  • Extranious rights need to be deleted from the directory. (again with the no IRFs)
  • The class group needs to be granted the 'Read' rights suite to "This Folder Only", as well as "Create files".
    • Traverse Folder
    • List Folder
    • Read Attributes
    • Read Extended Attributes
    • Read Permissions
  • "CREATOR OWNER" (a.k.a. S-1-3-0) needs to be granted the 'Read' rights suite to "Subfolders and files only"
The key thing to remember here is that "Subfolders and files only" is an inheritance setting, where "This Folder Only" is a direct rights grant. Files created in this directory will get the rights defined under 'creator owner'. If the professor wishes to remove student visibility to their files, they'll have to Take Owner each file. I have found that Windows Explorer really, really likes being able to View files it just wrote, and this rights config allows that.

This series of actions will create a drop box in which students can then copy their files and still see them, but then can't do anything with it. This is because Delete is a separate right that is not being granted, and the users are not getting the "Write Data" right either. Once the file is in the directory, it is stuck as far as that user's concerned. If a user attempts to save over the invisible file of another user, perhaps the file names are predictable, they'll get access-denied since they don't have Write Data or Delete to that invisible file.

If you're scripting this, and for this kind of operation I strongly recommend it, use icacls. It'd look something like this:

icacls PHYS-1234 /inheritance:d
icacls PHYS-1234 /remove CAS-Section
icacls PHYS-1234 /grant Classes-PHYS-1234:(rx,wd)
icacls PHYS-1234 /grant ProfessorSmith:(M)
icacls PHYS-1234 /grant *S-1-3-0:(oi)(ci)(rx)

(rx,wd) = Read-Execute & Write-Data
(M) = The "Modify" simple right. Essentially Read/Write without access-control.
(oi) = Object-Inherit, a.k.a. Files
(ci) = Container-Inherit, a.k.a. Directories
(rx) = Read-Execute
*S-1-3-0 = The SID of "CREATOR OWNER". An explicit grant to this SID works better than using the name, in my experience.

This hasn't been battle tested yet, but it seems to work from my pounding on it.

Labels: ,

Thursday, October 22, 2009

Windows 7 releases!

Or rather, its retail availability is today. We're on a Microsoft agreement, so we've had it since late August. And boy do I know that. I've been having a trickle of calls and emails ever since the beta released about various ways Win7 isn't working in my environment and whether I have any thoughts about that. Well, I do. As a matter of fact, Technical Services and ATUS both have thoughts on that:

Don't use it yet. We're not ready. Things will break. Don't call us when it does.

But as with any brand new technology there is demand. Couple that with the loose 'corporate controls' inherent in a public Higher Ed institution and we have it coming in anyway. And I get calls when people can't get to stuff.

The main generator of calls is our replacement of the Novell Login Script. I've spoken about how we feel about our login script in the past. Back on July 9, 2004 I had a long article about that. The environment has changed, but it still largely stands. Microsoft doesn't have a built in login script the same way NetWare/OES has had since the 80's, but there are hooks we can leverage. One of my co-workers has built a cunning .VBS file that we're using for our login script, and does the kinds of things we need out of a login script:
  • Run a series of small applications we need to run, which drive the password change notification process among other things.
  • Maps drives based on group membership.
  • Maps home directories.
  • Allows shelling out to other scripts, which allows less privileged people to manage scripts for their own users.
A fair amount of engineering did go into that script, but it works. Mostly. And that's the problem. It works good enough that at least one department on campus decided to put Vista in their one computer lab and rely on this script to get drive mappings. So I got calls shortly after quarter-start to the effect of, "your script don't work, how can this be fixed." To which my reply was (summarized), "You're on Vista and we told y'all not to do that. This isn't working because of XYZ, you'll have to live with it." And they have, for which I am greatful.

Which brings me to XYZ and Win7.

The main incompatibility has to do with the NetWare CIFS stack. Which I describe here. The NetWare CIFS stack doesn't speak NTLMv2, only LM and NTLM. In this instance, it makes it similar to much older Samba versions. This conflicts with Vista and Windows 7, which both default their LAN Manager Authentication Level to "NTLMv2 Responses Only." Which means that out of the box both Vista and Win7 will require changes to talk to our NetWare servers at all. This is fine, so long as they're domained we've set a Group Policy to change that level down to something the NetWare servers speak.

That's not all of it, though. Windows 7 introduced some changes into the SMB/CIFS stack that make talking to NetWare a bit less of a sure thing even with the LAN Man Auth level set right. Perhaps this is SMB2 negotiations getting in the way. I don't know. But for whatever reason, the NetWare CIFS stack and Win7 don't get along as well as the Vista's SMB/CIFS stack did.

The main effect of this is that the user's home-directory will fail to mount a lot more often on Win7 than on Vista. Also, other static drive mappings will fail more often. It is reasons like these that we are not recommending removing the Novell Client and relying on our still in testing Windows Login Script.

That said, I can understand why people are relying on the crufty script rather than the just-works Novell Login Script. Due to how our environment works, The Vista/Win7 Novell Client is dog slow. Annoyingly slow. So annoyingly slow that not getting some drives when you log in is preferable to dealing with it.

This will all change once we move the main file-serving cluster to Windows 2008. At that point, the Windows script should Just Work (tm). At that point, getting rid of the Novell Client will allow a more functional environment. We are not at that point yet.

Labels: , , ,

Wednesday, September 30, 2009

I have a degree in this stuff

I have a CompSci degree. This qualified me for two things:
  • A career in academics
  • A career in programming
You'll note that Systems Administration is not on that list. My degree has helped my career by getting me past the "4 year degree in a related field" requirement of jobs like mine. An MIS degree would be more appropriate, but there were very few of those back when I graduated. It has indirectly helped me in troubleshooting, as I have a much better foundation about how the internals work than your average computer mechanic.

Anyway. Every so often I stumble across something that causes me to go Ooo! ooo! over the sheer computer science of it. Yesterday I stumbled across Barrelfish, and this paper. If I weren't sick today I'd have finished it, but even as far as I've gotten into it I can see the implications of what they're trying to do.

The core concept behind the Barrelfish operating system is to assume that each computing core does not share memory and has access to some kind of message passing architecture. This has the side effect of having each computing core running its own kernel, which is why they're calling Barrelfish a 'multikernel operating system'. In essence, they're treating the insides of your computer like the distributed network that it is, and using already existing distributed computing methods to improve it. The type of multi-core we're doing now, SMP, ccNUMA, uses shared memory techniques rather than message passing, and it seems that this doesn't scale as far as message passing does once core counts go higher.

They go into a lot more detail in the paper about why this is. A big one is hetergenaity of CPU architectures out there in the marketplace, and they're not just talking just AMD vs Intel vs CUDA, this is also Core vs Core2 vs Nehalem. This heterogenaity in the marketplace makes it very hard for a traditional Operating System to be optimized for a specific platform.

A multikernel OS would use a discrete kernel for each microarcitecture. These kernels would communicate with each other using OS-standardized message passing protocols. On top of these microkernels would be created the abstraction called an Operating System upon which applications would run. Due to the modularity at the base of it, it would take much less effort to provide an optimized microkernel for a new microarcitecture.

The use of message passing is very interesting to me. Back in college, parallel computing was my main focus. I ended up not pursuing that area of study in large part because I was a strictly C student in math, parallel computing was a largely academic endeavor when I graduated, and you needed to be at least a B student in math to hack it in grad school. It still fired my imagination, and there was squee when the Pentium Pro was released and you could do 2 CPU multiprocessing.

In my Databases class, we were tasked with creating a database-like thingy in code and to write a paper on it. It was up to us what we did with it. Having just finished my Parallel Computing class, I decided to investigate distributed databases. So I exercised the PVM extensions we had on our compilers thanks to that class. I then used the six Unix machines I had access to at the time to create a 6-node distributed database. I used statically defined tables and queries since I didn't have time to build a table parser or query processor and needed to get it working so I could do some tests on how optimization of table positioning impacted performance.

Looking back on it 14 years later (eek) I can see some serious faults about my implementation. But then, I've spent the last... 12 years working with a distributed database in the form of Novell's NDS and later eDirectory. At the time I was doing this project, Novell was actively developing the first version of NDS. They had some problems with their implementation too.

My results were decidedly inconclusive. There was a noise factor in my data that I was not able to isolate and managed to drown out what differences there were between my optimized and non-optimized runs (in hindsight I needed larger tables by an order of magnitude or more). My analysis paper was largely an admission of failure. So when I got an A on the project I was confused enough I went to the professor and asked how this was possible. His response?
"Once I realized you got it working at all, that's when you earned the A. At that point the paper didn't matter."
Dude. PVM is a message passing architecture, like most distributed systems. So yes, distributed systems are my thing. And they're talking about doing this on the motherboard! How cool is that?

Both Linux and Windows are adopting more message-passing architectures in their internal structures, as they scale better on highly parallel systems. In Linux this involved reducing the use of the Big Kernel Lock in anything possible, as invoking the BKL forces the kernel into single-threaded mode and that's not a good thing with, say, 16 cores. Windows 7 involves similar improvements. As more and more cores sneak into everyday computers, this becomes more of a problem.

An operating system working without the assumption of shared memory is a very different critter. Operating state has to be replicated to each core to facilitate correct functioning, you can't rely on a common memory address to handle this. It seems that the form of this state is key to performance, and is very sensitive to microarchitecture changes. What was good on a P4, may suck a lot on a Phenom II. The use of a per-core kernel allows the optimal structure to be used on each core, with changes replicated rather than shared which improves performance. More importantly, it'll still be performant 5 years after release assuming regular per-core kernel updates.

You'd also be able to use the 1.75GB of GDDR3 on your GeForce 295 as part of the operating system if you really wanted to! And some might.

I'd burble further, but I'm sick so not thinking straight. Definitely food for thought!

Labels: , , , , ,

Monday, September 21, 2009

Quarter start: printing

Today is go-live for the new Microsoft/PCounter based printing system. It hasn't gone off perfectly, but most of the problems so far have been manageable. Also, it's only Monday. The true peak load for printing will be Wednesday between 11:00 and 12:00. Wednesday is when classes start.

So far the big problem is that some of the disk images used for the labs included printers they weren't supposed to, a side effect of how Microsoft does printing. All in all, it's a pretty small thing but it does ruin the clean look. The time between when Summer session stopped and when all the images had to be applied (last Friday) was the same time we get every year, but this year included major changes we haven't seen since we converted from queue-based printing to NDPS printing back around 2002. So yeah, these kinds of QA things can get dropped when under this kind of time pressure, and just plain new environment.

Also, the Library doesn't have their release stations up yet. They'll have them there by the end of the day, but the fact remains that they're on the old system until then. Due to the realities of accounting, each student was given only 50 pages this morning on the old system. Which means that some users are already whacking their heads on the limit. They'll have to go to one of the ATUS labs to print, as they're all on the new system and their quotas are much higher there. If Libraries doesn't have it by tomorrow, something will have to give.

Labels: ,

Friday, September 18, 2009

It's the little things

My attention was drawn to something yesterday that I just hadn't registered before. Perhaps because I see it so often I didn't twig to it being special in just that place.

Here are the Received: headers of a bugzilla message I got yesterday. It's just a sample. I've bolded the header names for readability:
Received: from ( by ( with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:10 -0700
Received: from ( by ( with Microsoft SMTP Server (TLS) id 8.1.393.1; Tue, 15 Sep 2009 13:58:09 -0700
Received: from mail97-va3 (localhost.localdomain []) by (Postfix) with ESMTP id 6EFC9AA0138 for me; Tue, 15 Sep 2009 20:58:09 +0000 (UTC)
Received: by mail97-va3 (MessageSwitch) id 12530482889694_15241; Tue, 15 Sep 2009 20:58:08 +0000 (UCT)
Received: from ( []) by (Postfix) with ESMTP id 5F7101A58056 for me; Tue, 15 Sep 2009 20:58:07 +0000 (UTC)
Received: from ([]) by with ESMTP; Tue, 15 Sep 2009 14:57:58 -0600
Received: from (localhost []) by (Postfix) with ESMTP id A56EECC7CE for me; Tue, 15 Sep 2009 14:57:58 -0600 (MDT)
For those who haven't read these kinds of headers before, read from the bottom up. The mail flow is:
  1. Originating server was, which mailed to...
  2. running Postfix, who forwarded it on to Novell's outbound mailer...
  3., who attempted to send to us and sent to the server listed in our MX record...
  4. running Postfix, who forwarded it on to another mailer on the same machine...
  5. mail97-ca3-r running something called MessageSwitch, who sent it on to the internal server we set up...
  6. running Exchange 2007, who send it on to the Client Access Server...
  7. for 'terminal delivery'. Actually it went on to one of the Mailbox servers, but that doesn't leave a record in the SMTP headers.
Why is this unusual? Because steps 4 and 5 are at Microsoft's Hosted ForeFront mail security service. The perceptive will notice that step 4 indicates that the server is running Postfix.

Postfix. On a Microsoft server. Hur hur hur.

Keep in mind that Microsoft purchased the ForeFront product line lock stock and barrel. If that company had been using non-MS products as part of their primary message flow, then Microsoft probably kept that up. Next versions just might move to more explicitly MS-branded servers. Or not, you never know. Microsoft has been making placating notes towards Open Source lately. They may keep it.

Labels: , , ,

Tuesday, September 08, 2009

DNS and AD Group Policy

This is aimed a bit more at local WWU users, but it is more widely applicable.

Now that we're moving to an environment where the health of Active Directory plays a much greater role, I've been taking a real close look at our DNS environment. As anyone who has ever received any training on AD knows, DNS is central to how AD works. AD uses DNS the way WinNT used WINS, the way IPX used SAPs, or NetWare uses SLP. Without it things break all over the place.

As I've stated in a previous post our DNS environment is very fragmented. As we domain more and more machines, the '' domain becomes the spot where the vast majority of computing resources is resolveable. Right now, the BIND servers are authoritative for the reverse-lookup domains which is why the IP address I use for managing my AD environment resolves to something not in the domain. What's more, the BIND servers are the DNS servers we pass out to every client.

That said, we've done the work to make it work out. The BIND servers have delegation records to indicate that the AD DNS root domain of is to be handled by the AD DNS servers. Windows clients are smart enough to notice this and do the DNS registration of their workstation name against the AD DNS servers and not the BIND servers. That said, the domains are authoritative on the BIND servers so the client's attempt to register the reverse-lookup records all fail. Every client on our network has Event Log entries to this effect.

Microsoft has DNS settings as a possible target for management through Group Policy. This could be used to help ensure our environment stays safe, but will require analysis before we do anything. Changes will not be made without a testing period. What can be done, and how can it help us?

Primary DNS Suffix
Probably the simplest setting of the lot. This would allow us to force all domained machines to consider to be their primary DNS domain and treat it accordingly for Dynamic DNS updates and resource lookups.

Dynamic Update
This forces/allows clients to register their names into the domain's DNS domain of Most already do this, and this is desirable anyway. We're unlikely to deviate from default on this one.

DNS Suffix Search List
This specifies the DNS suffixes that will be applied to all lookup attempts that don't end in period. This is one area that we probably should use, but don't know what to set. is at the top of the list for inclusion, but what else? seems logical, and is where a lot of central resources are located. But most of those are in now. So. Deserves thought.

Primary DNS Suffix Devolution
This determines whether to include the component parts of the primary dns suffix in the dns search list. If we set the primary DNS suffix to be, then the DNS resolver will also look in, and I believe the default here is 'True'.

Register PTR Records
If the domains remain on the BIND servers, we should probably set this to False. At least so long as our BIND servers refuse dynamic updates that is.

Registration Refresh Interval
Determines how frequently to update Dynamic registrations. Deviation from default seems unlikely.

Replace Addresses in Conflicts
This is a setting for handling how multiple registrations for the same IP (here defined as multiple A records pointing to the same IP) are to be handled. Since we're using insecure DNS updates at the moment, this setting deserves some research.

DNS Servers
If the Win/NW side of Tech Services wishes to open warfare with the Unix side of Tech Services we'll set this to use the AD DNS servers for all domained machines. For this setting overrides client-side DNS settings with the DNS servers defined in the Group Policy. No exceptions. A powerful tool. If we set this at all, it'll almost definitely be the BIND DNS servers. But I don't think we will. Also, it may be true that Microsoft has removed this from the Server 2008 GPO, as it isn't listed on this page.

Register DNS Records with Connection-Specific DNS Suffix
If a machine has more than one network connection (very, very few non VMWare host-machines will) allow them to register those connections against their primary DNS suffix. Due to the relative derth of configs, we're unlikely to change this from default.

TTL Set in the A and PTR Records
Since we're likely to turn off PTR updates, this setting is redundant.

Update Security Level
As more and more stations domain, there will come a time when we may wish to cut out the non-domained stations from updating into If that times come, we'll set this to 'secure only'. Until then, won't touch it.

Update Top Level Domain Zones
This allows clients to update a TLD like .local. Since our tree is not rooted in a TLD, this doesn't apply to us.

Some of these can have wide ranging effects, but are helpful. I'm very interested in the search-list settings, since each of our desktop techs has tens of DNS domains to chose from depending on their duty area. Something here might greatly speed up resouce resolution times.

Labels: , ,

Exchange Transport Rules, update

Remember this from a month ago? As threatened in that post I did go ahead and call Microsoft. To my great pleasure, they were able to reproduce this problem on their side. I've been getting periodic updates from them as they work through the problem. I went through a few cycles of this during the month:

MS Tech: Ahah! We have found the correct regex recipe. This is what it is.
Me: Let's try it out shall we?
MS Tech: Absolutely! Do you mind if we open up an Easy Assist session?
Me: Sure. [does so. Opens sends a few messages through, finds an edge case that the supplied regex doesn't handle]. Looks like we're not there yet in this edge case.
MS Tech: Indeed. Let me try some more things out in the lab and get back to you.

They've finally come up with a set of rules to match this text definition: "Match any X-SpamScore header with a signed integer value between 15 and 30".

Reading the KB article on this you'd think these ORed patterns would match:
But you'd be wrong. The rule that actually works is:
Except if ^-
Yes, that 'except if' is actually needed, even though the first rule should never match a negative value. You really need to have the $ inside the parens for the first statement, or it doesn't match right; this won't work: ^1(5|6|7|8|9)$. The same goes for the second statement with the \d$ constructor. The last statement doesn't need the 0$ in parens, but is there to match the pattern of the previous two statements of having the $ in the paren.


In the end, regexes in Exchange 2007 Transport Rules are still broken, but they can be made to work if you pound on them enough. We will not be using them because they are broken, and when Microsoft gets around to fixing them the hack-ass recipes we cook up will probably break at that time as well. A simple value list is what we're using right now, and it works well for 16-30. It doesn't scale as well for 31+, but there does seem to be a ceiling on what X-SpamScore can be set to.

Labels: , ,

Wednesday, September 02, 2009

A clustered file-system, for windows?

Yesterday I ran into this:

On the surface it looks like NTFS behaving like OCFS. But Microsoft has a warning on this page:
In Windows Server® 2008 R2, the Cluster Shared Volumes feature included in failover clustering is only supported for use with the Hyper-V server role. The creation, reproduction, and storage of files on Cluster Shared Volumes that were not created for the Hyper-V role, including any user or application data stored under the ClusterStorage folder of the system drive on every node, are not supported and may result in unpredictable behavior, including data corruption or data loss on these shared volumes. Only files that are created for the Hyper-V role can be stored on Cluster Shared Volumes. An example of a file type that is created for the Hyper-V role is a Virtual Hard Disk (VHD) file.

Before installing any software utility that might access files stored on Cluster Shared Volumes (for example, an antivirus or backup solution), review the documentation or check with the vendor to verify that the application or utility is compatible with Cluster Shared Volumes.
So unlike OCFS2, this multi-mount NTFS is only for VM's and not for general file-serving. In theory you could use this in combination with Network Load Balancing to create a high-availability cluster with even higher uptime than failover clusters already provide. The devil is in the details though, and Microsoft aludes to them.

A file system being used for Hyper-V isn't a complex locking environment. You'll have as many locks as there are VHD files, and they won't change often. Contrast this with a file-server where you can have thousands of locks that change by the second. Additionally, unless you disable Opportunistic Locking you are at grave risk of corrupting files used by more than one user (Acess databases!) if you are using the multi-mount NTFS.

Microsoft will have to promote awareness of this type of file-system into the SMB layer before this can be used for file-sharing. SMB has its own lock layer, and this will have to coordinate the SMB layers in the other nodes for it to work right. This may never happen, we'll see.

Labels: , ,

Tuesday, August 25, 2009

LDAP result size

It turns out that Microsoft changed how it handles the MaxPageSize value for the domain controller LDAP policy. The Server 2003 DC returns just over 34,000 objects for a certain query, but the Server 2008 DC in the same domain and therefore subject to the same LDAP policy returns around 20,000 objects. This is breaking things.

The question is: Is this a result of a new policy in Server 2008, or did Microsoft code in a new, lower max-value for this particular policy value? Whatever it is, the upper limit seems to be 20K. Don't know yet.


Tuesday, August 18, 2009

Didn't know that

The integrated network card in the HP DL380-G2 doesn't have a Windows Server 2008 driver. Anywhere. And the forum post that says you can use the 2003 driver on it lies, unless there is some even sneakier way of getting a driver in than I know of.

This is a problem, as that's one of our Domain Controllers. But not much of one, since it's one of the three DC's in the empty root (our forest is old enough for that particular bit of discredited advice) and all it does is global-catalog work. And act as our ONLY DOMAIN CONTROLLER on campus. In the off chance that a back-hoe manages to cut BOTH fiber routes to campus, it's the only GC up there.

Also, since it couldn't boot from a USB-DVD drive I had to do a parallel install of 2008 on it. So I still had my perfectly working 2003 install available. So I just dcpromoed the 2003 install and there we are!

Once we get a PCI GigE card for that server I can try getting 2008 working again.

Labels: ,

Thursday, August 13, 2009

Why we still use WINS when we have AD

WINS... the Windows Internet Name Service. Introduced in, I believe, Windows NT 3.5 in order to allow Windows name resolution to work across different IP subnets. NetBIOS relies on broadcasts for name resolution, and WINS allowed it to work by using a unicast to the WINS server to find addresses. In theory, DNS in Active Directory (now nine years old!) replaced it.

Not for us.

There are two things that drive the continued existence of WINS on our network, and will ensure that I'll be installing the Server 2008 WINS server when I upgrade our Domain Controllers in the next two weeks:
  1. We still have a lot of non-domained workstations
  2. Our DNS environment is mind-bogglingly fragmented
Here is a list of domains we have, and this is just the domains we're serving with DHCP. There are a lot more:
There are more we're serving with DHCP, I just got bored making the list. The thing is, a lot of those networks, and especially the labs, contain 100% domained workstations. Since we only have the one domain, this means all those computers are in a flat DNS structure. In effect, each domained workstation on campus has two DNS names: the one on our BIND servers, and the one in the MS-DNS servers.

That said, for those machines that AREN'T in the domain the only way they can find anything is to use WINS. We will be using until the University President says unto the masses, "Thou Shalt Domain Thy PC, Or Thou Shalt Be Denied Service." Until then, WINS will continue to be the best way to find Windows resources on campus.

Labels: ,

Thursday, August 06, 2009

Permission differences

In part, this blog post could have been written in 1997. We haven't exactly beaten down the door migrating away from NetWare.

Anyway, there are two areas that are vexing me regarding the different permissioning models between how Novell does it, and how Microsoft does it. The first has been around since the NT days, and relates to the differences (vast differences) between NTFS and the Trustee model. The second has to do with Active Directory permissions.

First, NTFS. As most companies contemplating a move from NetWare to Microsoft undoubtedly find out, Microsoft does permissions differently. First and foremost, NTFS doesn't have the concept of the 'visibility list', which is what allows NetWare to do this:

Grant a permission w-a-y down a directory tree.
Members of that rights grant will be able to browse from volume-root to that directory. They will see each directory entry along the path, and nothing else. Even if they have no rights to the intervening directories.

NTFS doesn't do that. In order to fake it you need two things:
  • Access Based Enumeration turned on on the share (default in Server 2008, and add-on option in Server 2003)
  • A specific rights grant on each directory between the share and the directory with the rights grant. The "Read" simple right granted to "this directory only".
Unfortunately, the second one is tricky. In order to grant it you have to add an Advanced right, because the "read" simple right grants read to, "This directory, files, and subdirectories," when what you want is, "this directory only". What this does is grant you the right to see that directory-entry in the previous directory's list.

Example: if I grant the group "StateAuditors" the write access to this directory:


If I just grant the right directly on "Procedures", the StateAuditors won't be able to get to that directory by way of that share. I could just create a new share on that spot, and it'd work. Otherwise, I'll have to grant the above mentioned rights to each of DocTeam, StandardsOffice, and Accounting.

It can be done, and it can even be scripted, but it represents a significant change in thinking required when it comes to handling permissions. As most permissions are handled by our Desktop group, this will require retraining on their part.

Second, AD permissions. AD, unlike eDirectory, does not allow the permissions short-cut of assigning a right to an OU. In eDirectory, this allowed anything in that OU access to the whatever. In AD, you can't grant the permission in the first place without a lot of trouble, and it won't work like you expect even if you do manage to assign it.

This is going to be a problem with printers. In the past, when creating new print objects for Faculty/Staff printers, I'd grant the users.wwu OU rights to use the printer. As students aren't in the access list, they can't print to it unless they're in a special printer-access group. All staff can print, but only special students can. As it should be. No biggie.

AD doesn't allow that. In order to allow "all staff but no students" to print to a printer, I'd have to come up with a group of some kind that contains all staff. That's going to be too unwieldy for words, so we have to go to the 'printer access group for everyone' model. Since I'm the one that sets up printer permissions, this is something *I* have to keep in mind.

Labels: , ,

Friday, July 24, 2009

Whoa! Another out-of-band patch from Microsoft!

TWO Updates!

One for Visual Studio, and a second for Internet Explorer. Due to the relative lack of IE use (okay, downright zero) on our Servers, we'll probably not hustle this one out the door. Our developers, on the other hand, should pay attention.

Labels: ,


We'll be getting our hardware early next week. Still no ETA on the storage we need to manage things.

Also, with Server 2008R2 being released on the 14th, that allows us to pick which OS we want to run. Microsoft just made it under the bar for us to consider that rev for fall quarter. Server 2008 R2 has a lot of new stuff in it, such as even more PowerShell (yes, it really is all that and a bag of chips) stuff, and PowerShell 2.0! "Printer Driver Isolation," should help minimize spooler faults, something I fear already even though I'm not running Windows Printing in production yet. Built in MPIO gets improvements as well, something that is very nice to have.

And yet... if we don't get it in time, we may not do it for the file serving cluster. Especially if pcounter doesn't work with the rebuilt printing paths for some reason. Won't know until we get stuff.

But once we get a delivery date for the storage pieces, we can start talking actual timelines beyond, "we hope to get it all done by fall start." Like, which weekends we'll need to block out to babysit the data migrations.

Labels: , ,

Tuesday, July 21, 2009

Digesting Novell financials

It's a perennial question, "why would anyone use Novell any more?" Typically coming from people who only know Novell as "That NetWare company," or perhaps, "the company that we replaced with Exchange." These are the same people who are convinced Novell is a dying company who just doesn't know it yet.

Yeah, well. Wrong. Novell managed to turn the corner and wean themselves off of the NetWare cash-cow. Take the last quarterly statement, which you can read in full glory here. I'm going to excerpt some bits, but it'll get long. First off, their description of their market segments. I'll try to include relevant products where I know them.

We are organized into four business unit segments, which are Open Platform Solutions, Identity and Security Management, Systems and Resource Management, and Workgroup. Below is a brief update on the revenue results for the second quarter and first six months of fiscal 2009 for each of our business unit segments:

Within our Open Platform Solutions business unit segment, Linux and open source products remain an important growth business. We are using our Open Platform Solutions business segment as a platform for acquiring new customers to which we can sell our other complementary cross-platform identity and management products and services. Revenue from our Linux Platform Products category within our Open Platform Solutions business unit segment increased 25% in the second quarter of fiscal 2009 compared to the prior year period. This product revenue increase was partially offset by lower services revenue of 11%, such that total revenue from our Open Platform Solutions business unit segment increased 18% in the second quarter of fiscal 2009 compared to the prior year period.

Revenue from our Linux Platform Products category within our Open Platform Solutions business unit segment increased 24% in the first six months of fiscal 2009 compared to the prior year period. This product revenue increase was partially offset by lower services revenue of 17%, such that total revenue from our Open Platform Solutions business unit segment increased 15% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: SLES/SLED]

Our Identity and Security Management business unit segment offers products that we believe deliver a complete, integrated solution in the areas of security, compliance, and governance issues. Within this segment, revenue from our Identity, Access and Compliance Management products increased 2% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 45%, such that total revenue from our Identity and Security Management business unit segment decreased 16% in the second quarter of fiscal 2009 compared to the prior year period.

Revenue from our Identity, Access and Compliance Management products decreased 3% in the first six months of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 40%, such that total revenue from our Identity and Security Management business unit segment decreased 18% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: IDM, Sentinal, ZenNAC, ZenEndPointSecurity]

Our Systems and Resource Management business unit segment strategy is to provide a complete “desktop to data center” offering, with virtualization for both Linux and mixed-source environments. Systems and Resource Management product revenue decreased 2% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 10%, such that total revenue from our Systems and Resource Management business unit segment decreased 3% in the second quarter of fiscal 2009 compared to the prior year period. In the second quarter of fiscal 2009, total business unit segment revenue was higher by 8%, compared to the prior year period, as a result of our acquisitions of Managed Object Solutions, Inc. (“Managed Objects”) which we acquired on November 13, 2008 and PlateSpin Ltd. (“PlateSpin”) which we acquired on March 26, 2008.

Systems and Resource Management product revenue increased 3% in the first six months of fiscal 2009 compared to the prior year period. The total product revenue increase was partially offset by lower services revenue of 14% in the first six months of fiscal 2009 compared to the prior year period. Total revenue from our Systems and Resource Management business unit segment increased 1% in the first six months of fiscal 2009 compared to the prior year period. In the first six months of fiscal 2009 total business unit segment revenue was higher by 12% compared to the prior year period as a result of our Managed Objects and PlateSpin acquisitions.

[sysadmin1138: Products include: The rest of the ZEN suite, PlateSpin]

Our Workgroup business unit segment is an important source of cash flow and provides us with the potential opportunity to sell additional products and services. Our revenue from Workgroup products decreased 14% in the second quarter of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 39%, such that total revenue from our Workgroup business unit segment decreased 17% in the second quarter of fiscal 2009 compared to the prior year period.

Our revenue from Workgroup products decreased 12% in the first six months of fiscal 2009 compared to the prior year period. In addition, services revenue was lower by 39%, such that total revenue from our Workgroup business unit segment decreased 15% in the first six months of fiscal 2009 compared to the prior year period.

[sysadmin1138: Products include: Open Enterprise Server, GroupWise, Novell Teaming+Conferencing,

The reduction in 'services' revenue is, I believe, a reflection in a decreased willingness for companies to pay Novell for consulting services. Also, Novell has changed how they advertise their consulting services which seems to also have had an impact. That's the economy for you. The raw numbers:

Three months ended

April 30, 2009

April 30, 2008

(In thousands)

Net revenue

income (loss)

Net revenue

income (loss)

Open Platform Solutions

$ 44,112
$ 34,756

$ 21,451

$ 37,516
$ 26,702

$ 12,191

Identity and Security Management







Systems and Resource Management














Common unallocated operating costs

(3,406 )

(113,832 )

(2,186 )

(131,796 )

Total per statements of operations

$ 215,595
$ 170,313

$ 17,624

$ 235,666
$ 175,199

$ 1,667

Six months ended

April 30, 2009

April 30, 2008

(In thousands)

Net revenue

income (loss)

Net revenue

income (loss)

Open Platform Solutions

$ 85,574
$ 68,525

$ 40,921

$ 74,315
$ 52,491

$ 24,059

Identity and Security Management







Systems and Resource Management














Common unallocated operating costs

(7,071 )

(228,940 )

(4,675 )

(257,058 )

Total per statements of operations

$ 430,466
$ 338,287

$ 31,268

$ 466,592
$ 348,184

$ 10,148

So, yes. Novell is making money, even in this economy. Not lots, but at least they're in the black. Their biggest growth area is Linux, which is making up for deficits in other areas of the company. Especially the sinking 'Workgroup' area. Once upon a time, "Workgroup," constituted over 90% of Novell revenue.
Revenue from our Workgroup segment decreased in the first six months of fiscal 2009 compared to the prior year period primarily from lower combined OES and NetWare-related revenue of $13.7 million, lower services revenue of $10.5 million and lower Collaboration product revenue of $6.3 million. Invoicing for the combined OES and NetWare-related products decreased 25% in the first six months of fiscal 2009 compared to the prior year period. Product invoicing for the Workgroup segment decreased 21% in the first six months of fiscal 2009 compared to the prior year period.
Which is to say, companies dropping OES/NetWare constituted the large majority of the losses in the Workgroup segment. Yet that loss was almost wholly made up by gains in other areas. So yes, Novell has turned the corner.

Another thing to note in the section about Linux:
The invoicing decrease in the first six months of 2009 reflects the results of the first quarter of fiscal 2009 when we did not sign any large deals, many of which have historically been fulfilled by SUSE Linux Enterprise Server (“SLES”) certificates delivered through Microsoft.
Which is pretty clear evidence that Microsoft is driving a lot of Novell's Operating System sales these days. That's quite a reversal, and a sign that Microsoft is officially more comfortable with this Linux thing.

Labels: , , , , , , , ,

This page is powered by Blogger. Isn't yours?