Recently in netware Category

Worried about the IPv4 to IPv6 migration?

NetWare users had a similar migration when Novell finally got off of IPX and moved to native TCP/IP with the release of NetWare 5.0 on or around 1999. We've done it before. Like the IPv6 transition, it was reasons other than "because it's a good idea" that pushed for the retirement of IPX from the core network. Getting rid of old networking protocols is hard and involves a lot of legacy, so they stick around for a long, long time.

As it happens IPv6 is spookily familiar to old IPX hands, but better in pretty much every way. It's what Novell had in mind back in the 80's, but done right.

  • Dynamic network addressing that doesn't require DHCP.
  • A mechanism for whole-network announcements (SAP in IPX, various multicast methods for IPv6)

Anyway, you have a network protocol you need to eventually retire, but pretty much everything uses it. What do you do? Like the stages of grief, there is a progression at work here:

  1. Ignore it. We're using the old system just fine, it's going to work for the forseeable future, no reason to migrate.
  2. On by default, but disabled manually. The installer asks for the new stuff, but we just turn it off as soon as the system is up. We're not migrating yet.
  3. The WAN link doesn't support the old stuff. Um, crap. Tunnel the old stuff over the new stuff for that link and otherwise... continue to not migrate.
  4. Clients go on-by-default, but disabled manually. New clients are supporting the new stuff, but we disable it manually when we push out new clients. We're not migrating.
  5. Clients get trouble related to protocol negotiation. Thanks to the tunnel there is new stuff out there and clients are finding it, but can't talk to it. Which is creating network delays and causing support tickets. Find ways to disable protocol discovery, push that out to clients.
  6. Internal support says all the manual changes are harshing their workflow, and can we please migrate since everything supports it now anyway. Okay, maybe we can go dual stack now.
  7. Network team asks if they can turn off the old stuff since everything is also using the new stuff. Say no, and revise deploy guides to start disabling the old stuff on clients but keep it on servers just in case.
  8. Network team asks again since the networking vendor has issued a bulletin on this stuff. Audit servers to see if there is any oldstuff usage. Find that the only usage is between the servers themselves and some really old, extremely broken stuff. Replace the broken stuff, turn off old stuff stack on servers.
  9. Migration complete.

At WWU we finished our IPX to IP migration by following this road and it took us something like 7 years to do it.

Ask yourself where you are in your IPv6 implementation. At WWU when I left we'd gotten to step 5 (but didn't have a step 3).

I've done this before, and so have most old NetWare hands. Appeals to best practices and address-space exhaustion won't work as well as you'd hope, feeling the pain of the protocol transition does. Just like we're seeing right now. Migration will happen after operational pain is felt, because people are lazy. We're going to have RFC1918 IPv4 islands hiding behind corporate firewalls for years and years to come, with full migration only happening after devices stop supporting IPv4 at all.

The IPX transition was a private-network only transition since it was never transited over the public Internet. The IPv6 transition is Internet wide, but there are local mitigations that will allow local v4 islands to function for a long, long time. I know this, since I've done it before.

This is a bit of a rehash of a post I did back in 2005, but Novell did it right when it came to handling user credentials way back in the late 80's and early 90's. The original documents have pretty much fallen off the web, but Novell chose to use a one-way RSA method (or possibly a two-way RSA method but elected to not retain the decryption key, which is much the same thing) to encipher the passwords. The certificate used in this method was generated by the tree itself at creation time, so was unique per tree.

The authentication process looked something like this (from memory, see also: primary documentation is offline)

  1. Client connects to a server, says, I want to log in a user, here is a temporary key.
  2. Server replies using the temporary key, "Sure. Here is my public key and a salt."
  3. Client says, "I want to log in bobjoe42.employees.corporate"
  4. Server replies, "Here is the public key for bobjoe42.employees.corporate"
  5. Client crypts the password with bobjoe42's certificate.
  6. Client crypts the cryptotext+salt with the server's signing key.
  7. Client says, "Here is the login for bobjoe42.emploees.corporate"
  8. Server decrypts login request to get at the cryptotext+salt of bobjoe42.emploees.corporate.
  9. Removes salt.
  10. Server compares the submitted cryptotext to the cryptotext on bobjoe42.employees.corporate's object. It matches.
  11. Server says, "You're good."

Unfortunately, the passwords were monocased before crypting computation.

Fortunately, they allowed really long passwords unlike many systems (ahem 1993 version of UNIX-crypt).

That said, this system does a lot of password-handling things right:

  1. Passwords are never passed in the clear over the network, only the enciphered values are transferred.
  2. Passwords are never stored in the clear.
  3. Passwords are never stored in a reversable way.
  4. Reversible keys are never transferred in the clear.
  5. The password submission process prevents replay attacks through the use of a random salt with each login transaction.
  6. The passwords themselves were stored encrypted with tree-unique crypto certificates, so the ciphertext of a password in one tree would look different than the same password in a different tree.

You can get a similar system using modern web technologies:

  1. Client connects to server over SSL. A secure channel is created.
  2. Client retrieves javascript or whatever from the server describing how to encode login credentials.
  3. Client says, "I'm about to log in, give me a salt."
  4. Server returns a salt to the client.
  5. Client computes a salted hash from the user-supplied password.
  6. Client submits, "I'm logging in bobjoe42@zmail.us with this hashtext."
  7. Server compares the hashtext to the password database, finds a match.
  8. Server replies, "You're good, use this token."

However, a lot of systems don't even bother going that complex, relying instead on the SSL channel to provide transaction security and allowing unhashed passwords to be passed over that crypted channel. That's "good enough" for a lot of things, and clearly Novell with rather paranoid back in the day.

As it happened, that method ended up being so secure they had to change their authentication system when it came time to handle systems that wanted to authenticate using non-NCP methods like, oh, CIFS, or Appletalk. Those other protocols don't have mechanisms to handle the sort of handshake that NCP allows so something else had to be created, and thus the Universal Password was born. But that's kind of beyond the scope of this article.

Yep, they did it right back then. A network sniffer on the network (a lot easier in the days of hubbed networks) was much less likely to yield tasty numnums. SMB wasn't so lucky.

Novell introduced NDS with NetWare 4.0 in 1993, and is still being shipped 21years later as part of Open Enterprise Server.

For those of you who've never run into it, NDS (Novell Directory Services, currently marketed as eDirectory) is currently a distributed LDAP database that also provides non-LDAP interfaces for interacting with the object store. It can scale up to very silly object counts and due to Novell's long experience with distributed database management does so with a minimum of object corruption. It just works (albeit on a proprietary system).

It didn't start off as an LDAP datastore, though. No, it began life in 1993 as the authentication database behind NetWare and had a few very revolutionary features versus what was available on the market at the time:

  • It allowed multiple servers to use the same authentication database, so you didn't have to have an account on each server if users needed to access more than one of them. This was the biggest selling point, and seems pretty basic right now.
    • NIS/NIS+ already did this and predates NDS, but was a UNIX-only system not useful for non-UNIX offices.
  • It ran the database on multiple nodes, which made it a replicated database.
  • It partitioned the database to provide improved database locality, which made it a sharded database.
  • It allowed write operations on more than one replica per shard, which made it a distributed database.
  • It had eventual convergence built into it.
  • It had robust authentication features, which I'll get into in a later post.

NDS was a replicated, distributed, sharded database with eventual convergence that was written in 1993. MongoDB can do three of those (replicated, sharded, eventual consistency, but can distribute reads if needed), Cassandra does all four. This is a solvable problem but it's a rather complex one as Novell found out.

Consider the state of networking in 1993.

  • 10Mbps Ethernet was high-speed, and was probably hubbed even in the "datacenter".
  • Any enterprise of any size had very slow WAN links connecting small offices to central, so you had high latency links.
  • 16Mbps token-ring was still in frequent enough use NetWare had to support it.
    • Since TR was faster than Ethernet, it was frequently deployed in the datacenter, which necessitated TR to Ethernet bridges.
    • TR was often the edge network as well.
  • The tech industry hadn't yet converged on a single Ethernet Layer 2 framing protocol, so anything talking to Ethernet had to be able to handle up to 4 different framing standards (to the best of my limited knowledge, Cisco gear stillcan be configured to use any of the three losers of that contest, even though none of them has been in common usage for a long time).
  • TCP/IP was not the only data standard, NetWare used its very own IPX protocol which is not an IP protocol (more on that in a later post).

Can you imagine trying to run something like Cassandra on 10Mbps links with some nodes on the other side of links with pings approaching 1000ms? It can certainly be done, but it sure as heck magnifies any problems in the convergence protocol.

Novel learned that too. Early versions of NDS were prone to corruption, very prone. Real world networking conditions were so very unlike the assumed conditions the developers engineered in that it was only after NDS hit production that they truly appreciated the full array of situations it had to support. From memory, it was only after NDS version 6 released on or about NetWare 4.11 service-pack 3 that it really became stable. That took Novell over 4 years to get right.

Corruption bugs continued in NDS even into the modern era since that's a very hard problem to stomp. The edge cases surrounding a node disappearing, and reappearing with old/new/changed data and how convergence happens gets very nuanced, very quickly. The open-source distributed database projects are dealing with that right now.

For all that it was a strong backing database for very large authentication and identity databases, NDS/eDirectory was never designed to be highly transactional. It's an LDAP database, and you use it where you'd use an LDAP database.

NetWare Retrospective

| No Comments

As I've recently been through a change of jobs I've had a lot of chance to look back on my career. That career is long enough to have included Novell NetWare in it quite prominently, though I no longer point that out on my resume unless I feel a specific employer would be impressed by that. Novell was doing a lot of familiar things 20-odd years ago, and this blog series will be a retrospective on some old-yet-new problems that were solved in the 90's, but we're still fighting today.

The evil genius of OSv

| No Comments

One of the talks here at LISA13 was one about a new Cloud-optimized operating system called OSv. This is a new thing, and I hadn't heard of it before. Why do we need yet another OS? And one that doesn't even run a Linux kernel? I was frowning through the talk until I got to this slide:

NotNetware.jpg

That's the point when I said:

Holy shit! They've built a 64-bit NetWare!

  • Cooperative multi-tasking? Check!
  • A shared memory space? Check!
  • Everything runs in Ring 0? Check!

There were a few other things that made the parallel even more clear to me, but this is a stunning display of evil genius. Even though Novell tried for ten years to promote NetWare as a perfectly legitimate general purpose server for application serving, it never really took off. There were several reasons for this (not exhaustive):

  • It was a pain to develop for. The NLM model never got anything approaching wide-spread adoption so you had to get everything just right.
  • The shared memory space meant that the OS allowed you to stomp all over other processes running on the system, something that other OSs (Windows, Linux) don't allow.
  • If something did manage to wiggle out of the app and into the kernel, it had free reign (though in practice all it did was abend the server; writing exploits is subject to the first bullet-point problem).
  • It didn't have any concept of forking, just threads. Which changed the multi-processing paradigm from what it was on most other platforms and made porting software to it a pain.
  • There were no significant user-space utilities (grep/sed/awk/bash), though they did get some of that well after they'd lost the battle.

All of these made NetWare a challenging platform to develop for, and challenging platforms don't get developed for. Novell tried to further encourage people to develop for it by getting the Java JVM ported to NetWare so people could run Java apps on it. Few did, though it was quite possible; search for "netstorage" on this blog to get one such application that saw a lot of use.

Have I mentioned that OSv's first release ships with a JVM on it?


The Evil Genius part is that they're not wrong, things really do run faster when you write a kernel like that and run things in the same memory space as the kernel. I got pretty nice scaling with Apache when I was running it on NetWare.

The Evil Genius part is that they're designing this system to be a single-app system, not a general purpose system like NetWare was supposed to be. It runs a JVM, and that's it. The JVM can only stomp on itself and the kernel, and apps can stomp on each other within the limits of the JVM.

The Evil Genius part is that if it does fall over, it's designed to be flushed and a fresh copy spun up in its place. Disposable servers! NetWare servers of old were bastion hosts that Shall Never Go Down. OSv? Not the same thing at all.

The Evil Genius part is that they're doing this in an era where a system like this can actually succeed.

The Evil Genius part is that everyone looks at what they're doing and goes, "...uh HUH. Riiiight. LIke that's a good idea." And like evil geniuses of the past will go unrecognized and slink off to some dark corner somewhere to cackle and dream of world domination that will never happen.

Migrating off of NetWare

| 5 Comments
It has been around a year since we did the heavy lifting of migrating off of NetWare and retiring our eDirectory tree. By this point last year we had our procedures in place, we just needed to pull the trigger and start moving data around. I was asked to provide some hints about it, but the mail bounced with a 550-mailbox-not-found error *ahem*.

Because it's such a narrowly focused topic, and the WWU people who read me lived through it and therefore already know this stuff, I'm putting the meat of the post under the fold.

You're welcome.
There is a certain question that has shown up in pretty much every class about how to set up an X500-compliant directory service (thats things like Active Directory, NDS, and eDirectory). It goes like this:
You have been hired as a consultant to set up $FakeCorpName's new $Directory. They have major offices in five places. New York, Los Angeles, London, Sydney, and Tokyo. They have five $OldTech. What is the directory layout you recommend?
I originally ran into this particular question when I was getting my Certified Novell Administrator certification back in 1996. In that case $Directory was NDS and $OldTech was actually other NDS trees. In 2000/2001 when I was getting my Active Directory training, $Directory was AD and $OldTech was NT4 domains. The names of the countries did not vary much between the two. NYC and LA are always there, as are London and Tokyo. Sometimes Paris is there instead of Sydney. Once in a great while you'll see Hong Kong instead of Tokyo. In a fit of continental inclusiveness, I think I saw "Johannesburg" in there once (in an Exchange class IIRC). I ran into this question again recently in relation to AD.

This is a good academic question, but you will never, ever get it that easy in real life. This question is good for considering how geographically diverse corporate structure impacts your network layout and the knock-on effects that can have on your directory structure. However, the network is only a small part of the overall decision making process when it comes to problems like these.

The major part? Politics.

It is now 2010. Multi-national companies have figured out this 'office networking' thingy and have a pre-existing infrastructure. They have some kind of directory tree, somewhere, even if it only exists in their ERP system (which they all have now). They have office IT people who have been doing that work for 15+ years. A company that size has probably eaten bought out competitors, which introduces strange networking designs to their network. Figuring out how to glue together 5 geographically separate WinNT4.0 domains in 2010 is not useful. The problem is not technical, it's business.

1996
In 1996, WAN links were expensive and slow. NDS was the only directory of note on the market (NIS+ was a unix directory, therefore completely ignored in the normal business windows-only workplace). Access across WAN links was generally discouraged unless specifically needed. Because of this, your WAN links gave you the no-brainer divisions in your NDS tree where Replicas needed to be declared. All the replication traffic would stay within that site and only external reference resolution would cross the expensive WAN. Resources the entire company needed access to might go in a specific, smaller, replica that gets put on multiple sites.

This in turn meant that the top levels of your NDS tree had a kind of default structure. Many early NDS diagrams had a structure like this:

An early NDS diagram
Each of the top-level "C" containers was a replica. The US example was given to show how internal organization could happen. Snazzy! However, this flew in the face of real-work experience. Companies merge. Bits get sold off. By 2000 Novell was publishing diagrams similar to this one:

A later NDS diagram
This one was designed to show how company mergers work. Gone are the early "C" containers, in their place are "O". Merging companies? Just merge that NDS tree into a new O, and tada! Then you can re-arrange your OUs and replicas at your leisure.

This was a sign that Novell, the early pioneer in directories like this, had their theory run smack into reality with bad results. The original tree style with the top level C containers didn't handle mergers and acquisitions well. Gone was the network purity of the early 1996 diagrams, now the diagrams showed some signs of political influence.

2000
In 2000, Microsoft released Windows 2000 and Active Directory. The business world had been on the Internet for some time, and the .com boom was in full swing. WAN links were still expensive and slow, but not nearly as slow as they used to be. The network problem Microsoft was faced with was merging multiple NT4 domains into a single Active Directory structure.

In 2000, AD inter-DC replication was a lot noisier than eDirectory was doing at the time, so replication traffic was a major concern. This is why AD introduced the concept of Sites and inter-Site replication scheduling. Even so, the diagrams you saw then were reminiscent of the 1996 NDS diagrams:
An early AD diagram
As you can see, separate domains for NYC and LA are gone, which is recognition that in-country WAN links may be fast enough for replication, but transcontinental links were still slow. Microsoft handled the mergers-and-acquisitions problem with inter-domain trusts (which, thanks to politics, tend to be hard to get rid of once in place).

AD replication improved with both Server 2003 and Server 2008. The Microsoft ecosystem got used to M&A activity the same way Novell did a decade earlier and changes were made to best practices. Also, network speeds improved a lot.

2010
In 2010 WAN links are still slow relative to LAN links, but they're now fast enough that directory replication traffic is not a significant load for all but the slowest of such links. Even trans-continental WAN links are fat enough that directory replication traffic doesn't eat too much valuable resources.
An AD tree in the modern era
Note how simple this is.There is an empty root to act as nothing but the root of an entire tree. Northwinds is the major company and it recently bought DigitalRiver, but hasn't fully digested it yet. Note the lack of geographic separation in this chart. WAN speeds have improved (and AD replication has improved) enough that replicating even large domains over the WAN is no longer a major no-no.
  


And yet... you'll rarely see trees like that. That's because, as I said, network considerations are not the major driver behind organization these days, it's politics.

Take the original question at the top of this post. Consider it 5 one-domain AD trees, and each country/city is its own business unit that's large enough to have their own full IT stack (people dedicated to server, desktop, web support, and developers supporting it all), and has also been that way for a number of years. This is what you'll run into in real life. This is what will monkey-wrench the network purity of the above charts.

The biggest influence towards whether or not a one-domain solution can be reached will be the political power behind the centralizing push, and how uncowed they get when Very Important People start throwing their weight around. If the CEO is the one pushing this and brooks no argument, then, well, it's more likely to happen. If the COO is the one pushing it, but caves to pressure in order to not expend political capital with regards to unrelated projects, you may end up with a much more fragmented picture.

There will be at least one, and perhaps as many as five, business units that will insist, adamantly, that they absolutely have to keep doing things the way they've always been doing it, and they can't have other admins stomping around their walled garden in jack-boots. Whether or not they get their way is a business decision, not a technical one. Caving into these demands will give you an AD structure that includes multiple domains, or worse, multiple forests.
Fragmented AD environment
In my experience, the biggest bone of contention will be who gets to be in the Domain and Enterprise Admins groups. Those groups are the God Groups for AD, and everyone has to trust them. Demonstrating that only a few tasks require Domain Admin rights and that nearly all day-to-day administration can be done through effectively delegated rights will go a long way towards alleviating this pressure, but that may not be enough to convince business managers weighing in on the process.

The reason for this resistance is that this kind of structural change will require changes to operational procedures. You may think IT types are used to change, but you'd be wrong. Change can be resented just as fiercely in the ranks of IT-middle-managers as it is in rank-n-file clerks. Change for change's sake is doubly resented.

Overcoming this kind of political obstructionism is damned hard. It takes real people skills and political backing. This is not the kind of thing you can really teach in an MCSE/MCITP class track. Political backing has to already be in place before the project even gets off the ground.

I haven't been in an MCSE/MCITP class, so I don't know what Microsoft is teaching these days. I ran into this question in what looks like a University environment, which is a bit less up-to-date than getting it direct from Microsoft would be.  Perhaps MS is teaching this with the political caveats attached. I don't know. But they should be doing so.
On a Wednesday in August in 1996, the WWU NDS tree was born. There were other trees, but this is the one that everyone else merged into. The one tree to rule them all. That was NetWare 4. It brought the directory, and it was glorious (when it worked right).

And now, most of 14 years later, it is done. The last replica servers were powered off today after a two year effort to disentangle WWU from NetWare.

I have some blog-header text to change.

Password policies in AD

| No Comments
One of the more annoying problems with password and account-lockout policies in Active Directory has been that they apply to every account universally. I you want to force your users to change passwords every 90 days, with account lockout after a certain number of bad login attempts, then the same policies apply to your 'Administrator' user. Account lock-out was a really great way to DoS yourself in really critical ways.

In a way, that's what account-lockout is all about. It's to keep bad people from coming in, but its also a way for bad people from preventing legitimate people from using their own accounts. You need to take the good with the bad.

Since we were a NetWare shot for y-e-a-r-s we're very used to Intruder Lockout (ILO), and losing it during the move to Windows was seen as a loss of a key security feature. We had accounts that had to be exempted from lockout, which was dead easy in eDirectory but very difficult in AD.

Happily, Server 2008 introduces a way to do this. It's called "Fine-Grained Password Policy", and is NOT group-policy based. This was somewhat surprising. Getting this requires raising the domain and forest functional levels to the 2008 level. What it allows is setting password policy based on group memberships, with conflict resolution handled by a priority setting on the policy itself. Interestingly, the actual policies are created through ASDI Edit, so they're not beginner-friendly.

For instance, we can set a 'lock out after 6 tries in 30 minutes' setting to the Domain Users group at a Priority of 30, and a second 'never lock out ever' policy to the Domain Admins group at a Priority of 20. That way 'Administrator' will have the never-lock policy apply to it, but Joe User will have the lock-after-6-in-30 policy apply. This works best if the password policy specifies that Domain Admins need to have very complex and long passwords, which makes a brute-force cracking attempt take unreasonably long amounts of time.

We put this in place a few weeks ago, and it is working as we expected. SO GLAD to have this.

TCP problems

| 3 Comments
My testing for a cheap NAS solution has progressed to the option that costs the most money, Windows 2008 running KernSafe's iStorage. As it happens, it works really well when the iSCSI initiator is Windows but Linux clients don't really want to talk to it. Windows: 30-50 MB/s. Linux: 3-5 MB/s. Biiiig difference there.

Looking at packets I'm noticing a similar pattern on the wire to one I'd seen before. Back when I was troubleshooting exactly why NetWare backups to DataProtector were horrible I came across this problem. It seems that TCP Windowing is fundamentally broken between Server 2008 and NetWare which leads to really bad throughputs, which in turn is very bad for half TB backups. The receiving server seemed to feel the need to ACK after every two packets, which in turn really slowed things down. And that's what the Linux clients are doing for iSCSI to Server 2008.

It has to be something affecting basic TCP services but not complex protocols. Using smbclient to upload a 4GB DVD iso runs at 50MB/s but the iSCSI throughput on the same client is a piddly 3-5MB/s. I'm sure some kind of tuning on either side might be able to jar things loose, heaven knows Linux 2.6.31 is a heck of a lot more current on TCP settings than NetWare 6.5 SP8 is. I just haven't found it yet.

Conversely, Server 2008 talking to a Linux iSCSI client works at line speed pretty much. I'm testing this for completeness's sake. We need something that can serve up to 30TB via both iSCSI and SMB. My findings aren't fully complete yet, but in general:
  • OpenFiler: GREAT iSCSI host, completely blows for SMB in our environment.
  • OpenSolaris: Great iSCSI host, just can't convince the kernel-mode CIFS to join our domain. Also, worst-of-breed random I/O performance.
  • OpenFiler + Windows: OpenFiler for iSCSI, Windows (mounting an iSCSI share) for SMB. Should work GREAT. Current best-best for the future.
  • OpenSolaris + Windows: As previous option, but I/O problems make it less attractive.
  • Windows + KernSafe: GREAT SMB performance, solid iSCSI for Windows hosts. Linux hosts will take lots of tuning (perhaps, or it could be intractable).

Other Blogs

My Other Stuff

Monthly Archives