NetWare Retrospective Part 1: Distributed databases

Novell introduced NDS with NetWare 4.0 in 1993, and is still being shipped 21years later as part of Open Enterprise Server.

For those of you who've never run into it, NDS (Novell Directory Services, currently marketed as eDirectory) is currently a distributed LDAP database that also provides non-LDAP interfaces for interacting with the object store. It can scale up to very silly object counts and due to Novell's long experience with distributed database management does so with a minimum of object corruption. It just works (albeit on a proprietary system).

It didn't start off as an LDAP datastore, though. No, it began life in 1993 as the authentication database behind NetWare and had a few very revolutionary features versus what was available on the market at the time:

  • It allowed multiple servers to use the same authentication database, so you didn't have to have an account on each server if users needed to access more than one of them. This was the biggest selling point, and seems pretty basic right now.
    • NIS/NIS+ already did this and predates NDS, but was a UNIX-only system not useful for non-UNIX offices.
  • It ran the database on multiple nodes, which made it a replicated database.
  • It partitioned the database to provide improved database locality, which made it a sharded database.
  • It allowed write operations on more than one replica per shard, which made it a distributed database.
  • It had eventual convergence built into it.
  • It had robust authentication features, which I'll get into in a later post.

NDS was a replicated, distributed, sharded database with eventual convergence that was written in 1993. MongoDB can do three of those (replicated, sharded, eventual consistency, but can distribute reads if needed), Cassandra does all four. This is a solvable problem but it's a rather complex one as Novell found out.

Consider the state of networking in 1993.

  • 10Mbps Ethernet was high-speed, and was probably hubbed even in the "datacenter".
  • Any enterprise of any size had very slow WAN links connecting small offices to central, so you had high latency links.
  • 16Mbps token-ring was still in frequent enough use NetWare had to support it.
    • Since TR was faster than Ethernet, it was frequently deployed in the datacenter, which necessitated TR to Ethernet bridges.
    • TR was often the edge network as well.
  • The tech industry hadn't yet converged on a single Ethernet Layer 2 framing protocol, so anything talking to Ethernet had to be able to handle up to 4 different framing standards (to the best of my limited knowledge, Cisco gear stillcan be configured to use any of the three losers of that contest, even though none of them has been in common usage for a long time).
  • TCP/IP was not the only data standard, NetWare used its very own IPX protocol which is not an IP protocol (more on that in a later post).

Can you imagine trying to run something like Cassandra on 10Mbps links with some nodes on the other side of links with pings approaching 1000ms? It can certainly be done, but it sure as heck magnifies any problems in the convergence protocol.

Novel learned that too. Early versions of NDS were prone to corruption, very prone. Real world networking conditions were so very unlike the assumed conditions the developers engineered in that it was only after NDS hit production that they truly appreciated the full array of situations it had to support. From memory, it was only after NDS version 6 released on or about NetWare 4.11 service-pack 3 that it really became stable. That took Novell over 4 years to get right.

Corruption bugs continued in NDS even into the modern era since that's a very hard problem to stomp. The edge cases surrounding a node disappearing, and reappearing with old/new/changed data and how convergence happens gets very nuanced, very quickly. The open-source distributed database projects are dealing with that right now.

For all that it was a strong backing database for very large authentication and identity databases, NDS/eDirectory was never designed to be highly transactional. It's an LDAP database, and you use it where you'd use an LDAP database.