Explaining LDAP.

| 1 Comment
The question was asked recently...

"How would you explain LDAP to a sysadmin who'd've heard of it, but not interacted with it before."

My first reaction illustrates my own biases rather well. "How could they NOT have heard of it before??" goes the rant in my head. Active Directory, choice of enterprise Windows deployments everywhere, includes both X500 and LDAP. Anyone doing unified authentication on Linux servers is using either LDAP or WinBind, which also uses LDAP. It seems that any PHP application doing authentication probably has an LDAP back end to it1. So it seems somewhat disingenuous to suppose a sysadmin who didn't know what LDAP was and could do.

But then, I remind myself, I've been playing with X500 directories since 1996 so LDAP was just another view on the same problem to me. Almost as easy as breathing. This proposed sysadmin probably has been working in a smaller shop. Perhaps with a bunch of Windows machines in a Workgroup, or a pack of Linux application servers that users don't generally log in to. It IS possible to be in IT and not run into LDAP. This is what makes this particular question an interesting challenge, since I've been doing it long enough that the definition is no longer ready to my head. Unfortunately for the person I'm about to info-dump upon, I get wordy.

LDAP.... Lightweight Directory Access Protocol. It came into existence as a way to standardize TCP access to X500 directories. X500 is a model of directory service that things like Active Directory and Novell eDirectory/NDS implemented. Since X500 was designed in the 1980's it is, perhaps, unwarrantedly complex (think ISDN), and LDAP was a way to simplify some of that complexity. Hence the 'lightweight".

LDAP, and X500 before it, are hierarchical in organization, but doesn't have to be. Objects in the database can be organized into a variety of containers, or just one big flat blob of objects. That's part of the flexibility of these systems. Container types vary between directory implementation, and can be an Organizational Unit (OU=), a DNS domain (DC=), or even a Cannonical Name (CN=), if not more. The name of an object is called the Distinguished Name (DN), and is composed of all the containers up to root. An example would be:

CN=Fred,OU=Users,DC=organization,DC=edu

This would be the object called Fred, in the 'Users' Organizational Unit, which is contained in the organization.edu domain.

Each directory has a list of classes and attributes allowable in the directory, and this is called a Schema. Objects have to belong to at least one class, and can belong to many. Belonging to a class grants the object the ability to define specific attributes, some of which are mandatory similar to Primary Keys in database tables.

Fred is a member of the User class, which itself inherits from the Top class. The Top class is the class that all other classes inherit from, as it defines the bare minimum attributes needed to define an object in the database. The User class can then define additional attributes that are distinct to the class, such as "first name", "password", and "groupMembership".

The LDAP protocol additionally defines a syntax for searching the directory for information. The return format is also defined. Lets look at the case of an authentication service such as a web-page or Linux login.

A user types in "fred" at the Login prompt of an SSH login to a linux server. The linux server then asks for a password, which the user provides. The Pam back-end then queries the LDAP server for objects of class User named "fred", and gets one, located at CN=Fred,OU=Users,DC=organization,DC=edu. It then queries the LDAP server for objects of class Group that are named CN=LinuxServerAccess,OU=Servers,DC=Organization,DC=EDU, and pulls the membership attributes. It finds that Fred is in this group, and therefore allowed to log in to that server. It then makes a third connection to the LDAP server and attempts to authenticate as Fred, with the password Fred supplied at the SSH login. Since Fred did not fat finger his password, the LDAP server allows the authenticated login. The Linux server detects the successfull login, and logs out of LDAP, finally permiting Fred to log in by way of SSH.

As I said before, these databases can be organized any which way. Some are organized based on the Organizational Chart of the organization, not with all the users in one big pile like the above example. In that case, Fred's distinguished-name could be as long as:

CN=Fred,ou=stuhelpdesk,ou=DesktopSupport,ou=InfoTechSvcs,ou=AcadAffairs,dc=organization,dc=edu

How to organize the directory is up to the implementers, and is not included in the standards.

The higher performing LDAP systems, such as the systems that can scale to 500,000 objects or higher, tend to index their databases in much the same way that relational databases do. This greatly speeds up common searches. Searching for an object's location is often one of the fastest searches an LDAP system can perform. Because of this LDAP very frequently is the back end for authentication systems.

LDAP is in many ways a speciallized identity database. If done right, on identical hardware it can easilly outperform even relational-databases in returning results.

Any questions?
Yeah, I get wordy.

1: Yes, this is wrong. MySQL contains a lot, if not most, of these PHP-application logins on the greater internet. I said I had my biases, right?

1 Comment

Great explanation! Trackback