An older problem

| 1 Comment
I deal with some large file-systems. Because of what we do, we get shipped archives with a lot of data in them. Hundreds of gigs sometimes. These are data provided by clients for processing, which we then do. Processing sometimes doubles, or even triples or more, the file-count in these filesystems depending on what our clients want done with their data.

One 10GB Outlook archive file can contain a huge number of emails. If a client desires these to be turned into .TIFF files for legal processes, that one 10GB .pst file can turn into hundreds of thousands of files, if not millions.

I've had cause to change some permissions at the top of some of these very large filesystems. By large, I mean larger than the big FacShare volume at WWU in terms of file-counts. As this is on a Windows NTFS volume, it has to walk the entire file-system to update permissions changes at the top.

This isn't the exact problem I'm fixing, but it's much like in some companies where granting permissions to specific users is done instead of to groups, and then that one user goes elsewhere and suddenly all the rights are broken and it takes a day and half to get the rights update processed (and heaven help you if it stops half-way for some reason).

Big file-systems take a long time to update rights inheritance. This has been a fact of life on Windows since the NT days. Nothing new here.

But... it doesn't have to be this way. I explain under the cut.
The reason NTFS requires a filesystem walk to update inherited permissions is because rights are explicitly set on each and every directory entry. Inherited rights and actual explicit rights are set in special orders to ensure they're enforced correctly (you want the explicit rights to override the inherited rights). POSIX-style filesystems, such as those used by Linux, also work on the same explicit-rights model. Change something high up, and it'll require a filesystem walk to percolate the change everywhere it needs to go.

And yet, Novell created a filesystem on NetWare that allows implicit inheritance to work. How did they do that, and why can't Microsoft do the same?

One thing is for dead certain, and that is the Novell filesystems (NWFS, the original NetWare filesystem, and the later NSS that got ported to Linux) are meta-data heavy beasts. Certain operations, such as a file-create or file-delete, can spawn a huge amount of meta-data operations relative to NTFS. In fact, when Novell ported GroupWise over to Linux, they recommended deployments to use something other than NSS for the data volumes because GW is a poor fit for meta-data heavy filesystems. That said, having all of this meta-data means that they could use an indirect rights model.

In order to use an indirect rights model, where rights set on a parent directory implicitly flow to child objects, searching up the directory tree to the directory root has to be a very, very fast operation. Comparable with an explicit assignment, in fact. It has to work fast even when a file is buried 100 sub-directories deep.

For NTFS, and presumably the other POSIXy filesystems like ext3, reiser, xfs and the like, searching up a tree is much more expensive than searching down it. NTFS (and other filesystems) maintains a B+ Tree index to speed up file access. In order to get any kind of rights inheritance in these filesystems an explicit-grant method is used as a trade-off for fast performance.

For Novell's NSS, they made a more liberal use of B+ Trees (see also: meta-data heavy) overall. But they also did something else; they used a data-structure just for tracking rights assignments, and they made it doublely-linked. Once this data-structure is in memory, it is a very fast operation to compute an indirect inheritance. The worst-case is still that 100-directory deep file with some kind of rights specified on every directory, but that's an edge case; far more likely is 100-deep and rights set on three to five of them which still makes the rights-computation very fast. Couple it with file-caching and you only have to do the computations when something in the rights-tree changes, or the file stales out of the file-cache.

That rights tree? Neat stuff! On the FacShare volume before we moved it to Windows, we only had about 300 explicit rights-grants on the entire volume, which was already north of 4 million files. A data-structure containing 300 elements is a lot faster to traverse than one with 4 orders of magnitude more items in it.

Neato! Can Microsoft do that?

I'm sure if they bent their minds to the task Microsoft could make it work with NTFS. However, it wouldn't be a small project by any stretch. They would have to change the $MFT structure to remove the explicit-inheritance areas, and create another in-memory data-structure (and almost definitely a completely new filesystem meta-data structure to go with it) to accommodate the rights-tree. Nearly all files would be without any explicit rights on them, relying instead on the rights inherited from their directory, which would make their rights-structures empty in the $MFT. I'm sure there are more fiddly details I'm completely missing out on.

Would they do that? I doubt it. I personally only know of the one filesystem that has gone so far as to do that, and everyone else in the world seems just fine with dealing with the quirks of an explicit-inheritance system. Very large filesystems with complex rights (or rights only set at the very top) of the type I work with now and worked with at WWU are relatively rare in the Windows space, and probably doesn't warrant the dev-time it'd take to get the feature running.

Unlike most sysadmins out there, I've actually worked on a filesystem where a rights change at the top of a million-file filesystem will take immediate effect across the entire thing, so I know what I'm missing out on. Meta-data heavy filesystems can pack in a lot of features (see also, Salvage) but they pay for it in the price of updating all of that meta-data and in how long it takes to sanity-check everything should corruption somehow sneak past the journal.

Yes, I miss it. But I also understand why features like that are not likely to show up in the future.



Disclaimer: Details about the Novell stuff was gleaned over many years of discussions with other Novell people, Novell support while troubleshooting filesystem problems, and most importantly a chat with a Novell filesystem developer I had at BrainShare 2001's Meet the Experts night. The above describes things to best of my understanding, which may not be accurate.

1 Comment

I too miss it when ever I have to work with none NSS volumes. I still get caught up with making a rights change and it not take effect until the user logs back in on those Windows servers. Those thing just keep justifying the Novell approach to my clients.

Your understanding matches mine. Those Meet the Experts nights are always fun and educational, especially when you can show the Experts gaps in even their understanding, such as by VPNing into a client issue I was in the middle of at the time to show them and build a work around with them (and a fresh ftf a few weeks later that fixed it right).

There is another practical issue that generally prevents most NSS volumes from ending up with a sub-directories depth of more about 2 dozen. Windows clients (at least as recent as XP/2003) can't handle fully addressed file names of more than 255 characters. I keep finding users who use whole sentences for folder and file names which can hit those limits quickly.