Why ext4 is very interesting

The new ext4 file-system is one I'm very interested in, especially since the state of reiserfs in the Linux community's gestalt is so in flux. Unlike ext3, it does a lot of things nicer:
  • Large directories. It can now handle very large directories. One source says unlimited directory sizes, and another less reliable source says sizes up to 64,000 files. This is primo for, say, a mailer. GroupWise would be able to run on this, if one was doing that.
  • Extent-mapped files. More of a feature for file-systems that contain a lot of big files rather than tens of millions of 4k small files. This may be an optional feature. Even so, it replicates an XFS feature which makes ext4 a top contender for the MythTV and other linux-based PVR products. Even SD video can be 2GB/hour, HD can be a lot larger.
  • Delayed allocation. Somewhat controversial, but if you have very high faith in your power environment, it can make for some nice performance gains. It allows the system to hold off on block allocation until a process either finishes writing, or triggers a dirty-cache limit, which further allows a decrease in potential file-frag and helps optimize disk I/O to some extent. Also, for highly transient files, such as mail queue files, it may allow some files to never actually hit disk between write and erase, which further increases performance. On the down side, in case of sudden power loss, delayed writes are simply lost completely.
  • Persistent pre-allocation. This is a feature XFS has that allows a process to block out a defined size of disk without actually writing to it. For a PVR application, the app could pre-allocate a 1GB hunk of disk at the start of recording and be assured that it would be as contiguous as possible. Certain bittorrent clients 'preallocate' space for downloads, though some fake it by writing 4.4GB of zeros and overwriting the zeros with the true data as it comes down. This would allow such clients to truly pre-allocate space in a much more performant way.
  • Online defrag. On traditional rotational media, and especially when there isn't a big chunk of storage virtualization between the server and the disk such as that provided by a modern storage array, this is much nicer. This is something else that XFS had, and ext is now picking up. Going back to the PVR again, if that recording pre-allocated 1GB of space and the recording is 2 hours long, at 2GB/hour it'll need a total of 4 GB. In theory each 1GB chunk could be on a completely diffrent part of the drive. An online defrag allows that file to be made contiguous without file-system downtime.
    • SSD: A bad idea for SSD media, where you want to conserve your writes, and doesn't suffer a performance penalty for frag. Hopefully, the efstools used to make the filesystem will be smart enough to detect SSDs and set features appropriately.
    • Storage Arrays: Not a good idea for extensive disk arrays like the HP EVA line. These arrays virtualize storage I/O to such an extent that file fragmentation doesn't cause the same performance problems as would a simple 6-disk RAID5 array would experience.
  • Higher resolution time-stamps. This doesn't yet have much support anywhere, but it allows timestamps with nanosecond resolution. This may sound like a strange thing to put into a 'nifty features' list, but this does make me go "ooo!".

Also, as I mentioned before, it looks like openSUSE 11.2 will make ext4 the default filesystem.