Mystery Reboots

Two of our three NDS servers rebooted themselves last night. From the logs it looks like they ran out of cache-allocator memory and somehow managed to reboot themselves without generating an abend.log entry. Weird.



As you can see from the chart (thank you Intermapper and a custom probe), they had been decreasing steadily over time before running completely out. The thing is, I'm not sure where the leak is. The eDir cache has been tuned and I looked at it yesterday afternoon and didn't see anything out of the ordinary. The Filesystem cache had been getting large, but one of the fixes in SP3 was to introduce the ability to scavange cache-buffers out of the NSS cache if needed. It wasn't any leak in TSAFS, since no backups were running at the time the messages started showing up.