Netware 6.5 and memory

We've had NW6.5 in all of the cluster for a week now, and already we're getting a feel for our particular problems. One thing that rises above the rest is memory handling. It just plain is crankier than NW6.0 was. Novell knows this, and has published a frequently updated TID on the topic:

TID10091980 Memory Fragmentation Issue with NetWare 6.5

The old Netware hands among you may remember w-a-y back in the day when what's now called the Traditional Filesystem required a bunch of tweaking to make work. Well, we're back there again only for memory-handling this time.

To give you an idea as to what we're facing here are some numbers from our servers.

Stat
StuSrv1
StuSrv2
Total System Memory
2,147,062,784
2,147,062,784
NLM Memory
424,497,152
435,085,312
File System Cache (NSS)
1,715,937,280
369,316,840
Mounted Volumes
Stu1, Stu3, Class1
Stu2, Stupublic

Note the b-i-g difference in FIle System Cache. Stusrv1 has more mounted volumes, and they're bigger, but that does not account for the 4.6x increase in memory usage for the file-system cache. The actual difference in terms of mounted, used disk-space is closer to 2.5:1 rather than 4.6:1. Why is this happening? I don't know.

Our Cache Balance setting IS set to 85%. But the reason why some servers are actually taking that is not clear at this time. We never came close to that number with NW6.0. We're also running the latest, bleeding edge NSS modules thanks to the troubleshooting required to get NW6.5 into our cluster. Right now I'm setting things back to 60% and see what we get.

All of our cluster nodes list "Fragmented Kernel Space" in the 10-25% of memory range. So far that hasn't been a true problem. The TID above lists ways to handle that, but our servers haven't been up as NW6.5 long enough to get a true feel for 'normal lode' yet. Plus, the reboots required for the settings to take effect still incur service outages on the cluster (an outage that lasts 15 seconds is still an outage) so it takes scheduling to get changes in.

Our NDS servers have also had memory-frag issues. I've heard rumors that this is more associated with eDir 8.7.3 than Netware 6.5, but it still remains. There are some DS settings you can use to try and reduce frag numbers.