progressing

Right now I'm running the mass IOZONE test. 30 workstations are pounding the test NetWare server with IOZONE, running this command-line:

iozone -Rab \report-dump\IOZONE-std\%COMPUTERNAME%-iozone1.xls -g 1G -i 0 -i 1 -i 2 -i 3 -i 4 -i 5

Right now all the stations are chewing on the 1GB file, and are all at various record-size stages. But the fun thing is the "nss /cachestats" output:
BENCHTEST-NW:nss /cachestat
***** Buffer Cache Statistics *****
Min cache buffers: 512
Num hash buckets: 524288
Min OS free cache buffers: 256
Num cache pages allocated: 414103
Cache hit percentage: 63%
Cache hit: 3407435
Cache miss: 1978789
Cache hit percentage(user): 60%
Cache hit(user): 3031275
Cache miss(user): 1978789
Cache hit percentage(sys): 100%
Cache hit(sys): 376160
Cache miss(sys): 0
Percent of buckets used: 48%
Max entries in a bucket: 7
Total entries: 399112
Yep. All that I/O is only partially being satisified by cache-reads. As it should be at this stage of the game.

What surprised me yesterday when I kicked off this particular test was how baddly hammered the server was at the very begining. This is the small file-size test, and better approximates actual usage. CPU during the first 30 minutes of the test was in the 70-90% range, and was asymetric, CPU1 was nearly pegged. During that phase of it we also drove a network utilization of 79-83% on the GigE uplink from the switch serving the testing machines and the router core. And on the Fibre Channel switch serving the test server, the high-water mark for transfer speed was 101 MB/Second (~20% utilization).

The FC speed is notible. The fasted throughput I was able to produce on the port linking the EVA was about 25 MB/Second, and that was done with TSATEST running against local volumes in parallel on three machines. Clearly our EVA is capable of much higher performance than we've been demanding of it. Nice to know.

Depending on how the numbers look once this test is done, I might change my testing procedure a bit. Run a separate 'small file' run in IOZone to capture the big-load periods, and perhaps a separate 'big file' run with 1G files to capture the 'cache exhaustion' performance.

From a NetWare note, the 'Current MP Service Processes' counter hit the max of 750 pretty fast during the early stages of the test. Upping the max to 1000 showed how utilization of service processes progressed during the test. Right now it's steady at 530 used processes. Since I don't think Linux has a similar tunable parameter, this could be one factor making a difference between the platforms.

Tags: ,