[netperf-talk] Some question about CPU utilization

Rick Jones rick.jones2 at hp.com
Wed Feb 20 10:40:14 PST 2008


>> How far away from your desired interval (10%) is netperf reporting?
> 
> 
> A lot. Most of the times it is between 15% and 30%. Sometimes it is
> 50-60%, I even got a 365% once (but CPU util was only 0.19% this time).

I wonder if there is then some inconsistency in the path taken through 
the stack.

>> What sort of test is this with that low a CPU load?  IIRC the linux 
>> procstat CPU util method is "statistical" and at such a low load level 
>> I get a little "concerned" about the measurments. 
> 
> 
> I run the following tests: TCP_STREAM, TCP_MAERTS, TCP_RR, UDP_STREAM,
> UDP_RR. netperf never reports a CPU load greater than 3% with these
> tests on my config. I'll attach one of my test result at the end of this
> message, if you want to have a look.
> 
> I also understand that with such low CPU load it is easy to have
> variation in tens of percent (eg. if another process wakes up on the
> system).

That is true.  I suppose you could try running netperf/netserver rtprio. 
  One other option, to avoid the statistical nature of the procstat 
method, would be to override the configure script and ask 
netperf/netserver to use the looper method.  That should launch some CPU 
soaker processes and use the rate at which they count to measure CPU 
util.  That method does require calibration, and it also has a settling 
time built into it.  Since it is a number of processes competing for CPU 
resources with netperf/netserver it can have an effect on the throughput.

I've not played with the looper stuff in a while, so there may be a 
triffle bitrot there :(

>> You may have - although depending on which manual you are reading, it 
>> may also discuss now how some CPU util methods don't require 
>> calibration, so the *_CPU tests will return "immediately."  The linux 
>> procstat method is such a method.
> 
> 
> I must have skipped this part of the manual :)

That or I might have skipped writing it :(  I've had some problems with 
the texi source for the current manual and not had the time to track it 
down (texi mode in email doesn't like updating nodes now :( )

> ------------------------------------------------------------------------
> 
> ------------------------------------------------
> TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.

Nice to see IPv6 testing :)

> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput      :  0.1%
> !!!                       Local CPU util  : 64.8%
> !!!                       Remote CPU util :  0.0%
> 
> Recv   Send    Send                          Utilization       Service Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
> Size   Size    Size     Time     Throughput  local    remote   local   remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
> 
>  87380  16384   1400    40.04       841.30   2.20     -1.00    0.859   -1.000 

One thing, not directly related, but I want to mention regardless - 
likely as not, 87380 was not the final size of the remote SO_RCVBUF, nor 
16384 the final size of the local SO_SNDBUF thanks to Linux's desire to 
autotune itself.  The classic netperf tests only snap the SO_*BUF 
settings once, right after the socket is created.  Everywhere else that 
is sufficient, but Linux, wanting to be different...  Anyhow, in the 
"omni" tests in the current top of trunk I've added code when compiled 
for linux to also snap the SO_*BUF settings at the end of the test 
iteration and make them available for output.  Output in the "omni" 
tests (./configure --enable-omni) is user-configurable via a file with 
lists of output names.  There is a brief text file in doc/ talking about 
that.

Now, related to this, in the omni tests, not hitting the confidence 
intervals no longer emits the warning - instead one can display the 
levels met for all tests.

Since the send size was < MSS (add a -v 2 to the command line to get the 
MSS) and I don't see nodelay in the test banner, I suspect there was 
something of a "race" involving Nagle which might have affected the 
average segment size on the wire and perhaps the CPU util per BK 
transferred.  It _might_ stablize if you add a test-specific -D option - 
although that may also reduce the throughput since it will preclude TCP 
coalescing sends into larger segments.  However, given a send size of 
1400 bytes, I'm guessing you wanted only 1400 bytes of data per segment 
anyway?  In which case you need to set the -D option anyhow :)


> ------------------------------------------------
>  
> ------------------------------------------------
> TCP MAERTS TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.
> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput      :  0.1%
> !!!                       Local CPU util  : 30.0%
> !!!                       Remote CPU util :  0.0%
> 
> Recv   Send    Send                          Utilization       Service Demand
> Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
> Size   Size    Size     Time     Throughput  local    remote   local   remote
> bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB
> 
>  87380  16384  87380    40.03       773.06   1.88     -1.00    0.795   -1.000 
> ------------------------------------------------
>  
> ------------------------------------------------
> TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.
> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput      :  1.0%
> !!!                       Local CPU util  : 28.9%
> !!!                       Remote CPU util :  0.0%
> 
> Local /Remote
> Socket Size   Request Resp.  Elapsed Trans.   CPU    CPU    S.dem   S.dem
> Send   Recv   Size    Size   Time    Rate     local  remote local   remote
> bytes  bytes  bytes   bytes  secs.   per sec  % S    % U    us/Tr   us/Tr
> 
> 16384  87380  1       1      40.00   11164.83  1.28   -1.00  4.578   -1.000 
> 16384  87380 
> ------------------------------------------------

Ah, the joys of driver interrupt coalsescing settings targetted ad 
reducing CPU util on bulk transfer.  That tends to leave the *_RR trans 
rate lower than it could be.

Since you have two processors, each with two cores, you could I suppose 
try disabling one of the processors (assuming one can do that from 
BIOS).  I don't think setting maxcpu to anything other than 1 will 
"work" here since the core numbering would mean (IIRC) maxcpu=2 giving 
you one core on each processor rather than both cores of one processor. 
  That right there would "double" the CPU util and perhaps make it less 
susceptible to transients.

rick jones

BTW, one thing I do whenever running netperf on Linux is shoot the 
irqbalance daemon in the head - its penchant for moving interrupts 
around really messes with things - especially when you want to see the 
effect on service demand on netperf/netserver running on the different 
cores relative to the interrupt CPU.


More information about the netperf-talk mailing list