[netperf-talk] Some question about CPU utilization
Rick Jones
rick.jones2 at hp.com
Wed Feb 20 10:40:14 PST 2008
>> How far away from your desired interval (10%) is netperf reporting?
>
>
> A lot. Most of the times it is between 15% and 30%. Sometimes it is
> 50-60%, I even got a 365% once (but CPU util was only 0.19% this time).
I wonder if there is then some inconsistency in the path taken through
the stack.
>> What sort of test is this with that low a CPU load? IIRC the linux
>> procstat CPU util method is "statistical" and at such a low load level
>> I get a little "concerned" about the measurments.
>
>
> I run the following tests: TCP_STREAM, TCP_MAERTS, TCP_RR, UDP_STREAM,
> UDP_RR. netperf never reports a CPU load greater than 3% with these
> tests on my config. I'll attach one of my test result at the end of this
> message, if you want to have a look.
>
> I also understand that with such low CPU load it is easy to have
> variation in tens of percent (eg. if another process wakes up on the
> system).
That is true. I suppose you could try running netperf/netserver rtprio.
One other option, to avoid the statistical nature of the procstat
method, would be to override the configure script and ask
netperf/netserver to use the looper method. That should launch some CPU
soaker processes and use the rate at which they count to measure CPU
util. That method does require calibration, and it also has a settling
time built into it. Since it is a number of processes competing for CPU
resources with netperf/netserver it can have an effect on the throughput.
I've not played with the looper stuff in a while, so there may be a
triffle bitrot there :(
>> You may have - although depending on which manual you are reading, it
>> may also discuss now how some CPU util methods don't require
>> calibration, so the *_CPU tests will return "immediately." The linux
>> procstat method is such a method.
>
>
> I must have skipped this part of the manual :)
That or I might have skipped writing it :( I've had some problems with
the texi source for the current manual and not had the time to track it
down (texi mode in email doesn't like updating nodes now :( )
> ------------------------------------------------------------------------
>
> ------------------------------------------------
> TCP STREAM TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.
Nice to see IPv6 testing :)
> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput : 0.1%
> !!! Local CPU util : 64.8%
> !!! Remote CPU util : 0.0%
>
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
>
> 87380 16384 1400 40.04 841.30 2.20 -1.00 0.859 -1.000
One thing, not directly related, but I want to mention regardless -
likely as not, 87380 was not the final size of the remote SO_RCVBUF, nor
16384 the final size of the local SO_SNDBUF thanks to Linux's desire to
autotune itself. The classic netperf tests only snap the SO_*BUF
settings once, right after the socket is created. Everywhere else that
is sufficient, but Linux, wanting to be different... Anyhow, in the
"omni" tests in the current top of trunk I've added code when compiled
for linux to also snap the SO_*BUF settings at the end of the test
iteration and make them available for output. Output in the "omni"
tests (./configure --enable-omni) is user-configurable via a file with
lists of output names. There is a brief text file in doc/ talking about
that.
Now, related to this, in the omni tests, not hitting the confidence
intervals no longer emits the warning - instead one can display the
levels met for all tests.
Since the send size was < MSS (add a -v 2 to the command line to get the
MSS) and I don't see nodelay in the test banner, I suspect there was
something of a "race" involving Nagle which might have affected the
average segment size on the wire and perhaps the CPU util per BK
transferred. It _might_ stablize if you add a test-specific -D option -
although that may also reduce the throughput since it will preclude TCP
coalescing sends into larger segments. However, given a send size of
1400 bytes, I'm guessing you wanted only 1400 bytes of data per segment
anyway? In which case you need to set the -D option anyhow :)
> ------------------------------------------------
>
> ------------------------------------------------
> TCP MAERTS TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.
> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput : 0.1%
> !!! Local CPU util : 30.0%
> !!! Remote CPU util : 0.0%
>
> Recv Send Send Utilization Service Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local remote
> bytes bytes bytes secs. 10^6bits/s % S % U us/KB us/KB
>
> 87380 16384 87380 40.03 773.06 1.88 -1.00 0.795 -1.000
> ------------------------------------------------
>
> ------------------------------------------------
> TCP REQUEST/RESPONSE TEST from ::0 (::) port 0 AF_INET6 to 2001::1 (2001::1) port 0 AF_INET6 : +/-5.0% @ 95% conf.
> !!! WARNING
> !!! Desired confidence was not achieved within the specified iterations.
> !!! This implies that there was variability in the test environment that
> !!! must be investigated before going further.
> !!! Confidence intervals: Throughput : 1.0%
> !!! Local CPU util : 28.9%
> !!! Remote CPU util : 0.0%
>
> Local /Remote
> Socket Size Request Resp. Elapsed Trans. CPU CPU S.dem S.dem
> Send Recv Size Size Time Rate local remote local remote
> bytes bytes bytes bytes secs. per sec % S % U us/Tr us/Tr
>
> 16384 87380 1 1 40.00 11164.83 1.28 -1.00 4.578 -1.000
> 16384 87380
> ------------------------------------------------
Ah, the joys of driver interrupt coalsescing settings targetted ad
reducing CPU util on bulk transfer. That tends to leave the *_RR trans
rate lower than it could be.
Since you have two processors, each with two cores, you could I suppose
try disabling one of the processors (assuming one can do that from
BIOS). I don't think setting maxcpu to anything other than 1 will
"work" here since the core numbering would mean (IIRC) maxcpu=2 giving
you one core on each processor rather than both cores of one processor.
That right there would "double" the CPU util and perhaps make it less
susceptible to transients.
rick jones
BTW, one thing I do whenever running netperf on Linux is shoot the
irqbalance daemon in the head - its penchant for moving interrupts
around really messes with things - especially when you want to see the
effect on service demand on netperf/netserver running on the different
cores relative to the interrupt CPU.
More information about the netperf-talk
mailing list