[netperf-talk] negative CPU utilization
Andrew Gallatin
gallatin at cs.duke.edu
Wed May 27 10:44:09 PDT 2009
Rick Jones wrote:
> That it ended-up being applied to platforms/situations which did not
> require that method is probably simply a matter of cut-and-paste
> development.
Gotcha.
<...>
> I'll give it a look and likely apply it. If the tick rate varies on the
> CPUs, is even that correct though? (Matching mpstat notwithstanding :)
As long as we charge CPU utilization as a percentage of time spent
non-idle, I think we're OK. If the CPU stops ticking, and idle is not
updated, then I guess we'll over-charge for unused CPUs. We'll have
to watch for that.
> How does it compare say with using the looper method in netperf on your
> platform? For an example of why I am skeptical about correctness being
> defined as matching a system tool output just look at some of the
> commentary in the netcpu_kstat code.
It seems to give the same results.
>>
>> I have a question about the original code.. you handle wrapping in a
>> way which I think is incorrect:
>>
>> - actual_rate = (lib_end_count[i] > lib_start_count[i]) ?
>> - (float)(lib_end_count[i] - lib_start_count[i])/lib_elapsed :
>> - (float)(lib_end_count[i] - lib_start_count[i] +
>> - MAXLONG)/ lib_elapsed;
>>
>> 1) these times are uint64_t, so MAXLONG is going to be 1^31-1
>> while the max uint64_t size (wrap point) is 1^64-1.
>> 2) these things are unsigned, so wrapping doesn't even matter.
>>
>> I preserved this behavior in my "tick_subtract" function. If
>> you agree with me that it is unneeded, we can remove that
>> function and just do simple subtraction.
>
> Given that code likely got cut-and-pasted from places where counters
> were initially 32 bits, you are likely quite correct.
Thinking about it more, there is an ugly case you might want to
handle: the counters can be 32-bits in older linux versions (looks
like they went from "unsigned long" to "unsigned long long" somewhere
between 2.6.0 and 2.6.6). So your code does the right thing to handle
this case (except the wrong constant is used). Eg, you really need to
add 0xffffffff00000000ULL rather than MAXLONG.
The problem is you don't have a 100% certain way to know whether the
system is giving you 32-bit counters, or 64-bit counters. I guess we
should keep doing what you're doing now. We could add a heuristic
to only apply the fixup if the old value was less than 32-bits
wide. But I suppose this could still be wrong if you have a
long running netperf and a really, really, really super fast
64-bit clock that rolls over quickly.
> Go for it :)
Will do.
>> scan_sockets_args called with the following argument vector
>> ./src/netperf -Hdell2950b-m -tUDP_STREAM -T1 -C -c -d -- -m 8972 -S 1M
>> -s 1M
>
> That's odd - where is the ",0" for the -T option in the
> scan_sockets_args output?
I have no idea.. I didn't touch this code, I promise!
FWIW, using mpstat, I see the netserver moving around on the remote
system when I vary the -T0,$REM_CPU binding. So it must be just an
oddity in printing.
Drew
More information about the netperf-talk
mailing list