[netperf-talk] CPU utilization & tickless kernels?
Rick Jones
rick.jones2 at hp.com
Fri Dec 7 14:26:11 PST 2012
On 12/07/2012 02:13 PM, Andrew Gallatin wrote:
> On 12/07/12 16:54, Rick Jones wrote:
>> The initial reasons are lost in the mists of time. It may have been
>> simply because the initial HP-UX version did per-CPU utilization. Today
>> one reason for the per-CPU tracking is to enable reporting the ID and
>> utilization of the most utilized CPU on either side. That is included
>> for helping with things like a four CPU system being able to have 25%
>> overall CPU utilization with either 25% across the board, two CPUs at
>> 50% and two idle, or one at 100% and three idle, and all the other
>> permuations.
>
> I did not realize you could do this sort of thing with netperf2.
Yup - you can use the omni output selectors:
raj at tardy:~$ netperf -H raj-8510w.americas.hpqcorp.net -l 20 -t
UDP_STREAM -c -- -m 1 -o
local_send_throughput,local_cpu_util,local_cpu_peak_util,local_cpu_peak_id
MIGRATED UDP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
raj-8510w.local () port 0 AF_INET : demo
Local Send Throughput,Local CPU Util %,Local Peak Per CPU Util %,Local
Peak Per CPU ID
3.12,20.59,75.55,0
>> Does top's per-CPU output look sane or mistaken?
>
> No. It looks like the same thing as mpstat. The
> summary does not match the per-cpu output.
>
> Hmmm..
>
> I played with trying to scale things based on the
> max total_ticks I found, but I never got anything "reasonable"
> in the case where one CPU is not pegged. If one CPU is pegged,
> then we get an accurate estimate of the max ticks/sec, and
> I see CPU utilization similar to what I see for good old RHEL5 on
> the same box... (but ~10% higher than vmstat).
>
> I'm pretty much convinced the kernel is lying to us.
> I need to try a 3.7 kernel just to see if anything has
> improved.
Ostensibly, the 2.6.38 kernel used above is tickless:
raj at tardy:~$ uname -a
Linux tardy 2.6.38-16-generic #67-Ubuntu SMP Thu Sep 6 17:58:38 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux
raj at tardy:~$ grep HZ /boot/config-2.6.38-16-generic
CONFIG_RCU_FAST_NO_HZ=y
CONFIG_NO_HZ=y
CONFIG_HZ_100=y
# CONFIG_HZ_250 is not set
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=100
CONFIG_MACHZ_WDT=m
and while my investigation wasn't exhaustive, my tests like those above
seemed to show agreement between netperf and vmstat - it was showing
about 79% idle in the overall.
You mentioned a "slow" system with 10 GbE networking. In the past, at
least a couple of platforms have had issues with not counting interrupt
time accurately. That would be a bigger problem when the interrupt time
was a larger percentage of the total. Any chance something along those
lines is happening to you?
rick
More information about the netperf-talk
mailing list