[netperf-talk] negative CPU utilization
Andrew Gallatin
gallatin at cs.duke.edu
Wed May 27 08:26:09 PDT 2009
Rick Jones wrote:
> I wonder if the "tick rate" of what gets reported in /proc/cpuinfo is no
> longer tied to:
>
> return sysconf (_SC_CLK_TCK);
I think the tick rate might be vary, depending on CPUs. If everything
is idle, and I look at the changes to the /proc/stat idle counts, then
it appears that the rate should be 200. If I hardcode it to 200, then
run a netperf with the IRQ and netperf bound to CPU0, netperf (and my
subtraction of idle counts) shows the load at ~40%. Eg:
proc_stat_buf 'cpu 85594 2455 186305 5051551404 327558 557841 701228
14897 0
cpu0 49773 632 93387 848141616 7376 298781 380909 4011 0
cpu1 16123 503 53120 1565456086 317817 37764 40600 5660 0
cpu2 8407 750 17645 1142567937 1048 158141 214179 2633 0
cpu3 11289 569 22151 1495385763 1316 63153 65538 2592 0
<...>
proc_stat_buf 'cpu 85618 2455 187131 5051556006 327559 557928 701292
14899 0
cpu0 49797 632 94212 848141616 7376 298868 380973 4012 0
cpu1 16123 503 53120 1565457084 317818 37764 40601 5660 0
cpu2 8407 750 17646 1142569647 1048 158141 214179 2633 0
cpu3 11289 569 22151 1495387657 1316 63153 65538 2592 0
<...>
lib_local_maxrate = 200.000000
calc_cpu_util: actual_rate on processor 0 is 0.099998 start 328d9d30 end
328d9d31 util 99.950005 cf 1.000000
calc_cpu_util: actual_rate on processor 1 is 99.798332 start 5d4ef6d6
end 5d4efabc util 50.100834 cf 1.000000
calc_cpu_util: actual_rate on processor 2 is 170.997147 start 441a3401
end 441a3aaf util 14.501427 cf 1.000000
calc_cpu_util: actual_rate on processor 3 is 189.396835 start 5921c6a3
end 5921ce09 util 5.301582 cf 1.000000
<...>
2097152 8972 10.00 1343019 0 9639.5 42.46 1.443
2097152 10.00 1343019 9639.5 0.71 0.024
Thats still wrong. According to mpstat, CPU0 is the only CPU with any load
(and its at 100%). So I guess maybe the different CPUs are changing
their clock rate based on load, and that's causing chaos.
I've really never understood your whole CPU utilization calculation's
dependency on tick rate.. Can I suggest something simpler: Just sum
up the all the CPU states, and express the per-cpu utilization as:
per_cpu_util = 100 - (100 * (idle_ticks / all_ticks))
I just made a patch which does that, and I'm seeing the correct
utilization (as confirmed with mpstat, and with what I "know" is
happening on the systems) on a whole smattering of formerly
problematic systems (ppc64, x86_64, kernels from 2.6.27 - up to
net-next). I've attached my patch.
I have a question about the original code.. you handle wrapping in a
way which I think is incorrect:
- actual_rate = (lib_end_count[i] > lib_start_count[i]) ?
- (float)(lib_end_count[i] - lib_start_count[i])/lib_elapsed :
- (float)(lib_end_count[i] - lib_start_count[i] +
- MAXLONG)/ lib_elapsed;
1) these times are uint64_t, so MAXLONG is going to be 1^31-1
while the max uint64_t size (wrap point) is 1^64-1.
2) these things are unsigned, so wrapping doesn't even matter.
I preserved this behavior in my "tick_subtract" function. If
you agree with me that it is unneeded, we can remove that
function and just do simple subtraction.
Please let me know if you think my approach is viable. If so,
I'll do wider testing (back to older systems with 8 field
proc stats, uniprocessor systems, etc).
Thanks,
Drew
PS: Here's full output from the same pair of systems in the
above, but with my utilization method.
./src/netperf -Hdell2950b-m -tUDP_STREAM -T1,0 -C -c -d -- -m 8972 -S 1M
-s 1M
scan_sockets_args called with the following argument vector
./src/netperf -Hdell2950b-m -tUDP_STREAM -T1 -C -c -d -- -m 8972 -S 1M
-s 1M
Program name: ./src/netperf
Local send alignment: 8
Local recv alignment: 8
Remote send alignment: 8
Remote recv alignment: 8
Report local CPU 1
Report remote CPU 1
Verbosity: 1
Debug: 1
Port: 12865
Test name: UDP_STREAM
Test bytes: 0 Test time: 10 Test trans: 0
Host name: dell2950b-m
installing catcher for all signals
Could not install signal catcher for sig 32, errno 22
Could not install signal catcher for sig 33, errno 22
Could not install signal catcher for sig 65, errno 22
remotehost is dell2950b-m and port 12865
establish_control called with host 'dell2950b-m' port '12865' remfam
AF_UNSPEC
local '0.0.0.0' port '0' locfam AF_UNSPEC
getaddrinfo returned the following for host 'dell2950b-m' port '12865'
family AF_UNSPEC
cannonical name: 'dell2950b-m.sw.myri.com'
flags: 2 family: AF_INET: socktype: SOCK_STREAM protocol
IPPROTO_TCP addrlen 16
sa_family: AF_INET sadata: 50 65 10 0 130 51
getaddrinfo returned the following for host '0.0.0.0' port '0' family
AF_UNSPEC
cannonical name: '0.0.0.0'
flags: 3 family: AF_INET: socktype: SOCK_STREAM protocol
IPPROTO_TCP addrlen 16
sa_family: AF_INET sadata: 0 0 0 0 0 0
bound control socket to 0.0.0.0 and 0
successful connection to remote netserver at dell2950b-m and 12865
complete_addrinfo using hostname dell2950b-m port 0 family AF_UNSPEC
type SOCK_DGRAM prot IPPROTO_UDP flags 0x0
getaddrinfo returned the following for host 'dell2950b-m' port '0'
family AF_UNSPEC
cannonical name: 'dell2950b-m.sw.myri.com'
flags: 2 family: AF_INET: socktype: SOCK_DGRAM protocol
IPPROTO_UDP addrlen 16
sa_family: AF_INET sadata: 0 0 10 0 130 51
local_data_address not set, using local_host_name of '0.0.0.0'
complete_addrinfo using hostname 0.0.0.0 port 0 family AF_UNSPEC type
SOCK_DGRAM prot IPPROTO_UDP flags 0x1
getaddrinfo returned the following for host '0.0.0.0' port '0' family
AF_UNSPEC
cannonical name: '0.0.0.0'
flags: 3 family: AF_INET: socktype: SOCK_DGRAM protocol
IPPROTO_UDP addrlen 16
sa_family: AF_INET sadata: 0 0 0 0 0 0
UDP UNIDIRECTIONAL SEND TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to
dell2950b-m.sw.myri.com (10.0.130.51) port 0 AF_INET : cpu bind
create_data_socket: socket 4 obtained...
netperf: get_sock_buffer: send socket size determined to be 2097152
netperf: get_sock_buffer: receive socket size determined to be 2097152
calibrate: lib_num_loc_cpus 4 lines 5 CPU_LINE_LENGTH 108
proc_stat_buflen 540
recv_response: received a 256 byte response
send_udp_stream: remote data connection done.
About to start a timer for 10 seconds.
cpu_util_init enter, proc_stat_fd 5 proc_stat_buf 0x10096d60
proc_stat_buf 'cpu 85904 2456 192194 5056775225 327974 558430 701669
15179 0
cpu0 50002 632 97535 849439433 7436 299350 381332 4018 0
cpu1 16189 503 54819 1566754531 318165 37773 40609 5930 0
cpu2 8415 750 17664 1143884630 1048 158147 214183 2635 0
cpu3 11297 569 22175 1496696629 1323 63158 65543 2594 0
intr 70972758 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 119 0 2242184 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 24807236 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 17338275 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
res[50002] is 632 97535 849439433 7436 299350 381332 4018 0 269053280
res[16189] is 503 54819 1566754531 318165 37773 40609 5930 0 269053280
res[8415] is 750 17664 1143884630 1048 158147 214183 2635 0 269053280
res[11297] is 569 22175 1496696629 1323 63158 65543 2594 0 269053280
proc_stat_buf 'cpu 85926 2456 193013 5056780270 327976 558494 701707
15317 0
cpu0 50002 632 97535 849440841 7436 299414 381370 4018 0
cpu1 16210 503 55637 1566754595 318165 37774 40610 6068 0
cpu2 8415 750 17664 1143886370 1048 158147 214183 2635 0
cpu3 11297 569 22175 1496698462 1325 63158 65543 2594 0
intr 71105574 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 119 0 2242532 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 24939676 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 17338275 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0'
res[50002] is 632 97535 849440841 7436 299414 381370 4018 0 269053280
res[16210] is 503 55637 1566754595 318165 37774 40610 6068 0 269053280
res[8415] is 750 17664 1143886370 1048 158147 214183 2635 0 269053280
res[11297] is 569 22175 1496698462 1325 63158 65543 2594 0 269053280
recv_response: received a 256 byte response
send_udp_stream: remote results obtained
lib_local_maxrate = 100.000000
calc_cpu_util: util on processor 0, diff = 0 0 0 1408 0 64 38 0 0 util
6.754967 cf 1.000000
calc_cpu_util: util on processor 1, diff = 21 0 818 64 0 1 1 138 0 util
93.863853 cf 1.000000
calc_cpu_util: util on processor 2, diff = 0 0 0 1740 0 0 0 0 0 util
0.000000 cf 1.000000
calc_cpu_util: util on processor 3, diff = 0 0 0 1833 2 0 0 0 0 util
0.108994 cf 1.000000
calc_service_demand called: units_sent = 12406409824.000000
elapsed_time = 0.000000
cpu_util = 0.000000
num cpu = 0
calc_service_demand using: units_sent = 12406409824.000000
elapsed_time = 10.000217
cpu_util = 25.181953
num cpu = 4
calc_service_demand got: thruput = 1211537.115985
servdem = 0.831405
calc_service_demand called: units_sent = 12406409824.000000
elapsed_time = 0.000000
cpu_util = 13.033341
num cpu = 4
calc_service_demand using: units_sent = 12406409824.000000
elapsed_time = 10.000217
cpu_util = 13.033341
num cpu = 4
calc_service_demand got: thruput = 1211537.115985
servdem = 0.430308
calculate_confidence: itr 1; time 10.000217; res 9924.912054
lcpu 25.181953; rcpu 13.033341
lsdm 0.831405; rsdm 0.430308
Socket Message Elapsed Messages CPU Service
Size Size Time Okay Errors Throughput Util Demand
bytes bytes secs # # 10^6bits/sec % SS us/KB
2097152 8972 10.00 1382792 0 9924.9 25.18 0.831
2097152 10.00 1382792 9924.9 13.03 0.430
shutdown_control: shutdown of control connection requested.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: procstat.diff
Type: text/x-diff
Size: 6015 bytes
Desc: not available
Url : http://www.netperf.org/pipermail/netperf-talk/attachments/20090527/7e3d2e83/attachment.diff
More information about the netperf-talk
mailing list