[netperf-talk] NetPerf issue with intervals approaching line-rate.
Rick Jones
rick.jones2 at hp.com
Thu Jul 14 13:24:14 PDT 2011
On 07/14/2011 11:42 AM, Moosa Baransi wrote:
> Hi Rick,
>
> Thanks for your prompt response.
>
> As for your question, this happens after the requested length test,
> something like seconds after the test starts.
> I do not use --enable-spin
>
> I was doing the test in open air to check the throughput in field. Once
> I took the board and put it in a chamber to defuse any wireless
> interference, the problem went away.
>
> Seems that it is my mistake to try doing it in open air.
OK, so this leads to hypothesis #2 :)
Presumably the chamber was isolating your wireless communications from
interference - does your wireless network retransmit at the data-link layer?
One other thing that may be happening is that the remote netserver's
test timer may have expired before netperf finished its test loop. If
there are link-layer retransmissions and they delayed things for
PAD_TIME or more (4 seconds) then netserver "wins" the timer race with
netperf. However as I type, that is usually only a problem with TCP tests.
So, hypothesis #3
It took long enough for the last 64k (128k if Linux since it doubles) to
transmit with all the interference that it kept the channel (term?) busy
long enough for the select() call in recv_response() to time out, but
even that hypothesis isn't really satisfying because the error message
is an errno 4 and a return -1 which I think means EINTR.
So, hypothesis #4 which tweaks hypothesis #3 slightly
The interval timer didn't get turned-off, and the interval was shorter
than the time it took for the first bytes of netserver's response to get
to netperf. That has the problem that there is a stop_itimer() call in
the netperf code in the signal handler. Of course, if that didn't
"work"...
Although... while I see a stop_itimer() call in catcher(), catcher can
set times_up on an interval timer SIGALRM, but the send_udp_stream()
code when it sees times_up set to one does not call stop_timer(). So,
the "race" may not have been with the interval timer, but with the test
timer.
So, if you happen to go back outisde, you might try taking a system call
trace of netperf's execution.
happy benchmarking,
rick jones
>
> Thanks,
>
> Moosa
>
>
>
> On 07/14/2011 08:16 PM, Rick Jones wrote:
>> On 07/13/2011 08:34 AM, Moosa Baransi wrote:
>>> Hi Raj,
>>>
>>> My name is Moosa and I am working with NetPerf 2.4.5. I encountered a
>>> problem which I can't find a solution for it in the different forums.
>>
>> Then we should include one or more forums to improve life for the next
>> guy. So, I will cc netperf-talk and include a more informative subject
>> header.
>>
>>> _The problem description_:
>>>
>>> I'm working on a board which has a max UDP throughput of about 10
>>> Mbps. I verified that with NetPerf using the command:
>>>
>>>
>>> *netperf -H 192.168.1.106 -D 2 -l 20 -t UDP_STREAM -fm -- -m1472
>>> -s64k -S64k &*
>>>
>>>
>>> I would like to use the burst option to measure how it impacts the
>>> CPU utilization. I am running the following command:
>>>
>>>
>>> *netperf -H 192.168.1.106 -D 2 -l 20 -t UDP_STREAM -b5 -w0 -fm --
>>> -m1472 -s64k -S64k &*
>>>
>>>
>>> The issue that if I change the burst to anything close to 10Mbps
>>> (*-b9* or above), I get the following error:
>>>
>>>
>>> *netperf: receive_response: no response received. errno 4 counter -1.*
>>>
>>>
>>> The host and the server are running NetPerf 2.4.5.
>>>
>>> I do not care if I have packet loss or retries, I just want the whole
>>> business to go on and not to hang in the middle.
>>>
>>> I will appreciate of you can help me with this.
>>
>> I'm guessing that when you get the error, it happens well before the
>> requested test length has passed?
>>
>> Unless one uses --enable-spin to cause netperf to sit and spin for the
>> interval, which I suspect you do not wish to do, the interval is
>> implemented with the interval timer. Every interval time units a
>> signal is generated. If you look at the code for INTERVALS_WAIT you
>> will see it makes a sigsuspend() call. The idea is that netperf is in
>> sigsuspend() before the signal fires.
>>
>> Now, on some platforms, netperf knows how to check what system call
>> was interrupted by the signal, and when it is in the middle of an
>> interval it will ask that the system call be restarted if it isn't
>> sigsuspend().
>>
>> However, on other platforms netperf does not know how to do that.
>> (enhancement patches would be most welcome :) So, if it takes longer
>> than the interval to get through the burst, there is a very real
>> chance the signal will fire during the send()/sendto() call, which
>> netperf will mis-interpret as the end of test indication. On those
>> platforms, it is necessary to ensure that the burst interval is longer
>> than the length of time it takes to send the burst.
>>
>> happy benchmarking,
>>
>> rick jones
More information about the netperf-talk
mailing list