[netperf-talk] netperf port numbers
Rick Jones
rick.jones2 at hp.com
Wed Aug 24 16:39:51 PDT 2011
On 08/24/2011 04:12 PM, Vishal Ahuja wrote:
> Hi Rick,
> I am running some netperf experiments using UDP and TCP over a 10 Gbps
> link - the machines are connected back to back. Am only running a single
> netperf client, and on the sender side there are multiple cores enabled.
> A single TCP flow manages upto 6.5 Gbps, which is fine. When using UDP,
> the problem is that the throughput on the sender side is around 4.1
> Gbps, but the throughput on the receive side is 0 Gbps. The same
> experiment with iperf achieves around 2.35 Gbps on the receive side.
> Using top, I observed that while the experiment was running, netserver
> was never scheduled on any of the cpus. Even if I run it with a nice
> value of -20, it does not get scheduled. Can you please help me to
> understand why this could be happening. My guess is the all the traffic
> is being directed to a single core, which gets overwhelmed by the
> interrupts, due to which, the netserver application never gets a chance.
> Is that correct? If yes, then why does it not happen for TCP,
> considering that the RTT in my setup is negligible.
TCP has end-to-end flow control. The TCP window advertised by the
receiver and honored by the sender, and the congestion window calculated
by the sender, both work to minimize the times when the sender
overwhelms the receiver.
UDP has no end-to-end flow control, and netperf at least makes no
attempt to provide any. It does though offer a way to throttle netperf
if you included --enable-intervals in the ./configure prior to compiling
netperf.
Also, depending on the model of NIC, there can be more offloads for TCP
than UDP - such as Large Receive Offload or General Receive Offload and
TSO/GSO - which takes advantage of TCP being a byte stream protocol and
not needing to preserve message boundaries. Some NICs support UDP
Fragmentation Offload, but I do not know if there is a corresponding UDP
Reassembly Offload. If "UFO" is supported on the sender, but no UDP
fragment reassembly offload at the receiver, then sending a maximum size
UDP datagram that becomes 45 or so fragments/packets on the wire/fibre
is not much more expensive than sending a 1024 byte one that is only one
packet on the wire/fibre, but will have significantly greater overhead
on the receiver - what takes one trip down the protocol stack at the
sender is sort of 45 trips up the protocol stack at the receiver.
Even without UDP Fragmentation Offload, I believe that sending a
fragmented UDP datagram is cheaper than reassembling it, so there is
still a disparity.
As for iperf vs netperf, they probably default to different send sizes -
on my Linux system for example, where the default UDP socket buffer is
something like 128KB, netperf will send 65507 byte messages. I don't
know if iperf sends messages that size by default. And, since you
didn't provide any of the command lines :) we don't know if you told
them to use the same send size.
I doubt that the traffic of a single stream (iperf or netperf, UDP or
TCP or whatever) would ever be directed to more than one core - things
like receive side scaling or receive packet scaling will hash the
headers, and if there is just the one flow, there will be just the one
queue/core used. At times I have had some success explicitly
affinitising netperf/netserver to a core other than the one taking
interrupts from the NIC. By default in Linux anyway, an attempt is made
to run the application on the same CPU as where the wakeup happened.
happy benchmarking,
rick jones
More information about the netperf-talk
mailing list