[netperf-talk] Testing on Freescale MPC8313ERDB
Rick Jones
rick.jones2 at hp.com
Wed May 5 15:51:29 PDT 2010
Dominic Lemire wrote:
> Thanks a lot Rick and Andrew.
>
> The CPU seems to be the bottleneck (single core 333MHz). I get better
> results when I connect 2 Freescale boards together (see results below).
>
> I tried the sendfile test with a 10MB file of random data, but I still
> see the cpu saturated and lower throughput (see last test below). Is
> this 10MB big enough?
> ...
> ---------- Two Freescale boards with cross-over cable (1000Mbit,
> full-duplex) ----------
> PHY: e0024520:04 - Link is Up - 1000/Full
> ~ # ./netperf -H 10.42.43.2 -c -C -- -s 128K -S 128K
> TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.42.43.2
> (10.42.43.2) port 0 AF_INET
> Recv Send Send Utilization Service
> Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local
> remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB
> us/KB
>
> 217088 217088 217088 10.01 191.72 99.90 92.01 42.686
> 39.313
>
> ---------- Sendfile test ----------
> ~ # ./netperf -H 10.42.43.2 -c -C -tTCP_SENDFILE -F /dev/shm/10meg.bin
> TCP SENDFILE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.42.43.2
> (10.42.43.2) port 0 AF_INET
> Recv Send Send Utilization Service
> Demand
> Socket Socket Message Elapsed Send Recv Send Recv
> Size Size Size Time Throughput local remote local
> remote
> bytes bytes bytes secs. 10^6bits/s % S % S us/KB
> us/KB
>
> 87380 16384 16384 10.00 150.21 99.90 96.50 54.481
> 52.628
Perhaps the Linux sendfile mechanism isn't really zero-copy on your platform.
IIRC I have code in there were netperf will create a temp file of sufficient
size if one is not specified via -F so I guess you could try that. If the
mechanism is indeed not zero-copy, ranging over 10MB of data will probably trash
your processor caches, which may be why the sendfile test was so much worse than
the regular TCP_STREAM test.
A regular TCP_STREAM test will allocate one more "send size" sized buffer than
the initial SO_SNDBUF size divided by the send size - this can be overridden
with the global -W option. It may be possible to see higher throughput (or at
least lower service demand) by constraining the send ring to just one entry.
Anyway, it does seem as though you have found your bottleneck - the processor(s)
in your system(s) are too wimpy :) That does bring-up a point about the various
speeds of Ethernet. These days it generally comes-up in the context of 10
Gigabit Ethernet, which is why the attached is written the way it is.
happy benchmarking,
rick jones
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 10g_perf
URL: <http://www.netperf.org/pipermail/netperf-talk/attachments/20100505/a73db094/attachment-0001.ksh>
More information about the netperf-talk
mailing list