[netperf-talk] Problem with Bidirectional Test
Rick Jones
rick.jones2 at hp.com
Thu Jul 16 10:14:27 PDT 2009
Ankit Goyal wrote:
> # ./netperf -t LOC_CPU
>
> it gives : 0
>
> and command top is there in my board but not vmstat.
> With command 'top' i saw the usage if cpu but it was showing 100% idle
> which is correct as i am not giving any operation to do. When I send the
> two TCP streams directed towards my board to the two ports, I get
> variation in CPU utilisation which is :
>
> CPU: 0.0% usr 60.0% sys 0.0% nic 0.0% idle 0.0% io 0.0% irq 40.0% sirq
> Load average: 0.38 0.14 0.04 3/28 924
>
> then it changes to
>
> CPU: 0.1% usr 52.4% sys 0.0% nic 0.0% idle 0.0% io 4.5% irq 42.7% sirq
> Load average: 0.51 0.17 0.05 3/28 924
>
> CPU: 0.1% usr 35.7% sys 0.0% nic 30.7% idle 0.0% io 2.3% irq 30.9% sirq
> Load average: 0.47 0.17 0.05 1/26 924
>
> then it becomes 100% idle
>
> so these are the changes when data is sent unidirectionally to the two
> ports together, does they reflect anything about the bottleneck that why
> throughput is halved? and how to see the cpu utilisation when data sent
> in and out from both the ports?
Yes. If sending in just one direction consumes all the CPU available on the
board, trying to send and receive at the same time will have each of the
individual tests have lower throughput. In other words, with the current
software, your board does not have enough CPU performance to drive two GbE ports
bidirectionally at link rate.
>
> query 2: when i assign these ports there ip addresses in the same subnet
> they started behaving abruptly as in:
>
> eth0 port:xxx.xx.xx.162
> eth1 port:xxx.xx.xx.163
> network: xxx.xx.xx.1 to xxx.xx.xx.255
> netmask 255.255.255.0
>
> these two ports are connected to a network and other end of the network
> is connected to my PC whose ip is xxx.xx.xx.138
>
> i can ping my PC from my board and from PC to any of the ports. But when
> I unplug my eth0 port from the network, my PC still pings that port
> xxx.xx.xx.162. I cant understand why is it pinging that port if it has
> been plugged out of the network?
I suspect that goes back to what we discussed concerning the "arp_ignore"
setting. If you were to check on your PC, I suspect you would see that in its
ARP cache it had the MAC address of eth1 associated with the IP address you
assigned to eth0.
> query 3: when i set the verbose mode 2 for the TCP stream directed from
> my PC to port on board , bytes per send (avg) comes out to be 65536 and
> MSS(bytes) comes out to be -1
> which is absurd as I have jumbo support both on my PC and board.
the MSS being reported as -1 suggests that the TCP_MAXSEG option was not
detected as being present at compile time or perhaps the call failed.
>
>
> so many queries..ahh!
> thanks for helping me out
>
>
>
>
> On Wed, Jul 15, 2009 at 10:17 PM, Rick Jones <rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com>> wrote:
>
> Ankit Goyal wrote:
>
> Hii,
>
> The board has two PCI-Express interfaces, one with four lanes
> and one with one lane; 2.5-Gbit/s full duplex per lane;
> compliant with PCI-Express base specification 1.1; configurable
> as root or end point and 2 Ethernet 10/100/1000-Mbit/s,
> full-duplex MACs with TCP/IP Acceleration Hardware, QoS, and
> Jumbo Frame support, supporting GMII/MII, TBI, RTBI, RGMII,
> SGMII, SMII interfaces. Memory access layer (MAL) provides DMA
> capability to both Ethernet channels.
>
>
> And to which PCIe interface are your ethernet MACs connected?
>
> when I give this command to see the sender and receiver CPU
> utilisation:
>
> this is for eth0 connected to comp1
> ./netperf -H computer1 -t TCP_STREAM -c -C -- -s 32K -S 64K
> ...
>
> Why is the sender local cpu utilisation is in negative and why
> service demand send is zero?
>
>
> It suggests that netperf does not know how to perform CPU
> utilization measurements on your platform. What does:
>
> netperf -t LOC_CPU
>
> say when run on your board?
>
>
> i am working on different subnets on the ports and they work gud
> when i send data through one port only but when i send data
> through both the ports bidirectionally instead of getting
> doubled they get halved..I have a busybox v1.2.1 on my board and
> linux kernel 2.6.30 on my board. I dont have /etc/sysctl.conf
> file on my board. There are vsftpd.conf,
> hosts.conf,nsswitch.conf and xinetd.conf configuration files only.
>
>
> Do you have "top" or vmstat or something like that on your board?
> That would be a way to get CPU utilization outside of netperf.
>
> rick jones
>
>
> Please let me know about the throughput becoming halved
>
> Thanks
>
> On Tue, Jul 14, 2009 at 10:32 PM, Rick Jones <rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com> <mailto:rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com>>> wrote:
>
> PLEASE keep this in netperf-talk.
>
>
> Ankit Goyal wrote:
>
> Hii,
> I have two ethernet ports of 1Gbps each on my board ,two
> PCI-Express connectors,
>
>
> What number of lanes? PCIe 1.1 or PCIe 2.0 etc etc etc.
>
> a PCI connector, tcp-ip acceleration too. So I connected one
> port to one PC and another to a Laptop. So i can see the
> througput unidirectionally. I have busybox on my board
> and using
> windows on other(PC & laptop). I have compiled version of
> netperf in windows and so can run netperf and netserver.
> So how
> can i see the results of bidirectional transfer i.e. data
> coming
> out from both ports to laptop and PC and data coming in from
> laptop and PC to these ports?
> please tell me some commands which does not use enable
> burst and
> tcp_maerts(if possible)
> because i dont have enable burst configured and
> tcp_maerts does
> not work(dont know,maybe i have different compiled
> versions of
> netperf in windows)
>
>
> Shame on you :)
>
> Well, there is no way to run without non-trivial concerns
> about skew
> error if you have neither maerts nor burst mode. You are
> left with:
>
> On your board:
> 1) start a TCP_STREAM test directed towards the laptop
> 2) start a TCP_STREAM test directed towards the PC
>
> On your laptop:
> 3) start a TCP_STREAM test directed towards your board
>
> One your PC:
> 4) start a TCP_STREAM test directed towards your board
>
>
>
> And when i do this test that data coming in to 1 port and
> data
> going out from 2nd port,my throughput becomes half of the
> original unidirectional data transfer from one port, why
> is this
> so? I should have got double the throughput? it means
> there is
> no advantage of using two ethernet ports..is it so?
> Please help me out.
>
>
> I am trying...
>
> What is the CPU utlization on your board during these tests?
> Perhaps you have maxed-out the CPU(s).
>
> What are all the IP addresses involved here?
>
> Are both ports of the board configured into the same IP
> subnet? If
> so, and you are running Linux on the board, you need to use
> sysctl
> to set:
>
> net.ipv4.conf.default.arp_ignore = 1
>
> in something that will persist across reboots (eg
> /etc/sysctl.conf)
> and reboot, and/or set:
>
> net.ipv4.conf.ethN.arp_ignore = 1
> net.ipv4.conf.ethM.arp_ignore = 1
>
> otherwise, ARP will treat those interfaces as rather
> interchangable
> and you may not have traffic flow the way you think it will.
>
> It would also be best (IMO) to configure each of the two ports on
> the board into a separate IP subnet (and adjust the laptop and PC
> accordingly) so you have a better idea of what sort of routing
> decisions the stack is going to make.
>
> rick jones
>
> Thanks
>
>
> On Mon, Jul 13, 2009 at 9:59 PM, Rick Jones
> <rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>>>>
> wrote:
>
> Ankit Goyal wrote:
>
> hii,
>
> I am working with dual ethernet port on my board.
> I have
> cross
> compiled netperf 2.4.5 on my board. So when I give
> ./netserver
> command:
>
> Starting netserver at port no 12865
> Starting netserver at hostname 0.0.0.0 port 12865
> and family
> AF_UNSPEC
>
>
> But this netserver runs on the eth0 port but I want
> netserver to
> run on both ethernet ports.
>
>
> That netserver will run over any port. The port over
> which tests
> will run will be influenced by the netperf command lines.
>
>
> So how to make netserver run on both ports
> simultaneously?
>
>
> Assuming we have eth0 at 1.2.3.4 and eth1 at 2.3.4.5
> on the
> netserver system the first version would be
>
> netperf -H 1.2.3.4 ...
> netperf -H 2.3.4.5 ...
>
> If your board is running linux, you may need/want to
> set the
> "arp_ignore" (sysctl -a | grep ignore) sysctl to "1"
> to get linux
> out of its *very* literal interpretation of the weak
> end system
> model. By default, any interface in a linux system will
> respond to
> ARP requests for any system-local IP address.
>
>
> and if possible can you tell me some ways to
> increase the
> throughput? you
> told me in previously that you can get 1800Mbps on
> 1Gig
> bidirectionaly but i
> am able to get
> outbound:560mbps
> inbound:450mbps
>
>
> wat can be done to make it to 1800 Mbps? I will be
> very
> thankful
> if you help me out man.
>
>
> You will need to see first what the bottleneck happens
> to be.
> Attached to this message is some boilerplate I have
> worked-up that
> may help.
>
> Also, what sort of I/O connection does your dual-port chip
> have on
> this board? PCIe? PCI-X? Speeds and feeds?
>
> rick jones
> lets keep this discussion on netperf-talk for the
> benefit of all.
>
>
> Thanks
>
>
>
>
>
> On Thu, Jul 9, 2009 at 10:30 PM, Rick Jones
> <rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>>
> <mailto:rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com> <mailto:rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com>>>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>
> <mailto:rick.jones2 at hp.com <mailto:rick.jones2 at hp.com>>
>
> <mailto:rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com> <mailto:rick.jones2 at hp.com
> <mailto:rick.jones2 at hp.com>>>>>
>
> wrote:
>
> Ankit Goyal wrote:
>
> Thanks a ton guys!!
> If possible could you tell me that how
> much max
> bidirectional
> throughput I can get on 1Gbps ethernet
> connection?Can I
> get more
> than 1 Gig by changing the drivers and kernel?
> I know its a very relative question but i
> will be
> glad if
> u help
> me out!
>
>
> In theory you should be able to see O(1800)
> megabits/s.
> Certainly
> that would be my expectation in this day and age:
>
>
> s7:/home/raj/netperf2_trunk# netperf -H
> sbs133b1.west. -t
> TCP_RR -f
> m -- -r 64K -s 256K -S 256K -m 32K -b 8
> TCP REQUEST/RESPONSE TEST from 0.0.0.0
> (0.0.0.0) port 0
> AF_INET to
> sbs133b1.west (10.208.1.20) port 0 AF_INET :
> first burst 8
> Local /Remote
> Socket Size Request Resp. Elapsed
> Send Recv Size Size Time Throughput
> bytes Bytes bytes bytes secs.
> 10^6bits/sec
>
> 262142 262142 65536 65536 10.00 1837.52
> 524288 524288
>
> whether one will always get that with the paired
> TCP_STREAM/TCP_MAERTS test is an open question:
>
> s7:/home/raj/netperf2_trunk# for i in 1; do
> netperf -t
> TCP_STREAM -l
> 60 -H sbs133b1.west -P 0 & netperf -t
> TCP_MAERTS -l 60 -H
> sbs133b1.west -P 0 & done
> [1] 14619
> [2] 14620
> s7:/home/raj/netperf2_trunk#
> 87380 16384 16384 60.02 713.28
> 87380 16384 16384 60.01 874.86
>
> s7:/home/raj/netperf2_trunk# for i in 1; do
> netperf -t
> TCP_STREAM -l
> 120 -H sbs133b1.west -P 0 & netperf -t
> TCP_MAERTS -l
> 120 -H
> sbs133b1.west -P 0 & done
> [1] 14621
> [2] 14622
> s7:/home/raj/netperf2_trunk#
> 87380 16384 16384 120.03 626.89
> 87380 16384 16384 120.01 895.40
>
> (FWIW, one of the systems involved there is
> several (4
> or 5?)
> years
> old now)
>
>
>
>
> Some of my checklist items when presented with
> assertions of poor
> network performance, in no particular order:
>
> *) Is *any one* CPU on either end of the transfer at
> or close
> to 100%
> utilization? A given TCP connection cannot really take
> advantage
> of more than the services of a single core in the
> system, so
> average CPU utilization being low does not a priori mean
> things are
> OK.
>
> *) Are there TCP retransmissions being registered in
> netstat
> statistics on the sending system? Take a snapshot of
> netstat -s -t
> from just before the transfer, and one from just
> after and
> run it
> through beforeafter from
> ftp://ftp.cup.hp.com/dist/networking/tools:
>
> netstat -s -t > before
> transfer or wait 60 or so seconds if the transfer was
> already going
> netstat -s -t > after
> beforeafter before after > delta
>
> *) Are there packet drops registered in ethtool -S
> statistics on
> either side of the transfer? Take snapshots in a matter
> similar to
> that with netstat.
>
> *) Are there packet drops registered in the stats for the
> switch(es)
> being traversed by the transfer? These would be
> retrieved via
> switch-specific means.
>
> *) What is the latency between the two end points.
> Install
> netperf on
> both sides, start netserver on one side and on the other
> side run:
>
> netperf -t TCP_RR -l 30 -H <remote>
>
> and invert the transaction/s rate to get the RTT
> latency.
> There
> are caveats involving NIC interrupt coalescing settings
> defaulting
> in favor of throughput/CPU util over latency:
>
>
> ftp://ftp.cup.hp.com/dist/networking/briefs/nic_latency_vs_tput.txt
>
> but when the connections are over a WAN latency is
> important and
> may not be clouded as much by NIC settings.
>
> This all leads into:
>
> *) What is the *effective* TCP (or other) window size
> for the
> connection. One limit to the performance of a TCP bulk
> transfer
> is:
>
> Tput <= W(eff)/RTT
>
> The effective window size will be the lesser of:
>
> a) the classic TCP window advertised by the receiver
> (the
> value in
> the TCP header's window field shifted by the
> window scaling
> factor exchanged during connection establishment (why
> one wants
> to get traces including the connection
> establishment...)
>
> this will depend on whether/what the receiving
> application has
> requested via a setsockopt(SO_RCVBUF) call and the
> sysctl limits
> set in the OS. If the application does not call
> setsockopt(SO_RCVBUF) then the Linux stack will
> "autotune" the
> advertised window based on other sysctl limits in
> the OS.
>
> b) the computed congestion window on the sender -
> this will be
> affected by the packet loss rate over the connection,
> hence the
> interest in the netstat and ethtool stats.
>
> c) the quantity of data to which the sending TCP can
> maintain a
> reference while waiting for it to be ACKnowledged
> by the
> receiver - this will be akin to the classic TCP
> window case
> above, but on the sending side, and concerning
> setsockopt(SO_SNDBUF) and sysctl settings.
>
> d) the quantity of data the sending application is
> willing/able to
> send at any one time before waiting for some sort of
> application-level acknowledgement. FTP and rcp will
> just blast
> all the data of the file into the socket as fast
> as the
> socket
> will take it. scp has some application-layer
> "windowing" which
> may cause it to put less data out onto the connection
> than TCP
> might otherwise have permitted. NFS has the maximum
> number of
> outstanding requests it will allow at one time
> acting as a
> defacto "window" etc etc etc
>
>
>
>
>
>
More information about the netperf-talk
mailing list