netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rick Jones <rick.jones2@hp.com>
To: Breno Leitao <leitao@linux.vnet.ibm.com>
Cc: Eric Dumazet <dada1@cosmosbay.com>,
	"Brandeburg, Jesse" <jesse.brandeburg@intel.com>,
	netdev@vger.kernel.org
Subject: Re: e1000 performance issue in 4 simultaneous links
Date: Fri, 11 Jan 2008 10:48:07 -0800	[thread overview]
Message-ID: <4787B9E7.6040001@hp.com> (raw)
In-Reply-To: <1200075581.9349.33.camel@cafe>

Breno Leitao wrote:
> On Fri, 2008-01-11 at 17:48 +0100, Eric Dumazet wrote:
> 
>>Breno Leitao a écrit :
>>
>>>Take a look at the interrupt table this time: 
>>>
>>>io-dolphins:~/leitao # cat /proc/interrupts  | grep eth[1]*[67]
>>>277:         15    1362450         13         14         13         14         15         18   XICS      Level     eth6
>>>278:         12         13    1348681         19         13         15         10         11   XICS      Level     eth7
>>>323:         11         18         17    1348426         18         11         11         13   XICS      Level     eth16
>>>324:         12         16         11         19    1402709         13         14         11   XICS      Level     eth17
>>>
>>>
>>>  
>>
>>If your machine has 8 cpus, then your vmstat output shows a bottleneck :)
>>
>>(100/8 = 12.5), so I guess one of your CPU is full
> 
> 
> Well, if I run top while running the test, I see this load distributed
> among the CPUs, mainly those that had a NIC IRC bonded. Take a look:
> 
> Tasks: 133 total,   2 running, 130 sleeping,   0 stopped,   1 zombie
> Cpu0  :  0.3%us, 19.5%sy,  0.0%ni, 73.5%id,  0.0%wa,  0.0%hi,  0.0%si,  6.6%st
> Cpu1  :  0.0%us,  0.0%sy,  0.0%ni, 75.1%id,  0.0%wa,  0.7%hi, 24.3%si,  0.0%st
> Cpu2  :  0.0%us,  0.0%sy,  0.0%ni, 73.1%id,  0.0%wa,  0.7%hi, 26.2%si,  0.0%st
> Cpu3  :  0.0%us,  0.0%sy,  0.0%ni, 76.1%id,  0.0%wa,  0.7%hi, 23.3%si,  0.0%st
> Cpu4  :  0.0%us,  0.3%sy,  0.0%ni, 70.4%id,  0.7%wa,  0.3%hi, 28.2%si,  0.0%st
> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Cpu6  :  0.0%us,  0.0%sy,  0.0%ni, 99.7%id,  0.0%wa,  0.0%hi,  0.3%si,  0.0%st
> Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st

If you have IRQ's bound to CPUs 1-4, and have four netperfs running, 
given that the stack ostensibly tries to have applications run on the 
same CPUs, what is running on CPU0?

Is it related to:

>   The 2 interface test that I showed in my first email, was run in two
> different NIC. Also, I am running netperf with the following command
> "netperf -H <hostname> -T 0,8" while netserver is running without any
> argument at all. Also, running vmstat in parallel shows that there is no
> bottleneck in the CPU. Take a look: 

Unless you have a morbid curiousity :) there isn't much point in binding 
all the netperf's to CPU 0 when the interrupts for the NICs servicing 
their connections are on CPUs 1-4.  I also assume then that the 
system(s) on which netserver is running have > 8 CPUs in them? (There 
are multiple destination systems yes?)

Does anything change if you explicitly bind each netperf to the CPU on 
which the interrups for its connection are processed?  Or for that 
matter if you remove the -T command entirely

Does UDP_STREAM show different performance than TCP_STREAM (I'm 
ass-u-me-ing based on the above we are looking at the netperf side of a 
TCP_STREAM test above, please correct if otherwise).

Are the CPUs above single-core CPUs or multi-core CPUs, and if 
multi-core are caches shared?  How are CPUs numbered if multi-core on 
that system?  Is there any hardware threading involved?  I'm wondering 
if there may be some wrinkles in the system that might lead to reported 
CPU utilization being low even if a chip is otherwise saturated.  Might 
need some HW counters to check that...

Can you describe the I/O subsystem more completely?  I understand that 
you are using at most two ports of a pair of quad-port cards at any one 
time, but am still curious to know if those two cards are on separate 
busses, or if they share any bus/link on the way to memory.

rick jones

      reply	other threads:[~2008-01-11 18:48 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-10 16:17 e1000 performance issue in 4 simultaneous links Breno Leitao
2008-01-10 16:36 ` Ben Hutchings
2008-01-10 16:51   ` Jeba Anandhan
2008-01-10 17:31   ` Breno Leitao
2008-01-10 18:18     ` Kok, Auke
2008-01-10 18:37     ` Rick Jones
2008-01-10 18:26 ` Rick Jones
2008-01-10 20:52 ` Brandeburg, Jesse
2008-01-11  1:28   ` David Miller
2008-01-11 11:09     ` Benny Amorsen
2008-01-12  1:41       ` David Miller
2008-01-12  5:13         ` Denys Fedoryshchenko
2008-01-30 16:57           ` Kok, Auke
2008-01-11 16:20   ` Breno Leitao
2008-01-11 16:48     ` Eric Dumazet
2008-01-11 17:36       ` Denys Fedoryshchenko
2008-01-11 18:45         ` Breno Leitao
2008-01-11 18:19       ` Breno Leitao
2008-01-11 18:48         ` Rick Jones [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4787B9E7.6040001@hp.com \
    --to=rick.jones2@hp.com \
    --cc=dada1@cosmosbay.com \
    --cc=jesse.brandeburg@intel.com \
    --cc=leitao@linux.vnet.ibm.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).