Re: [RFC v1] hand off skb list to other cpu to submit to upper layer

netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	jesse.brandeburg@intel.com,
	Stephen Hemminger <shemminger@vyatta.com>
Subject: Re: [RFC v1] hand off skb list to other cpu to submit to upper layer
Date: Wed, 25 Feb 2009 15:20:23 +0800	[thread overview]
Message-ID: <1235546423.2604.556.camel@ymzhang> (raw)
In-Reply-To: <20090225063656.GA32635@gondor.apana.org.au>

On Wed, 2009-02-25 at 14:36 +0800, Herbert Xu wrote:
> Zhang, Yanmin <yanmin_zhang@linux.intel.com> wrote:
> > Subject: hand off skb list to other cpu to submit to upper layer
> > From: Zhang Yanmin <yanmin.zhang@linux.intel.com>
> > 
> > Recently, I am investigating an ip_forward performance issue with 10G IXGBE NIC.
> > I start the testing on 2 machines. Every machine has 2 10G NICs. The 1st one seconds
> > packets by pktgen. The 2nd receives the packets from one NIC and forwards them out
> > from the 2nd NIC. As NICs supports multi-queue, I bind the queues to different logical
> > cpu of different physical cpu while considering cache sharing carefully.
> > 
> > Comparing with sending speed on the 1st machine, the forward speed is not good, only
> > about 60% of sending speed. As a matter of fact, IXGBE driver starts NAPI when interrupt
> > arrives. When ip_forward=1, receiver collects a packet and forwards it out immediately.
> > So although IXGBE collects packets with NAPI, the forwarding really has much impact on
> > collection. As IXGBE runs very fast, it drops packets quickly. The better way for
> > receiving cpu is doing nothing than just collecting packets.
> 
Thanks for your comments.

> This doesn't make sense.  With multiqueue RX, every core should be
> working to receive its fraction of the traffic and forwarding them
> out.
I never say the core can't receive and forward packets at the same time.
I mean the performance isn't good.

>   So you shouldn't have any idle cores to begin with.  The fact
> that you do means that multiqueue RX hasn't maximised its utility,
> so you should tackle that instead of trying redirect traffic away
> from the cores that are receiving.
>From Stephen's explanation, the packets are being sent with different SRC/DST address
pairs by which harware delivers packets to different queues. we couldn't expect
NIC always puts packets into queues evenly.

The behavior is IXGBE is very fast and cpu couldn't collect packets in time if it
collects packets and forwards them at the same time. That causes IXGBE drops packets.

> 
> Of course for NICs that don't support multiqueue RX, or where the
> number of RX queues is less than the number of cores, then a scheme
> like yours may be useful.
IXGBE NIC does support a large number of RX queues. By default, it creates
CPU_NUM queues. But the performance is not good when we bind queues to
cpu evenly. One reason is cache miss/ping-pong. The forwarder machine has
2 physical cpu and every cpu has 8 logical threads. All 8 logical cpu share
the last level cache. With my ip_forward testing by pktgen, binding queues
to 8 logical cpu of a physical cpu could have 40% improvement than binding
queues to 16 logical cpu. So the optimization scenario just needs IXGBE drivers
create 8 queues.

If the machines might have a couple of NICs and every NIC has CPU_NUM queues,
binding them evenly might cause more cache-miss/ping-pong. I didn't test
multiple receiving NICs scenario as I couldn't get enough hardware.

Yanmin

next prev parent reply	other threads:[~2009-02-25  7:20 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-25  1:27 [RFC v1] hand off skb list to other cpu to submit to upper layer Zhang, Yanmin
2009-02-25  2:11 ` Stephen Hemminger
2009-02-25  2:35   ` Zhang, Yanmin
2009-02-25  5:18     ` Stephen Hemminger
2009-02-25  5:51       ` Zhang, Yanmin
2009-02-25  6:36 ` Herbert Xu
2009-02-25  7:20   ` Zhang, Yanmin [this message]
2009-02-25  7:31     ` David Miller
2009-03-04  9:27       ` Zhang, Yanmin
2009-03-04  9:39         ` David Miller
2009-03-05  1:04           ` Zhang, Yanmin
2009-03-05  2:40             ` Zhang, Yanmin
2009-03-05  7:32               ` Jens Låås
2009-03-05  9:24                 ` Zhang, Yanmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1235546423.2604.556.camel@ymzhang \
    --to=yanmin_zhang@linux.intel.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jesse.brandeburg@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).