netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ben Hutchings <bhutchings@solarflare.com>
To: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>,
	netdev@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	herbert@gondor.apana.org.au, jesse.brandeburg@intel.com,
	shemminger@vyatta.com, David Miller <davem@davemloft.net>
Subject: Re: [RFC v2: Patch 1/3] net: hand off skb list to other cpu to submit to upper layer
Date: Thu, 12 Mar 2009 14:08:26 +0000	[thread overview]
Message-ID: <1236866906.3221.11.camel@achroite> (raw)
In-Reply-To: <1236845792.2567.484.camel@ymzhang>

On Thu, 2009-03-12 at 16:16 +0800, Zhang, Yanmin wrote:
> On Wed, 2009-03-11 at 12:13 +0100, Andi Kleen wrote:
[...]
> >  and just use the hash function on the
> > NIC.
> Sorry. I can't understand what the hash function of NIC is. Perhaps NIC hardware has something
> like hash function to decide the RX queue number based on SRC/DST?

Yes, that's exactly what they do.  This feature is sometimes called
Receive-Side Scaling (RSS) which is Microsoft's name for it.  Microsoft
requires Windows drivers performing RSS to provide the hash value to the
networking stack, so Linux drivers for the same hardware should be able
to do so too.

> >  Have you considered this for forwarding too?
> Yes. originally, I plan to add a tx_num under the same sysfs directory, so admin could
> define that all packets received from a RX queue should be sent out from a specific TX queue.

The choice of TX queue can be based on the RX hash so that configuration
is usually unnecessary.

> So struct sk_buff->queue_mapping would be a union of 2 sub-members, rx_num and tx_num. But
> sk_buff->queue_mapping is just a u16 which is a small type. We might use the most-significant
> bit of sk_buff->queue_mapping as a flag as rx_num and tx_num wouldn't exist at the
> same time.
> 
> >  The trick here would
> > be to try to avoid reordering inside streams as far as possible,
> It's not to solve reorder issue. The start point is 10G NIC is very fast. We need some cpu
> work on packet receiving dedicately. If they work on other things, NIC might drop packets
> quickly.

Aggressive power-saving causes far greater latency than context-
switching under Linux.  I believe most 10G NICs have large RX FIFOs to
mitigate against this.  Ethernet flow control also helps to prevent
packet loss.

> The sysfs interface is just to facilitate NIC drivers. If there is no the sysfs interface,
> driver developers need implement it with parameters which are painful.
[...]

Or through the ethtool API, which already has some multiqueue control
operations.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


  reply	other threads:[~2009-03-12 14:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-11  8:53 [RFC v2: Patch 1/3] net: hand off skb list to other cpu to submit to upper layer Zhang, Yanmin
2009-03-11 11:13 ` Andi Kleen
2009-03-12  8:16   ` Zhang, Yanmin
2009-03-12 14:08     ` Ben Hutchings [this message]
2009-03-13  6:43       ` Zhang, Yanmin
2009-03-13 17:06         ` Tom Herbert
2009-03-13 18:51           ` David Miller
2009-03-13 21:01             ` Tom Herbert
2009-03-13 22:10               ` Ben Hutchings
2009-03-13 22:15                 ` Stephen Hemminger
     [not found]             ` <65634d660903131358h765bef64y6a0f1b0db7400f6f@mail.gmail.com>
2009-03-13 21:02               ` David Miller
2009-03-13 21:59                 ` Tom Herbert
2009-03-13 22:19                   ` David Miller
2009-03-13 23:58                     ` Herbert Xu
2009-03-14  0:24                     ` Tom Herbert
2009-03-14  1:53                       ` Andi Kleen
2009-03-14  2:19                       ` David Miller
2009-03-14 13:19                         ` Herbert Xu
2009-03-14 18:15                         ` Tom Herbert
2009-03-14 18:45                           ` David Miller
2009-03-16 16:53                             ` Tom Herbert
2009-03-14  1:51               ` Andi Kleen
2009-03-16  3:20           ` Zhang, Yanmin
2009-03-12 14:34     ` Andi Kleen
2009-03-13  9:06       ` Zhang, Yanmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1236866906.3221.11.camel@achroite \
    --to=bhutchings@solarflare.com \
    --cc=andi@firstfloor.org \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=jesse.brandeburg@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shemminger@vyatta.com \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).