From: Ben Hutchings <bhutchings@solarflare.com>
To: Tom Herbert <therbert@google.com>
Cc: David Miller <davem@davemloft.net>,
yanmin_zhang@linux.intel.com, andi@firstfloor.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
herbert@gondor.apana.org.au, jesse.brandeburg@intel.com,
shemminger@vyatta.com
Subject: Re: [RFC v2: Patch 1/3] net: hand off skb list to other cpu to submit to upper layer
Date: Fri, 13 Mar 2009 22:10:59 +0000 [thread overview]
Message-ID: <1236982259.3300.13.camel@achroite> (raw)
In-Reply-To: <65634d660903131401v24d0b5aarec36ad95220ba201@mail.gmail.com>
On Fri, 2009-03-13 at 14:01 -0700, Tom Herbert wrote:
> On Fri, Mar 13, 2009 at 11:51 AM, David Miller <davem@davemloft.net> wrote:
> >
> > From: Tom Herbert <therbert@google.com>
> > Date: Fri, 13 Mar 2009 10:06:56 -0700
> >
> > > You'll definitely want to look at the hardware provided hash. We've
> > > been using a 10G NIC which provides a Toeplitz hash (the one defined
> > > by Microsoft) and a software RSS-like capability to move packets from
> > > an interrupting CPU to another for processing. The hash could be used
> > > to index to a set of CPUs, but we also use the hash as a connection
> > > identifier to key into a lookup table to steer packets to the CPU
> > > where the application is running based on the running CPU of the last
> > > recvmsg. Using the device provided hash in this manner is a HUGE win,
> > > as opposed to taking cache misses to get 4-tuple from packet itself to
> > > compute a hash. I posted some patches a while back on our work if
> > > you're interested.
> >
> > I never understood this.
> >
> > If you don't let the APIC move the interrupt around, the individual
> > MSI-X interrupts will steer packets to individual specific CPUS and as
> > a result the scheduler will migrate tasks over to those cpus since the
> > wakeup events keep occuring there.
>
> We are trying to follow the decisions scheduler as opposed to leading
> it. This works on very loaded systems, with applications binding to
> cpusets, with threads that are receiving on multiple sockets. I
> suppose it might be compelling if a NIC could steer packets per flow,
> instead of by a hash...
Depending on the NIC, RX queue selection may be done using a large
number of bits of the hash value and an indirection table or by matching
against specific values in the headers. The SFC4000 supports both of
these, though limited to TCP/IPv4 and UDP/IPv4. I think Neptune may be
more flexible. Of course, both indirection table entries and filter
table entries will be limited resources in any NIC, so allocating these
wholly automatically is an interesting challenge.
Ben.
--
Ben Hutchings, Senior Software Engineer, Solarflare Communications
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
next prev parent reply other threads:[~2009-03-13 22:11 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-11 8:53 [RFC v2: Patch 1/3] net: hand off skb list to other cpu to submit to upper layer Zhang, Yanmin
2009-03-11 11:13 ` Andi Kleen
2009-03-12 8:16 ` Zhang, Yanmin
2009-03-12 14:08 ` Ben Hutchings
2009-03-13 6:43 ` Zhang, Yanmin
2009-03-13 17:06 ` Tom Herbert
2009-03-13 18:51 ` David Miller
2009-03-13 21:01 ` Tom Herbert
2009-03-13 22:10 ` Ben Hutchings [this message]
2009-03-13 22:15 ` Stephen Hemminger
[not found] ` <65634d660903131358h765bef64y6a0f1b0db7400f6f@mail.gmail.com>
2009-03-13 21:02 ` David Miller
2009-03-13 21:59 ` Tom Herbert
2009-03-13 22:19 ` David Miller
2009-03-13 23:58 ` Herbert Xu
2009-03-14 0:24 ` Tom Herbert
2009-03-14 1:53 ` Andi Kleen
2009-03-14 2:19 ` David Miller
2009-03-14 13:19 ` Herbert Xu
2009-03-14 18:15 ` Tom Herbert
2009-03-14 18:45 ` David Miller
2009-03-16 16:53 ` Tom Herbert
2009-03-14 1:51 ` Andi Kleen
2009-03-16 3:20 ` Zhang, Yanmin
2009-03-12 14:34 ` Andi Kleen
2009-03-13 9:06 ` Zhang, Yanmin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1236982259.3300.13.camel@achroite \
--to=bhutchings@solarflare.com \
--cc=andi@firstfloor.org \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=jesse.brandeburg@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
--cc=therbert@google.com \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).