netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Duyck <alexander.h.duyck@intel.com>
To: Ben Hutchings <bhutchings@solarflare.com>
Cc: David Miller <davem@davemloft.net>,
	"therbert@google.com" <therbert@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Subject: Re: [RFC PATCH 0/3] Simplified 16 bit Toeplitz hash algorithm
Date: Mon, 03 Jan 2011 13:45:25 -0800	[thread overview]
Message-ID: <4D224375.2040208@intel.com> (raw)
In-Reply-To: <1294085724.3167.202.camel@localhost>

On 1/3/2011 12:15 PM, Ben Hutchings wrote:
> On Mon, 2011-01-03 at 11:52 -0800, Alexander Duyck wrote:
>> On 1/3/2011 11:30 AM, Ben Hutchings wrote:
>>> On Mon, 2011-01-03 at 11:02 -0800, David Miller wrote:
>>>> From: Tom Herbert<therbert@google.com>
>>>> Date: Mon, 3 Jan 2011 10:47:20 -0800
>>>>
>>>>> I'm not sure why this would be needed.  What is the a advantage in
>>>>> making the TX and RX queues match?
>>>>
>>>> That's how their hardware based RFS essentially works.
>>>>
>>>> Instead of watching for "I/O system calls" like we do in software, the
>>>> chip watches for which TX queue a flow ends up on and matches things
>>>> up on the receive side with the same numbered RX queue to match.
>>>
>>> ixgbe also implements IRQ affinity setting (or rather hinting) and TX
>>> queue selection by CPU, the inverse of IRQ affinity setting.  Together
>>> with the hardware/firmware Flow Director feature, this should indeed
>>> result in hardware RFS.  (However, irqbalanced does not yet follow the
>>> affinity hints AFAIK, so this requires some manual intervention.  Maybe
>>> the OOT driver is different?)
>>>
>>> The proposed change to make TX queue selection hash-based seems to be a
>>> step backwards.
>>>
>>> Ben.
>>>
>>
>> Actually this code would only be applied in the case where Flow Director
>> didn't apply such as non-TCP frames.  It would essentially guarantee
>> that we end up with TX/RX on the same CPU for all cases instead of just
>> when Flow Director matches a given flow.
>
> The code you posted doesn't seem to implement that, though.

Actually it does, it only takes effect in the case that flow director 
isn't enabled.  I just implemented it as a ndo_select_queue and then in 
the case of the igb example I applied it directly, and in the case of 
the ixgbe example I just added it to the end of the ndo_select_queue 
function that it already had.

>
>> The general idea is to at least keep the traffic local to one TX/RX
>> queue pair so that if we cannot match the queue pair to the application,
>> perhaps the application can be affinitized to match up with the queue
>> pair.  Otherwise we end up with traffic getting routed to one TX queue
>> on one CPU, and the RX being routed to another queue on perhaps a
>> different CPU and it becomes quite difficult to match up the queues and
>> the applications.
>
> Right.  That certainly seems like a Good Thing, though I believe it can
> be implemented generically by recording the RX queue number on the
> socket:
>
> http://article.gmane.org/gmane.linux.network/158477

That was one of the reasons why I put this chunk of code out there as an 
RFC as I didn't see anywhere where it really fit in.  I wasn't sure if 
anyone had a use for it or not, but I didn't see much point in keeping 
it to myself and so I submitted as an RFC to see if anyone had any interest.

>> Since the approach is based on Toeplitz it can be applied to all
>> hardware capable of generating a Toeplitz based hash and as a result it
>> would likely also work in a much more vendor neutral kind of way than
>> Flow Director currently does.
>
> Which I appreciate, but I'm not convinced that weakening Toeplitz is a
> good way to do it.
>
> I understand that Robert Watson (FreeBSD hacker) has been doing some
> research on the security and performance implications of flow hashing
> algorithms, though I haven't seen any results of that yet.
>
> Ben.
>

I wasn't really sure about it either, but from what I can tell Toeplitz 
is pretty weak in the first place, especially if using a static key, but 
really hard to do efficiently in software with a full 40 byte key.

The advantages of the 16 bit key were that I could do the hash 
computation with little CPU overhead and then I also was able to 
generate the symmetric hash result so I didn't have to mess with source 
and destination field ordering to generate the TX hash.  Since most of 
the hardware I am familiar with doesn't support more than 128 queues 
anyway the 16 bit hash input and result generated via this approach 
should be more than enough to handle the queue selection and 
distribution needs of the hardware which was my only real concern.

Thanks for the input,

Alex





  reply	other threads:[~2011-01-03 21:45 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-18  1:00 [RFC PATCH 0/3] Simplified 16 bit Toeplitz hash algorithm Alexander Duyck
2010-12-18  1:00 ` [RFC PATCH 1/3] net: add simplified 16 bit Toeplitz hash function for transmit side hashing Alexander Duyck
2010-12-18  1:00 ` [RFC PATCH 2/3] ixgbe: example of how to update ixgbe to make use of in-kernel Toeplitz hash Alexander Duyck
2010-12-18  1:00 ` [RFC PATCH 3/3] igb: example of how to update igb to make use of in-kernel Toeplitz hashing Alexander Duyck
2010-12-18  5:09   ` David Miller
2010-12-18  6:53     ` Alexander Duyck
2010-12-18  6:59       ` David Miller
2011-01-03 18:47 ` [RFC PATCH 0/3] Simplified 16 bit Toeplitz hash algorithm Tom Herbert
2011-01-03 19:00   ` Alexander Duyck
2011-01-03 19:02   ` David Miller
2011-01-03 19:30     ` Ben Hutchings
2011-01-03 19:52       ` Alexander Duyck
2011-01-03 19:54         ` David Miller
2011-01-03 20:15         ` Ben Hutchings
2011-01-03 21:45           ` Alexander Duyck [this message]
2011-01-04  3:25           ` Tom Herbert
2011-01-04 15:43             ` Ben Hutchings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4D224375.2040208@intel.com \
    --to=alexander.h.duyck@intel.com \
    --cc=bhutchings@solarflare.com \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=therbert@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).