From: Ben Hutchings <bhutchings@solarflare.com>
To: Rusty Russell <rusty@rustcorp.com.au>
Cc: krkumar2@in.ibm.com, kvm@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org,
virtualization@lists.linux-foundation.org,
levinsasha928@gmail.com
Subject: Re: [net-next RFC PATCH 0/5] Series short description
Date: Thu, 15 Dec 2011 01:36:44 +0000 [thread overview]
Message-ID: <1323913005.2753.47.camel@bwh-desktop> (raw)
In-Reply-To: <87r50efgza.fsf@rustcorp.com.au>
On Fri, 2011-12-09 at 16:01 +1030, Rusty Russell wrote:
> On Wed, 7 Dec 2011 17:02:04 +0000, Ben Hutchings <bhutchings@solarflare.com> wrote:
> > Solarflare controllers (sfc driver) have 8192 perfect filters for
> > TCP/IPv4 and UDP/IPv4 which can be used for flow steering. (The filters
> > are organised as a hash table, but matched based on 5-tuples.) I
> > implemented the 'accelerated RFS' interface in this driver.
> >
> > I believe the Intel 82599 controllers (ixgbe driver) have both
> > hash-based and perfect filter modes and the driver can be configured to
> > use one or the other. The driver has its own independent mechanism for
> > steering RX and TX flows which predates RFS; I don't know whether it
> > uses hash-based or perfect filters.
>
> Thanks for this summary (and Jason, too). I've fallen a long way behind
> NIC state-of-the-art.
>
> > Most multi-queue controllers could support a kind of hash-based
> > filtering for TCP/IP by adjusting the RSS indirection table. However,
> > this table is usually quite small (64-256 entries). This means that
> > hash collisions will be quite common and this can result in reordering.
> > The same applies to the small table Jason has proposed for virtio-net.
>
> But this happens on real hardware today. Better that real hardware is
> nice, but is it overkill?
What do you mean, it happens on real hardware today? So far as I know,
the only cases where we have dynamic adjustment of flow steering are in
ixgbe (big table of hash filters, I think) and sfc (perfect filters).
I don't think that anyone's currently doing flow steering with the RSS
indirection table. (At least, not on Linux. I think that Microsoft was
intending to do so on Windows, but I don't know whether they ever did.)
> And can't you reorder even with perfect matching, since prior packets
> will be on the old queue and more recent ones on the new queue? Does it
> discard or requeue old ones? Or am I missing a trick?
Yes, that is possible. RFS is careful to avoid such reordering by only
changing the steering of a flow when none of its packets can be in a
software receive queue. It is not generally possible to do the same for
hardware receive queues. However, when the first condition is met it is
likely that there won't be a whole lot of packets for that flow in the
hardware receive queue either. (But if there are, then I think as a
side-effect of commit 09994d1 RFS will repeatedly ask the driver to
steer the flow. Which isn't ideal.)
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
next prev parent reply other threads:[~2011-12-15 1:36 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-12-05 8:58 [net-next RFC PATCH 0/5] Series short description Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 1/5] virtio_net: passing rxhash through vnet_hdr Jason Wang
2011-12-05 8:58 ` [net-next RFC PATCH 2/5] tuntap: simple flow director support Jason Wang
2011-12-05 10:38 ` Stefan Hajnoczi
2011-12-05 20:09 ` Ben Hutchings
2011-12-06 7:21 ` Jason Wang
2011-12-06 17:31 ` Ben Hutchings
2011-12-05 20:09 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 3/5] macvtap: " Jason Wang
2011-12-05 20:11 ` Ben Hutchings
2011-12-05 20:11 ` Ben Hutchings
2011-12-05 8:59 ` [net-next RFC PATCH 4/5] virtio: introduce a method to get the irq of a specific virtqueue Jason Wang
2011-12-05 8:59 ` [net-next RFC PATCH 5/5] virtio-net: flow director support Jason Wang
2011-12-05 8:59 ` Jason Wang
2011-12-05 10:55 ` Stefan Hajnoczi
2011-12-06 6:33 ` Jason Wang
2011-12-06 9:18 ` Stefan Hajnoczi
2011-12-06 10:21 ` Jason Wang
2011-12-06 13:15 ` Stefan Hajnoczi
2011-12-06 15:42 ` Sridhar Samudrala
2011-12-06 16:14 ` Michael S. Tsirkin
2011-12-06 23:10 ` Sridhar Samudrala
2011-12-07 11:05 ` Jason Wang
2011-12-07 11:02 ` Jason Wang
2011-12-09 2:00 ` Sridhar Samudrala
2011-12-06 15:42 ` Sridhar Samudrala
2011-12-07 3:03 ` Jason Wang
2011-12-07 9:08 ` Stefan Hajnoczi
2011-12-07 12:10 ` Jason Wang
2011-12-07 15:04 ` Stefan Hajnoczi
2011-12-07 15:04 ` Stefan Hajnoczi
2011-12-06 13:15 ` Stefan Hajnoczi
2011-12-06 9:18 ` Stefan Hajnoczi
2011-12-05 20:42 ` Ben Hutchings
2011-12-06 7:25 ` Jason Wang
2011-12-06 17:36 ` Ben Hutchings
2011-12-07 7:30 ` [net-next RFC PATCH 0/5] Series short description Rusty Russell
2011-12-07 11:31 ` Jason Wang
2011-12-07 17:02 ` Ben Hutchings
2011-12-08 10:06 ` Jason Wang
2011-12-09 5:31 ` Rusty Russell
2011-12-15 1:36 ` Ben Hutchings [this message]
2011-12-15 23:12 ` Rusty Russell
2011-12-07 7:30 ` Rusty Russell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1323913005.2753.47.camel@bwh-desktop \
--to=bhutchings@solarflare.com \
--cc=krkumar2@in.ibm.com \
--cc=kvm@vger.kernel.org \
--cc=levinsasha928@gmail.com \
--cc=mst@redhat.com \
--cc=netdev@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.