netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Lamparter <equinox@diac24.net>
To: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: David Lamparter <equinox@diac24.net>,
	netdev@vger.kernel.org, amine.kherbouche@6wind.com,
	roopa@cumulusnetworks.com, stephen@networkplumber.org,
	"bridge@lists.linux-foundation.org"
	<bridge@lists.linux-foundation.org>
Subject: Re: [PATCH 1/6] bridge: learn dst metadata in FDB
Date: Thu, 17 Aug 2017 14:45:30 +0200	[thread overview]
Message-ID: <20170817124530.GN773745@eidolon> (raw)
In-Reply-To: <4d4c8d17-1e0b-b78f-983b-e419e153b2bf@cumulusnetworks.com>

On Thu, Aug 17, 2017 at 02:39:43PM +0300, Nikolay Aleksandrov wrote:
> On 17/08/17 14:03, David Lamparter wrote:
> > On Wed, Aug 16, 2017 at 11:38:06PM +0300, Nikolay Aleksandrov wrote:
> >> On 16/08/17 20:01, David Lamparter wrote:
> >> and hitting the fast path for everyone in a few different places for a
> >> feature that the majority will not use does not sound acceptable to
> >> me. We've been trying hard to optimize it, trying to avoid additional
> >> cache lines, removing tests and keeping special cases to a minimum. 
> > 
> > skb->dst is on the same cacheline as skb->len.
> > fdb->md_dst is on the same cacheline as fdb->dst.
> > Both will be 0 in a lot of cases, so this should be two null checks on
> > data that is hot in the cache.  Are you sure this is an actual problem?
> > 
> 
> Sure - no, I haven't benchmarked it, but I don't see skb->len being on
> the same cache line as _skb_refdst assuming 64 byte cache lines.
> But again any special cases, in my opinion, should be handled on their own,
> it is both about the fast path and the code complexity that they bring in.

(separate thread)

[cut]
> > I really hope you're not suggesting the entire MDB with IPv4 & IPv6
> > snooping be duplicated into both VPLS and mac80211?
> 
> Code can always be shared if there are more users, no need to stuff
> everything in the bridge,

The MDB code is far from trivial, has several configuration knobs, and
even sends its own queries if configured to do so.  It can also use
quite a bit of memory of there's a nontrivial number of multicast
groups.  I *really* think it shouldn't be duplicated.

> but I'm not that familiar with this case, once patches are out I can
> comment further.

I've pushed my hacks to:
https://github.com/eqvinox/vpls-linux-kernel/commits/mdb-hack
(top two commits)

THIS IS ABSOLUTELY A PROOF OF CONCEPT.  It doesn't un-learn dst
metadata, it probably leaks buckets, and it may kill your cat.
(I haven't pushed my attempts at mac80211, because I haven't gotten
anywhere useful there just yet.)

> >> As you've noted this is only an RFC so I will not point out every issue, but there seems
> >> to be a major problem with br_fdb_update(), note that it runs without any locks except RCU.
> > 
> > Right, Thanks! ... I only thought about concurrent access, forgetting
> > about concurrent modification...  I'll replace it with an xchg I think.
> > (No need for a lock that way)
> 
> I think you can still lose references to a dst that way, what if someone changes the
> dst you read before the xchg and you xchg it ?

The dst to be released is the return from (atomic) xchg, not the value
read earlier for comparison.  This can happen in parallel, but apart
from a little extra churn in the update case it has no ill effects.

If someone changes it in the meantime, they have new dst information for
the fdb entry, and so do we.  With xchg'ing it, either one of the
updates will stick and the other will be properly released.  Considering
that there is no correct ordering here (either packet could be processed
a nanosecond later or earlier), this is perfectly fine as an outcome.


-David

  parent reply	other threads:[~2017-08-17 12:45 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-16 17:01 [RFC net-next] VPLS support David Lamparter
2017-08-16 17:01 ` [PATCH 1/6] bridge: learn dst metadata in FDB David Lamparter
2017-08-16 20:38   ` Nikolay Aleksandrov
2017-08-17 11:03     ` David Lamparter
2017-08-17 11:39       ` Nikolay Aleksandrov
2017-08-17 11:51         ` Nikolay Aleksandrov
2017-08-17 12:10           ` David Lamparter
2017-08-17 12:19             ` Nikolay Aleksandrov
2017-08-17 12:20             ` David Lamparter
2017-08-17 12:45         ` David Lamparter [this message]
2017-08-17 13:04           ` Nikolay Aleksandrov
2017-08-17 16:16     ` David Lamparter
2017-08-16 17:01 ` [PATCH 2/6] mpls: split forwarding path on rx/tx boundary David Lamparter
2017-08-19 17:10   ` kbuild test robot
2017-08-19 17:42   ` kbuild test robot
2017-08-16 17:01 ` [PATCH 3/6] mpls: add VPLS entry points David Lamparter
2017-08-19 18:27   ` kbuild test robot
2017-08-21 14:01   ` Amine Kherbouche
2017-08-21 15:55     ` David Lamparter
2017-08-21 16:13       ` Amine Kherbouche
2017-08-16 17:02 ` [PATCH 4/6] mpls: VPLS support David Lamparter
2017-08-21 15:14   ` Amine Kherbouche
2017-08-21 16:18     ` David Lamparter
2017-08-21 16:11   ` Amine Kherbouche
2017-08-16 17:02 ` [PATCH 5/6] bridge: add VPLS pseudowire info in fdb dump David Lamparter
2017-08-16 17:02 ` [PATCH 6/6] mpls: pseudowire control word support David Lamparter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170817124530.GN773745@eidolon \
    --to=equinox@diac24.net \
    --cc=amine.kherbouche@6wind.com \
    --cc=bridge@lists.linux-foundation.org \
    --cc=netdev@vger.kernel.org \
    --cc=nikolay@cumulusnetworks.com \
    --cc=roopa@cumulusnetworks.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).