From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "David S. Miller" <davem@davemloft.net>,
netdev@vger.kernel.org, Stephen Hemminger <shemminger@vyatta.com>
Subject: Re: [PATCH 6/13] bridge: Add core IGMP snooping support
Date: Fri, 5 Mar 2010 21:06:56 -0800 [thread overview]
Message-ID: <20100306050656.GA6812@linux.vnet.ibm.com> (raw)
In-Reply-To: <20100306011718.GA12812@gondor.apana.org.au>
On Sat, Mar 06, 2010 at 09:17:18AM +0800, Herbert Xu wrote:
> On Fri, Mar 05, 2010 at 03:43:27PM -0800, Paul E. McKenney wrote:
> >
> > Cool!!! You use a pair of list_head structures, so that a given
> > element can be in both the old and the new hash table simultaneously.
> > Of course, an RCU grace period must elapse between consecutive resizings.
> > Which appears to be addressed.
>
> Thanks :) I will try to extend this to other existing hash tables
> where the number of updates can be limited like it is here.
>
> > The teardown needs an rcu_barrier_bh() rather than the current
> > synchronize_rcu_bh(), please see below.
>
> All the call_rcu_bh's are done under multicast_lock. The first
> check taken after taking the multicast_lock is whether we've
> started the tear-down. So where it currently calls synchronize()
> it should already be the case that no call_rcu_bh's are still
> running.
Agreed, but the callbacks registered by the call_rcu_bh() might run
at any time, possibly quite some time after the synchronize_rcu_bh()
completes. For example, the last call_rcu_bh() might register on
one CPU, and the synchronize_rcu_bh() on another CPU. Then there
is no guarantee that the call_rcu_bh()'s callback will execute before
the synchronize_rcu_bh() returns.
In contrast, rcu_barrier_bh() is guaranteed not to return until all
pending RCU-bh callbacks have executed.
> > Also, I don't see how the teardown code is preventing new readers from
> > finding the data structures before they are being passed to call_rcu_bh().
> > You can't safely start the RCU grace period until -after- all new readers
> > have been excluded. (But I could easily be missing something here.)
>
> I understand. However, AFAICS whatever it is that we are destroying
> is taken off the reader's visible data structure before call_rcu_bh.
> Do you have a particular case in mind where this is not the case?
I might simply have missed the operation that removed reader
visibility, looking again...
Ah, I see it. The "br->mdb = NULL" in br_multicast_stop() makes
it impossible for the readers to get to any of the data. Right?
If so, my confusion, you are right, this one is OK.
> > The br_multicast_del_pg() looks to need rcu_read_lock_bh() and
> > rcu_read_unlock_bh() around its loop, if I understand the pointer-walking
> > scheme correctly.
>
> Any function that modifies the data structure is done under the
> multicast_lock, including br_multicast_del_pg.
But spin_lock() does not take the place of rcu_read_lock_bh().
And so, in theory, the RCU-bh grace period could complete between
the time that br_multicast_del_pg() does its call_rcu_bh() and the
"*pp = p->next;" at the top of the next loop iteration. If so,
then br_multicast_free_pg()'s kfree() will possibly have clobbered
"p->next". Low probability, yes, but a long-running interrupt
could do the trick.
Or is there something I am missing that is preventing an RCU-bh
grace period from completing near the bottom of br_multicast_del_pg()'s
"for" loop?
> > Hmmm... Where is the read-side code? Wherever it is, it cannot safely
> > dereference the ->old pointer.
>
> Right, the old pointer is merely there to limit rehashings to one
> per window. So it isn't used by the read-side.
Good!
> The read-side is the data path (non-IGMP multicast packets). The
> sole entry point is br_mdb_get().
Hmmm... So the caller is responsible for rcu_read_lock_bh()?
Shouldn't the br_mdb_get() code path be using hlist_for_each_entry_rcu()
in __br_mdb_ip_get(), then? Or is something else going on here?
Thanx, Paul
> Cheers,
> --
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
next prev parent reply other threads:[~2010-03-06 5:06 UTC|newest]
Thread overview: 81+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-02-26 15:34 [RFC] [1/13] bridge: Add IGMP snooping support Herbert Xu
2010-02-26 15:35 ` [PATCH 1/13] bridge: Do br_pass_frame_up after other ports Herbert Xu
2010-02-26 15:35 ` [PATCH 2/13] bridge: Allow tail-call on br_pass_frame_up Herbert Xu
2010-02-27 11:14 ` David Miller
2010-02-27 15:36 ` Herbert Xu
2010-02-26 15:35 ` [PATCH 3/13] bridge: Avoid unnecessary clone on forward path Herbert Xu
2010-02-26 15:35 ` [PATCH 4/13] bridge: Use BR_INPUT_SKB_CB on xmit path Herbert Xu
2010-02-26 15:35 ` [PATCH 5/13] bridge: Split may_deliver/deliver_clone out of br_flood Herbert Xu
2010-02-26 15:35 ` [PATCH 6/13] bridge: Add core IGMP snooping support Herbert Xu
2010-02-26 15:35 ` [PATCH 7/13] bridge: Add multicast forwarding functions Herbert Xu
2010-02-26 15:35 ` [PATCH 8/13] bridge: Add multicast start/stop hooks Herbert Xu
2010-02-26 15:35 ` [PATCH 9/13] bridge: Add multicast data-path hooks Herbert Xu
2010-02-26 15:35 ` [PATCH 10/13] bridge: Add multicast_router sysfs entries Herbert Xu
2010-02-27 0:42 ` Stephen Hemminger
2010-02-27 11:29 ` David Miller
2010-02-27 15:53 ` Herbert Xu
2010-03-09 12:25 ` Herbert Xu
2010-03-09 12:26 ` Herbert Xu
2010-02-26 15:35 ` [PATCH 11/13] bridge: Add multicast_snooping sysfs toggle Herbert Xu
2010-02-26 15:35 ` [PATCH 12/13] bridge: Add hash elasticity/max sysfs entries Herbert Xu
2010-02-26 15:35 ` [PATCH 13/13] bridge: Add multicast count/interval " Herbert Xu
2010-02-28 5:40 ` [1/13] bridge: Add IGMP snooping support Herbert Xu
2010-02-28 5:41 ` [PATCH 1/13] bridge: Do br_pass_frame_up after other ports Herbert Xu
2010-02-28 5:41 ` [PATCH 2/13] bridge: Allow tail-call on br_pass_frame_up Herbert Xu
2010-02-28 5:41 ` [PATCH 3/13] bridge: Avoid unnecessary clone on forward path Herbert Xu
2010-02-28 5:41 ` [PATCH 4/13] bridge: Use BR_INPUT_SKB_CB on xmit path Herbert Xu
2010-02-28 5:41 ` [PATCH 5/13] bridge: Split may_deliver/deliver_clone out of br_flood Herbert Xu
2010-02-28 5:41 ` [PATCH 6/13] bridge: Add core IGMP snooping support Herbert Xu
2010-03-05 23:43 ` Paul E. McKenney
2010-03-06 1:17 ` Herbert Xu
2010-03-06 5:06 ` Paul E. McKenney [this message]
2010-03-06 6:56 ` Herbert Xu
2010-03-06 7:03 ` Herbert Xu
2010-03-07 23:31 ` David Miller
2010-03-06 7:07 ` Herbert Xu
2010-03-07 23:31 ` David Miller
2010-03-06 15:00 ` Paul E. McKenney
2010-03-06 15:19 ` Paul E. McKenney
2010-03-06 19:00 ` Paul E. McKenney
2010-03-07 2:45 ` Herbert Xu
2010-03-07 3:11 ` Paul E. McKenney
2010-03-08 18:50 ` Arnd Bergmann
2010-03-09 3:15 ` Paul E. McKenney
2010-03-11 18:49 ` Arnd Bergmann
2010-03-14 23:01 ` Paul E. McKenney
2010-03-09 21:12 ` Arnd Bergmann
2010-03-10 2:14 ` Paul E. McKenney
2010-03-10 9:41 ` Arnd Bergmann
2010-03-10 10:39 ` Eric Dumazet
2010-03-10 10:49 ` Herbert Xu
2010-03-10 13:13 ` Paul E. McKenney
2010-03-10 14:07 ` Herbert Xu
2010-03-10 16:26 ` Paul E. McKenney
2010-03-10 16:35 ` David Miller
2010-03-10 17:56 ` Arnd Bergmann
2010-03-10 21:25 ` Paul E. McKenney
2010-03-10 13:27 ` Arnd Bergmann
2010-03-10 13:39 ` Arnd Bergmann
2010-03-10 13:19 ` Paul E. McKenney
2010-03-10 13:30 ` Arnd Bergmann
2010-03-10 13:57 ` Paul E. McKenney
2010-02-28 5:41 ` [PATCH 7/13] bridge: Add multicast forwarding functions Herbert Xu
2010-02-28 5:41 ` [PATCH 8/13] bridge: Add multicast start/stop hooks Herbert Xu
2010-02-28 5:41 ` [PATCH 9/13] bridge: Add multicast data-path hooks Herbert Xu
2010-04-27 17:13 ` [PATCH net-next] bridge: use is_multicast_ether_addr Stephen Hemminger
2010-04-27 19:53 ` David Miller
2010-02-28 5:41 ` [PATCH 10/13] bridge: Add multicast_router sysfs entries Herbert Xu
2010-04-27 17:13 ` [PATCH net-next] bridge: multicast router list manipulation Stephen Hemminger
2010-04-27 19:54 ` David Miller
2010-04-27 23:11 ` Michał Mirosław
2010-04-27 23:25 ` Stephen Hemminger
2010-04-27 23:28 ` David Miller
2010-04-27 23:44 ` Stephen Hemminger
2010-04-27 23:51 ` David Miller
2010-04-27 23:27 ` David Miller
2010-04-28 1:51 ` Herbert Xu
2010-02-28 5:41 ` [PATCH 11/13] bridge: Add multicast_snooping sysfs toggle Herbert Xu
2010-02-28 5:41 ` [PATCH 12/13] bridge: Add hash elasticity/max sysfs entries Herbert Xu
2010-02-28 5:41 ` [PATCH 13/13] bridge: Add multicast count/interval " Herbert Xu
2010-02-28 8:52 ` [1/13] bridge: Add IGMP snooping support David Miller
2010-03-01 2:08 ` Herbert Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100306050656.GA6812@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=herbert@gondor.apana.org.au \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.