From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "Jiri Pirko" <jpirko@redhat.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Andy Gospodarek" <andy@greyhouse.net>,
"David S. Miller" <davem@davemloft.net>,
LKML <linux-kernel@vger.kernel.org>,
netdev <netdev@vger.kernel.org>,
"Nicolas de Pesloüan" <nicolas.2p.debian@gmail.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Guy Streeter" <streeter@redhat.com>,
stephen@networkplumber.org
Subject: Re: [PATCH] net: add a synchronize_net() in netdev_rx_handler_unregister()
Date: Fri, 29 Mar 2013 12:20:11 -0700 [thread overview]
Message-ID: <20130329192011.GO3320@linux.vnet.ibm.com> (raw)
In-Reply-To: <1364562082.5113.16.camel@edumazet-glaptop>
On Fri, Mar 29, 2013 at 06:01:22AM -0700, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> On Fri, 2013-03-29 at 10:48 +0100, Jiri Pirko wrote:
>
> > Hmm. I think that this might be issue introduced by:
> > commit a9b3cd7f323b2e57593e7215362a7b02fc933e3a
> > Author: Stephen Hemminger <shemminger@vyatta.com>
> > Date: Mon Aug 1 16:19:00 2011 +0000
> >
> > rcu: convert uses of rcu_assign_pointer(x, NULL) to RCU_INIT_POINTER
> >
> >
> > Because, if rcu_dereference(dev->rx_handler) is null,
> > rcu_dereference(dev->rx_handler_data) is never done. Therefore I believe
> > we are hitting following scenario:
> >
> >
> > CPU0 CPU1
> > ---- ----
> > dev->rx_handler_data = NULL
> > rcu_read_lock()
> > dev->rx_handler = NULL
> >
> >
> > CPU0 will see rx_handler set and yet, rx_handler_data nulled. Write
> > barrier in rcu_assign_pointer() might prevent this reorder from happening.
> > Therefore I suggest:
> >
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index 0caa38e..c16b829 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -3332,8 +3332,8 @@ void netdev_rx_handler_unregister(struct net_device *dev)
> > {
> >
> > ASSERT_RTNL();
> > - RCU_INIT_POINTER(dev->rx_handler, NULL);
> > - RCU_INIT_POINTER(dev->rx_handler_data, NULL);
> > + rcu_assign_pointer(dev->rx_handler, NULL);
> > + rcu_assign_pointer(dev->rx_handler_data, NULL);
> > }
> > EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
> >
> >
>
> Nope this changes nothing at all.
>
> However, we can fix the bug in a different way, if we want to avoid a
> test in fast path.
>
> With following patch, we can make sure that a reader seeing a non NULL
> rx_handler has a guarantee to see a non NULL rx_handler_data
>
> Thanks
>
> [PATCH] net: add a synchronize_net() in netdev_rx_handler_unregister()
>
> commit 35d48903e97819 (bonding: fix rx_handler locking) added a race
> in bonding driver, reported by Steven Rostedt who did a very good
> diagnosis :
>
> <quoting Steven>
>
> I'm currently debugging a crash in an old 3.0-rt kernel that one of our
> customers is seeing. The bug happens with a stress test that loads and
> unloads the bonding module in a loop (I don't know all the details as
> I'm not the one that is directly interacting with the customer). But the
> bug looks to be something that may still be present and possibly present
> in mainline too. It will just be much harder to trigger it in mainline.
>
> In -rt, interrupts are threads, and can schedule in and out just like
> any other thread. Note, mainline now supports interrupt threads so this
> may be easily reproducible in mainline as well. I don't have the ability
> to tell the customer to try mainline or other kernels, so my hands are
> somewhat tied to what I can do.
>
> But according to a core dump, I tracked down that the eth irq thread
> crashed in bond_handle_frame() here:
>
> slave = bond_slave_get_rcu(skb->dev);
> bond = slave->bond; <--- BUG
>
>
> the slave returned was NULL and accessing slave->bond caused a NULL
> pointer dereference.
>
> Looking at the code that unregisters the handler:
>
> void netdev_rx_handler_unregister(struct net_device *dev)
> {
>
> ASSERT_RTNL();
> RCU_INIT_POINTER(dev->rx_handler, NULL);
> RCU_INIT_POINTER(dev->rx_handler_data, NULL);
> }
>
> Which is basically:
> dev->rx_handler = NULL;
> dev->rx_handler_data = NULL;
>
> And looking at __netif_receive_skb() we have:
>
> rx_handler = rcu_dereference(skb->dev->rx_handler);
> if (rx_handler) {
> if (pt_prev) {
> ret = deliver_skb(skb, pt_prev, orig_dev);
> pt_prev = NULL;
> }
> switch (rx_handler(&skb)) {
>
> My question to all of you is, what stops this interrupt from happening
> while the bonding module is unloading? What happens if the interrupt
> triggers and we have this:
>
>
> CPU0 CPU1
> ---- ----
> rx_handler = skb->dev->rx_handler
>
> netdev_rx_handler_unregister() {
> dev->rx_handler = NULL;
> dev->rx_handler_data = NULL;
>
> rx_handler()
> bond_handle_frame() {
> slave = skb->dev->rx_handler;
> bond = slave->bond; <-- NULL pointer dereference!!!
>
>
> What protection am I missing in the bond release handler that would
> prevent the above from happening?
>
> </quoting Steven>
>
> We can fix bug this in two ways. First is adding a test in
> bond_handle_frame() and others to check if rx_handler_data is NULL.
>
> A second way is adding a synchronize_net() in
> netdev_rx_handler_unregister() to make sure that a rcu protected reader
> has the guarantee to see a non NULL rx_handler_data.
>
> The second way is better as it avoids an extra test in fast path.
>
> Reported-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Jiri Pirko <jpirko@redhat.com>
> Cc: Paul E. McKenney <paulmck@us.ibm.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
With kudos to Steven Rostedt for his analogy between RCU and
Schrödinger's cat. ;-)
> ---
> net/core/dev.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index b13e5c7..56932a4 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -3314,6 +3314,7 @@ int netdev_rx_handler_register(struct net_device *dev,
> if (dev->rx_handler)
> return -EBUSY;
>
> + /* Note: rx_handler_data must be set before rx_handler */
> rcu_assign_pointer(dev->rx_handler_data, rx_handler_data);
> rcu_assign_pointer(dev->rx_handler, rx_handler);
>
> @@ -3334,6 +3335,11 @@ void netdev_rx_handler_unregister(struct net_device *dev)
>
> ASSERT_RTNL();
> RCU_INIT_POINTER(dev->rx_handler, NULL);
> + /* a reader seeing a non NULL rx_handler in a rcu_read_lock()
> + * section has a guarantee to see a non NULL rx_handler_data
> + * as well.
> + */
> + synchronize_net();
> RCU_INIT_POINTER(dev->rx_handler_data, NULL);
> }
> EXPORT_SYMBOL_GPL(netdev_rx_handler_unregister);
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2013-03-29 19:20 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-28 17:16 [BUG] Crash with NULL pointer dereference in bond_handle_frame in -rt (possibly mainline) Steven Rostedt
2013-03-28 17:29 ` Eric Dumazet
2013-03-28 17:44 ` Steven Rostedt
2013-03-29 9:48 ` Jiri Pirko
2013-03-29 13:01 ` [PATCH] net: add a synchronize_net() in netdev_rx_handler_unregister() Eric Dumazet
2013-03-29 13:17 ` Steven Rostedt
2013-03-29 13:38 ` Eric Dumazet
2013-03-29 15:11 ` Ivan Vecera
2013-03-29 15:38 ` Eric Dumazet
2013-03-29 16:12 ` Jiri Pirko
2013-03-29 16:46 ` Eric Dumazet
2013-03-29 19:20 ` Paul E. McKenney [this message]
2013-03-29 19:26 ` Eric Dumazet
2013-03-29 19:39 ` David Miller
2013-03-29 15:46 ` [BUG] Crash with NULL pointer dereference in bond_handle_frame in -rt (possibly mainline) Stephen Hemminger
2013-03-29 18:36 ` Steven Rostedt
2013-03-29 19:55 ` Steven Rostedt
2013-03-30 9:19 ` Jiri Pirko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130329192011.GO3320@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=andy@greyhouse.net \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=jpirko@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=nicolas.2p.debian@gmail.com \
--cc=rostedt@goodmis.org \
--cc=stephen@networkplumber.org \
--cc=streeter@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.