netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
@ 2013-01-15  9:34 Cong Wang
  2013-01-16 20:27 ` David Miller
  2013-01-17  1:24 ` Eric Dumazet
  0 siblings, 2 replies; 9+ messages in thread
From: Cong Wang @ 2013-01-15  9:34 UTC (permalink / raw)
  To: netdev; +Cc: Eric Dumazet, Jiri Pirko, David S. Miller, Cong Wang

From: Cong Wang <amwang@redhat.com>

v4: hold rtnl lock for the whole netpoll_setup()
v3: remove the comment
v2: use RCU read lock

This patch fixes the following warning:

[   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
[   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
[   72.019582] Call Trace:
[   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
[   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
[   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
[   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
[   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
[   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
[   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
[   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
[   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b

In case of other races, hold rtnl lock for the entire netpoll_setup() function.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 9f05067..a5ad1c1 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -1048,11 +1048,13 @@ int netpoll_setup(struct netpoll *np)
 	struct in_device *in_dev;
 	int err;
 
+	rtnl_lock();
 	if (np->dev_name)
-		ndev = dev_get_by_name(&init_net, np->dev_name);
+		ndev = __dev_get_by_name(&init_net, np->dev_name);
 	if (!ndev) {
 		np_err(np, "%s doesn't exist, aborting\n", np->dev_name);
-		return -ENODEV;
+		err = -ENODEV;
+		goto unlock;
 	}
 
 	if (netdev_master_upper_dev_get(ndev)) {
@@ -1066,15 +1068,14 @@ int netpoll_setup(struct netpoll *np)
 
 		np_info(np, "device %s not up yet, forcing it\n", np->dev_name);
 
-		rtnl_lock();
 		err = dev_open(ndev);
-		rtnl_unlock();
 
 		if (err) {
 			np_err(np, "failed to open %s\n", ndev->name);
 			goto put;
 		}
 
+		rtnl_unlock();
 		atleast = jiffies + HZ/10;
 		atmost = jiffies + carrier_timeout * HZ;
 		while (!netif_carrier_ok(ndev)) {
@@ -1094,16 +1095,14 @@ int netpoll_setup(struct netpoll *np)
 			np_notice(np, "carrier detect appears untrustworthy, waiting 4 seconds\n");
 			msleep(4000);
 		}
+		rtnl_lock();
 	}
 
 	if (!np->local_ip.ip) {
 		if (!np->ipv6) {
-			rcu_read_lock();
-			in_dev = __in_dev_get_rcu(ndev);
-
+			in_dev = __in_dev_get_rtnl(ndev);
 
 			if (!in_dev || !in_dev->ifa_list) {
-				rcu_read_unlock();
 				np_err(np, "no IP address for %s, aborting\n",
 				       np->dev_name);
 				err = -EDESTADDRREQ;
@@ -1111,14 +1110,12 @@ int netpoll_setup(struct netpoll *np)
 			}
 
 			np->local_ip.ip = in_dev->ifa_list->ifa_local;
-			rcu_read_unlock();
 			np_info(np, "local IP %pI4\n", &np->local_ip.ip);
 		} else {
 #if IS_ENABLED(CONFIG_IPV6)
 			struct inet6_dev *idev;
 
 			err = -EDESTADDRREQ;
-			rcu_read_lock();
 			idev = __in6_dev_get(ndev);
 			if (idev) {
 				struct inet6_ifaddr *ifp;
@@ -1133,7 +1130,6 @@ int netpoll_setup(struct netpoll *np)
 				}
 				read_unlock_bh(&idev->lock);
 			}
-			rcu_read_unlock();
 			if (err) {
 				np_err(np, "no IPv6 address for %s, aborting\n",
 				       np->dev_name);
@@ -1151,17 +1147,17 @@ int netpoll_setup(struct netpoll *np)
 	/* fill up the skb queue */
 	refill_skbs();
 
-	rtnl_lock();
 	err = __netpoll_setup(np, ndev, GFP_KERNEL);
-	rtnl_unlock();
-
 	if (err)
 		goto put;
 
+	rtnl_unlock();
 	return 0;
 
 put:
 	dev_put(ndev);
+unlock:
+	rtnl_unlock();
 	return err;
 }
 EXPORT_SYMBOL(netpoll_setup);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-15  9:34 [Patch net-next v4] netpoll: fix a rtnl lock assertion failure Cong Wang
@ 2013-01-16 20:27 ` David Miller
  2013-01-17  1:24 ` Eric Dumazet
  1 sibling, 0 replies; 9+ messages in thread
From: David Miller @ 2013-01-16 20:27 UTC (permalink / raw)
  To: amwang; +Cc: netdev, eric.dumazet, jiri

From: Cong Wang <amwang@redhat.com>
Date: Tue, 15 Jan 2013 17:34:06 +0800

> From: Cong Wang <amwang@redhat.com>
> 
> v4: hold rtnl lock for the whole netpoll_setup()
> v3: remove the comment
> v2: use RCU read lock
> 
> This patch fixes the following warning:
> 
> [   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
> [   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
> [   72.019582] Call Trace:
> [   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
> [   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
> [   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
> [   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
> [   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
> [   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
> [   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
> [   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
> [   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
> 
> In case of other races, hold rtnl lock for the entire netpoll_setup() function.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>

Applied.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-15  9:34 [Patch net-next v4] netpoll: fix a rtnl lock assertion failure Cong Wang
  2013-01-16 20:27 ` David Miller
@ 2013-01-17  1:24 ` Eric Dumazet
  2013-01-17  3:30   ` Cong Wang
  2013-01-17  3:52   ` David Miller
  1 sibling, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2013-01-17  1:24 UTC (permalink / raw)
  To: Cong Wang; +Cc: netdev, Jiri Pirko, David S. Miller

On Tue, 2013-01-15 at 17:34 +0800, Cong Wang wrote:
> From: Cong Wang <amwang@redhat.com>
> 
> v4: hold rtnl lock for the whole netpoll_setup()
> v3: remove the comment
> v2: use RCU read lock
> 
> This patch fixes the following warning:
> 
> [   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
> [   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
> [   72.019582] Call Trace:
> [   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
> [   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
> [   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
> [   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
> [   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
> [   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
> [   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
> [   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
> [   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
> 
> In case of other races, hold rtnl lock for the entire netpoll_setup() function.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
> ---
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c

...

>  	if (np->dev_name)
> -		ndev = dev_get_by_name(&init_net, np->dev_name);
> +		ndev = __dev_get_by_name(&init_net, np->dev_name);

This change brings interesting bugs.

All the "goto put;" are basically wrong, and the section waiting for the
carrier and releasing/getting rtnl is buggy.

Please revert this part.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  1:24 ` Eric Dumazet
@ 2013-01-17  3:30   ` Cong Wang
  2013-01-17  3:54     ` David Miller
  2013-01-17  3:52   ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Cong Wang @ 2013-01-17  3:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, Jiri Pirko, David S. Miller

On Wed, 2013-01-16 at 17:24 -0800, Eric Dumazet wrote:
> On Tue, 2013-01-15 at 17:34 +0800, Cong Wang wrote:
> > From: Cong Wang <amwang@redhat.com>
> > 
> > v4: hold rtnl lock for the whole netpoll_setup()
> > v3: remove the comment
> > v2: use RCU read lock
> > 
> > This patch fixes the following warning:
> > 
> > [   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
> > [   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
> > [   72.019582] Call Trace:
> > [   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
> > [   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
> > [   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
> > [   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
> > [   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
> > [   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
> > [   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
> > [   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
> > [   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
> > 
> > In case of other races, hold rtnl lock for the entire netpoll_setup() function.
> > 
> > Cc: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Jiri Pirko <jiri@resnulli.us>
> > Cc: David S. Miller <davem@davemloft.net>
> > Signed-off-by: Cong Wang <amwang@redhat.com>
> > ---
> > diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> 
> ...
> 
> >  	if (np->dev_name)
> > -		ndev = dev_get_by_name(&init_net, np->dev_name);
> > +		ndev = __dev_get_by_name(&init_net, np->dev_name);
> 
> This change brings interesting bugs.

Hmm, I didn't realize __dev_get_by_name() doesn't hold the device, so
just call dev_hold() after this?

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index a5ad1c1..a9b1004 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -1056,6 +1056,7 @@ int netpoll_setup(struct netpoll *np)
                err = -ENODEV;
                goto unlock;
        }
+       dev_hold(ndev);
 
        if (netdev_master_upper_dev_get(ndev)) {
                np_err(np, "%s is a slave device, aborting\n",
np->dev_name);


> 
> All the "goto put;" are basically wrong, and the section waiting for the
> carrier and releasing/getting rtnl is buggy.

Either we have to sleep for few seconds with rtnl lock held, or leave as
it is. The original code doesn't hold rtnl lock either.

Thanks!

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  1:24 ` Eric Dumazet
  2013-01-17  3:30   ` Cong Wang
@ 2013-01-17  3:52   ` David Miller
  2013-01-17  4:00     ` Cong Wang
  1 sibling, 1 reply; 9+ messages in thread
From: David Miller @ 2013-01-17  3:52 UTC (permalink / raw)
  To: eric.dumazet; +Cc: amwang, netdev, jiri

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 16 Jan 2013 17:24:45 -0800

>>  	if (np->dev_name)
>> -		ndev = dev_get_by_name(&init_net, np->dev_name);
>> +		ndev = __dev_get_by_name(&init_net, np->dev_name);
 ...
> Please revert this part.

You mean just revert that hunk above that made it use the
non-refcounting version of dev_get_by_name()?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  3:30   ` Cong Wang
@ 2013-01-17  3:54     ` David Miller
  2013-01-17  4:18       ` Cong Wang
  0 siblings, 1 reply; 9+ messages in thread
From: David Miller @ 2013-01-17  3:54 UTC (permalink / raw)
  To: amwang; +Cc: eric.dumazet, netdev, jiri

From: Cong Wang <amwang@redhat.com>
Date: Thu, 17 Jan 2013 11:30:18 +0800

> On Wed, 2013-01-16 at 17:24 -0800, Eric Dumazet wrote:
>> On Tue, 2013-01-15 at 17:34 +0800, Cong Wang wrote:
>> >  	if (np->dev_name)
>> > -		ndev = dev_get_by_name(&init_net, np->dev_name);
>> > +		ndev = __dev_get_by_name(&init_net, np->dev_name);
>> 
>> This change brings interesting bugs.
> 
> Hmm, I didn't realize __dev_get_by_name() doesn't hold the device, so
> just call dev_hold() after this?

Why not just... call dev_get_by_name()?  It doesn't hurt to over-RCU
lock.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  3:52   ` David Miller
@ 2013-01-17  4:00     ` Cong Wang
  0 siblings, 0 replies; 9+ messages in thread
From: Cong Wang @ 2013-01-17  4:00 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, jiri

On Wed, 2013-01-16 at 22:52 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 16 Jan 2013 17:24:45 -0800
> 
> >>  	if (np->dev_name)
> >> -		ndev = dev_get_by_name(&init_net, np->dev_name);
> >> +		ndev = __dev_get_by_name(&init_net, np->dev_name);
>  ...
> > Please revert this part.
> 
> You mean just revert that hunk above that made it use the
> non-refcounting version of dev_get_by_name()?

But there is no reason to take both rtnl lock and RCU read lock,
although that is fine.

I think just adding dev_hold() is enough.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  3:54     ` David Miller
@ 2013-01-17  4:18       ` Cong Wang
  2013-01-17  4:53         ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Cong Wang @ 2013-01-17  4:18 UTC (permalink / raw)
  To: David Miller; +Cc: eric.dumazet, netdev, jiri

On Wed, 2013-01-16 at 22:54 -0500, David Miller wrote:
> From: Cong Wang <amwang@redhat.com>
> Date: Thu, 17 Jan 2013 11:30:18 +0800
> 
> > On Wed, 2013-01-16 at 17:24 -0800, Eric Dumazet wrote:
> >> On Tue, 2013-01-15 at 17:34 +0800, Cong Wang wrote:
> >> >  	if (np->dev_name)
> >> > -		ndev = dev_get_by_name(&init_net, np->dev_name);
> >> > +		ndev = __dev_get_by_name(&init_net, np->dev_name);
> >> 
> >> This change brings interesting bugs.
> > 
> > Hmm, I didn't realize __dev_get_by_name() doesn't hold the device, so
> > just call dev_hold() after this?
> 
> Why not just... call dev_get_by_name()?  It doesn't hurt to over-RCU
> lock.
> 

Just that taking RCU read lock while having rtnl lock is unnecessary, no
other reason.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Patch net-next v4] netpoll: fix a rtnl lock assertion failure
  2013-01-17  4:18       ` Cong Wang
@ 2013-01-17  4:53         ` Eric Dumazet
  0 siblings, 0 replies; 9+ messages in thread
From: Eric Dumazet @ 2013-01-17  4:53 UTC (permalink / raw)
  To: Cong Wang; +Cc: David Miller, netdev, jiri

On Thu, 2013-01-17 at 12:18 +0800, Cong Wang wrote:

> > Why not just... call dev_get_by_name()?  It doesn't hurt to over-RCU
> > lock.
> > 
> 
> Just that taking RCU read lock while having rtnl lock is unnecessary, no
> other reason.

Calling the dev_get_by_name() would be just fine, and generate less
code.

Its not a fast path...

Anyway David already applied your patch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-01-17  4:53 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-01-15  9:34 [Patch net-next v4] netpoll: fix a rtnl lock assertion failure Cong Wang
2013-01-16 20:27 ` David Miller
2013-01-17  1:24 ` Eric Dumazet
2013-01-17  3:30   ` Cong Wang
2013-01-17  3:54     ` David Miller
2013-01-17  4:18       ` Cong Wang
2013-01-17  4:53         ` Eric Dumazet
2013-01-17  3:52   ` David Miller
2013-01-17  4:00     ` Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).