* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 [not found] <20040613121514.6b3c1c8a.davem@redhat.com> @ 2004-06-13 23:35 ` Herbert Xu 2004-06-13 23:41 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-13 23:35 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, linux-net, netdev, yoshfuji David S. Miller <davem@redhat.com> wrote: > > diff -Nru a/net/core/dst.c b/net/core/dst.c > --- a/net/core/dst.c 2004-06-13 06:04:49 -07:00 > +++ b/net/core/dst.c 2004-06-13 06:04:49 -07:00 > @@ -230,8 +230,8 @@ > if (event!=NETDEV_DOWN && > dst->output == dst_discard_out) { This is a historical question. What's the dst->output check for? > dst->dev = &loopback_dev; > - dev_put(dev); > dev_hold(&loopback_dev); > + dev_put(dev); > dst->output = dst_discard_out; > if (dst->neighbour && dst->neighbour->dev == dev) { > dst->neighbour->dev = &loopback_dev; > @@ -242,6 +242,8 @@ > dst->input = dst_discard_in; > dst->output = dst_discard_out; > } The loopback_dev stuff takes care of the case when someone is still holding onto the dst. How come we don't have similar code in ifdown? > + if (dst->ops->ifdown) > + dst->ops->ifdown(dst, event != NETDEV_DOWN); What's the rationale for doing this for both UNREGISTER and DOWN? Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-13 23:35 ` 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 Herbert Xu @ 2004-06-13 23:41 ` Herbert Xu 2004-06-14 1:36 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-13 23:41 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji BTW, it looks like I missed the original IPv4 idev patch to route.c as well. So could someone please tell me why it was needed in the first place? Thanks, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-13 23:41 ` Herbert Xu @ 2004-06-14 1:36 ` David S. Miller 2004-06-14 1:50 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-14 1:36 UTC (permalink / raw) To: Herbert Xu; +Cc: schwab, netdev, yoshfuji On Mon, 14 Jun 2004 09:41:42 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > BTW, it looks like I missed the original IPv4 idev patch to route.c > as well. So could someone please tell me why it was needed in the > first place? Because per-device ipv4/ipv6 statistics support is coming. It's necessary infrastructure. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 1:36 ` David S. Miller @ 2004-06-14 1:50 ` Herbert Xu 2004-06-14 4:07 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-14 1:50 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Sun, Jun 13, 2004 at 06:36:22PM -0700, David S. Miller wrote: > On Mon, 14 Jun 2004 09:41:42 +1000 > Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > BTW, it looks like I missed the original IPv4 idev patch to route.c > > as well. So could someone please tell me why it was needed in the > > first place? > > Because per-device ipv4/ipv6 statistics support is coming. > It's necessary infrastructure. Thanks. Is there somewhere where I can look at the code that's going to use this infrastructure? -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 1:50 ` Herbert Xu @ 2004-06-14 4:07 ` David S. Miller 2004-06-14 4:22 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-14 4:07 UTC (permalink / raw) To: Herbert Xu; +Cc: schwab, netdev, yoshfuji On Mon, 14 Jun 2004 11:50:13 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > On Sun, Jun 13, 2004 at 06:36:22PM -0700, David S. Miller wrote: > > On Mon, 14 Jun 2004 09:41:42 +1000 > > Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > > > BTW, it looks like I missed the original IPv4 idev patch to route.c > > > as well. So could someone please tell me why it was needed in the > > > first place? > > > > Because per-device ipv4/ipv6 statistics support is coming. > > It's necessary infrastructure. > > Thanks. Is there somewhere where I can look at the code that's > going to use this infrastructure? Not written yet, but it will be of the form: idev = rt->idev; SNMP_BUMP_BLAH_BLAH_STAT(idev); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:07 ` David S. Miller @ 2004-06-14 4:22 ` Herbert Xu 2004-06-14 4:27 ` David S. Miller 2004-06-14 10:28 ` Herbert Xu 0 siblings, 2 replies; 25+ messages in thread From: Herbert Xu @ 2004-06-14 4:22 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Sun, Jun 13, 2004 at 09:07:25PM -0700, David S. Miller wrote: > > Not written yet, but it will be of the form: > > idev = rt->idev; > SNMP_BUMP_BLAH_BLAH_STAT(idev); OK, in that case either they all need to check whether idev is NULL before doing this or we'll need to fix ->ifdown to set the dev to something other than NULL. BTW in looking at this I've found two bugs in the dst code wrt cleaning up net devices. 1. The code in dst_dev_event relies on the entries with dev in it getting onto the garbage list before it is called. Unfortunately for routing entries, the flushing is done asynchronusly so it is entirely possible for the entries to get onto the garbage list after dst_dev_event has been called twice (DOWN and then UNREGISTER). This is not fatal though since another GC run will pick them up in at most two minutes. But the user may well be alarmed by those dreaded "waiting for..." errors. 2. dst_dev_event doesn't clean up dst->child at all. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:22 ` Herbert Xu @ 2004-06-14 4:27 ` David S. Miller 2004-06-14 4:42 ` Herbert Xu 2004-06-14 10:28 ` Herbert Xu 1 sibling, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-14 4:27 UTC (permalink / raw) To: Herbert Xu; +Cc: schwab, netdev, yoshfuji On Mon, 14 Jun 2004 14:22:16 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > OK, in that case either they all need to check whether idev is > NULL before doing this or we'll need to fix ->ifdown to set the > dev to something other than NULL. That's correct, I don't exactly how to deal with that race just yet. > BTW in looking at this I've found two bugs in the dst code wrt > cleaning up net devices. > > 1. The code in dst_dev_event relies on the entries with dev in it > getting onto the garbage list before it is called. Unfortunately > for routing entries, the flushing is done asynchronusly so it is > entirely possible for the entries to get onto the garbage list after > dst_dev_event has been called twice (DOWN and then UNREGISTER). > > This is not fatal though since another GC run will pick them up in > at most two minutes. But the user may well be alarmed by those > dreaded "waiting for..." errors. Right. > 2. dst_dev_event doesn't clean up dst->child at all. DST object is disassembled at GC time, so I classift this just like case #1 above. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:27 ` David S. Miller @ 2004-06-14 4:42 ` Herbert Xu 2004-06-14 4:47 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-14 4:42 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Sun, Jun 13, 2004 at 09:27:08PM -0700, David S. Miller wrote: > > > 1. The code in dst_dev_event relies on the entries with dev in it > > getting onto the garbage list before it is called. Unfortunately > > for routing entries, the flushing is done asynchronusly so it is > > entirely possible for the entries to get onto the garbage list after > > dst_dev_event has been called twice (DOWN and then UNREGISTER). > > > > This is not fatal though since another GC run will pick them up in > > at most two minutes. But the user may well be alarmed by those > > dreaded "waiting for..." errors. > > Right. > > > 2. dst_dev_event doesn't clean up dst->child at all. > > DST object is disassembled at GC time, so I classift this just > like case #1 above. Yes. In fact the loopback_dev code in dst_dev_event is also like this. Removing that code will simply defer the dropping of the dev reference to the GC. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:42 ` Herbert Xu @ 2004-06-14 4:47 ` Herbert Xu 2004-06-14 4:43 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-14 4:47 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Mon, Jun 14, 2004 at 02:42:40PM +1000, herbert wrote: > On Sun, Jun 13, 2004 at 09:27:08PM -0700, David S. Miller wrote: > > > > > 1. The code in dst_dev_event relies on the entries with dev in it > > > getting onto the garbage list before it is called. Unfortunately > > > for routing entries, the flushing is done asynchronusly so it is > > > entirely possible for the entries to get onto the garbage list after > > > dst_dev_event has been called twice (DOWN and then UNREGISTER). > > > > > > This is not fatal though since another GC run will pick them up in > > > at most two minutes. But the user may well be alarmed by those > > > dreaded "waiting for..." errors. > > > > Right. > > > > > 2. dst_dev_event doesn't clean up dst->child at all. > > > > DST object is disassembled at GC time, so I classift this just > > like case #1 above. > > Yes. In fact the loopback_dev code in dst_dev_event is also like > this. Removing that code will simply defer the dropping of the > dev reference to the GC. And there's more :) The work done in ifdown also falls in the same category. Its work will be carried out in the GC anyway. Actually in all these cases it could take a lot more than two minutes if there is a long-living entity (maybe a TCP connection) holding onto the dst. So I guess #1 and #2 should be addressed at some point. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:47 ` Herbert Xu @ 2004-06-14 4:43 ` David S. Miller 2004-06-14 4:53 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-14 4:43 UTC (permalink / raw) To: Herbert Xu; +Cc: schwab, netdev, yoshfuji On Mon, 14 Jun 2004 14:47:17 +1000 Herbert Xu <herbert@gondor.apana.org.au> wrote: > And there's more :) The work done in ifdown also falls in the same > category. Its work will be carried out in the GC anyway. > > Actually in all these cases it could take a lot more than two minutes > if there is a long-living entity (maybe a TCP connection) holding onto > the dst. So I guess #1 and #2 should be addressed at some point. Maybe the solution is to initiate a GC run as soon as possible instead of however long into the future it would have ended up running. Wouldn't that help? ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:43 ` David S. Miller @ 2004-06-14 4:53 ` Herbert Xu 0 siblings, 0 replies; 25+ messages in thread From: Herbert Xu @ 2004-06-14 4:53 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Sun, Jun 13, 2004 at 09:43:55PM -0700, David S. Miller wrote: > On Mon, 14 Jun 2004 14:47:17 +1000 > Herbert Xu <herbert@gondor.apana.org.au> wrote: > > > And there's more :) The work done in ifdown also falls in the same > > category. Its work will be carried out in the GC anyway. > > > > Actually in all these cases it could take a lot more than two minutes > > if there is a long-living entity (maybe a TCP connection) holding onto > > the dst. So I guess #1 and #2 should be addressed at some point. > > Maybe the solution is to initiate a GC run as soon as possible > instead of however long into the future it would have ended > up running. > > Wouldn't that help? Not really. The GC is powerless until all those entities drop their references to the dst. That's why we've got this hack in dst_dev_event that sets ->dev to loopback_dev in order to drop the dev reference early. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 4:22 ` Herbert Xu 2004-06-14 4:27 ` David S. Miller @ 2004-06-14 10:28 ` Herbert Xu 2004-06-14 12:44 ` Herbert Xu 1 sibling, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-14 10:28 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji On Mon, Jun 14, 2004 at 02:22:16PM +1000, herbert wrote: > > 1. The code in dst_dev_event relies on the entries with dev in it > getting onto the garbage list before it is called. Unfortunately > for routing entries, the flushing is done asynchronusly so it is > entirely possible for the entries to get onto the garbage list after > dst_dev_event has been called twice (DOWN and then UNREGISTER). Actually this can't happen since fib_netdev_event will do an immediate flush. So it's only #2 or IPsec that's a problem. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 10:28 ` Herbert Xu @ 2004-06-14 12:44 ` Herbert Xu 2004-06-16 19:37 ` Alexey Kuznetsov 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-14 12:44 UTC (permalink / raw) To: David S. Miller; +Cc: schwab, netdev, yoshfuji, kuznet [-- Attachment #1: Type: text/plain, Size: 1312 bytes --] On Mon, Jun 14, 2004 at 08:28:58PM +1000, herbert wrote: > On Mon, Jun 14, 2004 at 02:22:16PM +1000, herbert wrote: > > > > 1. The code in dst_dev_event relies on the entries with dev in it > > getting onto the garbage list before it is called. Unfortunately > > for routing entries, the flushing is done asynchronusly so it is > > entirely possible for the entries to get onto the garbage list after > > dst_dev_event has been called twice (DOWN and then UNREGISTER). > > Actually this can't happen since fib_netdev_event will do an > immediate flush. So it's only #2 or IPsec that's a problem. And even there it's not too bad as it only happens if there are more than two IPsec dst's in a row, e.g., ESP/IPCOMP. Here is a patch to do the same loopback_dev hack for child dst's. I've also taken the liberty to remove the bogus dst->output check. I've tested it with ESP/IPCOMP and I can confirm that it allows me to remove my NIC module if when I've got a TCP connection over it. Without it the removal hangs. Alexey, if I've missed any subtle piece of logic there, please scream :) Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 2115 bytes --] ===== net/core/dst.c 1.17 vs edited ===== --- 1.17/net/core/dst.c 2004-06-13 03:51:53 +10:00 +++ edited/net/core/dst.c 2004-06-14 22:39:51 +10:00 @@ -210,6 +210,36 @@ return NULL; } +/* Dirty hack. We did it in 2.2 (in __dst_free), + * we have _very_ good reasons not to repeat + * this mistake in 2.3, but we have no choice + * now. _It_ _is_ _explicit_ _deliberate_ + * _race_ _condition_. + * + * Commented and originally written by Alexey. + */ +static void dst_ifdown(struct dst_entry *dst, int unregister) +{ + struct net_device *dev = dst->dev; + + do { + if (unregister) { + dst->dev = &loopback_dev; + dev_hold(&loopback_dev); + dev_put(dev); + if (dst->neighbour && dst->neighbour->dev == dev) { + dst->neighbour->dev = &loopback_dev; + dev_put(dev); + dev_hold(&loopback_dev); + } + } + + if (dst->ops->ifdown) + dst->ops->ifdown(dst, unregister); + } while ((dst = dst->child) && dst->flags & DST_NOHASH && + dst->dev == dev); +} + static int dst_dev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = ptr; @@ -221,29 +251,11 @@ spin_lock_bh(&dst_lock); for (dst = dst_garbage_list; dst; dst = dst->next) { if (dst->dev == dev) { - /* Dirty hack. We did it in 2.2 (in __dst_free), - we have _very_ good reasons not to repeat - this mistake in 2.3, but we have no choice - now. _It_ _is_ _explicit_ _deliberate_ - _race_ _condition_. - */ - if (event!=NETDEV_DOWN && - dst->output == dst_discard_out) { - dst->dev = &loopback_dev; - dev_hold(&loopback_dev); - dev_put(dev); - dst->output = dst_discard_out; - if (dst->neighbour && dst->neighbour->dev == dev) { - dst->neighbour->dev = &loopback_dev; - dev_put(dev); - dev_hold(&loopback_dev); - } - } else { + if (event == NETDEV_DOWN) { dst->input = dst_discard_in; dst->output = dst_discard_out; } - if (dst->ops->ifdown) - dst->ops->ifdown(dst, event != NETDEV_DOWN); + dst_ifdown(dst, event != NETDEV_DOWN); } } spin_unlock_bh(&dst_lock); ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-14 12:44 ` Herbert Xu @ 2004-06-16 19:37 ` Alexey Kuznetsov 2004-06-16 20:09 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Alexey Kuznetsov @ 2004-06-16 19:37 UTC (permalink / raw) To: Herbert Xu; +Cc: David S. Miller, schwab, netdev, yoshfuji Hello! > Alexey, if I've missed any subtle piece of logic there, please > scream :) The "logic" behind the check ("logic" is double quoted because of that comment) with dst_discard_out was to make release only on the second attempt to unregister. In the past unregister event used to be rebroadcast periodically until the unregister finally succeeds. Now logic is different, so this check really does not make sense. Anyway, the comment remains damn true and the place is still bug. It makes sense to think about waiting for at least one quiescent state after detaching the stale device from dst and real dev_put() it. Reference to RCU is not occasional, the job looks natural for it. Alexey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-16 19:37 ` Alexey Kuznetsov @ 2004-06-16 20:09 ` David S. Miller 2004-06-16 20:37 ` Alexey Kuznetsov 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-16 20:09 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: herbert, schwab, netdev, yoshfuji On Wed, 16 Jun 2004 23:37:31 +0400 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > Anyway, the comment remains damn true and the place is still > bug. It makes sense to think about waiting for at least one quiescent > state after detaching the stale device from dst and real dev_put() it. > Reference to RCU is not occasional, the job looks natural for it. It seems a simple synchronize_kernel() would clear all the crap no? ===== net/core/dst.c 1.18 vs edited ===== --- 1.18/net/core/dst.c 2004-06-16 10:26:56 -07:00 +++ edited/net/core/dst.c 2004-06-16 13:10:37 -07:00 @@ -243,6 +243,8 @@ dst->ops->ifdown(dst, unregister); } while ((dst = dst->child) && dst->flags & DST_NOHASH && dst->dev == dev); + + synchronize_kernel(); } static int dst_dev_event(struct notifier_block *this, unsigned long event, void *ptr) ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-16 20:09 ` David S. Miller @ 2004-06-16 20:37 ` Alexey Kuznetsov 2004-06-16 20:47 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Alexey Kuznetsov @ 2004-06-16 20:37 UTC (permalink / raw) To: David S. Miller; +Cc: herbert, schwab, netdev, yoshfuji Hello! > > bug. It makes sense to think about waiting for at least one quiescent > > state after detaching the stale device from dst and real dev_put() it. > > Reference to RCU is not occasional, the job looks natural for it. > > It seems a simple synchronize_kernel() would clear all the > crap no? Sort of. I think it should happen after killing reference to dst->dev: + dst->dev = &loopback_dev; + dev_hold(&loopback_dev); but before dropping reference to dev: + dev_put(dev); So, I think it should look like: dst->dev = &loopback_dev; dev_hold(&loopback_dev); + synchronize_kernel(); dev_put(dev); (sorry, I still do not have Herbert's patch pulled) Alexey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-16 20:37 ` Alexey Kuznetsov @ 2004-06-16 20:47 ` David S. Miller 2004-06-17 8:17 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-16 20:47 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: herbert, schwab, netdev, yoshfuji On Thu, 17 Jun 2004 00:37:48 +0400 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > Sort of. I think it should happen after killing reference to dst->dev: ... > but before dropping reference to dev: ... > So, I think it should look like: > > dst->dev = &loopback_dev; > dev_hold(&loopback_dev); > + synchronize_kernel(); > dev_put(dev); Perfect, that's what I put into my tree then, plus some comment. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-16 20:47 ` David S. Miller @ 2004-06-17 8:17 ` Herbert Xu 2004-06-17 8:33 ` Herbert Xu 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-17 8:17 UTC (permalink / raw) To: David S. Miller; +Cc: Alexey Kuznetsov, schwab, netdev, yoshfuji On Wed, Jun 16, 2004 at 01:47:11PM -0700, David S. Miller wrote: > On Thu, 17 Jun 2004 00:37:48 +0400 > Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > > > Sort of. I think it should happen after killing reference to dst->dev: > ... > > but before dropping reference to dev: > ... > > So, I think it should look like: > > > > dst->dev = &loopback_dev; > > dev_hold(&loopback_dev); > > + synchronize_kernel(); > > dev_put(dev); > > Perfect, that's what I put into my tree then, plus some comment. Yes we definitely want this call. But wouldn't it be better to do one call at the top level, say at the end of netdev_wait_allrefs? Otherwise we may end up with lots of synchronize_kernel() calls scattered in each individual event handler. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 8:17 ` Herbert Xu @ 2004-06-17 8:33 ` Herbert Xu 2004-06-17 17:10 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Herbert Xu @ 2004-06-17 8:33 UTC (permalink / raw) To: David S. Miller; +Cc: Alexey Kuznetsov, schwab, netdev, yoshfuji [-- Attachment #1: Type: text/plain, Size: 514 bytes --] On Thu, Jun 17, 2004 at 06:17:22PM +1000, herbert wrote: > > Yes we definitely want this call. But wouldn't it be better to do one > call at the top level, say at the end of netdev_wait_allrefs? Something like should be good enough. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt [-- Attachment #2: p --] [-- Type: text/plain, Size: 961 bytes --] ===== net/core/dev.c 1.145 vs edited ===== --- 1.145/net/core/dev.c 2004-06-04 15:02:47 +10:00 +++ edited/net/core/dev.c 2004-06-17 18:31:12 +10:00 @@ -3033,6 +3033,12 @@ netdev_wait_allrefs(dev); + /* Wait for at least one quiescent state before + * destroying dev. This ensures users like + * dst->dev are completely finished. + */ + synchronize_net(); + /* paranoia */ BUG_ON(atomic_read(&dev->refcnt)); BUG_TRAP(!dev->ip_ptr); ===== net/core/dst.c 1.19 vs edited ===== --- 1.19/net/core/dst.c 2004-06-17 06:42:22 +10:00 +++ edited/net/core/dst.c 2004-06-17 18:31:46 +10:00 @@ -231,12 +231,6 @@ if (unregister) { dst->dev = &loopback_dev; dev_hold(&loopback_dev); - - /* Wait for at least one quiescent state after - * detaching the stale device from dst. - */ - synchronize_kernel(); - dev_put(dev); if (dst->neighbour && dst->neighbour->dev == dev) { dst->neighbour->dev = &loopback_dev; ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 8:33 ` Herbert Xu @ 2004-06-17 17:10 ` David S. Miller 2004-06-17 17:24 ` Alexey Kuznetsov 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-17 17:10 UTC (permalink / raw) To: Herbert Xu; +Cc: kuznet, schwab, netdev, yoshfuji Herbert, you still don't understand what the synchronize_kernel() is for. It is to make sure that anyone (other cpus using the dst) who "saw" the old dst->dev is done doing whatever they were doing _BEFORE_ we put the device reference. The best we could do is accumulate devices to be put onto some kind of list, then at the end of dst_ifdown() do one synchronize_kernel() then walk the list putting all the devices. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 17:10 ` David S. Miller @ 2004-06-17 17:24 ` Alexey Kuznetsov 2004-06-17 17:53 ` David S. Miller 0 siblings, 1 reply; 25+ messages in thread From: Alexey Kuznetsov @ 2004-06-17 17:24 UTC (permalink / raw) To: David S. Miller; +Cc: Herbert Xu, schwab, netdev, yoshfuji Hello! > It is to make sure that anyone (other cpus using the dst) who "saw" > the old dst->dev is done doing whatever they were doing _BEFORE_ > we put the device reference. Actually, he noticed right thing. I forgot that this function works in context when dev, which is argument is of the function, is held, so dev_put() here can be safely replaced with __dev_put(). So, synchonize_kernel() can be moved somewhere to unregister_netdevice(), (maybe, it is already present there under surface) Alexey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 17:24 ` Alexey Kuznetsov @ 2004-06-17 17:53 ` David S. Miller 2004-06-17 19:07 ` Alexey Kuznetsov 0 siblings, 1 reply; 25+ messages in thread From: David S. Miller @ 2004-06-17 17:53 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: herbert, schwab, netdev, yoshfuji On Thu, 17 Jun 2004 21:24:49 +0400 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > Actually, he noticed right thing. I forgot that this function > works in context when dev, which is argument is of the function, > is held, so dev_put() here can be safely replaced with __dev_put(). > > So, synchonize_kernel() can be moved somewhere to unregister_netdevice(), > (maybe, it is already present there under surface) It occurs, via synchronize_net(), but this happens before NETDEV_UNREGISTER event is sent out. Is it sufficient? I think not, and that we need to add another call in unregister_netdevice() right before final dev_put(). ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 17:53 ` David S. Miller @ 2004-06-17 19:07 ` Alexey Kuznetsov 2004-06-17 19:16 ` David S. Miller 2004-06-17 21:29 ` Herbert Xu 0 siblings, 2 replies; 25+ messages in thread From: Alexey Kuznetsov @ 2004-06-17 19:07 UTC (permalink / raw) To: David S. Miller; +Cc: herbert, schwab, netdev, yoshfuji Hello! > It occurs, via synchronize_net(), but this happens before NETDEV_UNREGISTER > event is sent out. Is it sufficient? I think not, Agreed. > and that we need to > add another call in unregister_netdevice() right before final dev_put(). This should be enough. Alexey ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 19:07 ` Alexey Kuznetsov @ 2004-06-17 19:16 ` David S. Miller 2004-06-17 21:29 ` Herbert Xu 1 sibling, 0 replies; 25+ messages in thread From: David S. Miller @ 2004-06-17 19:16 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: herbert, schwab, netdev, yoshfuji On Thu, 17 Jun 2004 23:07:39 +0400 Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> wrote: > > and that we need to > > add another call in unregister_netdevice() right before final dev_put(). > > This should be enough. Perfect, that's what I'll do in my tree. ^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 2004-06-17 19:07 ` Alexey Kuznetsov 2004-06-17 19:16 ` David S. Miller @ 2004-06-17 21:29 ` Herbert Xu 1 sibling, 0 replies; 25+ messages in thread From: Herbert Xu @ 2004-06-17 21:29 UTC (permalink / raw) To: Alexey Kuznetsov; +Cc: David S. Miller, schwab, netdev, yoshfuji On Thu, Jun 17, 2004 at 11:07:39PM +0400, Alexey Kuznetsov wrote: > > > and that we need to > > add another call in unregister_netdevice() right before final dev_put(). > > This should be enough. It is enough for this dst case, but it may not be enough in general. Other event handlers may need a rebroadcast before they drop the dev reference. So we really should do the synchronize_net() before freeing the device in net_run_todo(). Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 25+ messages in thread
end of thread, other threads:[~2004-06-17 21:29 UTC | newest]
Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20040613121514.6b3c1c8a.davem@redhat.com>
2004-06-13 23:35 ` 2.6.7-rc3: unregister_netdevice: waiting for tun0 to become free. Usage count = 1 Herbert Xu
2004-06-13 23:41 ` Herbert Xu
2004-06-14 1:36 ` David S. Miller
2004-06-14 1:50 ` Herbert Xu
2004-06-14 4:07 ` David S. Miller
2004-06-14 4:22 ` Herbert Xu
2004-06-14 4:27 ` David S. Miller
2004-06-14 4:42 ` Herbert Xu
2004-06-14 4:47 ` Herbert Xu
2004-06-14 4:43 ` David S. Miller
2004-06-14 4:53 ` Herbert Xu
2004-06-14 10:28 ` Herbert Xu
2004-06-14 12:44 ` Herbert Xu
2004-06-16 19:37 ` Alexey Kuznetsov
2004-06-16 20:09 ` David S. Miller
2004-06-16 20:37 ` Alexey Kuznetsov
2004-06-16 20:47 ` David S. Miller
2004-06-17 8:17 ` Herbert Xu
2004-06-17 8:33 ` Herbert Xu
2004-06-17 17:10 ` David S. Miller
2004-06-17 17:24 ` Alexey Kuznetsov
2004-06-17 17:53 ` David S. Miller
2004-06-17 19:07 ` Alexey Kuznetsov
2004-06-17 19:16 ` David S. Miller
2004-06-17 21:29 ` Herbert Xu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).