* [RFC] dev->refcnt long term holder @ 2009-11-16 18:01 Eric Dumazet 2009-11-16 19:02 ` Stephen Hemminger 2009-11-16 21:50 ` PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle Eric Dumazet 0 siblings, 2 replies; 13+ messages in thread From: Eric Dumazet @ 2009-11-16 18:01 UTC (permalink / raw) To: David S. Miller, Herbert Xu, Stephen Hemminger; +Cc: Linux Netdev List time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.266s user 0m0.000s sys 0m0.001s real 0m0.770s user 0m0.000s sys 0m0.000s real 0m1.022s user 0m0.000s sys 0m0.000s One problem of current schem in vlan dismantle phase is the holding of device done by following chain : vlan_dev_stop() -> netif_carrier_off(dev) -> linkwatch_fire_event(dev) -> dev_hold() ... And __linkwatch_run_queue() run up to one second later... Is following patch one way to avoid the problem, or should we add a new linkwatch_forgetpro_device(dev) method to immediately release the device reference (and unlink device from the list) ? (This would probably need a doubly linked list instead of single link list) Thanks diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 4198ec5..17216f9 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -512,7 +512,6 @@ static int vlan_dev_stop(struct net_device *dev) if (compare_ether_addr(dev->dev_addr, real_dev->dev_addr)) dev_unicast_delete(real_dev, dev->dev_addr); - netif_carrier_off(dev); return 0; } ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-16 18:01 [RFC] dev->refcnt long term holder Eric Dumazet @ 2009-11-16 19:02 ` Stephen Hemminger 2009-11-16 19:54 ` Eric Dumazet 2009-11-16 21:50 ` PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle Eric Dumazet 1 sibling, 1 reply; 13+ messages in thread From: Stephen Hemminger @ 2009-11-16 19:02 UTC (permalink / raw) To: Eric Dumazet; +Cc: David S. Miller, Herbert Xu, Linux Netdev List On Mon, 16 Nov 2009 19:01:37 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 > > real 0m0.266s > user 0m0.000s > sys 0m0.001s > > real 0m0.770s > user 0m0.000s > sys 0m0.000s > > real 0m1.022s > user 0m0.000s > sys 0m0.000s > > > One problem of current schem in vlan dismantle phase is the > holding of device done by following chain : > > vlan_dev_stop() -> > netif_carrier_off(dev) -> > linkwatch_fire_event(dev) -> > dev_hold() ... > > And __linkwatch_run_queue() run up to one second later... > > Is following patch one way to avoid the problem, or should > we add a new linkwatch_forgetpro_device(dev) method to immediately > release the device reference (and unlink device from the list) ? > (This would probably need a doubly linked list instead of single link list) Or both.. -- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-16 19:02 ` Stephen Hemminger @ 2009-11-16 19:54 ` Eric Dumazet 2009-11-17 8:30 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2009-11-16 19:54 UTC (permalink / raw) To: Stephen Hemminger; +Cc: David S. Miller, Herbert Xu, Linux Netdev List Stephen Hemminger a écrit : > > Or both.. > Well well well :) I was hoping a fast path, but anyway, a linkwatch_forget_dev(dev) is probably better, (in case "ip link del" closely follows an "ip link add / up") I'll post something when tested. Thanks ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-16 19:54 ` Eric Dumazet @ 2009-11-17 8:30 ` David Miller 2009-11-17 8:37 ` Herbert Xu 2009-11-17 17:58 ` Stephen Hemminger 0 siblings, 2 replies; 13+ messages in thread From: David Miller @ 2009-11-17 8:30 UTC (permalink / raw) To: eric.dumazet; +Cc: shemminger, herbert, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Mon, 16 Nov 2009 20:54:29 +0100 > I was hoping a fast path, but anyway, a linkwatch_forget_dev(dev) is > probably better, (in case "ip link del" closely follows an "ip link > add / up") > > I'll post something when tested. Let's face it, linkwatch has been a thorn in our sides for a long time. A non-stop source of problems. I suspect that here in this VLAN case you care about, the work can even be done synchronously. I'm trying to remember why we added this asynchronous link state event processing monster. It probably has something to do with needing a sleepable context. What's amusing is that linkwatch has repeatably caused RTNL deadlock issues over the years. :-) If it is purely an issue of doing link state processing outside of HW irq context: 1) PHY events in most drivers are handled in softirq (NAPI ->poll()) or a workqueue of some sort these days. If we can make all of the link state management softirq safe (it probably is, except for perhaps RTNL), these case can be done synchronously always. 2) The remaining cases are device probe, open, and close. None of which execute in HW irq context, they can sleep, and they hold RTNL already. I'm sure there are a few drivers which still can invoke netif_carrier_*() in HW interrupt context, but we can create a tiny workqueue helper or something like that for them. It won't be the main way to invoke carrier state changes, just a compatability item. Anyways, just some food for thought. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-17 8:30 ` David Miller @ 2009-11-17 8:37 ` Herbert Xu 2009-11-17 17:58 ` Stephen Hemminger 1 sibling, 0 replies; 13+ messages in thread From: Herbert Xu @ 2009-11-17 8:37 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, shemminger, netdev On Tue, Nov 17, 2009 at 12:30:19AM -0800, David Miller wrote: > > 1) PHY events in most drivers are handled in softirq (NAPI ->poll()) > or a workqueue of some sort these days. If we can make all of > the link state management softirq safe (it probably is, except > for perhaps RTNL), these case can be done synchronously always. I think one of the reasons it's done this way is to call the function netdev_state_change which needs to be in process context to walk through the notifier list and do all the calls. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-17 8:30 ` David Miller 2009-11-17 8:37 ` Herbert Xu @ 2009-11-17 17:58 ` Stephen Hemminger 2009-11-17 19:18 ` David Miller 1 sibling, 1 reply; 13+ messages in thread From: Stephen Hemminger @ 2009-11-17 17:58 UTC (permalink / raw) To: David Miller; +Cc: eric.dumazet, herbert, netdev On Tue, 17 Nov 2009 00:30:19 -0800 (PST) David Miller <davem@davemloft.net> wrote: > I'm trying to remember why we added this asynchronous link state event > processing monster. It probably has something to do with needing a > sleepable context. What's amusing is that linkwatch has repeatably > caused RTNL deadlock issues over the years. :-) I thought it was to handle: 1) carrier on old devices would bounce, so it provides ratelimiting of state changes. Modern hardware and CPU's probably makes this a non-issue. 2) wasn't there some code path with device changes, hotplug, uevent and udev that meant that we couldn't do notifiers immediately. -- ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [RFC] dev->refcnt long term holder 2009-11-17 17:58 ` Stephen Hemminger @ 2009-11-17 19:18 ` David Miller 0 siblings, 0 replies; 13+ messages in thread From: David Miller @ 2009-11-17 19:18 UTC (permalink / raw) To: shemminger; +Cc: eric.dumazet, herbert, netdev From: Stephen Hemminger <shemminger@vyatta.com> Date: Tue, 17 Nov 2009 09:58:46 -0800 > I thought it was to handle: > 1) carrier on old devices would bounce, so it provides ratelimiting > of state changes. Modern hardware and CPU's probably makes this a non-issue. > 2) wasn't there some code path with device changes, hotplug, uevent and > udev that meant that we couldn't do notifiers immediately. I did a lot of code review in this area last night and it seems to be #1 above and simply being able to do sleeping things to send the events even though the carrier state changes happen in interrupt context. Even for what it's designed to do, it's overengineered. So what I propose is that we simplify the design and also allow direct invocation for cases where we're already in a sleepable context and/or holding RTNL. Similar to how Eric is doing in his latest linkwatch patch for VLANs. Note also that linkwatch's current implementation is the sole reason we do the real work of netdevice destruction after dropping RTNL :-) Linkwatch and unregister_netdevice() used to deadlock on RTNL. >From history-2.6 GIT: commit ff936f4e8148e75b20595eda5de6d3a4bb55b631 Author: David S. Miller <davem@nuts.ninka.net> Date: Mon May 19 04:30:48 2003 -0700 [NET]: Fix netdevice unregister races. We had two major issues when unregistering networking devices. 1) Even trying to run hotplug asynchronously could deadlock if keventd was currently trying to get the RTNL semaphore in order to process linkwatch events. 2) Unregister needs to wait for the last reference to go away before the finalization of the unregister can execute. This cannot occur under the RTNL semaphore as this is deadlock prone as well. The solution is to do all of this stuff after dropping the RTNL semaphore. rtnl_lock, if it is about to protect a region of code that could unregister network devices, registers a list to which unregistered netdevs are attached. At rtnl_unlock time this list is processed to wait for refcounts to drop to zero and then finalize the unregister. ^ permalink raw reply [flat|nested] 13+ messages in thread
* PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-16 18:01 [RFC] dev->refcnt long term holder Eric Dumazet 2009-11-16 19:02 ` Stephen Hemminger @ 2009-11-16 21:50 ` Eric Dumazet 2009-11-16 22:39 ` Stephen Hemminger 1 sibling, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2009-11-16 21:50 UTC (permalink / raw) To: David S. Miller, Herbert Xu, Stephen Hemminger; +Cc: Linux Netdev List time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.266s user 0m0.000s sys 0m0.001s real 0m0.770s user 0m0.000s sys 0m0.000s real 0m1.022s user 0m0.000s sys 0m0.000s One problem of current schem in vlan dismantle phase is the holding of device done by following chain : vlan_dev_stop() -> netif_carrier_off(dev) -> linkwatch_fire_event(dev) -> dev_hold() ... And __linkwatch_run_queue() runs up to one second later... A generic fix to this problem is to add a linkwatch_forget_dev() method to unlink the device from the list of watched devices. dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory), to be able to unlink device in O(1). After patch : time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.024s user 0m0.000s sys 0m0.000s real 0m0.032s user 0m0.000s sys 0m0.001s real 0m0.033s user 0m0.000s sys 0m0.000s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/netdevice.h | 3 +- net/core/dev.c | 3 ++ net/core/link_watch.c | 42 ++++++++++++++++++++++++------------ 3 files changed, 33 insertions(+), 15 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 7043f85..4e25730 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -896,7 +896,7 @@ struct net_device { /* device index hash chain */ struct hlist_node index_hlist; - struct net_device *link_watch_next; + struct list_head link_watch_list; /* register/unregister state machine */ enum { NETREG_UNINITIALIZED=0, @@ -1600,6 +1600,7 @@ static inline void dev_hold(struct net_device *dev) */ extern void linkwatch_fire_event(struct net_device *dev); +extern void linkwatch_forget_dev(struct net_device *dev); /** * netif_carrier_ok - test if carrier present diff --git a/net/core/dev.c b/net/core/dev.c index 4b24d79..649de02 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5066,6 +5066,8 @@ static void netdev_wait_allrefs(struct net_device *dev) { unsigned long rebroadcast_time, warning_time; + linkwatch_forget_dev(dev); + rebroadcast_time = warning_time = jiffies; while (atomic_read(&dev->refcnt) != 0) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { @@ -5280,6 +5282,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->napi_list); INIT_LIST_HEAD(&dev->unreg_list); + INIT_LIST_HEAD(&dev->link_watch_list); dev->priv_flags = IFF_XMIT_DST_RELEASE; setup(dev); strcpy(dev->name, name); diff --git a/net/core/link_watch.c b/net/core/link_watch.c index bf8f7af..05fe273 100644 --- a/net/core/link_watch.c +++ b/net/core/link_watch.c @@ -35,7 +35,7 @@ static unsigned long linkwatch_nextevent; static void linkwatch_event(struct work_struct *dummy); static DECLARE_DELAYED_WORK(linkwatch_work, linkwatch_event); -static struct net_device *lweventlist; +static LIST_HEAD(lweventlist); static DEFINE_SPINLOCK(lweventlist_lock); static unsigned char default_operstate(const struct net_device *dev) @@ -89,8 +89,10 @@ static void linkwatch_add_event(struct net_device *dev) unsigned long flags; spin_lock_irqsave(&lweventlist_lock, flags); - dev->link_watch_next = lweventlist; - lweventlist = dev; + if (list_empty(&dev->link_watch_list)) { + list_add_tail(&dev->link_watch_list, &lweventlist); + dev_hold(dev); + } spin_unlock_irqrestore(&lweventlist_lock, flags); } @@ -135,7 +137,8 @@ static void linkwatch_schedule_work(int urgent) static void __linkwatch_run_queue(int urgent_only) { - struct net_device *next; + struct net_device *dev; + LIST_HEAD(wrk); /* * Limit the number of linkwatch events to one @@ -153,19 +156,18 @@ static void __linkwatch_run_queue(int urgent_only) clear_bit(LW_URGENT, &linkwatch_flags); spin_lock_irq(&lweventlist_lock); - next = lweventlist; - lweventlist = NULL; - spin_unlock_irq(&lweventlist_lock); + list_splice_init(&lweventlist, &wrk); - while (next) { - struct net_device *dev = next; + while (!list_empty(&wrk)) { - next = dev->link_watch_next; + dev = list_first_entry(&wrk, struct net_device, link_watch_list); + list_del_init(&dev->link_watch_list); if (urgent_only && !linkwatch_urgent_event(dev)) { - linkwatch_add_event(dev); + list_add_tail(&dev->link_watch_list, &lweventlist); continue; } + spin_unlock_irq(&lweventlist_lock); /* * Make sure the above read is complete since it can be @@ -189,10 +191,24 @@ static void __linkwatch_run_queue(int urgent_only) } dev_put(dev); + spin_lock_irq(&lweventlist_lock); } - if (lweventlist) + if (!list_empty(&lweventlist)) linkwatch_schedule_work(0); + spin_unlock_irq(&lweventlist_lock); +} + +void linkwatch_forget_dev(struct net_device *dev) +{ + unsigned long flags; + + spin_lock_irqsave(&lweventlist_lock, flags); + if (!list_empty(&dev->link_watch_list)) { + list_del_init(&dev->link_watch_list); + dev_put(dev); + } + spin_unlock_irqrestore(&lweventlist_lock, flags); } @@ -216,8 +232,6 @@ void linkwatch_fire_event(struct net_device *dev) bool urgent = linkwatch_urgent_event(dev); if (!test_and_set_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) { - dev_hold(dev); - linkwatch_add_event(dev); } else if (!urgent) return; ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-16 21:50 ` PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle Eric Dumazet @ 2009-11-16 22:39 ` Stephen Hemminger 2009-11-17 12:26 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Stephen Hemminger @ 2009-11-16 22:39 UTC (permalink / raw) To: Eric Dumazet; +Cc: David S. Miller, Herbert Xu, Linux Netdev List On Mon, 16 Nov 2009 22:50:48 +0100 Eric Dumazet <eric.dumazet@gmail.com> wrote: > vlan_dev_stop() -> > netif_carrier_off(dev) -> > linkwatch_fire_event(dev) -> > dev_hold() ... > > And __linkwatch_run_queue() runs up to one second later... > > A generic fix to this problem is to add a linkwatch_forget_dev() method > to unlink the device from the list of watched devices. > > dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory), > to be able to unlink device in O(1). > > After patch : > time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 > > real 0m0.024s > user 0m0.000s > sys 0m0.000s > > real 0m0.032s > user 0m0.000s > sys 0m0.001s > > real 0m0.033s > user 0m0.000s > sys 0m0.000s > > > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> Acked-by: Stephen Hemminger <shemminger@vyatta.com> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-16 22:39 ` Stephen Hemminger @ 2009-11-17 12:26 ` David Miller 2009-11-17 12:31 ` Herbert Xu 0 siblings, 1 reply; 13+ messages in thread From: David Miller @ 2009-11-17 12:26 UTC (permalink / raw) To: shemminger; +Cc: eric.dumazet, herbert, netdev From: Stephen Hemminger <shemminger@vyatta.com> Date: Mon, 16 Nov 2009 14:39:17 -0800 > On Mon, 16 Nov 2009 22:50:48 +0100 > Eric Dumazet <eric.dumazet@gmail.com> wrote: > >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> > > Acked-by: Stephen Hemminger <shemminger@vyatta.com> Is this really valid? The whole point in emitting the netif_carrier_off() is so that applications see the event in userspace and therefore can clean things up. Sure, the kernel will no longer make the device visible, and therefore the application can't operate on it any longer. But the application is deserved of receiving the event anyways so that it can clean up internal state and datastructures. It seem to me that in this ->stop() case we'll now elide the event emission, and I don't see how that can be right. Really, the link watch stuff is just due for a redesign. I don't think a simple hack is going to cut it this time, sorry Eric :-) ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-17 12:26 ` David Miller @ 2009-11-17 12:31 ` Herbert Xu 2009-11-17 15:59 ` Eric Dumazet 0 siblings, 1 reply; 13+ messages in thread From: Herbert Xu @ 2009-11-17 12:31 UTC (permalink / raw) To: David Miller; +Cc: shemminger, eric.dumazet, netdev On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote: > > Really, the link watch stuff is just due for a redesign. I don't > think a simple hack is going to cut it this time, sorry Eric :-) I have no objections against any redesigns, but since the only caller of linkwatch_forget_dev runs in process context with the RTNL, it could also legally emit those events. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-17 12:31 ` Herbert Xu @ 2009-11-17 15:59 ` Eric Dumazet 2009-11-18 13:05 ` David Miller 0 siblings, 1 reply; 13+ messages in thread From: Eric Dumazet @ 2009-11-17 15:59 UTC (permalink / raw) To: Herbert Xu, David Miller; +Cc: shemminger, netdev Herbert Xu a écrit : > On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote: >> Really, the link watch stuff is just due for a redesign. I don't >> think a simple hack is going to cut it this time, sorry Eric :-) > > I have no objections against any redesigns, but since the only > caller of linkwatch_forget_dev runs in process context with the > RTNL, it could also legally emit those events. Thanks guys, here an updated version then, before linkwatch surgery ? In this version, I force the event to be sent synchronously. [PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.266s user 0m0.000s sys 0m0.001s real 0m0.770s user 0m0.000s sys 0m0.000s real 0m1.022s user 0m0.000s sys 0m0.000s One problem of current schem in vlan dismantle phase is the holding of device done by following chain : vlan_dev_stop() -> netif_carrier_off(dev) -> linkwatch_fire_event(dev) -> dev_hold() ... And __linkwatch_run_queue() runs up to one second later... A generic fix to this problem is to add a linkwatch_forget_dev() method to unlink the device from the list of watched devices. dev->link_watch_next becomes dev->link_watch_list (and use a bit more memory), to be able to unlink device in O(1). After patch : time ip link del eth3.103 ; time ip link del eth3.104 ; time ip link del eth3.105 real 0m0.024s user 0m0.000s sys 0m0.000s real 0m0.032s user 0m0.000s sys 0m0.001s real 0m0.033s user 0m0.000s sys 0m0.000s Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com> --- include/linux/netdevice.h | 3 - net/core/dev.c | 3 + net/core/link_watch.c | 94 +++++++++++++++++++++--------------- 3 files changed, 62 insertions(+), 38 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 7043f85..4e25730 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -896,7 +896,7 @@ struct net_device { /* device index hash chain */ struct hlist_node index_hlist; - struct net_device *link_watch_next; + struct list_head link_watch_list; /* register/unregister state machine */ enum { NETREG_UNINITIALIZED=0, @@ -1600,6 +1600,7 @@ static inline void dev_hold(struct net_device *dev) */ extern void linkwatch_fire_event(struct net_device *dev); +extern void linkwatch_forget_dev(struct net_device *dev); /** * netif_carrier_ok - test if carrier present diff --git a/net/core/dev.c b/net/core/dev.c index d867522..9b58c04 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5067,6 +5067,8 @@ static void netdev_wait_allrefs(struct net_device *dev) { unsigned long rebroadcast_time, warning_time; + linkwatch_forget_dev(dev); + rebroadcast_time = warning_time = jiffies; while (atomic_read(&dev->refcnt) != 0) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { @@ -5281,6 +5283,7 @@ struct net_device *alloc_netdev_mq(int sizeof_priv, const char *name, INIT_LIST_HEAD(&dev->napi_list); INIT_LIST_HEAD(&dev->unreg_list); + INIT_LIST_HEAD(&dev->link_watch_list); dev->priv_flags = IFF_XMIT_DST_RELEASE; setup(dev); strcpy(dev->name, name); diff --git a/net/core/link_watch.c b/net/core/link_watch.c index bf8f7af..5910b55 100644 --- a/net/core/link_watch.c +++ b/net/core/link_watch.c @@ -35,7 +35,7 @@ static unsigned long linkwatch_nextevent; static void linkwatch_event(struct work_struct *dummy); static DECLARE_DELAYED_WORK(linkwatch_work, linkwatch_event); -static struct net_device *lweventlist; +static LIST_HEAD(lweventlist); static DEFINE_SPINLOCK(lweventlist_lock); static unsigned char default_operstate(const struct net_device *dev) @@ -89,8 +89,10 @@ static void linkwatch_add_event(struct net_device *dev) unsigned long flags; spin_lock_irqsave(&lweventlist_lock, flags); - dev->link_watch_next = lweventlist; - lweventlist = dev; + if (list_empty(&dev->link_watch_list)) { + list_add_tail(&dev->link_watch_list, &lweventlist); + dev_hold(dev); + } spin_unlock_irqrestore(&lweventlist_lock, flags); } @@ -133,9 +135,35 @@ static void linkwatch_schedule_work(int urgent) } +static void linkwatch_do_dev(struct net_device *dev) +{ + /* + * Make sure the above read is complete since it can be + * rewritten as soon as we clear the bit below. + */ + smp_mb__before_clear_bit(); + + /* We are about to handle this device, + * so new events can be accepted + */ + clear_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state); + + rfc2863_policy(dev); + if (dev->flags & IFF_UP) { + if (netif_carrier_ok(dev)) + dev_activate(dev); + else + dev_deactivate(dev); + + netdev_state_change(dev); + } + dev_put(dev); +} + static void __linkwatch_run_queue(int urgent_only) { - struct net_device *next; + struct net_device *dev; + LIST_HEAD(wrk); /* * Limit the number of linkwatch events to one @@ -153,46 +181,40 @@ static void __linkwatch_run_queue(int urgent_only) clear_bit(LW_URGENT, &linkwatch_flags); spin_lock_irq(&lweventlist_lock); - next = lweventlist; - lweventlist = NULL; - spin_unlock_irq(&lweventlist_lock); + list_splice_init(&lweventlist, &wrk); - while (next) { - struct net_device *dev = next; + while (!list_empty(&wrk)) { - next = dev->link_watch_next; + dev = list_first_entry(&wrk, struct net_device, link_watch_list); + list_del_init(&dev->link_watch_list); if (urgent_only && !linkwatch_urgent_event(dev)) { - linkwatch_add_event(dev); + list_add_tail(&dev->link_watch_list, &lweventlist); continue; } - - /* - * Make sure the above read is complete since it can be - * rewritten as soon as we clear the bit below. - */ - smp_mb__before_clear_bit(); - - /* We are about to handle this device, - * so new events can be accepted - */ - clear_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state); - - rfc2863_policy(dev); - if (dev->flags & IFF_UP) { - if (netif_carrier_ok(dev)) - dev_activate(dev); - else - dev_deactivate(dev); - - netdev_state_change(dev); - } - - dev_put(dev); + spin_unlock_irq(&lweventlist_lock); + linkwatch_do_dev(dev); + spin_lock_irq(&lweventlist_lock); } - if (lweventlist) + if (!list_empty(&lweventlist)) linkwatch_schedule_work(0); + spin_unlock_irq(&lweventlist_lock); +} + +void linkwatch_forget_dev(struct net_device *dev) +{ + unsigned long flags; + int clean = 0; + + spin_lock_irqsave(&lweventlist_lock, flags); + if (!list_empty(&dev->link_watch_list)) { + list_del_init(&dev->link_watch_list); + clean = 1; + } + spin_unlock_irqrestore(&lweventlist_lock, flags); + if (clean) + linkwatch_do_dev(dev); } @@ -216,8 +238,6 @@ void linkwatch_fire_event(struct net_device *dev) bool urgent = linkwatch_urgent_event(dev); if (!test_and_set_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) { - dev_hold(dev); - linkwatch_add_event(dev); } else if (!urgent) return; ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle 2009-11-17 15:59 ` Eric Dumazet @ 2009-11-18 13:05 ` David Miller 0 siblings, 0 replies; 13+ messages in thread From: David Miller @ 2009-11-18 13:05 UTC (permalink / raw) To: eric.dumazet; +Cc: herbert, shemminger, netdev From: Eric Dumazet <eric.dumazet@gmail.com> Date: Tue, 17 Nov 2009 16:59:21 +0100 > Herbert Xu a écrit : >> On Tue, Nov 17, 2009 at 04:26:04AM -0800, David Miller wrote: >>> Really, the link watch stuff is just due for a redesign. I don't >>> think a simple hack is going to cut it this time, sorry Eric :-) >> >> I have no objections against any redesigns, but since the only >> caller of linkwatch_forget_dev runs in process context with the >> RTNL, it could also legally emit those events. > > Thanks guys, here an updated version then, before linkwatch surgery ? Sure, applied to net-next-2.6, thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-11-18 13:05 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-11-16 18:01 [RFC] dev->refcnt long term holder Eric Dumazet 2009-11-16 19:02 ` Stephen Hemminger 2009-11-16 19:54 ` Eric Dumazet 2009-11-17 8:30 ` David Miller 2009-11-17 8:37 ` Herbert Xu 2009-11-17 17:58 ` Stephen Hemminger 2009-11-17 19:18 ` David Miller 2009-11-16 21:50 ` PATCH net-next-2.6] linkwatch: linkwatch_forget_dev() to speedup device dismantle Eric Dumazet 2009-11-16 22:39 ` Stephen Hemminger 2009-11-17 12:26 ` David Miller 2009-11-17 12:31 ` Herbert Xu 2009-11-17 15:59 ` Eric Dumazet 2009-11-18 13:05 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).