* [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? @ 2012-08-10 9:27 Eric Dumazet 2012-08-10 10:42 ` David Miller 0 siblings, 1 reply; 14+ messages in thread From: Eric Dumazet @ 2012-08-10 9:27 UTC (permalink / raw) To: David Miller; +Cc: netdev, Eric W. Biederman NETDEV_UNREGISTER_BATCH seems unused we can probably remove it. I am tracking a device refcount issue, delaying net device dismantle by 1 second in netdev_wait_allrefs() I guess we need to add a notifier called _after_ the final synchronize_net() in rollback_registered_many() [ 52.022261] calling rtnetlink_dev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022266] calling fib_rules_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022270] calling arp_netdev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022274] calling ip_netdev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022277] calling fib_netdev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022280] calling ip_mr_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022284] calling ndisc_netdev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022287] calling ip6_route_dev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022290] calling ipv6_dev_notf msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022293] calling packet_netdev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022297] calling dst_dev_notifier msg/val 9 data ffff880610bea000 refcnt 15->15 [ 52.022331] calling rtnetlink_dev_notifier msg/val 2 data ffff880610bea000 refcnt 15->15 [ 52.022335] calling fib_rules_notifier msg/val 2 data ffff880610bea000 refcnt 15->15 [ 52.022340] calling arp_netdev_notifier msg/val 2 data ffff880610bea000 refcnt 15->15 [ 52.022350] calling ip_netdev_notifier msg/val 2 data ffff880610bea000 refcnt 15->15 [ 52.022374] arp: arp_ifdown dev ffff880610bea000 refcnt 15 [ 52.022382] calling fib_netdev_notifier msg/val 2 data ffff880610bea000 refcnt 15->14 [ 52.022931] IPv4: rt_free(ffff8802fee7e180) [ 52.023394] calling ip_mr_notifier msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.023402] IPv6: dst_free(ffff88060b680480) refcnt 0 [ 52.023404] IPv6: dst_free(ffff880307f7cf00) refcnt 0 [ 52.023407] IPv6: dst_free(ffff880307f7cd80) refcnt 0 [ 52.023409] IPv6: dst_free(ffff880307f7cc00) refcnt 0 [ 52.023410] IPv6: dst_free(ffff88060b680180) refcnt 0 [ 52.023412] IPv6: dst_free(ffff88060810c780) refcnt 0 [ 52.023414] IPv6: dst_free(ffff88060810c600) refcnt 0 [ 52.023415] IPv6: dst_free(ffff88060810c480) refcnt 0 [ 52.023421] calling ndisc_netdev_notifier msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.023424] calling ip6_route_dev_notifier msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.023432] calling ipv6_dev_notf msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.023436] calling packet_netdev_notifier msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.023522] dst_ifdown dst ffff8802fee7e000 dev ffff880610bea000 dst->dev ffff880311b33000 unreg 0 dev->refcnt 14 [ 52.023525] calling dst_dev_notifier msg/val 2 data ffff880610bea000 refcnt 14->14 [ 52.024267] calling rtnetlink_dev_notifier msg/val 6 data ffff880610bea000 refcnt 13->13 [ 52.024272] calling fib_rules_notifier msg/val 6 data ffff880610bea000 refcnt 13->13 [ 52.024277] calling arp_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 13->13 [ 52.024279] inetdev_destroy() [ 52.024315] arp: arp_ifdown dev ffff880610bea000 refcnt 13 [ 52.024421] arp: arp_ifdown dev ffff880610bea000 refcnt 12 [ 52.024425] calling ip_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 13->12 [ 52.024430] arp: arp_ifdown dev ffff880610bea000 refcnt 12 [ 52.024434] calling fib_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 12->12 [ 52.024437] calling ip_mr_notifier msg/val 6 data ffff880610bea000 refcnt 12->12 [ 52.024440] calling ndisc_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 12->12 [ 52.024443] calling ip6_route_dev_notifier msg/val 6 data ffff880610bea000 refcnt 12->12 [ 52.024458] calling ipv6_dev_notf msg/val 6 data ffff880610bea000 refcnt 12->10 [ 52.024462] calling packet_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 10->9 [ 52.024465] dst_ifdown dst ffff8802fee7e000 dev ffff880610bea000 dst->dev ffff880311b33000 unreg 1 dev->refcnt 9 [ 52.024468] calling dst_dev_notifier msg/val 6 data ffff880610bea000 refcnt 9->9 [ 52.024564] calling rtnetlink_dev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024569] calling fib_rules_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024574] calling arp_netdev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024579] calling ip_netdev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024583] calling fib_netdev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024586] calling ip_mr_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024589] calling ndisc_netdev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024592] calling ip6_route_dev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024595] calling ipv6_dev_notf msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024598] calling packet_netdev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.024601] calling dst_dev_notifier msg/val 17 data ffff880610bea000 refcnt 7->7 [ 52.028214] IPv4: dst_free(ffff8802fee7e180) refcnt 0 [ 52.033207] dst_free(ffff88060816e0c0) refcnt 1 [ 52.033212] adding dst ffff88060816e0c0 to garbage [ 52.033214] ------------[ cut here ]------------ [ 52.033221] WARNING: at net/core/dst.c:215 __dst_free+0x57/0xf0() [ 52.033224] Hardware name: HP Z600 Workstation [ 52.033225] Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables xt_state ipt_REJECT fuse iptable_mangle iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bridge stp nfsd auth_rpcgss exportfs nfs lockd sunrpc asix usbnet rt61pci crc_itu_t rt2x00pci rt2x00lib eeprom_93cx6 tg3 ixgbe mdio igb [ 52.033254] Pid: 0, comm: swapper/6 Tainted: G W 3.5.0+ #764 [ 52.033255] Call Trace: [ 52.033257] <IRQ> [<ffffffff8103de0f>] warn_slowpath_common+0x7f/0xc0 [ 52.033266] [<ffffffff8103de6a>] warn_slowpath_null+0x1a/0x20 [ 52.033270] [<ffffffff8155d247>] __dst_free+0x57/0xf0 [ 52.033277] [<ffffffff815c96ba>] free_fib_info_rcu+0x18a/0x220 [ 52.033282] [<ffffffff810c4897>] rcu_process_callbacks+0x1f7/0x580 [ 52.033287] [<ffffffff81066869>] ? enqueue_hrtimer+0x39/0xe0 [ 52.033292] [<ffffffff81046d60>] __do_softirq+0xc0/0x260 [ 52.033296] [<ffffffff81067939>] ? hrtimer_interrupt+0x169/0x260 [ 52.033301] [<ffffffff816d006c>] call_softirq+0x1c/0x30 [ 52.033306] [<ffffffff81004225>] do_softirq+0x55/0x90 [ 52.033309] [<ffffffff8104720e>] irq_exit+0x9e/0xc0 [ 52.033313] [<ffffffff816d07ee>] smp_apic_timer_interrupt+0x6e/0x99 [ 52.033316] [<ffffffff816cf987>] apic_timer_interrupt+0x67/0x70 [ 52.033318] <EOI> [<ffffffff81300230>] ? intel_idle+0xf0/0x150 [ 52.033326] [<ffffffff8130020f>] ? intel_idle+0xcf/0x150 [ 52.033331] [<ffffffff814ac779>] cpuidle_enter+0x19/0x20 [ 52.033335] [<ffffffff814ace0c>] cpuidle_idle_call+0xac/0x310 [ 52.033340] [<ffffffff8100b6f5>] cpu_idle+0x85/0xd0 [ 52.033345] [<ffffffff816b65de>] start_secondary+0x21b/0x21d [ 52.033347] ---[ end trace 555e476df6e4a6f3 ]--- [ 52.033357] dst_free(ffff88060816e000) refcnt 1 [ 52.033359] adding dst ffff88060816e000 to garbage [ 52.033360] ------------[ cut here ]------------ [ 52.033363] WARNING: at net/core/dst.c:215 __dst_free+0x57/0xf0() [ 52.033365] Hardware name: HP Z600 Workstation [ 52.033366] Modules linked in: tun ip6table_filter ip6_tables ebtable_nat ebtables xt_state ipt_REJECT fuse iptable_mangle iptable_filter ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 bridge stp nfsd auth_rpcgss exportfs nfs lockd sunrpc asix usbnet rt61pci crc_itu_t rt2x00pci rt2x00lib eeprom_93cx6 tg3 ixgbe mdio igb [ 52.033390] Pid: 0, comm: swapper/6 Tainted: G W 3.5.0+ #764 [ 52.033392] Call Trace: [ 52.033393] <IRQ> [<ffffffff8103de0f>] warn_slowpath_common+0x7f/0xc0 [ 52.033399] [<ffffffff8103de6a>] warn_slowpath_null+0x1a/0x20 [ 52.033402] [<ffffffff8155d247>] __dst_free+0x57/0xf0 [ 52.033406] [<ffffffff815c9740>] free_fib_info_rcu+0x210/0x220 [ 52.033410] [<ffffffff810c4897>] rcu_process_callbacks+0x1f7/0x580 [ 52.033414] [<ffffffff81066869>] ? enqueue_hrtimer+0x39/0xe0 [ 52.033418] [<ffffffff81046d60>] __do_softirq+0xc0/0x260 [ 52.033421] [<ffffffff81067939>] ? hrtimer_interrupt+0x169/0x260 [ 52.033425] [<ffffffff816d006c>] call_softirq+0x1c/0x30 [ 52.033429] [<ffffffff81004225>] do_softirq+0x55/0x90 [ 52.033432] [<ffffffff8104720e>] irq_exit+0x9e/0xc0 [ 52.033436] [<ffffffff816d07ee>] smp_apic_timer_interrupt+0x6e/0x99 [ 52.033439] [<ffffffff816cf987>] apic_timer_interrupt+0x67/0x70 [ 52.033440] <EOI> [<ffffffff81300230>] ? intel_idle+0xf0/0x150 [ 52.033447] [<ffffffff8130020f>] ? intel_idle+0xcf/0x150 [ 52.033451] [<ffffffff814ac779>] cpuidle_enter+0x19/0x20 [ 52.033455] [<ffffffff814ace0c>] cpuidle_idle_call+0xac/0x310 [ 52.033458] [<ffffffff8100b6f5>] cpu_idle+0x85/0xd0 [ 52.033462] [<ffffffff816b65de>] start_secondary+0x21b/0x21d [ 52.033464] ---[ end trace 555e476df6e4a6f4 ]--- [ 52.132982] dst_gc_task() [ 52.132986] delayed 3 performed 0 [ 52.633325] IPv4: rt_free(ffff8802fee7e3c0) [ 52.633827] IPv4: rt_free(ffff8802fee7e240) [ 52.641632] IPv4: dst_free(ffff8802fee7e3c0) refcnt 0 [ 52.641634] IPv4: dst_free(ffff8802fee7e240) refcnt 0 [ 52.732498] dst_gc_task() [ 52.732502] delayed 3 performed 0 [ 53.039739] calling rtnetlink_dev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039744] calling fib_rules_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039748] calling arp_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039751] calling ip_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039758] arp: arp_ifdown dev ffff880610bea000 refcnt 1 [ 53.039762] calling fib_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039766] calling ip_mr_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039769] calling ndisc_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039772] calling ip6_route_dev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039778] calling ipv6_dev_notf msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039781] calling packet_netdev_notifier msg/val 6 data ffff880610bea000 refcnt 1->1 [ 53.039785] dst_ifdown dst ffff8802fee7e000 dev ffff880610bea000 dst->dev ffff880311b33000 unreg 1 dev->refcnt 1 [ 53.039788] dst_ifdown dst ffff88060816e000 dev ffff880610bea000 dst->dev ffff880311b33000 unreg 1 dev->refcnt 1 [ 53.039791] dst_ifdown dst ffff88060816e0c0 dev ffff880610bea000 dst->dev ffff880610bea000 unreg 1 dev->refcnt 1 [ 53.039794] calling dst_dev_notifier msg/val 6 data ffff880610bea000 refcnt 1->0 ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? 2012-08-10 9:27 [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? Eric Dumazet @ 2012-08-10 10:42 ` David Miller 2012-08-10 11:06 ` Eric Dumazet 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2012-08-10 10:42 UTC (permalink / raw) To: eric.dumazet; +Cc: netdev, ebiederm From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 10 Aug 2012 11:27:04 +0200 > NETDEV_UNREGISTER_BATCH seems unused we can probably remove it. Indeed, the routing cache was the final real user. > I am tracking a device refcount issue, delaying net device dismantle by > 1 second in netdev_wait_allrefs() > > I guess we need to add a notifier called _after_ the final > synchronize_net() in rollback_registered_many() It's essentially caused by DST_GC_INC, right? ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? 2012-08-10 10:42 ` David Miller @ 2012-08-10 11:06 ` Eric Dumazet 2012-08-10 14:45 ` Eric W. Biederman 2012-08-10 14:46 ` [PATCH] net: remove delay at device dismantle Eric Dumazet 0 siblings, 2 replies; 14+ messages in thread From: Eric Dumazet @ 2012-08-10 11:06 UTC (permalink / raw) To: David Miller; +Cc: netdev, ebiederm On Fri, 2012-08-10 at 03:42 -0700, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Fri, 10 Aug 2012 11:27:04 +0200 > > > NETDEV_UNREGISTER_BATCH seems unused we can probably remove it. > > Indeed, the routing cache was the final real user. > > > I am tracking a device refcount issue, delaying net device dismantle by > > 1 second in netdev_wait_allrefs() > > > > I guess we need to add a notifier called _after_ the final > > synchronize_net() in rollback_registered_many() > > It's essentially caused by DST_GC_INC, right? No, we in fact need a rcu_barrier(), then another call to dst_dev_event(). rcu_barrier() is needed so that in-flight call_rcu() of routes (from rt_free()) are completed. Or else we miss these dst in the dst_dev_event(). I have a working patch, adding the rcu_barrier() and one additional NETDEV_UNREGISTER_FINAL event. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? 2012-08-10 11:06 ` Eric Dumazet @ 2012-08-10 14:45 ` Eric W. Biederman 2012-08-10 15:01 ` Eric Dumazet 2012-08-10 17:55 ` Paul E. McKenney 2012-08-10 14:46 ` [PATCH] net: remove delay at device dismantle Eric Dumazet 1 sibling, 2 replies; 14+ messages in thread From: Eric W. Biederman @ 2012-08-10 14:45 UTC (permalink / raw) To: Eric Dumazet; +Cc: David Miller, netdev Eric Dumazet <eric.dumazet@gmail.com> writes: > On Fri, 2012-08-10 at 03:42 -0700, David Miller wrote: >> From: Eric Dumazet <eric.dumazet@gmail.com> >> Date: Fri, 10 Aug 2012 11:27:04 +0200 >> >> > NETDEV_UNREGISTER_BATCH seems unused we can probably remove it. >> >> Indeed, the routing cache was the final real user. >> >> > I am tracking a device refcount issue, delaying net device dismantle by >> > 1 second in netdev_wait_allrefs() >> > >> > I guess we need to add a notifier called _after_ the final >> > synchronize_net() in rollback_registered_many() >> >> It's essentially caused by DST_GC_INC, right? > > No, we in fact need a rcu_barrier(), then another call to > dst_dev_event(). > > rcu_barrier() is needed so that in-flight call_rcu() of routes (from > rt_free()) are completed. Or else we miss these dst in the > dst_dev_event(). > I have a working patch, adding the rcu_barrier() and one additional > NETDEV_UNREGISTER_FINAL event. Can someone help bring me up to speed. What has changed in the dst ref counting that has invalidated our previous solutions? As for the idea of putting an rcu_barrier inside of the rtnl_lock. I really don't like it. You are trading off a 1000ms singled threaded wait without locks for extending the hold times of rtnl lock by 12ms or so. We already have an rcu_barrier on that path in netdev_run_todo, so we can reorganize things to use that barrier I would be much happer. Furthermore I talked to Paul McKenney a while ago about creating an rcu_barrier expedited and he really did not like the idea. Reading through the code we really should get dst_rcu_free out of the header and make it non-line. dst_rcu_free can't possibly be called from a location where it can be inlined. Trying to understand your analysis I have stared at the code for a while and I am definitely not seeing any rcu callbacks that result in calling rt_free. So one of us is missing something. All I am seeing from your trace is one call of rtnetlink_dev_notifier the refcount is at 7 and the next call of rtnetlink_dev_notifier the refcnt has dropped to 1. Eric ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? 2012-08-10 14:45 ` Eric W. Biederman @ 2012-08-10 15:01 ` Eric Dumazet 2012-08-10 17:55 ` Paul E. McKenney 1 sibling, 0 replies; 14+ messages in thread From: Eric Dumazet @ 2012-08-10 15:01 UTC (permalink / raw) To: Eric W. Biederman; +Cc: David Miller, netdev On Fri, 2012-08-10 at 07:45 -0700, Eric W. Biederman wrote: > Can someone help bring me up to speed. What has changed in the > dst ref counting that has invalidated our previous solutions? > In fact your patch (850a545bd8a416484 net: Move rcu_barrier from rollback_registered_many to netdev_run_todo.) reinstated the problem again. You didnt notice, but other people can see the problem. > As for the idea of putting an rcu_barrier inside of the rtnl_lock. I > really don't like it. You are trading off a 1000ms singled threaded wait > without locks for extending the hold times of rtnl lock by 12ms or so. > We probably can keep the rcu_barrier() in netdev_run_todo() and kick the UNREGISTER_FINAL in netdev_run_todo() too. I'll try that. > We already have an rcu_barrier on that path in netdev_run_todo, > so we can reorganize things to use that barrier I would be much > happer. Furthermore I talked to Paul McKenney a while ago > about creating an rcu_barrier expedited and he really did not > like the idea. > > Reading through the code we really should get dst_rcu_free > out of the header and make it non-line. dst_rcu_free can't > possibly be called from a location where it can be inlined. > > Trying to understand your analysis I have stared at the code for > a while and I am definitely not seeing any rcu callbacks that > result in calling rt_free. So one of us is missing something. Recent commits add the rt_free() calls. Of course, if you unregister a dummy device you wont see the problem. If you unregister a real device, you definitely hit the problem. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? 2012-08-10 14:45 ` Eric W. Biederman 2012-08-10 15:01 ` Eric Dumazet @ 2012-08-10 17:55 ` Paul E. McKenney 1 sibling, 0 replies; 14+ messages in thread From: Paul E. McKenney @ 2012-08-10 17:55 UTC (permalink / raw) To: Eric W. Biederman; +Cc: Eric Dumazet, David Miller, netdev On Fri, Aug 10, 2012 at 07:45:48AM -0700, Eric W. Biederman wrote: > Eric Dumazet <eric.dumazet@gmail.com> writes: > > > On Fri, 2012-08-10 at 03:42 -0700, David Miller wrote: > >> From: Eric Dumazet <eric.dumazet@gmail.com> > >> Date: Fri, 10 Aug 2012 11:27:04 +0200 > >> > >> > NETDEV_UNREGISTER_BATCH seems unused we can probably remove it. > >> > >> Indeed, the routing cache was the final real user. > >> > >> > I am tracking a device refcount issue, delaying net device dismantle by > >> > 1 second in netdev_wait_allrefs() > >> > > >> > I guess we need to add a notifier called _after_ the final > >> > synchronize_net() in rollback_registered_many() > >> > >> It's essentially caused by DST_GC_INC, right? > > > > No, we in fact need a rcu_barrier(), then another call to > > dst_dev_event(). > > > > rcu_barrier() is needed so that in-flight call_rcu() of routes (from > > rt_free()) are completed. Or else we miss these dst in the > > dst_dev_event(). > > > I have a working patch, adding the rcu_barrier() and one additional > > NETDEV_UNREGISTER_FINAL event. > > Can someone help bring me up to speed. What has changed in the > dst ref counting that has invalidated our previous solutions? > > As for the idea of putting an rcu_barrier inside of the rtnl_lock. I > really don't like it. You are trading off a 1000ms singled threaded wait > without locks for extending the hold times of rtnl lock by 12ms or so. > > We already have an rcu_barrier on that path in netdev_run_todo, > so we can reorganize things to use that barrier I would be much > happer. Furthermore I talked to Paul McKenney a while ago > about creating an rcu_barrier expedited and he really did not > like the idea. For whatever it is worth, I do have rcu_barrier_expedited() on my list of things that at least one person has expressed interest in, but that I do not yet have a good solution for. Obstacles include the following: 1. If a given CPU has lots of callbacks, but is running a real-time process, what do you do? (a) Hammer the real-time process? (b) Make rcu_barrier_expedited() wait (current likely choice)? (c) Handle via a set priority for callback processing (which might be the case for BOOST_RCU builds)? (d) Force migration of the callbacks (mmmaybe...)? (e) Force migration of the real-time process (ouch!)? 2. Ditto, but huge numbers of callbacks and non-realtime processes. Similar solution space. 3. There will be some real-time disruption from any reasonable implementation of rcu_barrier_expedited(). Maybe the RT guys choose to map it to rcu_barrier()? On the other hand, in the same email thread back in May you were also looking for a kmem_cache_free_rcu(). At the time I didn't have a good solution for this, but I do believe that I have one now. ;-) Thanx, Paul > Reading through the code we really should get dst_rcu_free > out of the header and make it non-line. dst_rcu_free can't > possibly be called from a location where it can be inlined. > > Trying to understand your analysis I have stared at the code for > a while and I am definitely not seeing any rcu callbacks that > result in calling rt_free. So one of us is missing something. > > All I am seeing from your trace is one call of rtnetlink_dev_notifier > the refcount is at 7 and the next call of rtnetlink_dev_notifier the > refcnt has dropped to 1. > > Eric > > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH] net: remove delay at device dismantle 2012-08-10 11:06 ` Eric Dumazet 2012-08-10 14:45 ` Eric W. Biederman @ 2012-08-10 14:46 ` Eric Dumazet 2012-08-10 15:42 ` [PATCH net-next v2] " Eric Dumazet 1 sibling, 1 reply; 14+ messages in thread From: Eric Dumazet @ 2012-08-10 14:46 UTC (permalink / raw) To: David Miller; +Cc: netdev, Tom Herbert, Mahesh Bandewar From: Eric Dumazet <edumazet@google.com> I noticed extra one second delay in device dismantle, tracked down to a call to dst_dev_event() while some call_rcu() are still in RCU queues. These call_rcu() were posted by rt_free(struct rtable *rt) calls. We then wait a little (but one second) in netdev_wait_allrefs() before kicking again NETDEV_UNREGISTER. As the call_rcu() are now completed, dst_dev_event() can do the needed device swap on busy dst. To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called in rollback_registered_many() after a rcu_barrier(). Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after IP cache removal. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> --- include/linux/netdevice.h | 2 +- net/core/dev.c | 23 +++++++---------------- net/core/dst.c | 2 +- net/core/rtnetlink.c | 2 +- net/ipv4/fib_frontend.c | 2 -- 5 files changed, 10 insertions(+), 21 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index a9db4f3..e3a43fd 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1550,7 +1550,7 @@ struct packet_type { #define NETDEV_PRE_TYPE_CHANGE 0x000E #define NETDEV_POST_TYPE_CHANGE 0x000F #define NETDEV_POST_INIT 0x0010 -#define NETDEV_UNREGISTER_BATCH 0x0011 +#define NETDEV_UNREGISTER_FINAL 0x0011 #define NETDEV_RELEASE 0x0012 #define NETDEV_NOTIFY_PEERS 0x0013 #define NETDEV_JOIN 0x0014 diff --git a/net/core/dev.c b/net/core/dev.c index a39354e..be3a5af 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1394,7 +1394,6 @@ rollback: nb->notifier_call(nb, NETDEV_DOWN, dev); } nb->notifier_call(nb, NETDEV_UNREGISTER, dev); - nb->notifier_call(nb, NETDEV_UNREGISTER_BATCH, dev); } } @@ -1436,7 +1435,6 @@ int unregister_netdevice_notifier(struct notifier_block *nb) nb->notifier_call(nb, NETDEV_DOWN, dev); } nb->notifier_call(nb, NETDEV_UNREGISTER, dev); - nb->notifier_call(nb, NETDEV_UNREGISTER_BATCH, dev); } } unlock: @@ -5307,14 +5305,14 @@ static void rollback_registered_many(struct list_head *head) netdev_unregister_kobject(dev); } - /* Process any work delayed until the end of the batch */ - dev = list_first_entry(head, struct net_device, unreg_list); - call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev); - synchronize_net(); + /* Wait for rcu callbacks to finish before next phase */ + rcu_barrier(); - list_for_each_entry(dev, head, unreg_list) + list_for_each_entry(dev, head, unreg_list) { + call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev); dev_put(dev); + } } static void rollback_registered(struct net_device *dev) @@ -5757,9 +5755,8 @@ static void netdev_wait_allrefs(struct net_device *dev) /* Rebroadcast unregister notification */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); - /* don't resend NETDEV_UNREGISTER_BATCH, _BATCH users - * should have already handle it the first time */ - + rcu_barrier(); + call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev); if (test_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) { /* We must not have linkwatch events @@ -5821,11 +5818,6 @@ void netdev_run_todo(void) __rtnl_unlock(); - /* Wait for rcu callbacks to finish before attempting to drain - * the device list. This usually avoids a 250ms wait. - */ - if (!list_empty(&list)) - rcu_barrier(); while (!list_empty(&list)) { struct net_device *dev @@ -6226,7 +6218,6 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char the device is just moving and can keep their slaves up. */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); - call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev); rtmsg_ifinfo(RTM_DELLINK, dev, ~0U); /* diff --git a/net/core/dst.c b/net/core/dst.c index 56d6361..f6593d2 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -374,7 +374,7 @@ static int dst_dev_event(struct notifier_block *this, unsigned long event, struct dst_entry *dst, *last = NULL; switch (event) { - case NETDEV_UNREGISTER: + case NETDEV_UNREGISTER_FINAL: case NETDEV_DOWN: mutex_lock(&dst_gc_mutex); for (dst = dst_busy_list; dst; dst = dst->next) { diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2c5a0a0..30f5a06 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2356,7 +2356,7 @@ static int rtnetlink_event(struct notifier_block *this, unsigned long event, voi case NETDEV_PRE_TYPE_CHANGE: case NETDEV_GOING_DOWN: case NETDEV_UNREGISTER: - case NETDEV_UNREGISTER_BATCH: + case NETDEV_UNREGISTER_FINAL: case NETDEV_RELEASE: case NETDEV_JOIN: break; diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index c43ae3f..4fe50184 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -1071,8 +1071,6 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo case NETDEV_CHANGE: rt_cache_flush(dev_net(dev), 0); break; - case NETDEV_UNREGISTER_BATCH: - break; } return NOTIFY_DONE; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH net-next v2] net: remove delay at device dismantle 2012-08-10 14:46 ` [PATCH] net: remove delay at device dismantle Eric Dumazet @ 2012-08-10 15:42 ` Eric Dumazet 2012-08-11 5:54 ` Eric Dumazet 0 siblings, 1 reply; 14+ messages in thread From: Eric Dumazet @ 2012-08-10 15:42 UTC (permalink / raw) To: David Miller; +Cc: netdev, Tom Herbert, Mahesh Bandewar, Eric W. Biederman From: Eric Dumazet <edumazet@google.com> I noticed extra one second delay in device dismantle, tracked down to a call to dst_dev_event() while some call_rcu() are still in RCU queues. These call_rcu() were posted by rt_free(struct rtable *rt) calls. We then wait a little (but one second) in netdev_wait_allrefs() before kicking again NETDEV_UNREGISTER. As the call_rcu() are now completed, dst_dev_event() can do the needed device swap on busy dst. To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called after a rcu_barrier(), but outside of RTNL lock. Use NETDEV_UNREGISTER_FINAL with care ! Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after IP cache removal. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <therbert@google.com> Cc: Mahesh Bandewar <maheshb@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> --- v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock as its more risky, base this patch on net-next include/linux/netdevice.h | 2 +- net/core/dev.c | 22 ++++++++-------------- net/core/dst.c | 2 +- net/core/fib_rules.c | 3 ++- net/core/rtnetlink.c | 2 +- net/ipv4/devinet.c | 6 +++++- net/ipv4/fib_frontend.c | 2 -- 7 files changed, 18 insertions(+), 21 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index a9db4f3..e3a43fd 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1550,7 +1550,7 @@ struct packet_type { #define NETDEV_PRE_TYPE_CHANGE 0x000E #define NETDEV_POST_TYPE_CHANGE 0x000F #define NETDEV_POST_INIT 0x0010 -#define NETDEV_UNREGISTER_BATCH 0x0011 +#define NETDEV_UNREGISTER_FINAL 0x0011 #define NETDEV_RELEASE 0x0012 #define NETDEV_NOTIFY_PEERS 0x0013 #define NETDEV_JOIN 0x0014 diff --git a/net/core/dev.c b/net/core/dev.c index 1f06df8..4c0f837 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -1391,7 +1391,6 @@ rollback: nb->notifier_call(nb, NETDEV_DOWN, dev); } nb->notifier_call(nb, NETDEV_UNREGISTER, dev); - nb->notifier_call(nb, NETDEV_UNREGISTER_BATCH, dev); } } @@ -1433,7 +1432,6 @@ int unregister_netdevice_notifier(struct notifier_block *nb) nb->notifier_call(nb, NETDEV_DOWN, dev); } nb->notifier_call(nb, NETDEV_UNREGISTER, dev); - nb->notifier_call(nb, NETDEV_UNREGISTER_BATCH, dev); } } unlock: @@ -1453,7 +1451,8 @@ EXPORT_SYMBOL(unregister_netdevice_notifier); int call_netdevice_notifiers(unsigned long val, struct net_device *dev) { - ASSERT_RTNL(); + if (val != NETDEV_UNREGISTER_FINAL) + ASSERT_RTNL(); return raw_notifier_call_chain(&netdev_chain, val, dev); } EXPORT_SYMBOL(call_netdevice_notifiers); @@ -5304,10 +5303,6 @@ static void rollback_registered_many(struct list_head *head) netdev_unregister_kobject(dev); } - /* Process any work delayed until the end of the batch */ - dev = list_first_entry(head, struct net_device, unreg_list); - call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev); - synchronize_net(); list_for_each_entry(dev, head, unreg_list) @@ -5759,9 +5754,8 @@ static void netdev_wait_allrefs(struct net_device *dev) /* Rebroadcast unregister notification */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); - /* don't resend NETDEV_UNREGISTER_BATCH, _BATCH users - * should have already handle it the first time */ - + rcu_barrier(); + call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev); if (test_bit(__LINK_STATE_LINKWATCH_PENDING, &dev->state)) { /* We must not have linkwatch events @@ -5823,9 +5817,8 @@ void netdev_run_todo(void) __rtnl_unlock(); - /* Wait for rcu callbacks to finish before attempting to drain - * the device list. This usually avoids a 250ms wait. - */ + + /* Wait for rcu callbacks to finish before next phase */ if (!list_empty(&list)) rcu_barrier(); @@ -5834,6 +5827,8 @@ void netdev_run_todo(void) = list_first_entry(&list, struct net_device, todo_list); list_del(&dev->todo_list); + call_netdevice_notifiers(NETDEV_UNREGISTER_FINAL, dev); + if (unlikely(dev->reg_state != NETREG_UNREGISTERING)) { pr_err("network todo '%s' but state %d\n", dev->name, dev->reg_state); @@ -6228,7 +6223,6 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char the device is just moving and can keep their slaves up. */ call_netdevice_notifiers(NETDEV_UNREGISTER, dev); - call_netdevice_notifiers(NETDEV_UNREGISTER_BATCH, dev); rtmsg_ifinfo(RTM_DELLINK, dev, ~0U); /* diff --git a/net/core/dst.c b/net/core/dst.c index 069d51d..a2d3624 100644 --- a/net/core/dst.c +++ b/net/core/dst.c @@ -366,7 +366,7 @@ static int dst_dev_event(struct notifier_block *this, unsigned long event, struct dst_entry *dst, *last = NULL; switch (event) { - case NETDEV_UNREGISTER: + case NETDEV_UNREGISTER_FINAL: case NETDEV_DOWN: mutex_lock(&dst_gc_mutex); for (dst = dst_busy_list; dst; dst = dst->next) { diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index ab7db83..5850937 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -711,15 +711,16 @@ static int fib_rules_event(struct notifier_block *this, unsigned long event, struct net *net = dev_net(dev); struct fib_rules_ops *ops; - ASSERT_RTNL(); switch (event) { case NETDEV_REGISTER: + ASSERT_RTNL(); list_for_each_entry(ops, &net->rules_ops, list) attach_rules(&ops->rules_list, dev); break; case NETDEV_UNREGISTER: + ASSERT_RTNL(); list_for_each_entry(ops, &net->rules_ops, list) detach_rules(&ops->rules_list, dev); break; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 34d975b..c64efcf 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -2358,7 +2358,7 @@ static int rtnetlink_event(struct notifier_block *this, unsigned long event, voi case NETDEV_PRE_TYPE_CHANGE: case NETDEV_GOING_DOWN: case NETDEV_UNREGISTER: - case NETDEV_UNREGISTER_BATCH: + case NETDEV_UNREGISTER_FINAL: case NETDEV_RELEASE: case NETDEV_JOIN: break; diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index adf273f..6a5e6e4 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -1147,8 +1147,12 @@ static int inetdev_event(struct notifier_block *this, unsigned long event, void *ptr) { struct net_device *dev = ptr; - struct in_device *in_dev = __in_dev_get_rtnl(dev); + struct in_device *in_dev; + if (event == NETDEV_UNREGISTER_FINAL) + goto out; + + in_dev = __in_dev_get_rtnl(dev); ASSERT_RTNL(); if (!in_dev) { diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c index 7f073a3..43d6fea 100644 --- a/net/ipv4/fib_frontend.c +++ b/net/ipv4/fib_frontend.c @@ -1071,8 +1071,6 @@ static int fib_netdev_event(struct notifier_block *this, unsigned long event, vo case NETDEV_CHANGE: rt_cache_flush(dev_net(dev), 0); break; - case NETDEV_UNREGISTER_BATCH: - break; } return NOTIFY_DONE; } ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-10 15:42 ` [PATCH net-next v2] " Eric Dumazet @ 2012-08-11 5:54 ` Eric Dumazet 2012-08-11 5:57 ` David Miller ` (2 more replies) 0 siblings, 3 replies; 14+ messages in thread From: Eric Dumazet @ 2012-08-11 5:54 UTC (permalink / raw) To: David Miller; +Cc: netdev, Tom Herbert, Mahesh Bandewar, Eric W. Biederman On Fri, 2012-08-10 at 17:42 +0200, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > I noticed extra one second delay in device dismantle, tracked down to > a call to dst_dev_event() while some call_rcu() are still in RCU queues. > ... > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Tom Herbert <therbert@google.com> > Cc: Mahesh Bandewar <maheshb@google.com> > Cc: "Eric W. Biederman" <ebiederm@xmission.com> > --- > v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock > as its more risky, base this patch on net-next Also I am leaving for a one week vacation with no access to the Internet, so better hold this patch until my return ;) Thanks ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-11 5:54 ` Eric Dumazet @ 2012-08-11 5:57 ` David Miller 2012-08-23 2:18 ` David Miller 2012-08-23 2:25 ` Gao feng 2 siblings, 0 replies; 14+ messages in thread From: David Miller @ 2012-08-11 5:57 UTC (permalink / raw) To: eric.dumazet; +Cc: netdev, therbert, maheshb, ebiederm From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 11 Aug 2012 07:54:47 +0200 > On Fri, 2012-08-10 at 17:42 +0200, Eric Dumazet wrote: >> From: Eric Dumazet <edumazet@google.com> >> >> I noticed extra one second delay in device dismantle, tracked down to >> a call to dst_dev_event() while some call_rcu() are still in RCU queues. >> > ... >> Signed-off-by: Eric Dumazet <edumazet@google.com> >> Cc: Tom Herbert <therbert@google.com> >> Cc: Mahesh Bandewar <maheshb@google.com> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com> >> --- >> v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock >> as its more risky, base this patch on net-next > > Also I am leaving for a one week vacation with no access to the > Internet, so better hold this patch until my return ;) Ok :) ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-11 5:54 ` Eric Dumazet 2012-08-11 5:57 ` David Miller @ 2012-08-23 2:18 ` David Miller 2012-08-23 2:25 ` Gao feng 2 siblings, 0 replies; 14+ messages in thread From: David Miller @ 2012-08-23 2:18 UTC (permalink / raw) To: eric.dumazet; +Cc: netdev, therbert, maheshb, ebiederm From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sat, 11 Aug 2012 07:54:47 +0200 > On Fri, 2012-08-10 at 17:42 +0200, Eric Dumazet wrote: >> From: Eric Dumazet <edumazet@google.com> >> >> I noticed extra one second delay in device dismantle, tracked down to >> a call to dst_dev_event() while some call_rcu() are still in RCU queues. >> > ... >> Signed-off-by: Eric Dumazet <edumazet@google.com> >> Cc: Tom Herbert <therbert@google.com> >> Cc: Mahesh Bandewar <maheshb@google.com> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com> >> --- >> v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock >> as its more risky, base this patch on net-next > > Also I am leaving for a one week vacation with no access to the > Internet, so better hold this patch until my return ;) Since you're back, I've applied this now. Thanks. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-11 5:54 ` Eric Dumazet 2012-08-11 5:57 ` David Miller 2012-08-23 2:18 ` David Miller @ 2012-08-23 2:25 ` Gao feng 2012-08-23 2:51 ` David Miller 2 siblings, 1 reply; 14+ messages in thread From: Gao feng @ 2012-08-23 2:25 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, netdev, Tom Herbert, Mahesh Bandewar, Eric W. Biederman 于 2012年08月11日 13:54, Eric Dumazet 写道: > On Fri, 2012-08-10 at 17:42 +0200, Eric Dumazet wrote: >> From: Eric Dumazet <edumazet@google.com> >> >> I noticed extra one second delay in device dismantle, tracked down to >> a call to dst_dev_event() while some call_rcu() are still in RCU queues. >> > ... >> Signed-off-by: Eric Dumazet <edumazet@google.com> >> Cc: Tom Herbert <therbert@google.com> >> Cc: Mahesh Bandewar <maheshb@google.com> >> Cc: "Eric W. Biederman" <ebiederm@xmission.com> >> --- >> v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock >> as its more risky, base this patch on net-next > > Also I am leaving for a one week vacation with no access to the > Internet, so better hold this patch until my return ;) > Hi Eric I got this warning message with your patch applied. Aug 22 18:32:56 Donkey kernel: [ 124.733048] =============================== Aug 22 18:32:56 Donkey kernel: [ 124.733051] [ INFO: suspicious RCU usage. ] Aug 22 18:32:56 Donkey kernel: [ 124.733054] 3.6.0-rc1+ #14 Not tainted Aug 22 18:32:56 Donkey kernel: [ 124.733057] ------------------------------- Aug 22 18:32:56 Donkey kernel: [ 124.733060] include/linux/inetdevice.h:229 suspicious rcu_dereference_protected() usage! Aug 22 18:32:56 Donkey kernel: [ 124.733062] Aug 22 18:32:56 Donkey kernel: [ 124.733062] other info that might help us debug this: Aug 22 18:32:56 Donkey kernel: [ 124.733062] Aug 22 18:32:56 Donkey kernel: [ 124.733066] Aug 22 18:32:56 Donkey kernel: [ 124.733066] rcu_scheduler_active = 1, debug_locks = 1 Aug 22 18:32:56 Donkey kernel: [ 124.733069] 3 locks held by kworker/u:1/39: Aug 22 18:32:56 Donkey kernel: [ 124.733072] #0: (netns){.+.+.+}, at: [<ffffffff8107e117>] process_one_work+0x137/0x5d0 Aug 22 18:32:56 Donkey kernel: [ 124.733086] #1: (net_cleanup_work){+.+.+.}, at: [<ffffffff8107e117>] process_one_work+0x137/0x5d0 Aug 22 18:32:56 Donkey kernel: [ 124.733095] #2: (net_mutex){+.+.+.}, at: [<ffffffff81542fd5>] cleanup_net+0x85/0x1a0 Aug 22 18:32:56 Donkey kernel: [ 124.733106] Aug 22 18:32:56 Donkey kernel: [ 124.733106] stack backtrace: Aug 22 18:32:56 Donkey kernel: [ 124.733110] Pid: 39, comm: kworker/u:1 Not tainted 3.6.0-rc1+ #14 Aug 22 18:32:56 Donkey kernel: [ 124.733112] Call Trace: Aug 22 18:32:56 Donkey kernel: [ 124.733120] [<ffffffff810c2312>] lockdep_rcu_suspicious+0xe2/0x130 Aug 22 18:32:56 Donkey kernel: [ 124.733127] [<ffffffff815d35d4>] fib_netdev_event+0x114/0x180 Aug 22 18:32:56 Donkey kernel: [ 124.733133] [<ffffffff816914cc>] notifier_call_chain+0x5c/0x120 Aug 22 18:32:56 Donkey kernel: [ 124.733139] [<ffffffff8108d6b6>] raw_notifier_call_chain+0x16/0x20 Aug 22 18:32:56 Donkey kernel: [ 124.733144] [<ffffffff8154585c>] call_netdevice_notifiers+0x3c/0x70 Aug 22 18:32:56 Donkey kernel: [ 124.733149] [<ffffffff8154f88c>] netdev_run_todo+0x8c/0x280 Aug 22 18:32:56 Donkey kernel: [ 124.733154] [<ffffffff81544c9d>] ? synchronize_net+0x2d/0x40 Aug 22 18:32:56 Donkey kernel: [ 124.733159] [<ffffffff8155b90e>] rtnl_unlock+0xe/0x10 Aug 22 18:32:56 Donkey kernel: [ 124.733164] [<ffffffff81546c1a>] default_device_exit_batch+0xba/0xd0 Aug 22 18:32:56 Donkey kernel: [ 124.733170] [<ffffffff81542275>] ops_exit_list+0x55/0x60 Aug 22 18:32:56 Donkey kernel: [ 124.733175] [<ffffffff8154304b>] cleanup_net+0xfb/0x1a0 Aug 22 18:32:56 Donkey kernel: [ 124.733180] [<ffffffff8107e178>] process_one_work+0x198/0x5d0 Aug 22 18:32:56 Donkey kernel: [ 124.733185] [<ffffffff8107e117>] ? process_one_work+0x137/0x5d0 Aug 22 18:32:56 Donkey kernel: [ 124.733191] [<ffffffff8168b2b8>] ? __schedule+0x428/0x910 Aug 22 18:32:56 Donkey kernel: [ 124.733196] [<ffffffff81080951>] ? worker_thread+0x2a1/0x4b0 Aug 22 18:32:56 Donkey kernel: [ 124.733201] [<ffffffff81542f50>] ? net_drop_ns+0x50/0x50 Aug 22 18:32:56 Donkey kernel: [ 124.733207] [<ffffffff8108083a>] worker_thread+0x18a/0x4b0 Aug 22 18:32:56 Donkey kernel: [ 124.733212] [<ffffffff810806b0>] ? manage_workers+0x2a0/0x2a0 Aug 22 18:32:56 Donkey kernel: [ 124.733217] [<ffffffff810867de>] kthread+0xae/0xc0 Aug 22 18:32:56 Donkey kernel: [ 124.733222] [<ffffffff810c39cd>] ? trace_hardirqs_on+0xd/0x10 Aug 22 18:32:56 Donkey kernel: [ 124.733227] [<ffffffff81697044>] kernel_thread_helper+0x4/0x10 Aug 22 18:32:56 Donkey kernel: [ 124.733233] [<ffffffff8168d570>] ? retint_restore_args+0x13/0x13 Aug 22 18:32:56 Donkey kernel: [ 124.733237] [<ffffffff81086730>] ? flush_kthread_work+0x1a0/0x1a0 Aug 22 18:32:56 Donkey kernel: [ 124.733242] [<ffffffff81697040>] ? gs_change+0x13/0x13 Aug 22 18:32:56 Donkey kernel: [ 124.733248] Aug 22 18:32:56 Donkey kernel: [ 124.733250] =============================== ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-23 2:25 ` Gao feng @ 2012-08-23 2:51 ` David Miller 2012-08-23 2:58 ` Eric Dumazet 0 siblings, 1 reply; 14+ messages in thread From: David Miller @ 2012-08-23 2:51 UTC (permalink / raw) To: gaofeng; +Cc: eric.dumazet, netdev, therbert, maheshb, ebiederm From: Gao feng <gaofeng@cn.fujitsu.com> Date: Thu, 23 Aug 2012 10:25:00 +0800 > 于 2012年08月11日 13:54, Eric Dumazet 写道: >> On Fri, 2012-08-10 at 17:42 +0200, Eric Dumazet wrote: >>> From: Eric Dumazet <edumazet@google.com> >>> >>> I noticed extra one second delay in device dismantle, tracked down to >>> a call to dst_dev_event() while some call_rcu() are still in RCU queues. >>> >> ... >>> Signed-off-by: Eric Dumazet <edumazet@google.com> >>> Cc: Tom Herbert <therbert@google.com> >>> Cc: Mahesh Bandewar <maheshb@google.com> >>> Cc: "Eric W. Biederman" <ebiederm@xmission.com> >>> --- >>> v2: NETDEV_UNREGISTER_FINAL called outside of rtnl lock >>> as its more risky, base this patch on net-next >> >> Also I am leaving for a one week vacation with no access to the >> Internet, so better hold this patch until my return ;) >> > > Hi Eric > > I got this warning message with your patch applied. Ok I won't push this out until this is resolved. ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH net-next v2] net: remove delay at device dismantle 2012-08-23 2:51 ` David Miller @ 2012-08-23 2:58 ` Eric Dumazet 0 siblings, 0 replies; 14+ messages in thread From: Eric Dumazet @ 2012-08-23 2:58 UTC (permalink / raw) To: David Miller; +Cc: gaofeng, netdev, therbert, maheshb, ebiederm On Wed, 2012-08-22 at 19:51 -0700, David Miller wrote: > From: Gao feng <gaofeng@cn.fujitsu.com> > > Hi Eric > > > > I got this warning message with your patch applied. > > Ok I won't push this out until this is resolved. Thanks, I'll send a v2 ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2012-08-23 2:58 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2012-08-10 9:27 [RFC] NETDEV_UNREGISTER_BATCH seems unused nowaday ? Eric Dumazet 2012-08-10 10:42 ` David Miller 2012-08-10 11:06 ` Eric Dumazet 2012-08-10 14:45 ` Eric W. Biederman 2012-08-10 15:01 ` Eric Dumazet 2012-08-10 17:55 ` Paul E. McKenney 2012-08-10 14:46 ` [PATCH] net: remove delay at device dismantle Eric Dumazet 2012-08-10 15:42 ` [PATCH net-next v2] " Eric Dumazet 2012-08-11 5:54 ` Eric Dumazet 2012-08-11 5:57 ` David Miller 2012-08-23 2:18 ` David Miller 2012-08-23 2:25 ` Gao feng 2012-08-23 2:51 ` David Miller 2012-08-23 2:58 ` Eric Dumazet
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).