* Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? [not found] <bug-8638-10286@http.bugzilla.kernel.org/> @ 2007-06-16 15:34 ` Andrew Morton 2007-06-18 14:56 ` Chuck Ebbert 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2007-06-16 15:34 UTC (permalink / raw) To: netdev; +Cc: bugme-daemon@kernel-bugs.osdl.org, Paul Mackerras, kernelbugs On Sat, 16 Jun 2007 03:11:30 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8638 > > Summary: unregister_netdevice: waiting for ppp0 to become free. > pppoe + multihome + htb qos? > Product: Networking > Version: 2.5 > KernelVersion: 2.6.20-1.2316.fc5 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: Netfilter/Iptables > AssignedTo: networking_netfilter-iptables@kernel-bugs.osdl.org > ReportedBy: kernelbugs@tecnopolis.ca > > > Most recent kernel where this bug did not occur: has occurred since at least > 2.6.18-1.2200.fc5 (Sep 2005) but could have been in earlier versions as I > wasn't then using the tecnology I believe triggers the bug > Distribution: FC5 > Hardware Environment: x86 P4 UP 512MB > Software Environment: lots of cutting-edge (but stock kernel) networking > technology > Problem Description: > > Every few months on 1 box I administer: > kernel: unregister_netdevice: waiting for ppp0 to become free. Usage count = 1 > system gets very locked up (but often not completely, no panics) and won't > reboot: requires onsite hard reset. In fact, most reboot attempts will fail > even before the bug hits as a reboot will trigger the bug. I always reboot the > box with reboot -f now when I'm remote. > > I have a dozen extremely similar boxes to this buggy one out there and they > don't show this bug. Unique to this box and I think relevant to the bug: > > 1) 2 PPPoE DSL connections (multihomed, 2 IP addresses, traffic split by port, > used to achieve higher aggregate upload bandwidth) > 2) multi-table ip route rules ("ip rule add ... table 2") to achieve traffic > splitting in #1. > > Other technologies combined on this box but not on any others (though others > use them separately without the bug hitting): > > 3) QoS, HTB qdiscs (used on non-PPPoE boxes without the bug) > 4) 2.6sec IPSEC VPN (used on many other PPPoE and non-PPPoE boxes without > problems) > 5) PPPoE (used on many other boxes without this bug) > > I'm not even sure where to begin on what info to provide. I can provide my > config for any of the above technologies if it will help. The box is an > important production box and unless I can find a way to reliably make it barf > while onsite it may be hard to test things, like "turn off QoS", because all > the tecnologies are essential for day to day operations. > > I'll attach a useful log excerpt from the last 4 times the bug hit if I can. > > If this is a bad bug entry, please tell me what I need to add. It's my first > entry on this bugzilla and I'm not sure what's required. I'm sorry this bug > report is on the FC5 stock kernels, but I'm not sure I can use a "vanilla" > kernel instead of FC5 and not screw something up. However, there are NO binary > modules or any weird stuff on the box. It's all stock FC5 rpms. > > This box is a production box and the only one I have with 2 PPPoE connections > to test. I'm nearly positive it's either a 2-PPPoE+advanced-routing problem or > a 2-PPPoE+HTB problem. Since I've seen no other hits on google or elsewhere > that are exactly like this bug, I must assume it's something fairly unique to > this box: but what combination?! > > I've had a Redhat bugzilla open on this since Sep 2005 with zero replies! It > shows more detail and my thought process over the years. > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=169502 > > Steps to reproduce: > Haven't figured out a way to reliably hit this bug. Any hints to allow easier > testing (which must be done onsite) are welcome. > I have a vague feeling that we fixed this in a later kernel. Does anyone recall? Thanks. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? 2007-06-16 15:34 ` [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? Andrew Morton @ 2007-06-18 14:56 ` Chuck Ebbert 2007-06-18 15:18 ` Stephen Hemminger 2007-06-18 15:23 ` Andrew Morton 0 siblings, 2 replies; 4+ messages in thread From: Chuck Ebbert @ 2007-06-18 14:56 UTC (permalink / raw) To: Andrew Morton Cc: netdev, bugme-daemon@kernel-bugs.osdl.org, Paul Mackerras, kernelbugs Is there any way to print the addresses the notifier is calling to try and release net device references? I see: net/core/dev/c::netdev_wait_allrefs(): while (atomic_read(&dev->refcnt) != 0) { if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { rtnl_lock(); /* Rebroadcast unregister notification */ raw_notifier_call_chain(&netdev_chain, NETDEV_UNREGISTER, dev); but don't see any way to print the functions that get called. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? 2007-06-18 14:56 ` Chuck Ebbert @ 2007-06-18 15:18 ` Stephen Hemminger 2007-06-18 15:23 ` Andrew Morton 1 sibling, 0 replies; 4+ messages in thread From: Stephen Hemminger @ 2007-06-18 15:18 UTC (permalink / raw) To: Chuck Ebbert Cc: Andrew Morton, netdev, bugme-daemon@kernel-bugs.osdl.org, Paul Mackerras, kernelbugs On Mon, 18 Jun 2007 10:56:06 -0400 Chuck Ebbert <cebbert@redhat.com> wrote: > > Is there any way to print the addresses the notifier is calling > to try and release net device references? I see: > > net/core/dev/c::netdev_wait_allrefs(): > > while (atomic_read(&dev->refcnt) != 0) { > if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { > rtnl_lock(); > > /* Rebroadcast unregister notification */ > raw_notifier_call_chain(&netdev_chain, > NETDEV_UNREGISTER, dev); > > but don't see any way to print the functions that get called. You could walk the chain and print the functions out, but it wouldn't really help identify the problem. The problem is when a protocol forgets to call dev_put() after calling dev_hold(). The notifier there is just a last effort at beating a dead horse. It really should be removed since it never helps. The notifier in unregister does work, and calling the notification repeatedly doesn't change anything. -- Stephen Hemminger <shemminger@linux-foundation.org> ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? 2007-06-18 14:56 ` Chuck Ebbert 2007-06-18 15:18 ` Stephen Hemminger @ 2007-06-18 15:23 ` Andrew Morton 1 sibling, 0 replies; 4+ messages in thread From: Andrew Morton @ 2007-06-18 15:23 UTC (permalink / raw) To: Chuck Ebbert Cc: netdev, bugme-daemon@kernel-bugs.osdl.org, Paul Mackerras, kernelbugs On Mon, 18 Jun 2007 10:56:06 -0400 Chuck Ebbert <cebbert@redhat.com> wrote: > > Is there any way to print the addresses the notifier is calling > to try and release net device references? I see: > > net/core/dev/c::netdev_wait_allrefs(): > > while (atomic_read(&dev->refcnt) != 0) { > if (time_after(jiffies, rebroadcast_time + 1 * HZ)) { > rtnl_lock(); > > /* Rebroadcast unregister notification */ > raw_notifier_call_chain(&netdev_chain, > NETDEV_UNREGISTER, dev); > > but don't see any way to print the functions that get called. Nope. I guess we could add some print_notifier_call_chain() thing, but then we'd need one flavour per locking scheme and it would get ridiculous. I guess just an unlocked version would be OK - it's just a debug thing. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-06-18 15:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <bug-8638-10286@http.bugzilla.kernel.org/>
2007-06-16 15:34 ` [Bugme-new] [Bug 8638] New: unregister_netdevice: waiting for ppp0 to become free. pppoe + multihome + htb qos? Andrew Morton
2007-06-18 14:56 ` Chuck Ebbert
2007-06-18 15:18 ` Stephen Hemminger
2007-06-18 15:23 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).