All of lore.kernel.org
 help / color / mirror / Atom feed
* [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
@ 2005-09-01  2:04 Robert Scott
  2005-09-01  4:26 ` Ben Greear
  2005-09-01  4:37 ` Stephen Hemminger
  0 siblings, 2 replies; 9+ messages in thread
From: Robert Scott @ 2005-09-01  2:04 UTC (permalink / raw)
  To: bridge

Hello,

I know that this bug has been discussed before at length on this  
mailing list, but previous post seemed to indicate that it was fixed  
before kernel 2.6.12.  I am still seeing this occasionally in kernel  
2.6.12.3.  The system is running knoppix, and IPV6 is not compiled  
into the kernel(other posts mentioned numerous problems with the IPV6  
code).  But every so often, when bringing down the bridge (it doesn't  
happen every time), the process hangs, and the following message  
appears in dmesg repeatedly:

'unregister_netdevice: waiting for br0 to become free. Usage count = 1'

None of the processes involved can be killed, and an attempt to run  
an ifconfig results in a process that is also waiting forever.  At  
this point the box must be rebooted forcefully.

Two questions.
1. In a previous post, someone mentioned one solution was to  
commenting out the check that is hanging in the kernel.   Does this  
check preventing something terrible from happening(i assumed that it  
does), or is it safe to remove it.
2. Any ideas of something to try in order to make this repeatable?

thanks,
--robert scott



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  2:04 Robert Scott
@ 2005-09-01  4:26 ` Ben Greear
  2005-09-01  4:37 ` Stephen Hemminger
  1 sibling, 0 replies; 9+ messages in thread
From: Ben Greear @ 2005-09-01  4:26 UTC (permalink / raw)
  To: Robert Scott; +Cc: bridge

Robert Scott wrote:
> Hello,
> 
> I know that this bug has been discussed before at length on this  
> mailing list, but previous post seemed to indicate that it was fixed  
> before kernel 2.6.12.  I am still seeing this occasionally in kernel  
> 2.6.12.3.  The system is running knoppix, and IPV6 is not compiled  into 
> the kernel(other posts mentioned numerous problems with the IPV6  
> code).  But every so often, when bringing down the bridge (it doesn't  
> happen every time), the process hangs, and the following message  
> appears in dmesg repeatedly:
> 
> 'unregister_netdevice: waiting for br0 to become free. Usage count = 1'

I have found an appearant leak of a route object, which holds a reference
to a device.  I reproduced in both 2.6.11 and 2.6.13 using 802.1Q VLANs.
I have a patch that will print out the place of the leaked reference
against 2.6.13.

http://www.candelatech.com/oss/rfcnt.patch

Enable the feature in the Networking section of Kconfig.

If you can reproduce with this patch in place, you will get a file and line number
for the leak..please CC me.  I'm going to try to debug the leak, but I
could definately use some help...

> None of the processes involved can be killed, and an attempt to run  an 
> ifconfig results in a process that is also waiting forever.  At  this 
> point the box must be rebooted forcefully.
> 
> Two questions.
> 1. In a previous post, someone mentioned one solution was to  commenting 
> out the check that is hanging in the kernel.   Does this  check 
> preventing something terrible from happening(i assumed that it  does), 
> or is it safe to remove it.

This would be bad...could lead to memory corruption.

> 2. Any ideas of something to try in order to make this repeatable?

I have a complex application with a complex script to drive it that reproduces
the problem within an hour...I haven't found a simpler way....

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  2:04 Robert Scott
  2005-09-01  4:26 ` Ben Greear
@ 2005-09-01  4:37 ` Stephen Hemminger
  2005-09-01  5:12   ` Ben Greear
                     ` (2 more replies)
  1 sibling, 3 replies; 9+ messages in thread
From: Stephen Hemminger @ 2005-09-01  4:37 UTC (permalink / raw)
  To: Robert Scott; +Cc: bridge

On Wed, 31 Aug 2005 19:04:01 -0700
Robert Scott <rbscott@axentra.net> wrote:

> Hello,
> 
> I know that this bug has been discussed before at length on this  
> mailing list, but previous post seemed to indicate that it was fixed  
> before kernel 2.6.12.  I am still seeing this occasionally in kernel  
> 2.6.12.3.  The system is running knoppix, and IPV6 is not compiled  
> into the kernel(other posts mentioned numerous problems with the IPV6  
> code).  But every so often, when bringing down the bridge (it doesn't  
> happen every time), the process hangs, and the following message  
> appears in dmesg repeatedly:
> 
> 'unregister_netdevice: waiting for br0 to become free. Usage count = 1'
> 
> None of the processes involved can be killed, and an attempt to run  
> an ifconfig results in a process that is also waiting forever.  At  
> this point the box must be rebooted forcefully.
> 
> Two questions.
> 1. In a previous post, someone mentioned one solution was to  
> commenting out the check that is hanging in the kernel.   Does this  
> check preventing something terrible from happening(i assumed that it  
> does), or is it safe to remove it

Really bad idea, because if the thing that is holding the reference
like packets stuck in some dead queue, ever get processed the kernel
will die.

> 2. Any ideas of something to try in order to make this repeatable?

Two other recent reports are:
1. Buggy applications that hold packets in their input queue forever,
   and/or netfilters.  The socket buffer's contain a reference for
   packets in flight.

2. The VLAN code had a number of reference bugs, if you look through
   recent netdev mailing list you will see the discussion.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  4:37 ` Stephen Hemminger
@ 2005-09-01  5:12   ` Ben Greear
  2005-09-01  6:33   ` Robert Scott
  2005-09-01 19:04   ` Patrick McHardy
  2 siblings, 0 replies; 9+ messages in thread
From: Ben Greear @ 2005-09-01  5:12 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Robert Scott, bridge

Stephen Hemminger wrote:

> Two other recent reports are:
> 1. Buggy applications that hold packets in their input queue forever,
>    and/or netfilters.  The socket buffer's contain a reference for
>    packets in flight.

Surely the sockets listen for the net-unregister event and clean
themselves up???  An app shouldn't be able to hang the kernel
like this...

> 2. The VLAN code had a number of reference bugs, if you look through
>    recent netdev mailing list you will see the discussion.

Could be VLAN code, but my debug patch seems to implicate the ref
leaks in neighbor.c....  I also reproduced with another type of VLANs
other than 802.1q...

Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  4:37 ` Stephen Hemminger
  2005-09-01  5:12   ` Ben Greear
@ 2005-09-01  6:33   ` Robert Scott
  2005-09-01 16:24     ` Stephen Hemminger
  2005-09-01 19:04   ` Patrick McHardy
  2 siblings, 1 reply; 9+ messages in thread
From: Robert Scott @ 2005-09-01  6:33 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: bridge

> Two other recent reports are:
> 1. Buggy applications that hold packets in their input queue forever,
>    and/or netfilters.  The socket buffer's contain a reference for
>    packets in flight.

that may be it, but I am not sure which queue you are talking about,  
but there is an application that is using the netfiler ip_queue to  
queue packets to user space.  in this application, these packets can  
be held in user space for extended periods of time (up to 30/60  
seconds), and then they are either dropped or released.  Could this  
possibly be creating a problem?

I don't believe that the system is using any of the VLAN code.

> I have found an appearant leak of a route object, which holds a  
> reference
> to a device.  I reproduced in both 2.6.11 and 2.6.13 using 802.1Q  
> VLANs.
> I have a patch that will print out the place of the leaked reference
> against 2.6.13.
>
> http://www.candelatech.com/oss/rfcnt.patch
>
> Enable the feature in the Networking section of Kconfig.
Ben, i will incorporate this patch and let you know if i turn up any  
results.

thanks,
--robert

On Aug 31, 2005, at 9:37 PM, Stephen Hemminger wrote:

> On Wed, 31 Aug 2005 19:04:01 -0700
> Robert Scott <rbscott@axentra.net> wrote:
>
>
>> Hello,
>>
>> I know that this bug has been discussed before at length on this
>> mailing list, but previous post seemed to indicate that it was fixed
>> before kernel 2.6.12.  I am still seeing this occasionally in kernel
>> 2.6.12.3.  The system is running knoppix, and IPV6 is not compiled
>> into the kernel(other posts mentioned numerous problems with the IPV6
>> code).  But every so often, when bringing down the bridge (it doesn't
>> happen every time), the process hangs, and the following message
>> appears in dmesg repeatedly:
>>
>> 'unregister_netdevice: waiting for br0 to become free. Usage count  
>> = 1'
>>
>> None of the processes involved can be killed, and an attempt to run
>> an ifconfig results in a process that is also waiting forever.  At
>> this point the box must be rebooted forcefully.
>>
>> Two questions.
>> 1. In a previous post, someone mentioned one solution was to
>> commenting out the check that is hanging in the kernel.   Does this
>> check preventing something terrible from happening(i assumed that it
>> does), or is it safe to remove it
>>
>
> Really bad idea, because if the thing that is holding the reference
> like packets stuck in some dead queue, ever get processed the kernel
> will die.
>
>
>> 2. Any ideas of something to try in order to make this repeatable?
>>
>
> Two other recent reports are:
> 1. Buggy applications that hold packets in their input queue forever,
>    and/or netfilters.  The socket buffer's contain a reference for
>    packets in flight.
>
> 2. The VLAN code had a number of reference bugs, if you look through
>    recent netdev mailing list you will see the discussion.
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  6:33   ` Robert Scott
@ 2005-09-01 16:24     ` Stephen Hemminger
  0 siblings, 0 replies; 9+ messages in thread
From: Stephen Hemminger @ 2005-09-01 16:24 UTC (permalink / raw)
  To: Robert Scott; +Cc: bridge

On Wed, 31 Aug 2005 23:33:22 -0700
Robert Scott <rbscott@axentra.net> wrote:

> > Two other recent reports are:
> > 1. Buggy applications that hold packets in their input queue forever,
> >    and/or netfilters.  The socket buffer's contain a reference for
> >    packets in flight.
> 
> that may be it, but I am not sure which queue you are talking about,  
> but there is an application that is using the netfiler ip_queue to  
> queue packets to user space.  in this application, these packets can  
> be held in user space for extended periods of time (up to 30/60  
> seconds), and then they are either dropped or released.  Could this  
> possibly be creating a problem?

That could be it, ask Patrick McHardy, Harald, or netfilter-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
  2005-09-01  4:37 ` Stephen Hemminger
  2005-09-01  5:12   ` Ben Greear
  2005-09-01  6:33   ` Robert Scott
@ 2005-09-01 19:04   ` Patrick McHardy
  2 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2005-09-01 19:04 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Robert Scott, bridge

Stephen Hemminger wrote:
> Two other recent reports are:
> 1. Buggy applications that hold packets in their input queue forever,
>    and/or netfilters.  The socket buffer's contain a reference for
>    packets in flight.

skb->dev doesn't take a reference and is reset before packets are queued
to sockets. dst->dev however does hold a reference and is not reset,
perhaps we should change that.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
@ 2005-09-02  7:37 Louis Croisez
  0 siblings, 0 replies; 9+ messages in thread
From: Louis Croisez @ 2005-09-02  7:37 UTC (permalink / raw)
  To: bridge, rbscott

[-- Attachment #1: Type: text/plain, Size: 2032 bytes --]

Hi Robert,

2. Any ideas of something to try in order to make this repeatable?
> 

What i would do to reproduce it, is make a script establishing the bridge, 
then flooding the bridge with some external ping -f, then shuting down the 
bridge. If there is a refcount release problem, it should appear on eavy 
load (some buffer overflow, or something like that).
Also, are you playing with ebtables? How do you use this bridge?
#Louis.


Date: Wed, 31 Aug 2005 19:04:01 -0700
> From: Robert Scott <rbscott@axentra.net>
> Subject: [Bridge] unregister_netdevice: waiting for br0 to become
> free. Usage count = 1 (2.6.12.3 <http://2.6.12.3/>)
> To: bridge@lists.osdl.org
> Message-ID: <8189B9F5-DF6E-4032-8D2D-D5301A02A081@axentra.net>
> Content-Type: text/plain; charset=US-ASCII; delsp=yes; format=flowed
> 
> Hello,
> 
> I know that this bug has been discussed before at length on this
> mailing list, but previous post seemed to indicate that it was fixed
> before kernel 2.6.12. I am still seeing this occasionally in kernel
> 2.6.12.3 <http://2.6.12.3/>. The system is running knoppix, and IPV6 is 
> not compiled
> into the kernel(other posts mentioned numerous problems with the IPV6
> code). But every so often, when bringing down the bridge (it doesn't
> happen every time), the process hangs, and the following message
> appears in dmesg repeatedly:
> 
> 'unregister_netdevice: waiting for br0 to become free. Usage count = 1'
> 
> None of the processes involved can be killed, and an attempt to run
> an ifconfig results in a process that is also waiting forever. At
> this point the box must be rebooted forcefully.
> 
> Two questions.
> 1. In a previous post, someone mentioned one solution was to
> commenting out the check that is hanging in the kernel. Does this
> check preventing something terrible from happening(i assumed that it
> does), or is it safe to remove it.
> 2. Any ideas of something to try in order to make this repeatable?
> 
> thanks,
> --robert scott

[-- Attachment #2: Type: text/html, Size: 3078 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3)
@ 2015-06-26 12:04 Jorge Dominguez
  0 siblings, 0 replies; 9+ messages in thread
From: Jorge Dominguez @ 2015-06-26 12:04 UTC (permalink / raw)
  To: bridge

[-- Attachment #1: Type: text/plain, Size: 4318 bytes --]

Hi,

This mail list provides me a lot of information about problem and I want share solution to bad refcount on bridge.

Solution is applied on kernel

https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=f3abc9b963e004b8c96cd7fbee6fd905f2bfd620

commit f216f082b2b37c4943f1e7c393e2786648d48f6f
([NETFILTER]: bridge netfilter: deal with martians correctly)
added a refcount leak on in_dev.

Instead of using in_dev_get(), we can use __in_dev_get_rcu(),
as netfilter hooks are running under rcu_read_lock(), as pointed
by Patrick.


diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index 4fde742..907a82e 100644
--- a/net/bridge/br_netfilter.c 
<https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_netfilter.c?id=cce5a5c3029f731b5ea17a8a9a953ff742abf0d6>
+++ b/net/bridge/br_netfilter.c 
<https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/net/bridge/br_netfilter.c?id=f3abc9b963e004b8c96cd7fbee6fd905f2bfd620>
@@ -359,7 +359,7 @@ static int br_nf_pre_routing_finish(struct sk_buff *skb)
},
.proto = 0,
};
- struct in_device *in_dev = in_dev_get(dev);
+ struct in_device *in_dev = __in_dev_get_rcu(dev);

Best Regards,
Jorge.



>/  Two other recent reports are:
/>/  1. Buggy applications that hold packets in their input queue forever,
/>/     and/or netfilters.  The socket buffer's contain a reference for
/>/     packets in flight.
/
that may be it, but I am not sure which queue you are talking about,
but there is an application that is using the netfiler ip_queue to
queue packets to user space.  in this application, these packets can
be held in user space for extended periods of time (up to 30/60
seconds), and then they are either dropped or released.  Could this
possibly be creating a problem?

I don't believe that the system is using any of the VLAN code.

>/  I have found an appearant leak of a route object, which holds a
/>/  reference
/>/  to a device.  I reproduced in both 2.6.11 and 2.6.13 using 802.1Q
/>/  VLANs.
/>/  I have a patch that will print out the place of the leaked reference
/>/  against 2.6.13.
/>/
/>/  http://www.candelatech.com/oss/rfcnt.patch
/>/
/>/  Enable the feature in the Networking section of Kconfig.
/Ben, i will incorporate this patch and let you know if i turn up any
results.

thanks,
--robert

On Aug 31, 2005, at 9:37 PM, Stephen Hemminger wrote:

>/  On Wed, 31 Aug 2005 19:04:01 -0700
/>/  Robert Scott <rbscott at axentra.net  <https://lists.linux-foundation.org/mailman/listinfo/bridge>> wrote:
/>/
/>/
/>>/  Hello,
/>>/
/>>/  I know that this bug has been discussed before at length on this
/>>/  mailing list, but previous post seemed to indicate that it was fixed
/>>/  before kernel 2.6.12.  I am still seeing this occasionally in kernel
/>>/  2.6.12.3.  The system is running knoppix, and IPV6 is not compiled
/>>/  into the kernel(other posts mentioned numerous problems with the IPV6
/>>/  code).  But every so often, when bringing down the bridge (it doesn't
/>>/  happen every time), the process hangs, and the following message
/>>/  appears in dmesg repeatedly:
/>>/
/>>/  'unregister_netdevice: waiting for br0 to become free. Usage count
/>>/  = 1'
/>>/
/>>/  None of the processes involved can be killed, and an attempt to run
/>>/  an ifconfig results in a process that is also waiting forever.  At
/>>/  this point the box must be rebooted forcefully.
/>>/
/>>/  Two questions.
/>>/  1. In a previous post, someone mentioned one solution was to
/>>/  commenting out the check that is hanging in the kernel.   Does this
/>>/  check preventing something terrible from happening(i assumed that it
/>>/  does), or is it safe to remove it
/>>/
/>/
/>/  Really bad idea, because if the thing that is holding the reference
/>/  like packets stuck in some dead queue, ever get processed the kernel
/>/  will die.
/>/
/>/
/>>/  2. Any ideas of something to try in order to make this repeatable?
/>>/
/>/
/>/  Two other recent reports are:
/>/  1. Buggy applications that hold packets in their input queue forever,
/>/     and/or netfilters.  The socket buffer's contain a reference for
/>/     packets in flight.
/>/
/>/  2. The VLAN code had a number of reference bugs, if you look through
/>/     recent netdev mailing list you will see the discussion.
/>/
/


[-- Attachment #2: Type: text/html, Size: 5567 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-06-26 12:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-26 12:04 [Bridge] unregister_netdevice: waiting for br0 to become free. Usage count = 1 (2.6.12.3) Jorge Dominguez
  -- strict thread matches above, loose matches on Subject: below --
2005-09-02  7:37 Louis Croisez
2005-09-01  2:04 Robert Scott
2005-09-01  4:26 ` Ben Greear
2005-09-01  4:37 ` Stephen Hemminger
2005-09-01  5:12   ` Ben Greear
2005-09-01  6:33   ` Robert Scott
2005-09-01 16:24     ` Stephen Hemminger
2005-09-01 19:04   ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.