public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-02-08  0:46 Thomas Hood
  2001-02-08 19:31 ` kuznet
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Hood @ 2001-02-08  0:46 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel; +Cc: Bryan K. Walton, Russell Coker

Earlier I reported error messages when I tried to eject
a Xircom CEM56 network card under Linux 2.4.x.  (See below.
I also submitted this patch as a followup to that thread.)

Here is a patch which may not solve the underlying
problem but which does prevent the kernel from generating
an infinite number of 
    "unregister_netdevice: waiting for eth0 to become free.
     Usage count = -1"
messages on "cardctl eject" and from hanging up at shutdown.

-----------------------------------------------------
root@thanatos:/usr/src/kernel-source-2.4.1-ac3/net/core# diff -Naur dev.c_ORIG dev.c
--- dev.c_ORIG	Mon Feb  5 17:39:31 2001
+++ dev.c	Wed Feb  7 18:35:45 2001
@@ -2555,7 +2555,7 @@
 	 */
 
 	now = warning_time = jiffies;
-	while (atomic_read(&dev->refcnt) != 1) {
+	while (atomic_read(&dev->refcnt) > 1) {
 		if ((jiffies - now) > 1*HZ) {
 			/* Rebroadcast unregister notification */
 			notifier_call_chain(&netdev_chain, NETDEV_UNREGISTER, dev);
---------------------------------------------------

The underlying problem seem so be that refcnt is zero or
less at this point.  This is erroneous.  The error in 
maintaining the refcnt appears to occur only when 
I configure the eth0 interface (using pump or dhclient).
The more times I "ifup eth0" and "ifdown eth0" before
ejecting the card, the lower the "usage count" is 
reported to be (i.e., the larger the negative number!).

Be that as it may, because of the erroneous refcnt,
the above while loop within unregister_netdevice()
loops forever in the original code.  As modified it
falls through; and this makes the kernel usable for me.

In order to avoid the 
   "KERNEL: assertion(dev->ip_ptr==NULL)failed at
    dev.c(2422):netdev_finish_unregister"
message and the occasional
   "Freeing alive device"
message it seems to suffice that I kill the dhclient
process before running "ifdown eth0".  Am I right in
assuming that the latter messages aren't serious?

I hope the networking gurus can find the real bugs here.

Thomas Hood

> I have a bit more information about this bug now.
> The message "assertion(yadda) failed ..." occurs only
> if the eth0 interface has been configured using pump
> or dhclient.  If the card isn't connected to the network
> the message never occurs.  If eth0 is merely brought up
> and down using ifconfig the message doesn't occur.  Only
> if pump or dhclient has configured eth0 does the message
> occur.  Sometimes it occurs on "ifdown eth0", sometimes
> on "cardctl eject" and sometimes during the shutdown
> sequence.
> 
> Thomas
> 
> > 
> > Dear l-k.
> > 
> > I'm still having this problem with kernel 2.4.0:
> > 
> > Conditions:
> > Linux 2.4.0 compiled on an IBM ThinkPad 600 51U (Pentium II)
> > laptop with PCMCIA support.  Same behavior with integral kernel
> > PCMCIA, modular kernel PCMCIA and modular Hinds PCMCIA.  System
> > is Progeny Debian beta II.
> > 
> > I have a Xircom modem/ethernet card which works correctly using
> > the serial_cs, xirc2ps_cs, ds, i82365 and pcmcia_core modules;
> > however when I try to "cardctl eject" or "reboot" I get first,
> >    "KERNEL: assertion(dev->ip_ptr==NULL)failed at
> >     dev.c(2422):netdev_finish_unregister"
> > (not exact since I had to copy it down on paper ... doesn't
> > show up in the logs) then a perpetual series of:
> >    "unregister_netdevice: waiting for eth0 to become free.
> >     Usage count = -1"
> > messages every five seconds or so.  "ps -A" reveals that
> > modprobe is running; it can't be killed even with "kill -9".
> > The "ifconfig" command locks up.  Shutdown won't complete
> > so I end up having to use SysRq-S-U-B to reboot.
> > 
> > This problem only occurs if the Xircom card is connected to
> > the Ethernet (in which case it is configured using DHCP).
> > If the card is left unconnected to the network, the problem
> > does not occur---the card can be ejected.
> >
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
  2001-02-08  0:46 Thomas Hood
@ 2001-02-08 19:31 ` kuznet
  0 siblings, 0 replies; 17+ messages in thread
From: kuznet @ 2001-02-08 19:31 UTC (permalink / raw)
  To: Thomas Hood; +Cc: linux-kernel

Hello!

> Here is a patch which may not solve the underlying

This does not. refcnt cannot be <1 at this point.


> assuming that the latter messages aren't serious?

They are fatal. Machine must be rebooted after them.


> I hope the networking gurus can find the real bugs here.

Well, someone forgets to grab refcnt or makes redundant dev_put.
Try to catch this, f.e. adding BUG() to the places where fatal
messages are generated to get backtraces.

Alexey
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-02-09  5:44 Thomas Hood
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-02-09  5:44 UTC (permalink / raw)
  To: linux-kernel, jschlst, Andrew Morton, Bryan K. Walton,
	Russell Coker, dahinds

Okay, I now know the cause of this problem.  It's a bug in
the ipx driver.  (This also closes pcmcia-cs bug #126563.)

Through copious use of printk()s I was able to track down
the point at which the dev->refcnt for my card was being
erroneously decremented.  It was being erroneously decremented
by "notifier_call_chain(&netdev_chain, NETDEV_DOWN, dev);"
near the end of the "dev_close()" function in net/core/dev.c.
This indicated that one of the registered notifiers
was doing an improper "dev_put()".  I rmmod-ed the ipx module
from my system and the problem magically disappeared.  The
refcnt is no longer inappropriately decremented and there
are no more inappropriate calls to netdev_finish_unregister()
in the absence of a prior call to unregister_netdevice()
(which is what resulted in the "Freeing alive device" messages).

Where is the bug?  It is in the ipx driver.  When I configure
eth0 for ipx, the device gets added to the ipx driver's linked
list of devices headed at "ipx_interfaces".  The ipx driver
registers the following function to be notified of net events.
It's clear that the ipx driver will do a __ipxitf_put (which
decrements dev-refcnt and does a dev_put() on dev) every time
eth0 is taken down with "ifconfig eth0 down"!  That would be
okay, I guess, if the opposite were done on an "ifconfig eth0 up"
but it isn't.  This needs to be fixed somehow.  I'll leave it
up to the maintainers and other gurus to figure out how.  In
the meantime I'll just avoid using ipx.

---------------- net/ipx/af_ipx.c ---------------------------
static int ipxitf_device_event(struct notifier_block *notifier,
                                unsigned long event, void *ptr)
{
        struct net_device *dev = ptr;
        ipx_interface *i, *tmp;

        if (event != NETDEV_DOWN)
                return NOTIFY_DONE;

        spin_lock_bh(&ipx_interfaces_lock);
        for (i = ipx_interfaces; i;) {
                tmp = i->if_next;
                if (i->if_dev == dev)
                        __ipxitf_put(i);
                i = tmp;

        }
        spin_unlock_bh(&ipx_interfaces_lock);
        return NOTIFY_DONE;
}
--------------------------------------------------------------

--
Thomas Hood
Please reply to me at:  jdthood_AT_yahoo.co.uk

I wrote:
> Earlier I reported error messages when I tried to eject 
> a Xircom CEM56 network card under Linux 2.4.x. (See below. 
> I also submitted this patch as a followup to that thread.) 
> 
> Here is a patch which may not solve the underlying 
> problem but which does prevent the kernel from generating 
> an infinite number of 
>     "unregister_netdevice: waiting for eth0 to become free. 
>      Usage count = -1" 
> messages on "cardctl eject" and from hanging up at shutdown. 
> 
> ----------------------------------------------------- 
> root@thanatos:/usr/src/kernel-source-2.4.1-ac3/net/core# diff -Naur dev.c_ORIG dev.c 
> --- dev.c_ORIG Mon Feb 5 17:39:31 2001 
> +++ dev.c Wed Feb 7 18:35:45 2001 
> @@ -2555,7 +2555,7 @@ 
>           */ 
>   
>          now = warning_time = jiffies; 
> - while (atomic_read(&dev->refcnt) != 1) { 
> + while (atomic_read(&dev->refcnt) > 1) { 
>                  if ((jiffies - now) > 1*HZ) { 
>                          /* Rebroadcast unregister notification */ 
>                          notifier_call_chain(&netdev_chain, NETDEV_UNREGISTER, dev); 
> --------------------------------------------------- 
> 
> The underlying problem seem so be that refcnt is zero or 
> less at this point. This is erroneous. The error in 
> maintaining the refcnt appears to occur only when 
> I configure the eth0 interface (using pump or dhclient). 
> The more times I "ifup eth0" and "ifdown eth0" before 
> ejecting the card, the lower the "usage count" is 
> reported to be (i.e., the larger the negative number!). 
> 
> Be that as it may, because of the erroneous refcnt, 
> the above while loop within unregister_netdevice() 
> loops forever in the original code. As modified it 
> falls through; and this makes the kernel usable for me. 
> 
> In order to avoid the 
>    "KERNEL: assertion(dev->ip_ptr==NULL)failed at 
>     dev.c(2422):netdev_finish_unregister" 
> message and the occasional 
>    "Freeing alive device" 
> message it seems to suffice that I kill the dhclient 
> process before running "ifdown eth0". Am I right in 
> assuming that the latter messages aren't serious? 
> 
> I hope the networking gurus can find the real bugs here. 
> 
> Thomas Hood
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-02-10  6:01 Thomas Hood
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-02-10  6:01 UTC (permalink / raw)
  To: linux-kernel; +Cc: kuznet

> > Here is a patch which may not solve the underlying 
> 
> This does not. refcnt cannot be <1 at this point. 

The refcnt shouldn't be less than 1, but it is in fact
less than 1.  (As I'm sure you understand.)

> > assuming that the latter messages aren't serious? 
> 
> They are fatal. Machine must be rebooted after them. 

True.  I found that with testing---lots of ifups and ifdowns,
etc.---the kernel becomes unstable.

> > I hope the networking gurus can find the real bugs here. 
> 
> Well, someone forgets to grab refcnt or makes redundant dev_put. 
> Try to catch this, f.e. adding BUG() to the places where fatal 
> messages are generated to get backtraces. 
> Alexey

I think that the ipx driver makes an inappropriate dev_put
in its notifier callback.  However that is for people better
acquainted with the come than I to judge.  Removing the ipx
driver does work around the problem though.

Thomas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-02-12 12:53 Thomas Hood
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-02-12 12:53 UTC (permalink / raw)
  To: linux-kernel
  Cc: jschlst, Andrew Morton, Bryan K. Walton, Russell Coker, dahinds,
	kuznet, jgarzik

This bug was fixed by "acme" in 2.4.1-ac10.  :)
The ipx driver now increments refcnt on NETDEV_UP to
match downing the interface on NETDEV_DOWN.

Thanks all.
Thomas


> Okay, I now know the cause of this problem.  It's a bug in
> the ipx driver.  (This also closes pcmcia-cs bug #126563.)
> 
> Through copious use of printk()s I was able to track down
> the point at which the dev->refcnt for my card was being
> erroneously decremented.  It was being erroneously decremented
> by "notifier_call_chain(&netdev_chain, NETDEV_DOWN, dev);"
> near the end of the "dev_close()" function in net/core/dev.c.
> This indicated that one of the registered notifiers
> was doing an improper "dev_put()".  I rmmod-ed the ipx module
> from my system and the problem magically disappeared.  The
> refcnt is no longer inappropriately decremented and there
> are no more inappropriate calls to netdev_finish_unregister()
> in the absence of a prior call to unregister_netdevice()
> (which is what resulted in the "Freeing alive device" messages).
> 
> Where is the bug?  It is in the ipx driver.  When I configure
> eth0 for ipx, the device gets added to the ipx driver's linked
> list of devices headed at "ipx_interfaces".  The ipx driver
> registers the following function to be notified of net events.
> It's clear that the ipx driver will do a __ipxitf_put (which
> decrements dev-refcnt and does a dev_put() on dev) every time
> eth0 is taken down with "ifconfig eth0 down"!  That would be
> okay, I guess, if the opposite were done on an "ifconfig eth0 up"
> but it isn't.  This needs to be fixed somehow.  I'll leave it
> up to the maintainers and other gurus to figure out how.  In
> the meantime I'll just avoid using ipx.
> 
> ---------------- net/ipx/af_ipx.c ---------------------------
> static int ipxitf_device_event(struct notifier_block *notifier,
>                                 unsigned long event, void *ptr)
> {
>         struct net_device *dev = ptr;
>         ipx_interface *i, *tmp;
> 
>         if (event != NETDEV_DOWN)
>                 return NOTIFY_DONE;
> 
>         spin_lock_bh(&ipx_interfaces_lock);
>         for (i = ipx_interfaces; i;) {
>                 tmp = i->if_next;
>                 if (i->if_dev == dev)
>                         __ipxitf_put(i);
>                 i = tmp;
> 
>         }
>         spin_unlock_bh(&ipx_interfaces_lock);
>         return NOTIFY_DONE;
> }
> --------------------------------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
  2001-02-12 18:56 Thomas Hood
@ 2001-02-12 17:27 ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2001-02-12 17:27 UTC (permalink / raw)
  To: Thomas Hood
  Cc: linux-kernel, jschlst, Andrew Morton, Bryan K. Walton,
	Russell Coker, dahinds, kuznet, jgarzik

Em Mon, Feb 12, 2001 at 01:56:18PM -0500, Thomas Hood escreveu:
> Sorry, but it turns out that the bug is not completely
> fixed by the change that acme made.  With the change,
> ifup-ing and if-downing eth0 with the ipx module loaded
> no longer reduces eth0's refcnt to an indefinitely low
> (larger and larger negative) number.  However if the ipx
> module is loaded first and ipx configured on eth0, and
> then the network card inserted and "ifconfig eth0 up" done,
> and then "ifconfig eth0 down" done, then once again the
> refcnt is too low, so that when I try to "cardctl eject"
> my ethernet card, "modprobe -r xirc2ps_cs" hangs up.
> This whole business of refcnts needs to be thought 
> through more carefully.

As I've told in the message I've sent, I'm unfortunately damn busy these
days and wrote this patch in a hurry, was waiting for your feedback to see
if it fixed the problem or not, but it was applied because it seemed
obviously correct. I'll try to work on this as soon as possible, but this
can take some time.

- Arnaldo
 
> > This bug was fixed by "acme" in 2.4.1-ac10.  :)
> > The ipx driver now increments refcnt on NETDEV_UP to
> > match downing the interface on NETDEV_DOWN.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-02-12 18:56 Thomas Hood
  2001-02-12 17:27 ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 17+ messages in thread
From: Thomas Hood @ 2001-02-12 18:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: jschlst, Andrew Morton, Bryan K. Walton, Russell Coker, dahinds,
	kuznet, jgarzik, acme

Sorry, but it turns out that the bug is not completely
fixed by the change that acme made.  With the change,
ifup-ing and if-downing eth0 with the ipx module loaded
no longer reduces eth0's refcnt to an indefinitely low
(larger and larger negative) number.  However if the ipx
module is loaded first and ipx configured on eth0, and
then the network card inserted and "ifconfig eth0 up" done,
and then "ifconfig eth0 down" done, then once again the
refcnt is too low, so that when I try to "cardctl eject"
my ethernet card, "modprobe -r xirc2ps_cs" hangs up.
This whole business of refcnts needs to be thought 
through more carefully.

> This bug was fixed by "acme" in 2.4.1-ac10.  :)
> The ipx driver now increments refcnt on NETDEV_UP to
> match downing the interface on NETDEV_DOWN.
> 
> Thanks all.
> Thomas
> 
> 
> > Okay, I now know the cause of this problem.  It's a bug in
> > the ipx driver.  (This also closes pcmcia-cs bug #126563.)
> > 
> > Through copious use of printk()s I was able to track down
> > the point at which the dev->refcnt for my card was being
> > erroneously decremented.  It was being erroneously decremented
> > by "notifier_call_chain(&netdev_chain, NETDEV_DOWN, dev);"
> > near the end of the "dev_close()" function in net/core/dev.c.
> > This indicated that one of the registered notifiers
> > was doing an improper "dev_put()".  I rmmod-ed the ipx module
> > from my system and the problem magically disappeared.  The
> > refcnt is no longer inappropriately decremented and there
> > are no more inappropriate calls to netdev_finish_unregister()
> > in the absence of a prior call to unregister_netdevice()
> > (which is what resulted in the "Freeing alive device" messages).
> > 
> > Where is the bug?  It is in the ipx driver.  When I configure
> > eth0 for ipx, the device gets added to the ipx driver's linked
> > list of devices headed at "ipx_interfaces".  The ipx driver
> > registers the following function to be notified of net events.
> > It's clear that the ipx driver will do a __ipxitf_put (which
> > decrements dev-refcnt and does a dev_put() on dev) every time
> > eth0 is taken down with "ifconfig eth0 down"!  That would be
> > okay, I guess, if the opposite were done on an "ifconfig eth0 up"
> > but it isn't.  This needs to be fixed somehow.  I'll leave it
> > up to the maintainers and other gurus to figure out how.  In
> > the meantime I'll just avoid using ipx.
> > 
> > ---------------- net/ipx/af_ipx.c ---------------------------
> > static int ipxitf_device_event(struct notifier_block *notifier,
> >                                 unsigned long event, void *ptr)
> > {
> >         struct net_device *dev = ptr;
> >         ipx_interface *i, *tmp;
> > 
> >         if (event != NETDEV_DOWN)
> >                 return NOTIFY_DONE;
> > 
> >         spin_lock_bh(&ipx_interfaces_lock);
> >         for (i = ipx_interfaces; i;) {
> >                 tmp = i->if_next;
> >                 if (i->if_dev == dev)
> >                         __ipxitf_put(i);
> >                 i = tmp;
> > 
> >         }
> >         spin_unlock_bh(&ipx_interfaces_lock);
> >         return NOTIFY_DONE;
> > }
> > --------------------------------------------------------------
>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://vger.kernel.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
@ 2001-02-15 21:13       ` Thomas Hood
  2001-02-19 15:27         ` [PATCH] fix bad dev->refcnt in unregister_netdevice was " Arnaldo Carvalho de Melo
  2001-02-21 16:22       ` Thomas Hood
                         ` (4 subsequent siblings)
  5 siblings, 1 reply; 17+ messages in thread
From: Thomas Hood @ 2001-02-15 21:13 UTC (permalink / raw)
  To: linux-kernel

Update on the "unregister_netdevice" bug ...

Arnaldo Carvalho de Melo has been valiantly trying in his
scarce free time to find the cause.  I haven't been able to
hunt effectively because I don't really understand the networking
code; however I have been experimenting to see what are the
exact conditions under which the failure occurs.  I modified
my kernel to print dev->refcnt in /proc/net/dev so that I
could see what the refcnt of eth0 is at any given moment.
One of the more interesting experiment logs is appended 
below.

Experimentation seems to show
1) It happens when ipx is used, specifically when 
   auto_interface=on and auto_primary=on
2) It happens only or especially when using DHCP
3) It happens only to PCMCIA ethernet cards

Thomas Hood
jdthood_AT_yahoo.co.uk

Linux 2.4.1-ac10
/etc/pcmcia/network disabled with an 'exit 0'

command                         refcnt  message
-------                         ------  -------
(boot)                               0
(I inserted Xircom card)             1
ifconfig eth0 up                     2
ipx_configure --auto_interface=on --auto_primary=on    2
ifconfig eth0 down                   0  "Freeing alive device c127ac8c, eth0"
cardctl eject                        ?  "unregister_netdevice: waiting for
   eth0 to become free. Usage count = 0
   Message from syslogd@thanatos at Wed Feb 14 12:51:26 2001 ...
   thanatos kernel: unregister_netdevice: waiting for eth0 to become free.
   Usage count = 0"

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH] fix bad dev->refcnt in unregister_netdevice was Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
  2001-02-15 21:13       ` [PATCH] to deal with bad dev->refcnt in unregister_netdevice() Thomas Hood
@ 2001-02-19 15:27         ` Arnaldo Carvalho de Melo
  0 siblings, 0 replies; 17+ messages in thread
From: Arnaldo Carvalho de Melo @ 2001-02-19 15:27 UTC (permalink / raw)
  To: davem; +Cc: linux-kernel, jdthood

Hi,

	Found it, here's the patch, please apply. Not using auto creation
of interfaces we're using dev_get_by_name, that does a dev_hold, on
ipx_auto_create we're not doing it, duh 8) Tested with several
combinhations of unpluging pcmcia card, just ifconfig eth0 down, etc

- Arnaldo

--- linux-2.4.2-pre4/net/ipx/af_ipx.c	Mon Feb 19 06:00:21 2001
+++ linux-2.4.2-pre4.acme/net/ipx/af_ipx.c	Mon Feb 19 12:15:27 2001
@@ -1194,6 +1194,7 @@
 		atomic_set(&intrfc->refcnt, 1);
 		MOD_INC_USE_COUNT;
 		ipxitf_insert(intrfc);
+		dev_hold(dev);
 	}
 
 	return intrfc;


Em Thu, Feb 15, 2001 at 04:13:01PM -0500, Thomas Hood escreveu:
> Update on the "unregister_netdevice" bug ...
> 
> Arnaldo Carvalho de Melo has been valiantly trying in his
> scarce free time to find the cause.  I haven't been able to
> hunt effectively because I don't really understand the networking
> code; however I have been experimenting to see what are the
> exact conditions under which the failure occurs.  I modified
> my kernel to print dev->refcnt in /proc/net/dev so that I
> could see what the refcnt of eth0 is at any given moment.
> One of the more interesting experiment logs is appended 
> below.
> 
> Experimentation seems to show
> 1) It happens when ipx is used, specifically when 
>    auto_interface=on and auto_primary=on
> 2) It happens only or especially when using DHCP
> 3) It happens only to PCMCIA ethernet cards
> 
> Thomas Hood
> jdthood_AT_yahoo.co.uk
> 
> Linux 2.4.1-ac10
> /etc/pcmcia/network disabled with an 'exit 0'
> 
> command                         refcnt  message
> -------                         ------  -------
> (boot)                               0
> (I inserted Xircom card)             1
> ifconfig eth0 up                     2
> ipx_configure --auto_interface=on --auto_primary=on    2
> ifconfig eth0 down                   0  "Freeing alive device c127ac8c, eth0"
> cardctl eject                        ?  "unregister_netdevice: waiting for
>    eth0 to become free. Usage count = 0
>    Message from syslogd@thanatos at Wed Feb 14 12:51:26 2001 ...
>    thanatos kernel: unregister_netdevice: waiting for eth0 to become free.
>    Usage count = 0"
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
  2001-02-15 21:13       ` [PATCH] to deal with bad dev->refcnt in unregister_netdevice() Thomas Hood
@ 2001-02-21 16:22       ` Thomas Hood
  2001-02-25 23:42       ` Should isa-pnp utilize the PnP BIOS? Thomas Hood
                         ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-02-21 16:22 UTC (permalink / raw)
  To: linux-kernel

Update on the "unregister_netdevice" bug ...

Arnaldo Carvalho de Melo found one bug but there
remains another one that makes the dev->refcnt too
high instead of too low.

To be continued ...

Thomas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Should isa-pnp utilize the PnP BIOS?
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
  2001-02-15 21:13       ` [PATCH] to deal with bad dev->refcnt in unregister_netdevice() Thomas Hood
  2001-02-21 16:22       ` Thomas Hood
@ 2001-02-25 23:42       ` Thomas Hood
  2001-02-25 23:49         ` Jeremy Jackson
  2001-02-26  0:24         ` Jonathan Morton
  2001-03-03  0:33       ` Thomas Hood
                         ` (2 subsequent siblings)
  5 siblings, 2 replies; 17+ messages in thread
From: Thomas Hood @ 2001-02-25 23:42 UTC (permalink / raw)
  To: linux-kernel

Hello, l-k.

On my ThinkPad 600, The ThinkPad PnP BIOS configures
all PnP devices at boot time.

If I load the isa-pnp.o driver it never detects any ISA PnP
devices: it says "isapnp: No Plug & Play device found".  This
is unfortunate, because it means that device drivers can't
find out from isa-pnp where the devices are.

David Hinds's pcmcia-cs package contains driver code that
interfaces with the PnP BIOS.  With it, one can list the resource
usage of ISA PnP devices (serial and parallel ports, sound chip,
etc.) and set them, using the "lspnp" and "setpnp" commands.

Would it not be useful if the isa-pnp driver would fall back
to utilizing the PnP BIOS (if possible) in order to read and
change ISA PnP device configurations when it can't do this
itself?  If so, could this perhaps be done by bringing the Hinds
PnP BIOS driver into the kernel and interfacing isa-pnp to it?

Thomas Hood
jdthood_AT_yahoo.co.uk   <- Change '_AT_' to '@'

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Should isa-pnp utilize the PnP BIOS?
  2001-02-25 23:42       ` Should isa-pnp utilize the PnP BIOS? Thomas Hood
@ 2001-02-25 23:49         ` Jeremy Jackson
  2001-02-26  0:24         ` Jonathan Morton
  1 sibling, 0 replies; 17+ messages in thread
From: Jeremy Jackson @ 2001-02-25 23:49 UTC (permalink / raw)
  To: jdthoodREMOVETHIS; +Cc: linux-kernel

Thomas Hood wrote:

> On my ThinkPad 600, The ThinkPad PnP BIOS configures
> all PnP devices at boot time.
>
> If I load the isa-pnp.o driver it never detects any ISA PnP
> devices: it says "isapnp: No Plug & Play device found".  This
> is unfortunate, because it means that device drivers can't
> find out from isa-pnp where the devices are.
>
> David Hinds's pcmcia-cs package contains driver code that
> interfaces with the PnP BIOS.  With it, one can list the resource
> usage of ISA PnP devices (serial and parallel ports, sound chip,
> etc.) and set them, using the "lspnp" and "setpnp" commands.
>
> Would it not be useful if the isa-pnp driver would fall back
> to utilizing the PnP BIOS (if possible) in order to read and

I would find this EXTREMELY usefull... my Compaq laptop's
hot-dock with power eject will only work if Linux uses
PnP BIOS's insert/eject methods.

I saw some code in early 2.3 that would talk to bios, i still have
a tarball, but it seems 2.4 only does hardware banging (best in
*most* cases...)

>
> change ISA PnP device configurations when it can't do this
> itself?  If so, could this perhaps be done by bringing the Hinds
> PnP BIOS driver into the kernel and interfacing isa-pnp to it?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Should isa-pnp utilize the PnP BIOS?
  2001-02-25 23:42       ` Should isa-pnp utilize the PnP BIOS? Thomas Hood
  2001-02-25 23:49         ` Jeremy Jackson
@ 2001-02-26  0:24         ` Jonathan Morton
  1 sibling, 0 replies; 17+ messages in thread
From: Jonathan Morton @ 2001-02-26  0:24 UTC (permalink / raw)
  To: Jeremy Jackson, linux-kernel

>> Would it not be useful if the isa-pnp driver would fall back
>> to utilizing the PnP BIOS (if possible) in order to read and
>
>I would find this EXTREMELY usefull... my Compaq laptop's
>hot-dock with power eject will only work if Linux uses
>PnP BIOS's insert/eject methods.
>
>I saw some code in early 2.3 that would talk to bios, i still have
>a tarball, but it seems 2.4 only does hardware banging (best in
>*most* cases...)

There are some desktop m/boards that don't seem to respond to the
kernel-mode ISA-PnP at the moment, too.  Particularly my Abit KT7 - I have
to use user-mode ISA-PnP for it to pick up my nice SB AWE-64.  This needs
fixing somehow, and maybe looking at the PnP BIOS stuff is the right way.

--------------------------------------------------------------
from:     Jonathan "Chromatix" Morton
mail:     chromi@cyberspace.org  (not for attachments)
big-mail: chromatix@penguinpowered.com
uni-mail: j.d.morton@lancaster.ac.uk

The key to knowledge is not to rely on people to teach you it.

Get VNC Server for Macintosh from http://www.chromatix.uklinux.net/vnc/

-----BEGIN GEEK CODE BLOCK-----
Version 3.12
GCS$/E/S dpu(!) s:- a20 C+++ UL++ P L+++ E W+ N- o? K? w--- O-- M++$ V? PS
PE- Y+ PGP++ t- 5- X- R !tv b++ DI+++ D G e+ h+ r- y+
-----END GEEK CODE BLOCK-----



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Should isa-pnp utilize the PnP BIOS?
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
                         ` (2 preceding siblings ...)
  2001-02-25 23:42       ` Should isa-pnp utilize the PnP BIOS? Thomas Hood
@ 2001-03-03  0:33       ` Thomas Hood
  2001-07-20  2:48       ` [BUG] "unregister_netdevice: waiting for eth0 to become free. Usage count = 2" Thomas Hood
  2001-07-28  5:44       ` Multiple apm resume events Thomas Hood
  5 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-03-03  0:33 UTC (permalink / raw)
  To: linux-kernel

Okay, a couple of people have responded positively to this 
suggestion.  The next question is, how should it be implemented?

How 'bout:

$ cd pcmcia-cs/modules
$ cp pnp_bios.c pnp_proc.c pnp_rsrc.c /usr/src/linux/2.4.2a/drivers/pnp
$ cd ../include/linux
$ cp pnp_bios.h pnp_resource.h /usr/src/linux/2.4.2a/include/linux
Edit makefiles
Edit isapnp.c to include new global flag "isapnp_usepnpbios",
  a MODULE_PARM, which each isapnp function checks at entry.
  If the flag is set then: in the case of "low-level" functions,
  return immediately; in the case of "high-level" functions, call
  appropriate pnp_bios functions to perform the task; in the case
  of isapnp_init(), just check isapnp_disabled and exit.  isapnp's
  /proc interface would not be supported.  Presumably
  inter_module_get_request() would be used to call the isapnp-bios
  routines.

Comments?  (Go easy on me; I'm a newbie at kernel hacking.)

Thomas

> Hello, l-k.
> 
> On my ThinkPad 600, The ThinkPad PnP BIOS configures
> all PnP devices at boot time.
> 
> If I load the isa-pnp.o driver it never detects any ISA PnP
> devices: it says "isapnp: No Plug & Play device found".  This
> is unfortunate, because it means that device drivers can't
> find out from isa-pnp where the devices are.
> 
> David Hinds's pcmcia-cs package contains driver code that
> interfaces with the PnP BIOS.  With it, one can list the resource
> usage of ISA PnP devices (serial and parallel ports, sound chip,
> etc.) and set them, using the "lspnp" and "setpnp" commands.
> 
> Would it not be useful if the isa-pnp driver would fall back
> to utilizing the PnP BIOS (if possible) in order to read and
> change ISA PnP device configurations when it can't do this
> itself?  If so, could this perhaps be done by bringing the Hinds
> PnP BIOS driver into the kernel and interfacing isa-pnp to it?
> 
> Thomas Hood
> jdthood_AT_yahoo.co.uk   <- Change '_AT_' to '@'

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] to deal with bad dev->refcnt in unregister_netdevice()
@ 2001-03-10  2:46 Thomas Hood
  0 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-03-10  2:46 UTC (permalink / raw)
  To: linux-kernel

This bug seems to be fixed in 2.4.2-ac16.

Thanks again to Arnaldo Carvalho de Melo.

Thomas

On 21 Feb 2001 I wrote:
> Update on the "unregister_netdevice" bug ... 
> 
> Arnaldo Carvalho de Melo found one bug but there 
> remains another one that makes the dev->refcnt too 
> high instead of too low. 
> 
> To be continued ... 
> 
> Thomas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [BUG] "unregister_netdevice: waiting for eth0 to become free. Usage  count = 2"
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
                         ` (3 preceding siblings ...)
  2001-03-03  0:33       ` Thomas Hood
@ 2001-07-20  2:48       ` Thomas Hood
  2001-07-28  5:44       ` Multiple apm resume events Thomas Hood
  5 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-07-20  2:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: acme

>Groan<   The "unregister_netdevice" bug is back.

I haven't been able to do extensive testing, but I have
just encountered the message
   unregister_netdevice: waiting for eth0 to become free. Usage count = 2
again.  Once it starts, it repeats ad infinitum, once per second.
The message starts spewing when I do a "cardctl eject"
on a Xircom CEM56 modem/ethernet card (driven by xirc2ps_cs.o, serial.o)
which was previously configured using DHCP with IPX enabled.  The cardctl
eject never completes and the OS will not shut down completely; it hangs
at the point where it tries to de-configure network interfaces. Disabling
IPX cured the problem for me the next time I tried.

I am running 2.4.6-ac2, but the bug could have been reintroduced
a while back.  I haven't been using Ethernet for a couple of
months.

Well, I'm no expert on the networking code, so I'll just suggest
some things that look odd to me.  I'm looking in net/ipx/af_ipx.c,
tracing through ipxitf_create().  This function exits with dev->refcnt
incremented ... unless something goes wrong, in which case the function
exits through via a goto to "out_dev" which decrements the refcnt again.
Likewise, ipxitf_auto_create() increments the dev refcnt (by doing a
dev_hold(dev)) if all goes well.  However when I look at ipxitf_delete(),
which I presume ought to undo what the *_create() functions do, I see
nothing that decrements the refcnt.

If this is where the bug lies then I would suggest that the functions
be documented to say that "this function exits with the refcnt incremented
if blah blah blah", etc.  

As an aside, I notice that __dev_get_by_name() is called from ipxitf_delete().
A comment preceding __dev_get_by_name() in net/core/dev.c says that this
function should be called "under RTNL semaphore or @dev_base_lock", but
it is actually called under the ipx_interfaces_lock.  Is this okay?
__dev_get_by_name() is also called from within ipxitf_ioctl(), seemingly
under no locks at all.  Also okay?

Thomas Hood

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Multiple apm resume events
       [not found]     ` <20010214122244.H7859@conectiva.com.br>
                         ` (4 preceding siblings ...)
  2001-07-20  2:48       ` [BUG] "unregister_netdevice: waiting for eth0 to become free. Usage count = 2" Thomas Hood
@ 2001-07-28  5:44       ` Thomas Hood
  5 siblings, 0 replies; 17+ messages in thread
From: Thomas Hood @ 2001-07-28  5:44 UTC (permalink / raw)
  To: linux-kernel

Machine:   ThinkPad 600
Kernel:    2.4.7-ac1

When I resume the machine the apmd_proxy script handles
*two* "resume suspend" events.  The apm driver ought to
filter multiple resume events.

About a year ago I had this and a couple other problems
with the apm driver.  I submitted patches to the maintainer,
Stephen Rothwell, but he was MIA.  "Too busy", he said.
I see he is still listed as the maintainer.  Is there 
someone else who is acting as the de facto maintainer or
should I just post patches to this list?

Thomas Hood
Please cc: jdthood_AT_yahoo.co.uk  with "_AT_" -> "@"

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2001-07-28  5:45 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20010214092251.D1144@e-trend.de>
     [not found] ` <3A8AA725.7446DEA0@ubishops.ca>
     [not found]   ` <20010214165758.L28359@e-trend.de>
     [not found]     ` <20010214122244.H7859@conectiva.com.br>
2001-02-15 21:13       ` [PATCH] to deal with bad dev->refcnt in unregister_netdevice() Thomas Hood
2001-02-19 15:27         ` [PATCH] fix bad dev->refcnt in unregister_netdevice was " Arnaldo Carvalho de Melo
2001-02-21 16:22       ` Thomas Hood
2001-02-25 23:42       ` Should isa-pnp utilize the PnP BIOS? Thomas Hood
2001-02-25 23:49         ` Jeremy Jackson
2001-02-26  0:24         ` Jonathan Morton
2001-03-03  0:33       ` Thomas Hood
2001-07-20  2:48       ` [BUG] "unregister_netdevice: waiting for eth0 to become free. Usage count = 2" Thomas Hood
2001-07-28  5:44       ` Multiple apm resume events Thomas Hood
2001-03-10  2:46 [PATCH] to deal with bad dev->refcnt in unregister_netdevice() Thomas Hood
  -- strict thread matches above, loose matches on Subject: below --
2001-02-12 18:56 Thomas Hood
2001-02-12 17:27 ` Arnaldo Carvalho de Melo
2001-02-12 12:53 Thomas Hood
2001-02-10  6:01 Thomas Hood
2001-02-09  5:44 Thomas Hood
2001-02-08  0:46 Thomas Hood
2001-02-08 19:31 ` kuznet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox