netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be  killed, during ppp shutdown
       [not found]   ` <40B744DC.9956BF50@kepier.clara.net>
@ 2004-05-29  5:17     ` Herbert Xu
  2004-05-29 12:00       ` Herbert Xu
  2004-05-29 19:48       ` David S. Miller
  0 siblings, 2 replies; 7+ messages in thread
From: Herbert Xu @ 2004-05-29  5:17 UTC (permalink / raw)
  To: Neil Pilgrim, 251215
  Cc: Stephen Hemminger, David S. Miller, Jeff Garzik, netdev

On Fri, May 28, 2004 at 02:55:40PM +0100, Neil Pilgrim wrote:
>
> May 26 16:15:53 localhost kernel: kernel BUG at net/core/dev.c:3038!
> May 26 16:15:53 localhost kernel: invalid operand: 0000 [#1]
> May 26 16:15:53 localhost kernel: PREEMPT 
> May 26 16:15:53 localhost kernel: CPU:    0
> May 26 16:15:53 localhost kernel: EIP:    0060:[free_netdev+43/80]    Not tainted
> May 26 16:15:53 localhost kernel: EFLAGS: 00010297   (2.6.6-1-k7) 
> May 26 16:15:53 localhost kernel: EIP is at free_netdev+0x2b/0x50
> May 26 16:15:53 localhost kernel: eax: df236400   ebx: da638000   ecx: c02eada0   edx: 00000003
> May 26 16:15:53 localhost kernel: esi: df236400   edi: ddf3b400   ebp: e0b542c0   esp: da639f48
> May 26 16:15:53 localhost kernel: ds: 007b   es: 007b   ss: 0068
> May 26 16:15:53 localhost kernel: Process pppd (pid: 3852, threadinfo=da638000 task=da5f06b0)
> May 26 16:15:53 localhost kernel: Stack: e0b5094f df236400 00000282 da5ed780 ddf3b400 e0b4d020 dffe4840 d921309c 
> May 26 16:15:53 localhost kernel:        e0b4d07f ddf3b400 da5ed9c0 c0152d24 d921309c da5ed9c0 d81c3c10 da5ed9c0 
> May 26 16:15:53 localhost kernel:        00000000 df689900 da638000 c01513e9 da5ed9c0 df689900 df689900 da5ed9c0 
> May 26 16:15:53 localhost kernel: Call Trace:
> May 26 16:15:53 localhost kernel:  [__crc___netdev_watchdog_up+1116592/2857393] ppp_shutdown_interface+0x7f/0xf0 [ppp_generic]
> May 26 16:15:53 localhost kernel:  [__crc___netdev_watchdog_up+1101953/2857393] ppp_release+0x0/0x70 [ppp_generic]
> May 26 16:15:53 localhost kernel:  [__crc___netdev_watchdog_up+1102048/2857393] ppp_release+0x5f/0x70 [ppp_generic]
> May 26 16:15:53 localhost kernel:  [__fput+276/304] __fput+0x114/0x130
> May 26 16:15:53 localhost kernel:  [filp_close+89/144] filp_close+0x59/0x90
> May 26 16:15:53 localhost kernel:  [sys_close+97/160] sys_close+0x61/0xa0
> May 26 16:15:53 localhost kernel:  [syscall_call+7/11] syscall_call+0x7/0xb
> May 26 16:15:53 localhost kernel: 
> May 26 16:15:53 localhost kernel: Code: 0f 0b de 0b e0 75 29 c0 eb de 2b 80 04 02 00 00 89 44 24 04 

OK, I've had a look and it looks like the free_netdev call in
ppp_shutdown_inteface is the problem.  What's happening is that
the todo list is being processed either on another CPU or by
preemption in another context.  As a result when the subsequent
free_netdev is called the device hasn't yet been processed and
is still in state UNREGISTERING.

Why do we need to call free_netdev after unregistering the netdev
from the drivers at all? What's wrong with calling it from run_todo
itself?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be  killed, during ppp shutdown
  2004-05-29  5:17     ` Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown Herbert Xu
@ 2004-05-29 12:00       ` Herbert Xu
  2004-05-29 19:51         ` David S. Miller
  2004-05-29 19:48       ` David S. Miller
  1 sibling, 1 reply; 7+ messages in thread
From: Herbert Xu @ 2004-05-29 12:00 UTC (permalink / raw)
  To: Herbert Xu; +Cc: debian.bugs, 251215, shemminger, davem, jgarzik, netdev

Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> OK, I've had a look and it looks like the free_netdev call in
> ppp_shutdown_inteface is the problem.  What's happening is that
> the todo list is being processed either on another CPU or by
> preemption in another context.  As a result when the subsequent
> free_netdev is called the device hasn't yet been processed and
> is still in state UNREGISTERING.
> 
> Why do we need to call free_netdev after unregistering the netdev
> from the drivers at all? What's wrong with calling it from run_todo
> itself?

Well I guess someone might want to reregister the net device :)

Here is a minimal fix for this problem.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
===== net/core/dev.c 1.141 vs edited =====
--- 1.141/net/core/dev.c	2004-05-23 06:45:55 +10:00
+++ edited/net/core/dev.c	2004-05-29 21:56:18 +10:00
@@ -2999,7 +2999,6 @@
 
 		case NETREG_UNREGISTERING:
 			netdev_unregister_sysfs(dev);
-			dev->reg_state = NETREG_UNREGISTERED;
 
 			netdev_wait_allrefs(dev);
 
@@ -3015,6 +3014,7 @@
 			 */
 			if (dev->destructor)
 				dev->destructor(dev);
+			class_device_put(&dev->class_dev);
 			break;
 
 		default:
@@ -3140,6 +3140,8 @@
 	free_divert_blk(dev);
 
 	/* Finish processing unregister after unlock */
+	dev->reg_state = NETREG_UNREGISTERED;
+	class_device_get(&dev->class_dev);
 	net_set_todo(dev);
 
 	dev_put(dev);

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown
  2004-05-29  5:17     ` Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown Herbert Xu
  2004-05-29 12:00       ` Herbert Xu
@ 2004-05-29 19:48       ` David S. Miller
  2004-06-01 16:13         ` Stephen Hemminger
  1 sibling, 1 reply; 7+ messages in thread
From: David S. Miller @ 2004-05-29 19:48 UTC (permalink / raw)
  To: Herbert Xu; +Cc: debian.bugs, 251215, shemminger, jgarzik, netdev

On Sat, 29 May 2004 15:17:36 +1000
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Why do we need to call free_netdev after unregistering the netdev
> from the drivers at all? What's wrong with calling it from run_todo
> itself?

Because the driver is the only agent which knows when it is safe
to free up the structure.  It may still have some attached memory
to free, for example, ala:

	unregister_netdev(dev);
	kfree(dev->priv);
	free_netdev(dev);

This is common, for example in drivers/net/tg3.c:tg3_remove_one() we
have:

		struct tg3 *tp = netdev_priv(dev);

		unregister_netdev(dev);
		iounmap((void *)tp->regs);
		free_netdev(dev);

See?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown
  2004-05-29 12:00       ` Herbert Xu
@ 2004-05-29 19:51         ` David S. Miller
  2004-05-29 20:04           ` Herbert Xu
  0 siblings, 1 reply; 7+ messages in thread
From: David S. Miller @ 2004-05-29 19:51 UTC (permalink / raw)
  To: Herbert Xu; +Cc: herbert, debian.bugs, 251215, shemminger, jgarzik, netdev

On Sat, 29 May 2004 22:00:52 +1000
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> Here is a minimal fix for this problem.

Wouldn't it be easier just to move the "list_empty(&net_todo_list)"
check to be inside the net_todo_run_mutex semaphore?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown
  2004-05-29 19:51         ` David S. Miller
@ 2004-05-29 20:04           ` Herbert Xu
  2004-05-29 20:11             ` David S. Miller
  0 siblings, 1 reply; 7+ messages in thread
From: Herbert Xu @ 2004-05-29 20:04 UTC (permalink / raw)
  To: David S. Miller; +Cc: debian.bugs, 251215, shemminger, jgarzik, netdev

On Sat, May 29, 2004 at 12:51:53PM -0700, David S. Miller wrote:
> On Sat, 29 May 2004 22:00:52 +1000
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > Here is a minimal fix for this problem.
> 
> Wouldn't it be easier just to move the "list_empty(&net_todo_list)"
> check to be inside the net_todo_run_mutex semaphore?

Yes that should work much better.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email:  Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown
  2004-05-29 20:04           ` Herbert Xu
@ 2004-05-29 20:11             ` David S. Miller
  0 siblings, 0 replies; 7+ messages in thread
From: David S. Miller @ 2004-05-29 20:11 UTC (permalink / raw)
  To: Herbert Xu; +Cc: debian.bugs, 251215, shemminger, jgarzik, netdev

On Sun, 30 May 2004 06:04:16 +1000
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Sat, May 29, 2004 at 12:51:53PM -0700, David S. Miller wrote:
> > On Sat, 29 May 2004 22:00:52 +1000
> > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > 
> > > Here is a minimal fix for this problem.
> > 
> > Wouldn't it be easier just to move the "list_empty(&net_todo_list)"
> > check to be inside the net_todo_run_mutex semaphore?
> 
> Yes that should work much better.

Great, that's how I'll fix this bug.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown
  2004-05-29 19:48       ` David S. Miller
@ 2004-06-01 16:13         ` Stephen Hemminger
  0 siblings, 0 replies; 7+ messages in thread
From: Stephen Hemminger @ 2004-06-01 16:13 UTC (permalink / raw)
  To: David S. Miller; +Cc: Herbert Xu, debian.bugs, 251215, jgarzik, netdev

On Sat, 29 May 2004 12:48:33 -0700
"David S. Miller" <davem@redhat.com> wrote:

> On Sat, 29 May 2004 15:17:36 +1000
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > Why do we need to call free_netdev after unregistering the netdev
> > from the drivers at all? What's wrong with calling it from run_todo
> > itself?
> 
> Because the driver is the only agent which knows when it is safe
> to free up the structure.  It may still have some attached memory
> to free, for example, ala:
> 
> 	unregister_netdev(dev);
> 	kfree(dev->priv);
> 	free_netdev(dev);
> 
> This is common, for example in drivers/net/tg3.c:tg3_remove_one() we
> have:
> 
> 		struct tg3 *tp = netdev_priv(dev);
> 
> 		unregister_netdev(dev);
> 		iounmap((void *)tp->regs);
> 		free_netdev(dev);
> 
Also, for those device that don't want to do anything in between:
	dev->destructor = free_netdev;

Will end up calling free_netdev in the run_todo processing. This
can be very handy when unregister needs to happen in some context
already called with RTNL.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-06-01 16:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <E1BTLVa-000Ezb-FC@johnnybravo.uk.clara.net>
     [not found] ` <20040528124355.GA2391@gondor.apana.org.au>
     [not found]   ` <40B744DC.9956BF50@kepier.clara.net>
2004-05-29  5:17     ` Bug#251215: kernel-image-2.6.6-1-k7: pppd locks up, cannot be killed, during ppp shutdown Herbert Xu
2004-05-29 12:00       ` Herbert Xu
2004-05-29 19:51         ` David S. Miller
2004-05-29 20:04           ` Herbert Xu
2004-05-29 20:11             ` David S. Miller
2004-05-29 19:48       ` David S. Miller
2004-06-01 16:13         ` Stephen Hemminger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).