netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix a lock problem in generic phy code
@ 2007-08-31 12:30 Hans-Jürgen Koch
  2007-08-31 16:29 ` Hans-Jürgen Koch
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Hans-Jürgen Koch @ 2007-08-31 12:30 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev, Jeff Garzik, afleming

Lock debugging finds a problem in phy.c and phy_device.c,
this patch fixes it. Tested on an AT91SAM9263-EK board, 
kernel 2.6.23-rc4.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
---

Index: linux-2.6.23-rc/drivers/net/phy/phy_device.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy_device.c	2007-08-31 
14:07:47.000000000 +0200
+++ linux-2.6.23-rc/drivers/net/phy/phy_device.c	2007-08-31 14:08:22.000000000 
+0200
@@ -644,7 +644,7 @@
 	if (!(phydrv->flags & PHY_HAS_INTERRUPT))
 		phydev->irq = PHY_POLL;
 
-	spin_lock(&phydev->lock);
+	spin_lock_bh(&phydev->lock);
 
 	/* Start out supporting everything. Eventually,
 	 * a controller will attach, and may modify one
@@ -658,7 +658,7 @@
 	if (phydev->drv->probe)
 		err = phydev->drv->probe(phydev);
 
-	spin_unlock(&phydev->lock);
+	spin_unlock_bh(&phydev->lock);
 
 	return err;
 
Index: linux-2.6.23-rc/drivers/net/phy/phy.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy.c	2007-08-31 14:15:20.000000000 
+0200
+++ linux-2.6.23-rc/drivers/net/phy/phy.c	2007-08-31 14:15:43.000000000 +0200
@@ -755,7 +755,7 @@
  */
 void phy_start(struct phy_device *phydev)
 {
-	spin_lock(&phydev->lock);
+	spin_lock_bh(&phydev->lock);
 
 	switch (phydev->state) {
 		case PHY_STARTING:
@@ -769,7 +769,7 @@
 		default:
 			break;
 	}
-	spin_unlock(&phydev->lock);
+	spin_unlock_bh(&phydev->lock);
 }
 EXPORT_SYMBOL(phy_stop);
 EXPORT_SYMBOL(phy_start);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix a lock problem in generic phy code
  2007-08-31 12:30 [PATCH] Fix a lock problem in generic phy code Hans-Jürgen Koch
@ 2007-08-31 16:29 ` Hans-Jürgen Koch
  2007-09-09 21:36 ` [PATCH resend] " Hans-Jürgen Koch
  2007-09-13  4:12 ` [PATCH] " Jeff Garzik
  2 siblings, 0 replies; 8+ messages in thread
From: Hans-Jürgen Koch @ 2007-08-31 16:29 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev, Jeff Garzik, afleming

Am Freitag 31 August 2007 schrieb Hans-Jürgen Koch:
> Lock debugging finds a problem in phy.c and phy_device.c,
> this patch fixes it. Tested on an AT91SAM9263-EK board,
> kernel 2.6.23-rc4.

FYI, here's the log message without that patch:

[    3.420000] =================================
[    3.420000] [ INFO: inconsistent lock state ]
[    3.420000] 2.6.23-rc3-mm1 #21
[    3.420000] ---------------------------------
[    3.420000] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[    3.420000] swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
[    3.420000]  (&dev->lock){-+..}, at: [<c017da1c>] phy_timer+0x1c/0x4c8
[    3.420000] {softirq-on-W} state was registered at:
[    3.420000]   [<c006021c>] lock_acquire+0x94/0xac
[    3.420000]   [<c026423c>] _spin_lock+0x40/0x50
[    3.420000]   [<c017e2b4>] phy_probe+0x40/0x88
[    3.420000]   [<c017e60c>] phy_attach+0xc8/0xfc
[    3.420000]   [<c017e6c4>] phy_connect+0x1c/0x58
[    3.420000]   [<c017fc44>] macb_probe+0x4fc/0x5bc
[    3.420000]   [<c0177b44>] platform_drv_probe+0x20/0x24
[    3.420000]   [<c0176210>] driver_probe_device+0xb0/0x1bc
[    3.420000]   [<c01764dc>] __driver_attach+0xe4/0xe8
[    3.420000]   [<c017550c>] bus_for_each_dev+0x54/0x80
[    3.420000]   [<c0176074>] driver_attach+0x20/0x28
[    3.420000]   [<c01758d4>] bus_add_driver+0x84/0x1e0
[    3.420000]   [<c0176708>] driver_register+0x54/0x90
[    3.420000]   [<c0177de0>] platform_driver_register+0x6c/0x88
[    3.420000]   [<c00167ec>] macb_init+0x14/0x1c
[    3.420000]   [<c00087e4>] kernel_init+0x9c/0x2b4
[    3.420000]   [<c0040338>] do_exit+0x0/0x8e8
[    3.420000] irq event stamp: 115025
[    3.420000] hardirqs last  enabled at (115025): [<c0264958>] _spin_unlock_irq+0x30/0x60
[    3.420000] hardirqs last disabled at (115024): [<c02642c8>] _spin_lock_irq+0x28/0x60
[    3.420000] softirqs last  enabled at (114999): [<c0042910>] __do_softirq+0xfc/0x12c
[    3.420000] softirqs last disabled at (115022): [<c0042ee8>] irq_exit+0x68/0x7c
[    3.420000]
[    3.420000] other info that might help us debug this:
[    3.420000] no locks held by swapper/1.
[    3.420000]
[    3.420000] stack backtrace:
[    3.420000] [<c0028af4>] (dump_stack+0x0/0x14) from [<c005d8e4>] (print_usage_bug+0x120/0x150)
[    3.420000] [<c005d7c4>] (print_usage_bug+0x0/0x150) from [<c005e5e8>] (mark_lock+0x574/0x6bc)
[    3.420000] [<c005e074>] (mark_lock+0x0/0x6bc) from [<c005f4c0>] (__lock_acquire+0x4d0/0x1198)
[    3.420000] [<c005eff0>] (__lock_acquire+0x0/0x1198) from [<c006021c>] (lock_acquire+0x94/0xac)
[    3.420000] [<c0060188>] (lock_acquire+0x0/0xac) from [<c026423c>] (_spin_lock+0x40/0x50)
[    3.420000] [<c02641fc>] (_spin_lock+0x0/0x50) from [<c017da1c>] (phy_timer+0x1c/0x4c8)
[    3.420000]  r5:c098a800 r4:00000104
[    3.420000] [<c017da00>] (phy_timer+0x0/0x4c8) from [<c0046ee4>] (run_timer_softirq+0x1a0/0x218)
[    3.420000]  r7:c081ddd0 r6:c0331e80 r5:c098aa50 r4:00000104
[    3.420000] [<c0046d44>] (run_timer_softirq+0x0/0x218) from [<c00428a8>] (__do_softirq+0x94/0x12c)
[    3.420000] [<c0042814>] (__do_softirq+0x0/0x12c) from [<c0042ee8>] (irq_exit+0x68/0x7c)
[    3.420000] [<c0042e80>] (irq_exit+0x0/0x7c) from [<c0023060>] (__exception_text_start+0x60/0x74)
[    3.420000]  r4:00000001
[    3.420000] [<c0023000>] (__exception_text_start+0x0/0x74) from [<c0023b28>] (__irq_svc+0x48/0x74)
[    3.420000] Exception stack(0xc081de70 to 0xc081deb8)
[    3.420000] de60:                                     c030b2a8 c081a800 60000013 c081c000
[    3.420000] de80: c032db40 c081dee0 c032d758 c081dece 00000012 c030b398 00000034 c081df2c
[    3.420000] dea0: c081deb8 c081deb8 c003d6f0 c003d6f8 60000013 ffffffff
[    3.420000]  r7:00000003 r6:00000001 r5:fefff000 r4:ffffffff
[    3.420000] [<c003d494>] (vprintk+0x0/0x474) from [<c003d930>] (printk+0x28/0x30)
[    3.420000] [<c003d908>] (printk+0x0/0x30) from [<c01d3558>] (sock_register+0x64/0x84)
[    3.420000]  r3:00000001 r2:00000002 r1:00000001 r0:c02e4c88
[    3.420000] [<c01d34f4>] (sock_register+0x0/0x84) from [<c001b7bc>] (af_unix_init+0x40/0x88)
[    3.420000]  r5:00000000 r4:00000000
[    3.420000] [<c001b77c>] (af_unix_init+0x0/0x88) from [<c00087e4>] (kernel_init+0x9c/0x2b4)
[    3.420000]  r4:00000000
[    3.420000] [<c0008748>] (kernel_init+0x0/0x2b4) from [<c0040338>] (do_exit+0x0/0x8e8)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH resend] Fix a lock problem in generic phy code
  2007-08-31 12:30 [PATCH] Fix a lock problem in generic phy code Hans-Jürgen Koch
  2007-08-31 16:29 ` Hans-Jürgen Koch
@ 2007-09-09 21:36 ` Hans-Jürgen Koch
  2007-09-10  8:58   ` Herbert Xu
  2007-09-13  4:12 ` [PATCH] " Jeff Garzik
  2 siblings, 1 reply; 8+ messages in thread
From: Hans-Jürgen Koch @ 2007-09-09 21:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: netdev, Jeff Garzik, afleming

I already sent this patch on August, 31. I never got an answer, so here it
is again. 

Lock debugging finds a problem in phy.c and phy_device.c:

[    3.420000] =================================
[    3.420000] [ INFO: inconsistent lock state ]
[    3.420000] 2.6.23-rc3-mm1 #21
[    3.420000] ---------------------------------
[    3.420000] inconsistent {softirq-on-W} -> {in-softirq-W} usage.
[    3.420000] swapper/1 [HC0[0]:SC1[1]:HE1:SE0] takes:
[    3.420000]  (&dev->lock){-+..}, at: [<c017da1c>] phy_timer+0x1c/0x4c8
[    3.420000] {softirq-on-W} state was registered at:
[    3.420000]   [<c006021c>] lock_acquire+0x94/0xac
[    3.420000]   [<c026423c>] _spin_lock+0x40/0x50
[    3.420000]   [<c017e2b4>] phy_probe+0x40/0x88
[    3.420000]   [<c017e60c>] phy_attach+0xc8/0xfc
[    3.420000]   [<c017e6c4>] phy_connect+0x1c/0x58
[    3.420000]   [<c017fc44>] macb_probe+0x4fc/0x5bc
[    3.420000]   [<c0177b44>] platform_drv_probe+0x20/0x24
[    3.420000]   [<c0176210>] driver_probe_device+0xb0/0x1bc
[    3.420000]   [<c01764dc>] __driver_attach+0xe4/0xe8
[    3.420000]   [<c017550c>] bus_for_each_dev+0x54/0x80
[    3.420000]   [<c0176074>] driver_attach+0x20/0x28
[    3.420000]   [<c01758d4>] bus_add_driver+0x84/0x1e0
[    3.420000]   [<c0176708>] driver_register+0x54/0x90
[    3.420000]   [<c0177de0>] platform_driver_register+0x6c/0x88
[    3.420000]   [<c00167ec>] macb_init+0x14/0x1c
[    3.420000]   [<c00087e4>] kernel_init+0x9c/0x2b4
[    3.420000]   [<c0040338>] do_exit+0x0/0x8e8
[    3.420000] irq event stamp: 115025
[    3.420000] hardirqs last  enabled at (115025): [<c0264958>] _spin_unlock_irq+0x30/0x60
[    3.420000] hardirqs last disabled at (115024): [<c02642c8>] _spin_lock_irq+0x28/0x60
[    3.420000] softirqs last  enabled at (114999): [<c0042910>] __do_softirq+0xfc/0x12c
[    3.420000] softirqs last disabled at (115022): [<c0042ee8>] irq_exit+0x68/0x7c
[    3.420000]
[    3.420000] other info that might help us debug this:
[    3.420000] no locks held by swapper/1.
[    3.420000]
[    3.420000] stack backtrace:
[    3.420000] [<c0028af4>] (dump_stack+0x0/0x14) from [<c005d8e4>] (print_usage_bug+0x120/0x150)
[    3.420000] [<c005d7c4>] (print_usage_bug+0x0/0x150) from [<c005e5e8>] (mark_lock+0x574/0x6bc)
[    3.420000] [<c005e074>] (mark_lock+0x0/0x6bc) from [<c005f4c0>] (__lock_acquire+0x4d0/0x1198)
[    3.420000] [<c005eff0>] (__lock_acquire+0x0/0x1198) from [<c006021c>] (lock_acquire+0x94/0xac)
[    3.420000] [<c0060188>] (lock_acquire+0x0/0xac) from [<c026423c>] (_spin_lock+0x40/0x50)
[    3.420000] [<c02641fc>] (_spin_lock+0x0/0x50) from [<c017da1c>] (phy_timer+0x1c/0x4c8)
[    3.420000]  r5:c098a800 r4:00000104
[    3.420000] [<c017da00>] (phy_timer+0x0/0x4c8) from [<c0046ee4>] (run_timer_softirq+0x1a0/0x218)
[    3.420000]  r7:c081ddd0 r6:c0331e80 r5:c098aa50 r4:00000104
[    3.420000] [<c0046d44>] (run_timer_softirq+0x0/0x218) from [<c00428a8>] (__do_softirq+0x94/0x12c)
[    3.420000] [<c0042814>] (__do_softirq+0x0/0x12c) from [<c0042ee8>] (irq_exit+0x68/0x7c)
[    3.420000] [<c0042e80>] (irq_exit+0x0/0x7c) from [<c0023060>] (__exception_text_start+0x60/0x74)
[    3.420000]  r4:00000001
[    3.420000] [<c0023000>] (__exception_text_start+0x0/0x74) from [<c0023b28>] (__irq_svc+0x48/0x74)
[    3.420000] Exception stack(0xc081de70 to 0xc081deb8)
[    3.420000] de60:                                     c030b2a8 c081a800 60000013 c081c000
[    3.420000] de80: c032db40 c081dee0 c032d758 c081dece 00000012 c030b398 00000034 c081df2c
[    3.420000] dea0: c081deb8 c081deb8 c003d6f0 c003d6f8 60000013 ffffffff
[    3.420000]  r7:00000003 r6:00000001 r5:fefff000 r4:ffffffff
[    3.420000] [<c003d494>] (vprintk+0x0/0x474) from [<c003d930>] (printk+0x28/0x30)
[    3.420000] [<c003d908>] (printk+0x0/0x30) from [<c01d3558>] (sock_register+0x64/0x84)
[    3.420000]  r3:00000001 r2:00000002 r1:00000001 r0:c02e4c88
[    3.420000] [<c01d34f4>] (sock_register+0x0/0x84) from [<c001b7bc>] (af_unix_init+0x40/0x88)
[    3.420000]  r5:00000000 r4:00000000
[    3.420000] [<c001b77c>] (af_unix_init+0x0/0x88) from [<c00087e4>] (kernel_init+0x9c/0x2b4)
[    3.420000]  r4:00000000
[    3.420000] [<c0008748>] (kernel_init+0x0/0x2b4) from [<c0040338>] (do_exit+0x0/0x8e8)

The following patch fixes it. Tested on an AT91SAM9263-EK board, kernel 2.6.23-rc4 and -rc3-mm1.

Signed-off-by: Hans J. Koch <hjk@linutronix.de>
---
Index: linux-2.6.23-rc/drivers/net/phy/phy_device.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy_device.c	2007-08-31 14:07:47.000000000 +0200
+++ linux-2.6.23-rc/drivers/net/phy/phy_device.c	2007-08-31 14:08:22.000000000 +0200
@@ -644,7 +644,7 @@
 	if (!(phydrv->flags & PHY_HAS_INTERRUPT))
 		phydev->irq = PHY_POLL;
 
-	spin_lock(&phydev->lock);
+	spin_lock_bh(&phydev->lock);
 
 	/* Start out supporting everything. Eventually,
 	 * a controller will attach, and may modify one
@@ -658,7 +658,7 @@
 	if (phydev->drv->probe)
 		err = phydev->drv->probe(phydev);
 
-	spin_unlock(&phydev->lock);
+	spin_unlock_bh(&phydev->lock);
 
 	return err;
 
Index: linux-2.6.23-rc/drivers/net/phy/phy.c
===================================================================
--- linux-2.6.23-rc.orig/drivers/net/phy/phy.c	2007-08-31 14:15:20.000000000 +0200
+++ linux-2.6.23-rc/drivers/net/phy/phy.c	2007-08-31 14:15:43.000000000 +0200
@@ -755,7 +755,7 @@
  */
 void phy_start(struct phy_device *phydev)
 {
-	spin_lock(&phydev->lock);
+	spin_lock_bh(&phydev->lock);
 
 	switch (phydev->state) {
 		case PHY_STARTING:
@@ -769,7 +769,7 @@
 		default:
 			break;
 	}
-	spin_unlock(&phydev->lock);
+	spin_unlock_bh(&phydev->lock);
 }
 EXPORT_SYMBOL(phy_stop);
 EXPORT_SYMBOL(phy_start);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH resend] Fix a lock problem in generic phy code
  2007-09-09 21:36 ` [PATCH resend] " Hans-Jürgen Koch
@ 2007-09-10  8:58   ` Herbert Xu
  2007-09-10 18:45     ` Hans-Jürgen Koch
  0 siblings, 1 reply; 8+ messages in thread
From: Herbert Xu @ 2007-09-10  8:58 UTC (permalink / raw)
  To: Hans-J??rgen Koch; +Cc: linux-kernel, netdev, jeff, afleming

Hans-J??rgen Koch <hjk@linutronix.de> wrote:
>
> The following patch fixes it. Tested on an AT91SAM9263-EK board, kernel 2.6.23-rc4 and -rc3-mm1.

Could you please audit all instances of physdev->lock and add
_bh where necessary?  I can see that at least phys_stop also
needs the _bh.

We should also consider whether it makes sense to move the
timer into a work queue.

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH resend] Fix a lock problem in generic phy code
  2007-09-10  8:58   ` Herbert Xu
@ 2007-09-10 18:45     ` Hans-Jürgen Koch
  2007-09-11  2:21       ` Herbert Xu
  0 siblings, 1 reply; 8+ messages in thread
From: Hans-Jürgen Koch @ 2007-09-10 18:45 UTC (permalink / raw)
  To: Herbert Xu; +Cc: linux-kernel, netdev, jeff, afleming

Am Montag 10 September 2007 schrieb Herbert Xu:
> Hans-J??rgen Koch <hjk@linutronix.de> wrote:
> > The following patch fixes it. Tested on an AT91SAM9263-EK board, kernel
> > 2.6.23-rc4 and -rc3-mm1.
>
> Could you please audit all instances of physdev->lock and add
> _bh where necessary?  I can see that at least phys_stop also
> needs the _bh.

I think the patch does all that's necessary. At least, there're no error
messages in the logs anymore. I didn't check if there's an error on
unload, though.

>
> We should also consider whether it makes sense to move the
> timer into a work queue.

Somebody who's more involved in that PHY stuff should probably do that.
I'm on holidays, sitting in a hotel room, and will probably not have the
time to have a deeper look into that.

Thanks,
Hans




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH resend] Fix a lock problem in generic phy code
  2007-09-10 18:45     ` Hans-Jürgen Koch
@ 2007-09-11  2:21       ` Herbert Xu
  0 siblings, 0 replies; 8+ messages in thread
From: Herbert Xu @ 2007-09-11  2:21 UTC (permalink / raw)
  To: Hans-Jürgen Koch; +Cc: linux-kernel, netdev, jeff, afleming

On Mon, Sep 10, 2007 at 08:45:50PM +0200, Hans-Jürgen Koch wrote:
>
> > Could you please audit all instances of physdev->lock and add
> > _bh where necessary?  I can see that at least phys_stop also
> > needs the _bh.
> 
> I think the patch does all that's necessary. At least, there're no error
> messages in the logs anymore. I didn't check if there's an error on
> unload, though.

Sorry, but you can't rely on the non-existence of lockdep
messages as a proof of correctness :)

If we're going to fix the obvious bugs here, we should fix
the subtle ones too as otherwise they'll be much harder to
notice with this patch merged.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix a lock problem in generic phy code
  2007-08-31 12:30 [PATCH] Fix a lock problem in generic phy code Hans-Jürgen Koch
  2007-08-31 16:29 ` Hans-Jürgen Koch
  2007-09-09 21:36 ` [PATCH resend] " Hans-Jürgen Koch
@ 2007-09-13  4:12 ` Jeff Garzik
  2007-09-13  9:50   ` Hans-Jürgen Koch
  2 siblings, 1 reply; 8+ messages in thread
From: Jeff Garzik @ 2007-09-13  4:12 UTC (permalink / raw)
  To: Hans-Jürgen Koch; +Cc: linux-kernel, netdev, afleming

Hans-Jürgen Koch wrote:
> Lock debugging finds a problem in phy.c and phy_device.c,
> this patch fixes it. Tested on an AT91SAM9263-EK board, 
> kernel 2.6.23-rc4.
> 
> Signed-off-by: Hans J. Koch <hjk@linutronix.de>

applied



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] Fix a lock problem in generic phy code
  2007-09-13  4:12 ` [PATCH] " Jeff Garzik
@ 2007-09-13  9:50   ` Hans-Jürgen Koch
  0 siblings, 0 replies; 8+ messages in thread
From: Hans-Jürgen Koch @ 2007-09-13  9:50 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: linux-kernel, netdev, afleming

Am Donnerstag 13 September 2007 schrieb Jeff Garzik:
> Hans-Jürgen Koch wrote:
> > Lock debugging finds a problem in phy.c and phy_device.c,
> > this patch fixes it. Tested on an AT91SAM9263-EK board,
> > kernel 2.6.23-rc4.
> >
> > Signed-off-by: Hans J. Koch <hjk@linutronix.de>
>
> applied

Thanks! Andrew applied it to -mm, too.

Hans

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2007-09-13  9:50 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-08-31 12:30 [PATCH] Fix a lock problem in generic phy code Hans-Jürgen Koch
2007-08-31 16:29 ` Hans-Jürgen Koch
2007-09-09 21:36 ` [PATCH resend] " Hans-Jürgen Koch
2007-09-10  8:58   ` Herbert Xu
2007-09-10 18:45     ` Hans-Jürgen Koch
2007-09-11  2:21       ` Herbert Xu
2007-09-13  4:12 ` [PATCH] " Jeff Garzik
2007-09-13  9:50   ` Hans-Jürgen Koch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).