* cassini: possible recursive locking detected
@ 2014-05-06 9:39 Meelis Roos
2014-05-08 12:53 ` Emil Goode
0 siblings, 1 reply; 9+ messages in thread
From: Meelis Roos @ 2014-05-06 9:39 UTC (permalink / raw)
To: netdev
While installing Linux on Sun Fire V480, any traffic on builtin cassini
NIC caused a hang. Worked this around by using Broadcom NIC and tried a
kernel with most debugging options. This resulted in the following
warning. Maybe this is the deadlonck I was seeing?
[ 88.316595] =============================================
[ 88.316597] [ INFO: possible recursive locking detected ]
[ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted
[ 88.316605] ---------------------------------------------
[ 88.316608] swapper/3/1 is trying to acquire lock:
[ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316646]
[ 88.316646] but task is already holding lock:
[ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316659]
[ 88.316659] other info that might help us debug this:
[ 88.316661] Possible unsafe locking scenario:
[ 88.316661]
[ 88.316662] CPU0
[ 88.316664] ----
[ 88.316668] lock(&(&cp->tx_lock[i])->rlock);
[ 88.316671] lock(&(&cp->tx_lock[i])->rlock);
[ 88.316672]
[ 88.316672] *** DEADLOCK ***
[ 88.316672]
[ 88.316674] May be due to missing lock nesting notation
[ 88.316674]
[ 88.316677] 3 locks held by swapper/3/1:
[ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0
[ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460
[ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460
[ 88.316718]
[ 88.316718] stack backtrace:
[ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11
[ 88.316727] Call Trace:
[ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0
[ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80
[ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40
[ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460
[ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0
[ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280
[ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240
[ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40
[ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0
[ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0
[ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238
[ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540
[ 88.316842] [0000000000835fb8] printk+0x34/0x48
[ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0
[ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c
[ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0
--
Meelis Roos (mroos@linux.ee)
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: cassini: possible recursive locking detected 2014-05-06 9:39 cassini: possible recursive locking detected Meelis Roos @ 2014-05-08 12:53 ` Emil Goode 2014-05-08 20:00 ` Meelis Roos 0 siblings, 1 reply; 9+ messages in thread From: Emil Goode @ 2014-05-08 12:53 UTC (permalink / raw) To: Meelis Roos; +Cc: netdev [-- Attachment #1: Type: text/plain, Size: 3463 bytes --] Hello Meelis, I think this warning happens because we acquire multiple locks in a loop in cas_lock_tx() and I believe we should use nested lock annotation here. Perhaps you would like to try the attached patch? It won't fix the deadlock that you mentioned though. Best regards, Emil Goode On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote: > While installing Linux on Sun Fire V480, any traffic on builtin cassini > NIC caused a hang. Worked this around by using Broadcom NIC and tried a > kernel with most debugging options. This resulted in the following > warning. Maybe this is the deadlonck I was seeing? > > [ 88.316595] ============================================= > [ 88.316597] [ INFO: possible recursive locking detected ] > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted > [ 88.316605] --------------------------------------------- > [ 88.316608] swapper/3/1 is trying to acquire lock: > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > [ 88.316646] > [ 88.316646] but task is already holding lock: > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > [ 88.316659] > [ 88.316659] other info that might help us debug this: > [ 88.316661] Possible unsafe locking scenario: > [ 88.316661] > [ 88.316662] CPU0 > [ 88.316664] ---- > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock); > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock); > [ 88.316672] > [ 88.316672] *** DEADLOCK *** > [ 88.316672] > [ 88.316674] May be due to missing lock nesting notation > [ 88.316674] > [ 88.316677] 3 locks held by swapper/3/1: > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0 > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460 > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > [ 88.316718] > [ 88.316718] stack backtrace: > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11 > [ 88.316727] Call Trace: > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0 > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80 > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40 > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460 > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0 > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280 > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240 > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40 > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0 > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0 > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238 > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540 > [ 88.316842] [0000000000835fb8] printk+0x34/0x48 > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0 > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0 > > -- > Meelis Roos (mroos@linux.ee) > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html [-- Attachment #2: 0001-net-cassini-use-nested-lock-annotation.patch --] [-- Type: text/x-diff, Size: 936 bytes --] >From 1f3fcb0cb141e167c5389861eb0a6cb935f6a3d5 Mon Sep 17 00:00:00 2001 From: Emil Goode <emilgoode@gmail.com> Date: Thu, 8 May 2014 12:49:24 +0200 Subject: [PATCH] net: cassini: use nested lock annotation In the cas_lock_tx function we acquire multiple locks in a loop and need to use nested lock annotation to prevent lockdep warnings. Signed-off-by: Emil Goode <emilgoode@gmail.com> --- drivers/net/ethernet/sun/cassini.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/net/ethernet/sun/cassini.c b/drivers/net/ethernet/sun/cassini.c index df8d383..b9ac20f 100644 --- a/drivers/net/ethernet/sun/cassini.c +++ b/drivers/net/ethernet/sun/cassini.c @@ -246,7 +246,7 @@ static inline void cas_lock_tx(struct cas *cp) int i; for (i = 0; i < N_TX_RINGS; i++) - spin_lock(&cp->tx_lock[i]); + spin_lock_nested(&cp->tx_lock[i], i); } static inline void cas_lock_all(struct cas *cp) -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-08 12:53 ` Emil Goode @ 2014-05-08 20:00 ` Meelis Roos 2014-05-08 22:38 ` Emil Goode 0 siblings, 1 reply; 9+ messages in thread From: Meelis Roos @ 2014-05-08 20:00 UTC (permalink / raw) To: Emil Goode; +Cc: netdev > Hello Meelis, > > I think this warning happens because we acquire multiple locks > in a loop in cas_lock_tx() and I believe we should use nested > lock annotation here. > > Perhaps you would like to try the attached patch? Yes, it silences the warning. > It won't fix the deadlock that you mentioned though. Yes, the hang still happens, following a ERROR: System Hardware FATAL RESET from CPU0 CPU2 > > Best regards, > > Emil Goode > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote: > > While installing Linux on Sun Fire V480, any traffic on builtin cassini > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a > > kernel with most debugging options. This resulted in the following > > warning. Maybe this is the deadlonck I was seeing? > > > > [ 88.316595] ============================================= > > [ 88.316597] [ INFO: possible recursive locking detected ] > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted > > [ 88.316605] --------------------------------------------- > > [ 88.316608] swapper/3/1 is trying to acquire lock: > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > [ 88.316646] > > [ 88.316646] but task is already holding lock: > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > [ 88.316659] > > [ 88.316659] other info that might help us debug this: > > [ 88.316661] Possible unsafe locking scenario: > > [ 88.316661] > > [ 88.316662] CPU0 > > [ 88.316664] ---- > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock); > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock); > > [ 88.316672] > > [ 88.316672] *** DEADLOCK *** > > [ 88.316672] > > [ 88.316674] May be due to missing lock nesting notation > > [ 88.316674] > > [ 88.316677] 3 locks held by swapper/3/1: > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0 > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460 > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > [ 88.316718] > > [ 88.316718] stack backtrace: > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11 > > [ 88.316727] Call Trace: > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0 > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80 > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40 > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460 > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0 > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280 > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240 > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40 > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0 > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0 > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238 > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540 > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48 > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0 > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0 > > > > -- > > Meelis Roos (mroos@linux.ee) > > -- > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > the body of a message to majordomo@vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-08 20:00 ` Meelis Roos @ 2014-05-08 22:38 ` Emil Goode 2014-05-09 5:37 ` Meelis Roos 0 siblings, 1 reply; 9+ messages in thread From: Emil Goode @ 2014-05-08 22:38 UTC (permalink / raw) To: Meelis Roos; +Cc: netdev Hello, On Thu, May 08, 2014 at 11:00:03PM +0300, Meelis Roos wrote: > > Hello Meelis, > > > > I think this warning happens because we acquire multiple locks > > in a loop in cas_lock_tx() and I believe we should use nested > > lock annotation here. > > > > Perhaps you would like to try the attached patch? > > Yes, it silences the warning. > Ok thanks for testing, I'll send that patch. > > It won't fix the deadlock that you mentioned though. > > Yes, the hang still happens, following a > ERROR: System Hardware FATAL RESET from CPU0 CPU2 > Are you able to get the full dmesg output? I think this could be hard to solve since I don't have the hardware, but could take a look. > > > > Best regards, > > > > Emil Goode > > > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote: > > > While installing Linux on Sun Fire V480, any traffic on builtin cassini > > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a > > > kernel with most debugging options. This resulted in the following > > > warning. Maybe this is the deadlonck I was seeing? > > > > > > [ 88.316595] ============================================= > > > [ 88.316597] [ INFO: possible recursive locking detected ] > > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted > > > [ 88.316605] --------------------------------------------- > > > [ 88.316608] swapper/3/1 is trying to acquire lock: > > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > [ 88.316646] > > > [ 88.316646] but task is already holding lock: > > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > [ 88.316659] > > > [ 88.316659] other info that might help us debug this: > > > [ 88.316661] Possible unsafe locking scenario: > > > [ 88.316661] > > > [ 88.316662] CPU0 > > > [ 88.316664] ---- > > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock); > > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock); > > > [ 88.316672] > > > [ 88.316672] *** DEADLOCK *** > > > [ 88.316672] > > > [ 88.316674] May be due to missing lock nesting notation > > > [ 88.316674] > > > [ 88.316677] 3 locks held by swapper/3/1: > > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0 > > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460 > > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > [ 88.316718] > > > [ 88.316718] stack backtrace: > > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11 > > > [ 88.316727] Call Trace: > > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0 > > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80 > > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40 > > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460 > > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0 > > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280 > > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240 > > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40 > > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0 > > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0 > > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238 > > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540 > > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48 > > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0 > > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c > > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0 > > > > > > -- > > > Meelis Roos (mroos@linux.ee) > > > -- > > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > > the body of a message to majordomo@vger.kernel.org > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > -- > Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-08 22:38 ` Emil Goode @ 2014-05-09 5:37 ` Meelis Roos 2014-05-09 9:06 ` Emil Goode 0 siblings, 1 reply; 9+ messages in thread From: Meelis Roos @ 2014-05-09 5:37 UTC (permalink / raw) To: Emil Goode; +Cc: netdev > > > It won't fix the deadlock that you mentioned though. > > > > Yes, the hang still happens, following a > > ERROR: System Hardware FATAL RESET from CPU0 CPU2 > > > > Are you able to get the full dmesg output? > I think this could be hard to solve since I don't have > the hardware, but could take a look. There is sparc64 firmware-specific FATAL RESET only, inclugind pages of state dump for all 4 CPU-s and their MMUs etc. Nothing from Linux side. So it's like recursive fault or something similar. > > > > > > > Best regards, > > > > > > Emil Goode > > > > > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote: > > > > While installing Linux on Sun Fire V480, any traffic on builtin cassini > > > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a > > > > kernel with most debugging options. This resulted in the following > > > > warning. Maybe this is the deadlonck I was seeing? > > > > > > > > [ 88.316595] ============================================= > > > > [ 88.316597] [ INFO: possible recursive locking detected ] > > > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted > > > > [ 88.316605] --------------------------------------------- > > > > [ 88.316608] swapper/3/1 is trying to acquire lock: > > > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > [ 88.316646] > > > > [ 88.316646] but task is already holding lock: > > > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > [ 88.316659] > > > > [ 88.316659] other info that might help us debug this: > > > > [ 88.316661] Possible unsafe locking scenario: > > > > [ 88.316661] > > > > [ 88.316662] CPU0 > > > > [ 88.316664] ---- > > > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock); > > > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock); > > > > [ 88.316672] > > > > [ 88.316672] *** DEADLOCK *** > > > > [ 88.316672] > > > > [ 88.316674] May be due to missing lock nesting notation > > > > [ 88.316674] > > > > [ 88.316677] 3 locks held by swapper/3/1: > > > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0 > > > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460 > > > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > [ 88.316718] > > > > [ 88.316718] stack backtrace: > > > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11 > > > > [ 88.316727] Call Trace: > > > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0 > > > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80 > > > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40 > > > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460 > > > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0 > > > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280 > > > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240 > > > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40 > > > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0 > > > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0 > > > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238 > > > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540 > > > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48 > > > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0 > > > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c > > > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0 > > > > > > > > -- > > > > Meelis Roos (mroos@linux.ee) > > > > -- > > > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > > > the body of a message to majordomo@vger.kernel.org > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > > Meelis Roos (mroos@linux.ee) > -- Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-09 5:37 ` Meelis Roos @ 2014-05-09 9:06 ` Emil Goode 2014-05-09 20:33 ` David Miller 0 siblings, 1 reply; 9+ messages in thread From: Emil Goode @ 2014-05-09 9:06 UTC (permalink / raw) To: Meelis Roos; +Cc: netdev On Fri, May 09, 2014 at 08:37:30AM +0300, Meelis Roos wrote: > > > > It won't fix the deadlock that you mentioned though. > > > > > > Yes, the hang still happens, following a > > > ERROR: System Hardware FATAL RESET from CPU0 CPU2 > > > > > > > Are you able to get the full dmesg output? > > I think this could be hard to solve since I don't have > > the hardware, but could take a look. > > There is sparc64 firmware-specific FATAL RESET only, inclugind pages of > state dump for all 4 CPU-s and their MMUs etc. Nothing from Linux side. > > So it's like recursive fault or something similar. > I searched the net a bit and found these old threads: http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html "FreeBSD currently crashes on older models of V480 when attempting to use an on-board NIC due to what appears to be a CPU bug which needs to be worked around." http://marc.info/?l=linux-sparc&m=122220796209509&w=2 "I noticed the cassini network driver for the builtin gigabit network is unstable and brings the kernel down on a dualprocessor sparc SunFire 480R with a Hardware FATAL RESET" I think you should ask about this on the sparclinux mailing list. http://vger.kernel.org/vger-lists.html#sparclinux I would say it's very unlikely that the problem is related to that lockdep warning. > > > > > > > > > > Best regards, > > > > > > > > Emil Goode > > > > > > > > On Tue, May 06, 2014 at 12:39:48PM +0300, Meelis Roos wrote: > > > > > While installing Linux on Sun Fire V480, any traffic on builtin cassini > > > > > NIC caused a hang. Worked this around by using Broadcom NIC and tried a > > > > > kernel with most debugging options. This resulted in the following > > > > > warning. Maybe this is the deadlonck I was seeing? > > > > > > > > > > [ 88.316595] ============================================= > > > > > [ 88.316597] [ INFO: possible recursive locking detected ] > > > > > [ 88.316603] 3.15.0-rc4-00202-g30321c7-dirty #11 Not tainted > > > > > [ 88.316605] --------------------------------------------- > > > > > [ 88.316608] swapper/3/1 is trying to acquire lock: > > > > > [ 88.316644] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > > [ 88.316646] > > > > > [ 88.316646] but task is already holding lock: > > > > > [ 88.316657] (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > > [ 88.316659] > > > > > [ 88.316659] other info that might help us debug this: > > > > > [ 88.316661] Possible unsafe locking scenario: > > > > > [ 88.316661] > > > > > [ 88.316662] CPU0 > > > > > [ 88.316664] ---- > > > > > [ 88.316668] lock(&(&cp->tx_lock[i])->rlock); > > > > > [ 88.316671] lock(&(&cp->tx_lock[i])->rlock); > > > > > [ 88.316672] > > > > > [ 88.316672] *** DEADLOCK *** > > > > > [ 88.316672] > > > > > [ 88.316674] May be due to missing lock nesting notation > > > > > [ 88.316674] > > > > > [ 88.316677] 3 locks held by swapper/3/1: > > > > > [ 88.316694] #0: ((&cp->link_timer)){+.-...}, at: [<0000000000465f80>] call_timer_fn+0x0/0xe0 > > > > > [ 88.316706] #1: (&(&cp->lock)->rlock){..-...}, at: [<0000000000745d80>] cas_link_timer+0x80/0x460 > > > > > [ 88.316716] #2: (&(&cp->tx_lock[i])->rlock){..-...}, at: [<0000000000745da0>] cas_link_timer+0xa0/0x460 > > > > > [ 88.316718] > > > > > [ 88.316718] stack backtrace: > > > > > [ 88.316724] CPU: 2 PID: 1 Comm: swapper/3 Not tainted 3.15.0-rc4-00202-g30321c7-dirty #11 > > > > > [ 88.316727] Call Trace: > > > > > [ 88.316743] [00000000004a2c5c] __lock_acquire+0x10fc/0x1fa0 > > > > > [ 88.316749] [00000000004a406c] lock_acquire+0x4c/0x80 > > > > > [ 88.316760] [000000000083e07c] _raw_spin_lock+0x1c/0x40 > > > > > [ 88.316765] [0000000000745da0] cas_link_timer+0xa0/0x460 > > > > > [ 88.316769] [0000000000465fc8] call_timer_fn+0x48/0xe0 > > > > > [ 88.316775] [00000000004665d4] run_timer_softirq+0x214/0x280 > > > > > [ 88.316788] [000000000045f650] __do_softirq+0xf0/0x240 > > > > > [ 88.316800] [000000000042bd0c] do_softirq_own_stack+0x2c/0x40 > > > > > [ 88.316804] [000000000045fb44] irq_exit+0xc4/0xe0 > > > > > [ 88.316814] [000000000042fcc8] timer_interrupt+0x88/0xc0 > > > > > [ 88.316819] [0000000000426b84] valid_addr_bitmap_patch+0xbc/0x238 > > > > > [ 88.316826] [00000000004ab2f8] vprintk_emit+0x1d8/0x540 > > > > > [ 88.316842] [0000000000835fb8] printk+0x34/0x48 > > > > > [ 88.316847] [00000000004ac3e0] register_console+0x340/0x3e0 > > > > > [ 88.316862] [0000000000a74f2c] init_netconsole+0x180/0x20c > > > > > [ 88.316867] [0000000000426eb0] do_one_initcall+0x110/0x1a0 > > > > > > > > > > -- > > > > > Meelis Roos (mroos@linux.ee) > > > > > -- > > > > > To unsubscribe from this list: send the line "unsubscribe netdev" in > > > > > the body of a message to majordomo@vger.kernel.org > > > > > More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > > > > -- > > > Meelis Roos (mroos@linux.ee) > > > > -- > Meelis Roos (mroos@linux.ee) ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-09 9:06 ` Emil Goode @ 2014-05-09 20:33 ` David Miller 2014-05-16 20:12 ` Bjørn Mork 0 siblings, 1 reply; 9+ messages in thread From: David Miller @ 2014-05-09 20:33 UTC (permalink / raw) To: emilgoode; +Cc: mroos, netdev From: Emil Goode <emilgoode@gmail.com> Date: Fri, 9 May 2014 11:06:42 +0200 > I searched the net a bit and found these old threads: > > http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html > > "FreeBSD currently crashes on older models of V480 when attempting > to use an on-board NIC due to what appears to be a CPU bug which > needs to be worked around." I wish I still had a copy of that Schizo chip errata document referenced at the end of that posting, it's not accessible any longer. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-09 20:33 ` David Miller @ 2014-05-16 20:12 ` Bjørn Mork 2014-05-16 20:48 ` David Miller 0 siblings, 1 reply; 9+ messages in thread From: Bjørn Mork @ 2014-05-16 20:12 UTC (permalink / raw) To: David Miller; +Cc: emilgoode, mroos, netdev David Miller <davem@davemloft.net> writes: > From: Emil Goode <emilgoode@gmail.com> > Date: Fri, 9 May 2014 11:06:42 +0200 > >> I searched the net a bit and found these old threads: >> >> http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html >> >> "FreeBSD currently crashes on older models of V480 when attempting >> to use an on-board NIC due to what appears to be a CPU bug which >> needs to be worked around." > > I wish I still had a copy of that Schizo chip errata document referenced > at the end of that posting, it's not accessible any longer. The wayback machine saved it for you: https://web.archive.org/web/20090701005954/http://www.sun.com/processors/manuals/External_Schizo_Errata.pdf Bjørn ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: cassini: possible recursive locking detected 2014-05-16 20:12 ` Bjørn Mork @ 2014-05-16 20:48 ` David Miller 0 siblings, 0 replies; 9+ messages in thread From: David Miller @ 2014-05-16 20:48 UTC (permalink / raw) To: bjorn; +Cc: emilgoode, mroos, netdev From: Bjørn Mork <bjorn@mork.no> Date: Fri, 16 May 2014 22:12:15 +0200 > David Miller <davem@davemloft.net> writes: >> From: Emil Goode <emilgoode@gmail.com> >> Date: Fri, 9 May 2014 11:06:42 +0200 >> >>> I searched the net a bit and found these old threads: >>> >>> http://lists.freebsd.org/pipermail/freebsd-sparc64/2010-January/006935.html >>> >>> "FreeBSD currently crashes on older models of V480 when attempting >>> to use an on-board NIC due to what appears to be a CPU bug which >>> needs to be worked around." >> >> I wish I still had a copy of that Schizo chip errata document referenced >> at the end of that posting, it's not accessible any longer. > > The wayback machine saved it for you: > https://web.archive.org/web/20090701005954/http://www.sun.com/processors/manuals/External_Schizo_Errata.pdf Thanks, Meelis pointed out something similar to me in private correspondance :) ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-05-16 20:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-05-06 9:39 cassini: possible recursive locking detected Meelis Roos 2014-05-08 12:53 ` Emil Goode 2014-05-08 20:00 ` Meelis Roos 2014-05-08 22:38 ` Emil Goode 2014-05-09 5:37 ` Meelis Roos 2014-05-09 9:06 ` Emil Goode 2014-05-09 20:33 ` David Miller 2014-05-16 20:12 ` Bjørn Mork 2014-05-16 20:48 ` David Miller
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).