shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm

All of lore.kernel.org
 help / color / mirror / Atom feed

* shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm
@ 2011-07-25  9:28 ` Yoshii Takashi
  0 siblings, 0 replies; 6+ messages in thread
From: Yoshii Takashi @ 2011-07-25  9:28 UTC (permalink / raw)
  To: g.liakhovetski, linux-sh; +Cc: linux-mmc

Hello, Guennadi.
This was started as an evaluation of your patch
 [PATCH v2] mmc: tmio: fix recursive spinlock, don't schedule with interrupts disabled
But now I don't know where exactly the issue is. So, let me begin new thread.

On ag5evm(the board under arch/arm/mach-shmobile, that is SMP), both
 tag v3.0 + patch above, and
 tag v3.0 + cjb/mmc-next
with CONFIG_PROVE_LOCKING=y, inconsistent lock state is detected as text attached at
 the bottom.
Function itself(mount/read/write/umount) seems working without problem, so far.

As it happens on the spinlock at sh_dmae_interrupt(), it seems to have been there since
2dc66667 dmaengine: shdma: fix locking
which introduced the lock there.

I found deleting the spin_lock/unlock there makes it quiet. But that lock must be the
important part of the patch. Actually, deleting it brings another locking issue on
tmio_mmc, and it confuses me. I wonder what is the correct solution. I'm afraid I can't
tell what is the critical object to be protected in shdma and tmio_mmc source code...

Any suggestions are appreciated.
/yoshii

=================================
[ INFO: inconsistent lock state ]
3.0.0-00100-g82258ef-dirty #3643
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c0795fc4>] sh_dmae_interrupt+0x14/0x78
{HARDIRQ-ON-W} state was registered at:
  [<c0689b30>] __lock_acquire+0x5c4/0xbb4
  [<c068a6f8>] lock_acquire+0x60/0x74
  [<c088da7c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c0796edc>] sh_dmae_alloc_chan_resources+0x1b0/0x298
  [<c0794ca8>] dma_chan_get+0xd8/0x17c
  [<c0795560>] __dma_request_channel+0x140/0x1e4
  [<c07e5850>] tmio_mmc_request_dma+0x74/0x10c
  [<c0886b84>] tmio_mmc_host_probe+0x208/0x284
  [<c0886d68>] sh_mobile_sdhi_probe+0x168/0x28c
  [<c07bca4c>] platform_drv_probe+0x18/0x1c
  [<c07bb5b8>] driver_probe_device+0x7c/0x178
  [<c07bb748>] __driver_attach+0x94/0x98
  [<c07badbc>] bus_for_each_dev+0x60/0x8c
  [<c07ba5a4>] bus_add_driver+0xa8/0x2a4
  [<c07bbd24>] driver_register+0x78/0x18c
  [<c0633530>] do_one_initcall+0x34/0x184
  [<c00083d8>] kernel_init+0xa8/0x134
  [<c063a5a8>] kernel_thread_exit+0x0/0x8
irq event stamp: 17556
hardirqs last  enabled at (17553): [<c063a64c>] default_idle+0x24/0x2c
hardirqs last disabled at (17554): [<c0639674>] __irq_svc+0x34/0xa0
softirqs last  enabled at (17556): [<c065c878>] irq_enter+0x78/0x7c
softirqs last disabled at (17555): [<c065c86c>] irq_enter+0x6c/0x7c

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&new_sh_chan->desc_lock)->rlock);
  <Interrupt>
    lock(&(&new_sh_chan->desc_lock)->rlock);

 *** DEADLOCK ***

no locks held by swapper/0.

stack backtrace:
[<c063f730>] (unwind_backtrace+0x0/0xfc) from [<c0686358>] (print_usage_bug+0x21c/0x2bc)
[<c0686358>] (print_usage_bug+0x21c/0x2bc) from [<c0686908>] (mark_lock+0x510/0x740)
[<c0686908>] (mark_lock+0x510/0x740) from [<c0689ba4>] (__lock_acquire+0x638/0xbb4)
[<c0689ba4>] (__lock_acquire+0x638/0xbb4) from [<c068a6f8>] (lock_acquire+0x60/0x74)
[<c068a6f8>] (lock_acquire+0x60/0x74) from [<c088d868>] (_raw_spin_lock+0x34/0x44)
[<c088d868>] (_raw_spin_lock+0x34/0x44) from [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78)
[<c0795fc4>] (sh_dmae_interrupt+0x14/0x78) from [<c0698f50>] (handle_irq_event_percpu+0x54/0x184)
[<c0698f50>] (handle_irq_event_percpu+0x54/0x184) from [<c06990bc>] (handle_irq_event+0x3c/0x5c)
[<c06990bc>] (handle_irq_event+0x3c/0x5c) from [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104)
[<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104) from [<c0698eec>] (generic_handle_irq+0x28/0x30)
[<c0698eec>] (generic_handle_irq+0x28/0x30) from [<c0633058>] (asm_do_IRQ+0x58/0xb8)
[<c0633058>] (asm_do_IRQ+0x58/0xb8) from [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80)
Exception stack(0xc0951f70 to 0xc0951fb8)
1f60:                                     00000001 00004491 00000000 c0951fa8
1f80: c0950000 c0977604 c088f9c4 c0955b40 4000406a 412fc098 00000000 00000000
1fa0: 00000001 c0951fb8 00004490 c063a650 20000013 ffffffff
[<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80) from [<00004491>] (0x4491)
mmc0: new high speed SD card at address b368
mmcblk0: mmc0:b368 SDC   489 MiB 
 mmcblk0: p1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm
@ 2011-07-25  9:28 ` Yoshii Takashi
  0 siblings, 0 replies; 6+ messages in thread
From: Yoshii Takashi @ 2011-07-25  9:28 UTC (permalink / raw)
  To: g.liakhovetski, linux-sh; +Cc: linux-mmc

Hello, Guennadi.
This was started as an evaluation of your patch
 [PATCH v2] mmc: tmio: fix recursive spinlock, don't schedule with interrupts disabled
But now I don't know where exactly the issue is. So, let me begin new thread.

On ag5evm(the board under arch/arm/mach-shmobile, that is SMP), both
 tag v3.0 + patch above, and
 tag v3.0 + cjb/mmc-next
with CONFIG_PROVE_LOCKING=y, inconsistent lock state is detected as text attached at
 the bottom.
Function itself(mount/read/write/umount) seems working without problem, so far.

As it happens on the spinlock at sh_dmae_interrupt(), it seems to have been there since
2dc66667 dmaengine: shdma: fix locking
which introduced the lock there.

I found deleting the spin_lock/unlock there makes it quiet. But that lock must be the
important part of the patch. Actually, deleting it brings another locking issue on
tmio_mmc, and it confuses me. I wonder what is the correct solution. I'm afraid I can't
tell what is the critical object to be protected in shdma and tmio_mmc source code...

Any suggestions are appreciated.
/yoshii

================[ INFO: inconsistent lock state ]
3.0.0-00100-g82258ef-dirty #3643
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c0795fc4>] sh_dmae_interrupt+0x14/0x78
{HARDIRQ-ON-W} state was registered at:
  [<c0689b30>] __lock_acquire+0x5c4/0xbb4
  [<c068a6f8>] lock_acquire+0x60/0x74
  [<c088da7c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c0796edc>] sh_dmae_alloc_chan_resources+0x1b0/0x298
  [<c0794ca8>] dma_chan_get+0xd8/0x17c
  [<c0795560>] __dma_request_channel+0x140/0x1e4
  [<c07e5850>] tmio_mmc_request_dma+0x74/0x10c
  [<c0886b84>] tmio_mmc_host_probe+0x208/0x284
  [<c0886d68>] sh_mobile_sdhi_probe+0x168/0x28c
  [<c07bca4c>] platform_drv_probe+0x18/0x1c
  [<c07bb5b8>] driver_probe_device+0x7c/0x178
  [<c07bb748>] __driver_attach+0x94/0x98
  [<c07badbc>] bus_for_each_dev+0x60/0x8c
  [<c07ba5a4>] bus_add_driver+0xa8/0x2a4
  [<c07bbd24>] driver_register+0x78/0x18c
  [<c0633530>] do_one_initcall+0x34/0x184
  [<c00083d8>] kernel_init+0xa8/0x134
  [<c063a5a8>] kernel_thread_exit+0x0/0x8
irq event stamp: 17556
hardirqs last  enabled at (17553): [<c063a64c>] default_idle+0x24/0x2c
hardirqs last disabled at (17554): [<c0639674>] __irq_svc+0x34/0xa0
softirqs last  enabled at (17556): [<c065c878>] irq_enter+0x78/0x7c
softirqs last disabled at (17555): [<c065c86c>] irq_enter+0x6c/0x7c

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&new_sh_chan->desc_lock)->rlock);
  <Interrupt>
    lock(&(&new_sh_chan->desc_lock)->rlock);

 *** DEADLOCK ***

no locks held by swapper/0.

stack backtrace:
[<c063f730>] (unwind_backtrace+0x0/0xfc) from [<c0686358>] (print_usage_bug+0x21c/0x2bc)
[<c0686358>] (print_usage_bug+0x21c/0x2bc) from [<c0686908>] (mark_lock+0x510/0x740)
[<c0686908>] (mark_lock+0x510/0x740) from [<c0689ba4>] (__lock_acquire+0x638/0xbb4)
[<c0689ba4>] (__lock_acquire+0x638/0xbb4) from [<c068a6f8>] (lock_acquire+0x60/0x74)
[<c068a6f8>] (lock_acquire+0x60/0x74) from [<c088d868>] (_raw_spin_lock+0x34/0x44)
[<c088d868>] (_raw_spin_lock+0x34/0x44) from [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78)
[<c0795fc4>] (sh_dmae_interrupt+0x14/0x78) from [<c0698f50>] (handle_irq_event_percpu+0x54/0x184)
[<c0698f50>] (handle_irq_event_percpu+0x54/0x184) from [<c06990bc>] (handle_irq_event+0x3c/0x5c)
[<c06990bc>] (handle_irq_event+0x3c/0x5c) from [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104)
[<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104) from [<c0698eec>] (generic_handle_irq+0x28/0x30)
[<c0698eec>] (generic_handle_irq+0x28/0x30) from [<c0633058>] (asm_do_IRQ+0x58/0xb8)
[<c0633058>] (asm_do_IRQ+0x58/0xb8) from [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80)
Exception stack(0xc0951f70 to 0xc0951fb8)
1f60:                                     00000001 00004491 00000000 c0951fa8
1f80: c0950000 c0977604 c088f9c4 c0955b40 4000406a 412fc098 00000000 00000000
1fa0: 00000001 c0951fb8 00004490 c063a650 20000013 ffffffff
[<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80) from [<00004491>] (0x4491)
mmc0: new high speed SD card at address b368
mmcblk0: mmc0:b368 SDC   489 MiB 
 mmcblk0: p1

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm
  2011-07-25  9:28 ` Yoshii Takashi
@ 2011-07-27  8:23   ` Guennadi Liakhovetski
  -1 siblings, 0 replies; 6+ messages in thread
From: Guennadi Liakhovetski @ 2011-07-27  8:23 UTC (permalink / raw)
  To: Yoshii Takashi; +Cc: linux-sh, linux-mmc

Hello Yoshii-san

On Mon, 25 Jul 2011, Yoshii Takashi wrote:

> Hello, Guennadi.
> This was started as an evaluation of your patch
>  [PATCH v2] mmc: tmio: fix recursive spinlock, don't schedule with interrupts disabled
> But now I don't know where exactly the issue is. So, let me begin new thread.
> 
> On ag5evm(the board under arch/arm/mach-shmobile, that is SMP), both
>  tag v3.0 + patch above, and
>  tag v3.0 + cjb/mmc-next
> with CONFIG_PROVE_LOCKING=y, inconsistent lock state is detected as text attached at
>  the bottom.
> Function itself(mount/read/write/umount) seems working without problem, so far.
> 
> As it happens on the spinlock at sh_dmae_interrupt(), it seems to have been there since
> 2dc66667 dmaengine: shdma: fix locking
> which introduced the lock there.

Thanks for the report! It is interesting. From your backtrace it looks 
like the spin_lock_bh() in sh_dmae_alloc_chan_resources() gets interrupted 
by an interrupt on the same channel! I do not understand this, because, 
normally no interrupts on DMAC channels occur until data transfers begin 
over them.

So, a simple fix would be to replace calls to spin_(un)lock_bh() with 
spin_(un)lock_irq() in sh_dmae_alloc_chan_resources(), but I would like to 
understand, what kind of interrupts are occurring at that time and why. 
Could you please apply the attached below debugging patch and send me the 
resulting complete dmesg output? Alternatively, you can choose to wait 
until a detailed SMP testing of the SDHI driver takes place, hopefully, in 
approximately a week or a bit more.

Thanks
Guennadi

diff --git a/drivers/dma/shdma.c b/drivers/dma/shdma.c
index 0283300..2c6dde6 100644
--- a/drivers/dma/shdma.c
+++ b/drivers/dma/shdma.c
@@ -848,6 +848,10 @@ static irqreturn_t sh_dmae_interrupt(int irq, void *data)
 
 	chcr = sh_dmae_readl(sh_chan, CHCR);
 
+	if (list_empty(&sh_chan->ld_queue))
+		dev_warn(sh_chan->dev, "Spurious interrupt 0x%x on %s\n",
+			 chcr, dma_chan_name(&sh_chan->common));
+
 	if (chcr & CHCR_TE) {
 		/* DMA stop */
 		dmae_halt(sh_chan);

> 
> I found deleting the spin_lock/unlock there makes it quiet. But that lock must be the
> important part of the patch. Actually, deleting it brings another locking issue on
> tmio_mmc, and it confuses me. I wonder what is the correct solution. I'm afraid I can't
> tell what is the critical object to be protected in shdma and tmio_mmc source code...
> 
> Any suggestions are appreciated.
> /yoshii
> 
> =================================
> [ INFO: inconsistent lock state ]
> 3.0.0-00100-g82258ef-dirty #3643
> ---------------------------------
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>  (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c0795fc4>] sh_dmae_interrupt+0x14/0x78
> {HARDIRQ-ON-W} state was registered at:
>   [<c0689b30>] __lock_acquire+0x5c4/0xbb4
>   [<c068a6f8>] lock_acquire+0x60/0x74
>   [<c088da7c>] _raw_spin_lock_bh+0x3c/0x4c
>   [<c0796edc>] sh_dmae_alloc_chan_resources+0x1b0/0x298
>   [<c0794ca8>] dma_chan_get+0xd8/0x17c
>   [<c0795560>] __dma_request_channel+0x140/0x1e4
>   [<c07e5850>] tmio_mmc_request_dma+0x74/0x10c
>   [<c0886b84>] tmio_mmc_host_probe+0x208/0x284
>   [<c0886d68>] sh_mobile_sdhi_probe+0x168/0x28c
>   [<c07bca4c>] platform_drv_probe+0x18/0x1c
>   [<c07bb5b8>] driver_probe_device+0x7c/0x178
>   [<c07bb748>] __driver_attach+0x94/0x98
>   [<c07badbc>] bus_for_each_dev+0x60/0x8c
>   [<c07ba5a4>] bus_add_driver+0xa8/0x2a4
>   [<c07bbd24>] driver_register+0x78/0x18c
>   [<c0633530>] do_one_initcall+0x34/0x184
>   [<c00083d8>] kernel_init+0xa8/0x134
>   [<c063a5a8>] kernel_thread_exit+0x0/0x8
> irq event stamp: 17556
> hardirqs last  enabled at (17553): [<c063a64c>] default_idle+0x24/0x2c
> hardirqs last disabled at (17554): [<c0639674>] __irq_svc+0x34/0xa0
> softirqs last  enabled at (17556): [<c065c878>] irq_enter+0x78/0x7c
> softirqs last disabled at (17555): [<c065c86c>] irq_enter+0x6c/0x7c
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&(&new_sh_chan->desc_lock)->rlock);
>   <Interrupt>
>     lock(&(&new_sh_chan->desc_lock)->rlock);
> 
>  *** DEADLOCK ***
> 
> no locks held by swapper/0.
> 
> stack backtrace:
> [<c063f730>] (unwind_backtrace+0x0/0xfc) from [<c0686358>] (print_usage_bug+0x21c/0x2bc)
> [<c0686358>] (print_usage_bug+0x21c/0x2bc) from [<c0686908>] (mark_lock+0x510/0x740)
> [<c0686908>] (mark_lock+0x510/0x740) from [<c0689ba4>] (__lock_acquire+0x638/0xbb4)
> [<c0689ba4>] (__lock_acquire+0x638/0xbb4) from [<c068a6f8>] (lock_acquire+0x60/0x74)
> [<c068a6f8>] (lock_acquire+0x60/0x74) from [<c088d868>] (_raw_spin_lock+0x34/0x44)
> [<c088d868>] (_raw_spin_lock+0x34/0x44) from [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78)
> [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78) from [<c0698f50>] (handle_irq_event_percpu+0x54/0x184)
> [<c0698f50>] (handle_irq_event_percpu+0x54/0x184) from [<c06990bc>] (handle_irq_event+0x3c/0x5c)
> [<c06990bc>] (handle_irq_event+0x3c/0x5c) from [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104)
> [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104) from [<c0698eec>] (generic_handle_irq+0x28/0x30)
> [<c0698eec>] (generic_handle_irq+0x28/0x30) from [<c0633058>] (asm_do_IRQ+0x58/0xb8)
> [<c0633058>] (asm_do_IRQ+0x58/0xb8) from [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80)
> Exception stack(0xc0951f70 to 0xc0951fb8)
> 1f60:                                     00000001 00004491 00000000 c0951fa8
> 1f80: c0950000 c0977604 c088f9c4 c0955b40 4000406a 412fc098 00000000 00000000
> 1fa0: 00000001 c0951fb8 00004490 c063a650 20000013 ffffffff
> [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80) from [<00004491>] (0x4491)
> mmc0: new high speed SD card at address b368
> mmcblk0: mmc0:b368 SDC   489 MiB 
>  mmcblk0: p1
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm
@ 2011-07-27  8:23   ` Guennadi Liakhovetski
  0 siblings, 0 replies; 6+ messages in thread
From: Guennadi Liakhovetski @ 2011-07-27  8:23 UTC (permalink / raw)
  To: Yoshii Takashi; +Cc: linux-sh, linux-mmc

Hello Yoshii-san

On Mon, 25 Jul 2011, Yoshii Takashi wrote:

> Hello, Guennadi.
> This was started as an evaluation of your patch
>  [PATCH v2] mmc: tmio: fix recursive spinlock, don't schedule with interrupts disabled
> But now I don't know where exactly the issue is. So, let me begin new thread.
> 
> On ag5evm(the board under arch/arm/mach-shmobile, that is SMP), both
>  tag v3.0 + patch above, and
>  tag v3.0 + cjb/mmc-next
> with CONFIG_PROVE_LOCKING=y, inconsistent lock state is detected as text attached at
>  the bottom.
> Function itself(mount/read/write/umount) seems working without problem, so far.
> 
> As it happens on the spinlock at sh_dmae_interrupt(), it seems to have been there since
> 2dc66667 dmaengine: shdma: fix locking
> which introduced the lock there.

Thanks for the report! It is interesting. From your backtrace it looks 
like the spin_lock_bh() in sh_dmae_alloc_chan_resources() gets interrupted 
by an interrupt on the same channel! I do not understand this, because, 
normally no interrupts on DMAC channels occur until data transfers begin 
over them.

So, a simple fix would be to replace calls to spin_(un)lock_bh() with 
spin_(un)lock_irq() in sh_dmae_alloc_chan_resources(), but I would like to 
understand, what kind of interrupts are occurring at that time and why. 
Could you please apply the attached below debugging patch and send me the 
resulting complete dmesg output? Alternatively, you can choose to wait 
until a detailed SMP testing of the SDHI driver takes place, hopefully, in 
approximately a week or a bit more.

Thanks
Guennadi

diff --git a/drivers/dma/shdma.c b/drivers/dma/shdma.c
index 0283300..2c6dde6 100644
--- a/drivers/dma/shdma.c
+++ b/drivers/dma/shdma.c
@@ -848,6 +848,10 @@ static irqreturn_t sh_dmae_interrupt(int irq, void *data)
 
 	chcr = sh_dmae_readl(sh_chan, CHCR);
 
+	if (list_empty(&sh_chan->ld_queue))
+		dev_warn(sh_chan->dev, "Spurious interrupt 0x%x on %s\n",
+			 chcr, dma_chan_name(&sh_chan->common));
+
 	if (chcr & CHCR_TE) {
 		/* DMA stop */
 		dmae_halt(sh_chan);

> 
> I found deleting the spin_lock/unlock there makes it quiet. But that lock must be the
> important part of the patch. Actually, deleting it brings another locking issue on
> tmio_mmc, and it confuses me. I wonder what is the correct solution. I'm afraid I can't
> tell what is the critical object to be protected in shdma and tmio_mmc source code...
> 
> Any suggestions are appreciated.
> /yoshii
> 
> ================> [ INFO: inconsistent lock state ]
> 3.0.0-00100-g82258ef-dirty #3643
> ---------------------------------
> inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
> swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
>  (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c0795fc4>] sh_dmae_interrupt+0x14/0x78
> {HARDIRQ-ON-W} state was registered at:
>   [<c0689b30>] __lock_acquire+0x5c4/0xbb4
>   [<c068a6f8>] lock_acquire+0x60/0x74
>   [<c088da7c>] _raw_spin_lock_bh+0x3c/0x4c
>   [<c0796edc>] sh_dmae_alloc_chan_resources+0x1b0/0x298
>   [<c0794ca8>] dma_chan_get+0xd8/0x17c
>   [<c0795560>] __dma_request_channel+0x140/0x1e4
>   [<c07e5850>] tmio_mmc_request_dma+0x74/0x10c
>   [<c0886b84>] tmio_mmc_host_probe+0x208/0x284
>   [<c0886d68>] sh_mobile_sdhi_probe+0x168/0x28c
>   [<c07bca4c>] platform_drv_probe+0x18/0x1c
>   [<c07bb5b8>] driver_probe_device+0x7c/0x178
>   [<c07bb748>] __driver_attach+0x94/0x98
>   [<c07badbc>] bus_for_each_dev+0x60/0x8c
>   [<c07ba5a4>] bus_add_driver+0xa8/0x2a4
>   [<c07bbd24>] driver_register+0x78/0x18c
>   [<c0633530>] do_one_initcall+0x34/0x184
>   [<c00083d8>] kernel_init+0xa8/0x134
>   [<c063a5a8>] kernel_thread_exit+0x0/0x8
> irq event stamp: 17556
> hardirqs last  enabled at (17553): [<c063a64c>] default_idle+0x24/0x2c
> hardirqs last disabled at (17554): [<c0639674>] __irq_svc+0x34/0xa0
> softirqs last  enabled at (17556): [<c065c878>] irq_enter+0x78/0x7c
> softirqs last disabled at (17555): [<c065c86c>] irq_enter+0x6c/0x7c
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(&(&new_sh_chan->desc_lock)->rlock);
>   <Interrupt>
>     lock(&(&new_sh_chan->desc_lock)->rlock);
> 
>  *** DEADLOCK ***
> 
> no locks held by swapper/0.
> 
> stack backtrace:
> [<c063f730>] (unwind_backtrace+0x0/0xfc) from [<c0686358>] (print_usage_bug+0x21c/0x2bc)
> [<c0686358>] (print_usage_bug+0x21c/0x2bc) from [<c0686908>] (mark_lock+0x510/0x740)
> [<c0686908>] (mark_lock+0x510/0x740) from [<c0689ba4>] (__lock_acquire+0x638/0xbb4)
> [<c0689ba4>] (__lock_acquire+0x638/0xbb4) from [<c068a6f8>] (lock_acquire+0x60/0x74)
> [<c068a6f8>] (lock_acquire+0x60/0x74) from [<c088d868>] (_raw_spin_lock+0x34/0x44)
> [<c088d868>] (_raw_spin_lock+0x34/0x44) from [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78)
> [<c0795fc4>] (sh_dmae_interrupt+0x14/0x78) from [<c0698f50>] (handle_irq_event_percpu+0x54/0x184)
> [<c0698f50>] (handle_irq_event_percpu+0x54/0x184) from [<c06990bc>] (handle_irq_event+0x3c/0x5c)
> [<c06990bc>] (handle_irq_event+0x3c/0x5c) from [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104)
> [<c069b6e8>] (handle_fasteoi_irq+0x9c/0x104) from [<c0698eec>] (generic_handle_irq+0x28/0x30)
> [<c0698eec>] (generic_handle_irq+0x28/0x30) from [<c0633058>] (asm_do_IRQ+0x58/0xb8)
> [<c0633058>] (asm_do_IRQ+0x58/0xb8) from [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80)
> Exception stack(0xc0951f70 to 0xc0951fb8)
> 1f60:                                     00000001 00004491 00000000 c0951fa8
> 1f80: c0950000 c0977604 c088f9c4 c0955b40 4000406a 412fc098 00000000 00000000
> 1fa0: 00000001 c0951fb8 00004490 c063a650 20000013 ffffffff
> [<c0644acc>] (shmobile_handle_irq_gic+0xc/0x80) from [<00004491>] (0x4491)
> mmc0: new high speed SD card at address b368
> mmcblk0: mmc0:b368 SDC   489 MiB 
>  mmcblk0: p1
> 

---
Guennadi Liakhovetski, Ph.D.
Freelance Open-Source Software Developer
http://www.open-technology.de/

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm
  2011-07-27  8:23   ` Guennadi Liakhovetski
@ 2011-07-28  5:47     ` takasi-y
  -1 siblings, 0 replies; 6+ messages in thread
From: takasi-y @ 2011-07-28  5:47 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-sh, linux-mmc

[-- Attachment #1: Type: text/plain, Size: 2572 bytes --]

Thank you for your reply.

> Thanks for the report! It is interesting. From your backtrace it looks 
> like the spin_lock_bh() in sh_dmae_alloc_chan_resources() gets interrupted 
> by an interrupt on the same channel! I do not understand this, because, 
> normally no interrupts on DMAC channels occur until data transfers begin 
> over them.
Right. The message/trace looks as if things happen in unexpected order.
But after tracing, I found the order is not really bad.
1. sh_dmae_alloc_chan_resources() called and returned
2. sh_dmae_memcpy_issue_pending() -> sh_chan_xfer_ld_queue() -> dma_start()
3. sh_dmae_interrupt() invoked
4. the INFO message printed.

Perhaps lockdep issue, or more possibly I've miss-understood the meaning of the
 checking system and the message?
 
> So, a simple fix would be to replace calls to spin_(un)lock_bh() with 
> spin_(un)lock_irq() in sh_dmae_alloc_chan_resources(), 
Yeah! Now it moved to sh_dmae_prep_sg()
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c3a5c>] sh_dmae_prep_sg+0x110/0x29c
  [<c07c3c34>] sh_dmae_prep_slave_sg+0x4c/0x88
  [<c0815184>] tmio_mmc_start_dma+0x30c/0x3bc
  [<c0813fd4>] tmio_mmc_request+0xe0/0x2d8
  [<c0807bbc>] mmc_wait_for_req+0x12c/0x184
  [<c080ca0c>] mmc_app_send_scr+0x15c/0x1e0
  [<c080b284>] mmc_sd_setup_card+0x80/0x4a8
  [<c080bb20>] mmc_sd_init_card+0xcc/0x664
  [<c080c190>] mmc_attach_sd+0xd8/0x228
  [<c08078f8>] mmc_rescan+0x24c/0x3e4
  [<c06992ec>] process_one_work+0x170/0x4c4
  [<c06999d4>] worker_thread+0x19c/0x470
  [<c06a041c>] kthread+0x8c/0x90
  [<c06675a8>] kernel_thread_exit+0x0/0x8

And then s/_bh/_irq/ in sh_dmae_prep_sg(), it moved to sh_dmae_tx_submit()...
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c318c>] sh_dmae_tx_submit+0x30/0x138
  [<c0815194>] tmio_mmc_start_dma+0x31c/0x3bc
  ...

> .. but I would like to 
> understand, what kind of interrupts are occurring at that time and why. 
> Could you please apply the attached below debugging patch and send me the 
> resulting complete dmesg output?
Sure. Attached. Nothing seems to have happened, though.

> .. Alternatively, you can choose to wait 
> until a detailed SMP testing of the SDHI driver takes place, hopefully, in 
> approximately a week or a bit more.
That sounds promising! I would be happy if you spend more time on it than this issue.
I think I will study this some more on dma side.
Thank you. And sorry for bothering.
/yoshii

[-- Attachment #2: dmesg_test_ld_queue.txt --]
[-- Type: text/plain, Size: 10090 bytes --]

Linux version 3.0.0-00004-g65ae30f-dirty (yoshii@genny) (gcc version 4.4.1 (SG++Re-Built 4.4-93) ) #3701 SMP Wed Jul 27 21:56:24 JST 2011
CPU: ARMv7 Processor [412fc098] revision 8 (ARMv7), cr=10c5347f
CPU: VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine: ag5evm
Memory policy: ECC disabled, Data cache writealloc
bootconsole [early_ttySC2] enabled
On node 0 totalpages: 131072
free_area_init_node: node 0, pgdat c09a7cc0, node_mem_map c0ec7000
  Normal zone: 960 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 121920 pages, LIFO batch:31
  HighMem zone: 64 pages used for memmap
  HighMem zone: 8128 pages, LIFO batch:0
PERCPU: Embedded 7 pages/cpu @c12cf000 s4608 r8192 d15872 u32768
pcpu-alloc: s4608 r8192 d15872 u32768 alloc=8*4096
pcpu-alloc: [0] 0 [0] 1 
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 130048
Kernel command line: console=ttySC2,115200 earlyprintk=sh-sci.2 memchunk.ceu0=4M debug
PID hash table entries: 2048 (order: 1, 8192 bytes)
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 512MB = 512MB total
Memory: 504572k/504572k available, 19716k reserved, 32768K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    DMA     : 0xf6000000 - 0xffe00000   ( 158 MB)
    vmalloc : 0xde800000 - 0xe6000000   ( 120 MB)
    lowmem  : 0xc0000000 - 0xde000000   ( 480 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .init : 0xc0008000 - 0xc0660000   (6496 kB)
      .text : 0xc0660000 - 0xc097e99c   (3195 kB)
      .data : 0xc0980000 - 0xc09a8860   ( 163 kB)
       .bss : 0xc09a8884 - 0xc0ec6490   (5240 kB)
Hierarchical RCU implementation.
NR_IRQS:1024 nr_irqs:1024 1024
intc: Registered controller 'sh73a0-intcs' with 77 IRQs
Console: colour dummy device 80x30
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8191
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     16384
... MAX_LOCKDEP_CHAINS:      32768
... CHAINHASH_SIZE:          16384
 memory used by lock dependency info: 3695 kB
 per task-struct memory footprint: 1152 bytes
 sh_tmu.0: used for clock events
 sh_tmu.0: used for periodic clock events
 sh_tmu.1: used as clock source
Calibrating delay loop... 72.46 BogoMIPS (lpj=282624)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
Calibrating local timer... 98.28MHz.
CPU1: failed to come online
Brought up 1 CPUs
SMP: Total of 1 processors activated (72.46 BogoMIPS).
NET: Registered protocol family 16
pfc: sh73a0_pfc handling gpio 0 -> 908
L310 cache controller enabled
l2x0: 8 ways, CACHE_ID 0x410000c7, AUX_CTRL 0x02460000, Cache size: 524288 B
bio: create slab <bio-0> at 0
i2c-sh_mobile i2c-sh_mobile.0: I2C adapter 0 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.1: I2C adapter 1 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.2: I2C adapter 2 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.3: I2C adapter 3 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.4: I2C adapter 4 with bus speed 100000 Hz
Switching to clocksource sh_tmu.1
platform sh_tmu.0: used for oneshot clock events
Switched to NOHz mode on CPU #0
NET: Registered protocol family 2
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 7, 589824 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
TCP reno registered
UDP hash table entries: 256 (order: 2, 20480 bytes)
UDP-Lite hash table entries: 256 (order: 2, 20480 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
highmem bounce pool size: 64 pages
msgmni has been set to 921
io scheduler noop registered (default)
start plist test
end plist test
Console: switching to colour frame buffer device 68x60
graphics fb0: registered sh_mobile_lcdc_fb/mainlcd as 544x961 16bpp.
SuperH SCI(F) driver initialized
sh-sci.0: ttySC0 at MMIO 0xe6c40000 (irq = 104) is a scifa
console [ttySC2] enabled, bootconsole disabled
sh-sci.1: ttySC1 at MMIO 0xe6c50000 (irq = 105) is a scifa
sh-sci.2: ttySC2 at MMIO 0xe6c60000 (irq = 106) is a scifa
sh-sci.3: ttySC3 at MMIO 0xe6c70000 (irq = 107) is a scifa
sh-sci.4: ttySC4 at MMIO 0xe6c80000 (irq = 110) is a scifa
sh-sci.5: ttySC5 at MMIO 0xe6cb0000 (irq = 111) is a scifa
sh-sci.6: ttySC6 at MMIO 0xe6cc0000 (irq = 188) is a scifa
sh-sci.7: ttySC7 at MMIO 0xe6cd0000 (irq = 175) is a scifa
sh-sci.8: ttySC8 at MMIO 0xe6c30000 (irq = 112) is a scifb
smsc911x: Driver version 2008-10-21
smsc911x-mdio: probed
smsc911x smsc911x.0: eth0: attached PHY driver [Generic PHY] (mii_bus:phy_addr=0:01, irq=-1)
smsc911x smsc911x.0: eth0: MAC Address: 02:00:00:73:a0:02
mousedev: PS/2 mouse device common for all mice
sh_mobile_sdhi sh_mobile_sdhi.0: mmc0 base at 0xee100000 clock rate 69 MHz
sh_mobile_sdhi sh_mobile_sdhi.1: mmc1 base at 0xee120000 clock rate 69 MHz
sh_cmt sh_cmt.10: used for clock events
sh_cmt sh_cmt.10: used as clock source
sh_tmu sh_tmu.0: kept as earlytimer
sh_tmu sh_tmu.1: kept as earlytimer
TCP cubic registered
NET: Registered protocol family 17
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 2
Freeing init memory: 6496K

=================================
[ INFO: inconsistent lock state ]
3.0.0-00004-g65ae30f-dirty #3701
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
mdev/284 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c07c3014>] sh_dmae_interrupt+0x14/0xb0
{HARDIRQ-ON-W} state was registered at:
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c4014>] sh_dmae_alloc_chan_resources+0x1d0/0x2bc
  [<c07c1d20>] dma_chan_get+0xd8/0x17c
  [<c07c25d8>] __dma_request_channel+0x140/0x1e4
  [<c0814d00>] tmio_mmc_request_dma+0x74/0x10c
  [<c08b5f98>] tmio_mmc_host_probe+0x228/0x2cc
  [<c08b618c>] sh_mobile_sdhi_probe+0x150/0x274
  [<c07eaa24>] platform_drv_probe+0x18/0x1c
  [<c07e957c>] driver_probe_device+0x9c/0x1a4
  [<c07e9718>] __driver_attach+0x94/0x98
  [<c07e8d48>] bus_for_each_dev+0x60/0x8c
  [<c07e8530>] bus_add_driver+0xa8/0x2a4
  [<c07e9d14>] driver_register+0x78/0x18c
  [<c0660530>] do_one_initcall+0x34/0x184
  [<c00083d8>] kernel_init+0xa8/0x134
  [<c06675a8>] kernel_thread_exit+0x0/0x8
irq event stamp: 11477
hardirqs last  enabled at (11476): [<c08bbecc>] mutex_lock_nested+0x298/0x350
hardirqs last disabled at (11477): [<c0666674>] __irq_svc+0x34/0xa0
softirqs last  enabled at (11395): [<c06897a4>] irq_exit+0xa0/0xa4
softirqs last disabled at (11380): [<c06897a4>] irq_exit+0xa0/0xa4

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&new_sh_chan->desc_lock)->rlock);
  <Interrupt>
    lock(&(&new_sh_chan->desc_lock)->rlock);

 *** DEADLOCK ***

1 lock held by mdev/284:
 #0:  (sysfs_mutex){+.+.+.}, at: [<c075175c>] sysfs_permission+0x34/0x64

stack backtrace:
[<c066c730>] (unwind_backtrace+0x0/0xfc) from [<c06b3378>] (print_usage_bug+0x21c/0x2bc)
[<c06b3378>] (print_usage_bug+0x21c/0x2bc) from [<c06b3928>] (mark_lock+0x510/0x740)
[<c06b3928>] (mark_lock+0x510/0x740) from [<c06b6bc4>] (__lock_acquire+0x638/0xbb4)
[<c06b6bc4>] (__lock_acquire+0x638/0xbb4) from [<c06b7718>] (lock_acquire+0x60/0x74)
[<c06b7718>] (lock_acquire+0x60/0x74) from [<c08bcc88>] (_raw_spin_lock+0x34/0x44)
[<c08bcc88>] (_raw_spin_lock+0x34/0x44) from [<c07c3014>] (sh_dmae_interrupt+0x14/0xb0)
[<c07c3014>] (sh_dmae_interrupt+0x14/0xb0) from [<c06c5f70>] (handle_irq_event_percpu+0x54/0x184)
[<c06c5f70>] (handle_irq_event_percpu+0x54/0x184) from [<c06c60dc>] (handle_irq_event+0x3c/0x5c)
[<c06c60dc>] (handle_irq_event+0x3c/0x5c) from [<c06c8708>] (handle_fasteoi_irq+0x9c/0x104)
[<c06c8708>] (handle_fasteoi_irq+0x9c/0x104) from [<c06c5f0c>] (generic_handle_irq+0x28/0x30)
[<c06c5f0c>] (generic_handle_irq+0x28/0x30) from [<c0660058>] (asm_do_IRQ+0x58/0xb8)
[<c0660058>] (asm_do_IRQ+0x58/0xb8) from [<c0671aec>] (shmobile_handle_irq_gic+0xc/0x80)
Exception stack(0xddb21d78 to 0xddb21dc0)
1d60:                                                       00000000 00000001
1d80: dd98d108 dd4b46e8 00000000 00000000 dd4b46e8 00000001 dd85c808 ddb20000
1da0: 00000015 ddb21e14 00000008 ddb21dc0 c0751768 c0751704 a0000013 ffffffff
[<c0671aec>] (shmobile_handle_irq_gic+0xc/0x80) from [<00000001>] (0x1)
mmc0: new high speed SD card at address b368
mmcblk0: mmc0:b368 SDC   489 MiB 
 mmcblk0: p1
smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xde860000, IRQ: 65
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD1)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD1)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: shdma: (or tmio_mmc?) inconsistent lock state detected on
@ 2011-07-28  5:47     ` takasi-y
  0 siblings, 0 replies; 6+ messages in thread
From: takasi-y @ 2011-07-28  5:47 UTC (permalink / raw)
  To: Guennadi Liakhovetski; +Cc: linux-sh, linux-mmc

[-- Attachment #1: Type: text/plain, Size: 2572 bytes --]

Thank you for your reply.

> Thanks for the report! It is interesting. From your backtrace it looks 
> like the spin_lock_bh() in sh_dmae_alloc_chan_resources() gets interrupted 
> by an interrupt on the same channel! I do not understand this, because, 
> normally no interrupts on DMAC channels occur until data transfers begin 
> over them.
Right. The message/trace looks as if things happen in unexpected order.
But after tracing, I found the order is not really bad.
1. sh_dmae_alloc_chan_resources() called and returned
2. sh_dmae_memcpy_issue_pending() -> sh_chan_xfer_ld_queue() -> dma_start()
3. sh_dmae_interrupt() invoked
4. the INFO message printed.

Perhaps lockdep issue, or more possibly I've miss-understood the meaning of the
 checking system and the message?
 
> So, a simple fix would be to replace calls to spin_(un)lock_bh() with 
> spin_(un)lock_irq() in sh_dmae_alloc_chan_resources(), 
Yeah! Now it moved to sh_dmae_prep_sg()
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c3a5c>] sh_dmae_prep_sg+0x110/0x29c
  [<c07c3c34>] sh_dmae_prep_slave_sg+0x4c/0x88
  [<c0815184>] tmio_mmc_start_dma+0x30c/0x3bc
  [<c0813fd4>] tmio_mmc_request+0xe0/0x2d8
  [<c0807bbc>] mmc_wait_for_req+0x12c/0x184
  [<c080ca0c>] mmc_app_send_scr+0x15c/0x1e0
  [<c080b284>] mmc_sd_setup_card+0x80/0x4a8
  [<c080bb20>] mmc_sd_init_card+0xcc/0x664
  [<c080c190>] mmc_attach_sd+0xd8/0x228
  [<c08078f8>] mmc_rescan+0x24c/0x3e4
  [<c06992ec>] process_one_work+0x170/0x4c4
  [<c06999d4>] worker_thread+0x19c/0x470
  [<c06a041c>] kthread+0x8c/0x90
  [<c06675a8>] kernel_thread_exit+0x0/0x8

And then s/_bh/_irq/ in sh_dmae_prep_sg(), it moved to sh_dmae_tx_submit()...
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c318c>] sh_dmae_tx_submit+0x30/0x138
  [<c0815194>] tmio_mmc_start_dma+0x31c/0x3bc
  ...

> .. but I would like to 
> understand, what kind of interrupts are occurring at that time and why. 
> Could you please apply the attached below debugging patch and send me the 
> resulting complete dmesg output?
Sure. Attached. Nothing seems to have happened, though.

> .. Alternatively, you can choose to wait 
> until a detailed SMP testing of the SDHI driver takes place, hopefully, in 
> approximately a week or a bit more.
That sounds promising! I would be happy if you spend more time on it than this issue.
I think I will study this some more on dma side.
Thank you. And sorry for bothering.
/yoshii

[-- Attachment #2: dmesg_test_ld_queue.txt --]
[-- Type: text/plain, Size: 10090 bytes --]

Linux version 3.0.0-00004-g65ae30f-dirty (yoshii@genny) (gcc version 4.4.1 (SG++Re-Built 4.4-93) ) #3701 SMP Wed Jul 27 21:56:24 JST 2011
CPU: ARMv7 Processor [412fc098] revision 8 (ARMv7), cr=10c5347f
CPU: VIPT nonaliasing data cache, VIPT aliasing instruction cache
Machine: ag5evm
Memory policy: ECC disabled, Data cache writealloc
bootconsole [early_ttySC2] enabled
On node 0 totalpages: 131072
free_area_init_node: node 0, pgdat c09a7cc0, node_mem_map c0ec7000
  Normal zone: 960 pages used for memmap
  Normal zone: 0 pages reserved
  Normal zone: 121920 pages, LIFO batch:31
  HighMem zone: 64 pages used for memmap
  HighMem zone: 8128 pages, LIFO batch:0
PERCPU: Embedded 7 pages/cpu @c12cf000 s4608 r8192 d15872 u32768
pcpu-alloc: s4608 r8192 d15872 u32768 alloc=8*4096
pcpu-alloc: [0] 0 [0] 1 
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 130048
Kernel command line: console=ttySC2,115200 earlyprintk=sh-sci.2 memchunk.ceu0=4M debug
PID hash table entries: 2048 (order: 1, 8192 bytes)
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 512MB = 512MB total
Memory: 504572k/504572k available, 19716k reserved, 32768K highmem
Virtual kernel memory layout:
    vector  : 0xffff0000 - 0xffff1000   (   4 kB)
    fixmap  : 0xfff00000 - 0xfffe0000   ( 896 kB)
    DMA     : 0xf6000000 - 0xffe00000   ( 158 MB)
    vmalloc : 0xde800000 - 0xe6000000   ( 120 MB)
    lowmem  : 0xc0000000 - 0xde000000   ( 480 MB)
    pkmap   : 0xbfe00000 - 0xc0000000   (   2 MB)
    modules : 0xbf000000 - 0xbfe00000   (  14 MB)
      .init : 0xc0008000 - 0xc0660000   (6496 kB)
      .text : 0xc0660000 - 0xc097e99c   (3195 kB)
      .data : 0xc0980000 - 0xc09a8860   ( 163 kB)
       .bss : 0xc09a8884 - 0xc0ec6490   (5240 kB)
Hierarchical RCU implementation.
NR_IRQS:1024 nr_irqs:1024 1024
intc: Registered controller 'sh73a0-intcs' with 77 IRQs
Console: colour dummy device 80x30
Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
... MAX_LOCKDEP_SUBCLASSES:  8
... MAX_LOCK_DEPTH:          48
... MAX_LOCKDEP_KEYS:        8191
... CLASSHASH_SIZE:          4096
... MAX_LOCKDEP_ENTRIES:     16384
... MAX_LOCKDEP_CHAINS:      32768
... CHAINHASH_SIZE:          16384
 memory used by lock dependency info: 3695 kB
 per task-struct memory footprint: 1152 bytes
 sh_tmu.0: used for clock events
 sh_tmu.0: used for periodic clock events
 sh_tmu.1: used as clock source
Calibrating delay loop... 72.46 BogoMIPS (lpj=282624)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 512
CPU: Testing write buffer coherency: ok
Calibrating local timer... 98.28MHz.
CPU1: failed to come online
Brought up 1 CPUs
SMP: Total of 1 processors activated (72.46 BogoMIPS).
NET: Registered protocol family 16
pfc: sh73a0_pfc handling gpio 0 -> 908
L310 cache controller enabled
l2x0: 8 ways, CACHE_ID 0x410000c7, AUX_CTRL 0x02460000, Cache size: 524288 B
bio: create slab <bio-0> at 0
i2c-sh_mobile i2c-sh_mobile.0: I2C adapter 0 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.1: I2C adapter 1 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.2: I2C adapter 2 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.3: I2C adapter 3 with bus speed 100000 Hz
i2c-sh_mobile i2c-sh_mobile.4: I2C adapter 4 with bus speed 100000 Hz
Switching to clocksource sh_tmu.1
platform sh_tmu.0: used for oneshot clock events
Switched to NOHz mode on CPU #0
NET: Registered protocol family 2
IP route cache hash table entries: 4096 (order: 2, 16384 bytes)
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 7, 589824 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
TCP reno registered
UDP hash table entries: 256 (order: 2, 20480 bytes)
UDP-Lite hash table entries: 256 (order: 2, 20480 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
highmem bounce pool size: 64 pages
msgmni has been set to 921
io scheduler noop registered (default)
start plist test
end plist test
Console: switching to colour frame buffer device 68x60
graphics fb0: registered sh_mobile_lcdc_fb/mainlcd as 544x961 16bpp.
SuperH SCI(F) driver initialized
sh-sci.0: ttySC0 at MMIO 0xe6c40000 (irq = 104) is a scifa
console [ttySC2] enabled, bootconsole disabled
sh-sci.1: ttySC1 at MMIO 0xe6c50000 (irq = 105) is a scifa
sh-sci.2: ttySC2 at MMIO 0xe6c60000 (irq = 106) is a scifa
sh-sci.3: ttySC3 at MMIO 0xe6c70000 (irq = 107) is a scifa
sh-sci.4: ttySC4 at MMIO 0xe6c80000 (irq = 110) is a scifa
sh-sci.5: ttySC5 at MMIO 0xe6cb0000 (irq = 111) is a scifa
sh-sci.6: ttySC6 at MMIO 0xe6cc0000 (irq = 188) is a scifa
sh-sci.7: ttySC7 at MMIO 0xe6cd0000 (irq = 175) is a scifa
sh-sci.8: ttySC8 at MMIO 0xe6c30000 (irq = 112) is a scifb
smsc911x: Driver version 2008-10-21
smsc911x-mdio: probed
smsc911x smsc911x.0: eth0: attached PHY driver [Generic PHY] (mii_bus:phy_addr=0:01, irq=-1)
smsc911x smsc911x.0: eth0: MAC Address: 02:00:00:73:a0:02
mousedev: PS/2 mouse device common for all mice
sh_mobile_sdhi sh_mobile_sdhi.0: mmc0 base at 0xee100000 clock rate 69 MHz
sh_mobile_sdhi sh_mobile_sdhi.1: mmc1 base at 0xee120000 clock rate 69 MHz
sh_cmt sh_cmt.10: used for clock events
sh_cmt sh_cmt.10: used as clock source
sh_tmu sh_tmu.0: kept as earlytimer
sh_tmu sh_tmu.1: kept as earlytimer
TCP cubic registered
NET: Registered protocol family 17
VFP support v0.3: implementor 41 architecture 3 part 30 variant 9 rev 2
Freeing init memory: 6496K

=================================
[ INFO: inconsistent lock state ]
3.0.0-00004-g65ae30f-dirty #3701
---------------------------------
inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
mdev/284 [HC1[1]:SC0[0]:HE0:SE1] takes:
 (&(&new_sh_chan->desc_lock)->rlock){?.-...}, at: [<c07c3014>] sh_dmae_interrupt+0x14/0xb0
{HARDIRQ-ON-W} state was registered at:
  [<c06b6b50>] __lock_acquire+0x5c4/0xbb4
  [<c06b7718>] lock_acquire+0x60/0x74
  [<c08bce9c>] _raw_spin_lock_bh+0x3c/0x4c
  [<c07c4014>] sh_dmae_alloc_chan_resources+0x1d0/0x2bc
  [<c07c1d20>] dma_chan_get+0xd8/0x17c
  [<c07c25d8>] __dma_request_channel+0x140/0x1e4
  [<c0814d00>] tmio_mmc_request_dma+0x74/0x10c
  [<c08b5f98>] tmio_mmc_host_probe+0x228/0x2cc
  [<c08b618c>] sh_mobile_sdhi_probe+0x150/0x274
  [<c07eaa24>] platform_drv_probe+0x18/0x1c
  [<c07e957c>] driver_probe_device+0x9c/0x1a4
  [<c07e9718>] __driver_attach+0x94/0x98
  [<c07e8d48>] bus_for_each_dev+0x60/0x8c
  [<c07e8530>] bus_add_driver+0xa8/0x2a4
  [<c07e9d14>] driver_register+0x78/0x18c
  [<c0660530>] do_one_initcall+0x34/0x184
  [<c00083d8>] kernel_init+0xa8/0x134
  [<c06675a8>] kernel_thread_exit+0x0/0x8
irq event stamp: 11477
hardirqs last  enabled at (11476): [<c08bbecc>] mutex_lock_nested+0x298/0x350
hardirqs last disabled at (11477): [<c0666674>] __irq_svc+0x34/0xa0
softirqs last  enabled at (11395): [<c06897a4>] irq_exit+0xa0/0xa4
softirqs last disabled at (11380): [<c06897a4>] irq_exit+0xa0/0xa4

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&(&new_sh_chan->desc_lock)->rlock);
  <Interrupt>
    lock(&(&new_sh_chan->desc_lock)->rlock);

 *** DEADLOCK ***

1 lock held by mdev/284:
 #0:  (sysfs_mutex){+.+.+.}, at: [<c075175c>] sysfs_permission+0x34/0x64

stack backtrace:
[<c066c730>] (unwind_backtrace+0x0/0xfc) from [<c06b3378>] (print_usage_bug+0x21c/0x2bc)
[<c06b3378>] (print_usage_bug+0x21c/0x2bc) from [<c06b3928>] (mark_lock+0x510/0x740)
[<c06b3928>] (mark_lock+0x510/0x740) from [<c06b6bc4>] (__lock_acquire+0x638/0xbb4)
[<c06b6bc4>] (__lock_acquire+0x638/0xbb4) from [<c06b7718>] (lock_acquire+0x60/0x74)
[<c06b7718>] (lock_acquire+0x60/0x74) from [<c08bcc88>] (_raw_spin_lock+0x34/0x44)
[<c08bcc88>] (_raw_spin_lock+0x34/0x44) from [<c07c3014>] (sh_dmae_interrupt+0x14/0xb0)
[<c07c3014>] (sh_dmae_interrupt+0x14/0xb0) from [<c06c5f70>] (handle_irq_event_percpu+0x54/0x184)
[<c06c5f70>] (handle_irq_event_percpu+0x54/0x184) from [<c06c60dc>] (handle_irq_event+0x3c/0x5c)
[<c06c60dc>] (handle_irq_event+0x3c/0x5c) from [<c06c8708>] (handle_fasteoi_irq+0x9c/0x104)
[<c06c8708>] (handle_fasteoi_irq+0x9c/0x104) from [<c06c5f0c>] (generic_handle_irq+0x28/0x30)
[<c06c5f0c>] (generic_handle_irq+0x28/0x30) from [<c0660058>] (asm_do_IRQ+0x58/0xb8)
[<c0660058>] (asm_do_IRQ+0x58/0xb8) from [<c0671aec>] (shmobile_handle_irq_gic+0xc/0x80)
Exception stack(0xddb21d78 to 0xddb21dc0)
1d60:                                                       00000000 00000001
1d80: dd98d108 dd4b46e8 00000000 00000000 dd4b46e8 00000001 dd85c808 ddb20000
1da0: 00000015 ddb21e14 00000008 ddb21dc0 c0751768 c0751704 a0000013 ffffffff
[<c0671aec>] (shmobile_handle_irq_gic+0xc/0x80) from [<00000001>] (0x1)
mmc0: new high speed SD card at address b368
mmcblk0: mmc0:b368 SDC   489 MiB 
 mmcblk0: p1
smsc911x smsc911x.0: eth0: SMSC911x/921x identified at 0xde860000, IRQ: 65
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD1)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD5)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD55)
sh_mobile_sdhi sh_mobile_sdhi.1: timeout waiting for hardware interrupt (CMD1)

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-07-28  6:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-07-25  9:28 shdma: (or tmio_mmc?) inconsistent lock state detected on ag5evm Yoshii Takashi
2011-07-25  9:28 ` Yoshii Takashi
2011-07-27  8:23 ` Guennadi Liakhovetski
2011-07-27  8:23   ` Guennadi Liakhovetski
2011-07-28  5:47   ` takasi-y
2011-07-28  5:47     ` shdma: (or tmio_mmc?) inconsistent lock state detected on takasi-y

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.