linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* BUG: imx6: scheduling-while-atomic bug detected with commit: 8ba9580
@ 2013-10-09  6:10 Wang, Jiada (ESD)
  2013-10-15  6:29 ` Dirk Behme
  0 siblings, 1 reply; 3+ messages in thread
From: Wang, Jiada (ESD) @ 2013-10-09  6:10 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shawn and Lucas

  The following scheduling-while-atomic bug was detected in our kernel as well as v3.11.0-rc5 vanilla

[    3.421817] BUG: scheduling while atomic: kworker/u:0/6/0x00000002
[    3.421825] 4 locks held by kworker/u:0/6:
[    3.421857]  #0:  (events_unbound){.+.+.+}, at: [<80042028>] process_one_work+0x20c/0x668
[    3.421878]  #1:  ((&entry->work)){+.+.+.}, at: [<80042028>] process_one_work+0x20c/0x668
[    3.421901]  #2:  (&(&host->lock)->rlock){......}, at: [<802fe084>] sdhci_do_set_ios+0x24/0x6a8
[    3.421923]  #3:  (prepare_lock){+.+...}, at: [<803116f8>] clk_get_rate+0x1c/0x5c
[    3.421932] Modules linked in:
[    3.421936] irq event stamp: 640
[    3.421948] hardirqs last  enabled at (639): [<800412a4>] __cancel_work_timer+0x94/0xd0
[    3.421964] hardirqs last disabled at (640): [<8041579c>] _raw_spin_lock_irqsave+0x24/0x60
[    3.421980] softirqs last  enabled at (378): [<8002abc0>] __do_softirq+0x368/0x3f8
[    3.421992] softirqs last disabled at (373): [<8002b00c>] irq_exit+0x5c/0xb0
[    3.421998] Backtrace: 
[    3.422025] [<80011aa0>] (dump_backtrace+0x0/0x10c) from [<8040dbd4>] (dump_stack+0x18/0x1c)
[    3.422044]  r6:b40bfbf8 r5:81392b80 r4:b40a2e40 r3:b40a2e40
[    3.422058] [<8040dbbc>] (dump_stack+0x0/0x1c) from [<8040e49c>] (__schedule_bug+0x64/0x78)
[    3.422072] [<8040e438>] (__schedule_bug+0x0/0x78) from [<8041424c>] (__schedule+0x68/0x7f8)
[    3.422083]  r4:b40a2e40 r3:00000002
[    3.422094] [<804141e4>] (__schedule+0x0/0x7f8) from [<80414b6c>] (schedule+0x84/0x88)
[    3.422106] [<80414ae8>] (schedule+0x0/0x88) from [<80414e38>] (schedule_preempt_disabled+0x18/0x24)
[    3.422120] [<80414e20>] (schedule_preempt_disabled+0x0/0x24) from [<80413478>] (mutex_lock_nested+0x220/0x3c0)
[    3.422134] [<80413258>] (mutex_lock_nested+0x0/0x3c0) from [<803116f8>] (clk_get_rate+0x1c/0x5c)
[    3.422148] [<803116dc>] (clk_get_rate+0x0/0x5c) from [<80302a74>] (esdhc_pltfm_set_clock+0x1c/0x200)
[    3.422158]  r4:b405e400 r3:80302a58
[    3.422171] [<80302a58>] (esdhc_pltfm_set_clock+0x0/0x200) from [<802fd3d4>] (sdhci_set_clock+0x4c/0x31c)
[    3.422183] [<802fd388>] (sdhci_set_clock+0x0/0x31c) from [<802fe174>] (sdhci_do_set_ios+0x114/0x6a8)
[    3.422194] [<802fe060>] (sdhci_do_set_ios+0x0/0x6a8) from [<802fe730>] (sdhci_set_ios+0x28/0x34)
[    3.422213] [<802fe708>] (sdhci_set_ios+0x0/0x34) from [<802efa6c>] (mmc_power_up+0xb8/0xc4)
[    3.422224]  r5:b405e2ac r4:b405e000
[    3.422239] [<802ef9b4>] (mmc_power_up+0x0/0xc4) from [<802f06b4>] (mmc_start_host+0x30/0x44)
[    3.422255]  r7:b405e4b4 r6:00000000 r5:00000000 r4:b405e000
[    3.422267] [<802f0684>] (mmc_start_host+0x0/0x44) from [<802f0fd8>] (mmc_add_host+0x58/0x6c)
[    3.422277]  r5:00000000 r4:b405e000
[    3.422289] [<802f0f80>] (mmc_add_host+0x0/0x6c) from [<802ff188>] (sdhci_add_host+0x9f8/0xb88)
[    3.422299]  r5:b405e000 r4:b405e400
[    3.422313] [<802fe790>] (sdhci_add_host+0x0/0xb88) from [<80303dd8>] (sdhci_esdhc_imx_probe_wrapper+0x5c4/0x6a4)
[    3.422334]  r8:b4128400 r7:b4128410 r6:00002801 r5:b4394110 r4:b405e400
[    3.422356] [<80303814>] (sdhci_esdhc_imx_probe_wrapper+0x0/0x6a4) from [<8004fe34>] (async_run_entry_fn+0xf0/0x20c)
[    3.422370] [<8004fd44>] (async_run_entry_fn+0x0/0x20c) from [<8004219c>] (process_one_work+0x380/0x668)
[    3.422393]  r9:00000089 r8:00000000 r7:b4004600 r6:8064ca40 r5:b4391e08
[    3.422393] r4:b4081200
[    3.422406] [<80041e1c>] (process_one_work+0x0/0x668) from [<800428e0>] (worker_thread+0x290/0x3a8)
[    3.422423] [<80042650>] (worker_thread+0x0/0x3a8) from [<800476b0>] (kthread+0xb4/0xc0)
[    3.422439] [<800475fc>] (kthread+0x0/0xc0) from [<8000d970>] (ret_from_fork+0x14/0x24)
[    3.422455]  r7:00000000 r6:00000000 r5:800475fc r4:b409de58

The scheduling-while-atomic bug is due to clk_get_rate() trying to lock a mutex while a spinlock is being held by sdhci_do_set_ios().
The sequence is:
sdhci_do_set_ios()
  spin_lock_irqsave(&host->lock, flags);
  sdhci_set_clock(host, ios->clock);
    esdhc_pltfm_set_clock(host, clock); /* called via host->ops->set_clock */
      esdhc_set_clock(host, clock, clk_get_rate(pltfm_host->clk));
The locking violation occurs when clk_get_rate tries to lock a mutex.

The esdhc_pltfm_set_clock() routine was introduced in commit: 
8ba9580 mmc: sdhci-esdhc: calculate sdclk divider from controller clock

Would you please have a look at this issue?

Thanks,
Jiada

^ permalink raw reply	[flat|nested] 3+ messages in thread

* BUG: imx6: scheduling-while-atomic bug detected with commit: 8ba9580
  2013-10-09  6:10 BUG: imx6: scheduling-while-atomic bug detected with commit: 8ba9580 Wang, Jiada (ESD)
@ 2013-10-15  6:29 ` Dirk Behme
  2013-10-15  6:38   ` Lucas Stach
  0 siblings, 1 reply; 3+ messages in thread
From: Dirk Behme @ 2013-10-15  6:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Shawn and Lucas,

On 09.10.2013 08:10, Wang, Jiada (ESD) wrote:
> Hi Shawn and Lucas
>
>    The following scheduling-while-atomic bug was detected in our kernel as well as v3.11.0-rc5 vanilla
>
> [    3.421817] BUG: scheduling while atomic: kworker/u:0/6/0x00000002
> [    3.421825] 4 locks held by kworker/u:0/6:
> [    3.421857]  #0:  (events_unbound){.+.+.+}, at: [<80042028>] process_one_work+0x20c/0x668
> [    3.421878]  #1:  ((&entry->work)){+.+.+.}, at: [<80042028>] process_one_work+0x20c/0x668
> [    3.421901]  #2:  (&(&host->lock)->rlock){......}, at: [<802fe084>] sdhci_do_set_ios+0x24/0x6a8
> [    3.421923]  #3:  (prepare_lock){+.+...}, at: [<803116f8>] clk_get_rate+0x1c/0x5c
> [    3.421932] Modules linked in:
> [    3.421936] irq event stamp: 640
> [    3.421948] hardirqs last  enabled at (639): [<800412a4>] __cancel_work_timer+0x94/0xd0
> [    3.421964] hardirqs last disabled at (640): [<8041579c>] _raw_spin_lock_irqsave+0x24/0x60
> [    3.421980] softirqs last  enabled at (378): [<8002abc0>] __do_softirq+0x368/0x3f8
> [    3.421992] softirqs last disabled at (373): [<8002b00c>] irq_exit+0x5c/0xb0
> [    3.421998] Backtrace:
> [    3.422025] [<80011aa0>] (dump_backtrace+0x0/0x10c) from [<8040dbd4>] (dump_stack+0x18/0x1c)
> [    3.422044]  r6:b40bfbf8 r5:81392b80 r4:b40a2e40 r3:b40a2e40
> [    3.422058] [<8040dbbc>] (dump_stack+0x0/0x1c) from [<8040e49c>] (__schedule_bug+0x64/0x78)
> [    3.422072] [<8040e438>] (__schedule_bug+0x0/0x78) from [<8041424c>] (__schedule+0x68/0x7f8)
> [    3.422083]  r4:b40a2e40 r3:00000002
> [    3.422094] [<804141e4>] (__schedule+0x0/0x7f8) from [<80414b6c>] (schedule+0x84/0x88)
> [    3.422106] [<80414ae8>] (schedule+0x0/0x88) from [<80414e38>] (schedule_preempt_disabled+0x18/0x24)
> [    3.422120] [<80414e20>] (schedule_preempt_disabled+0x0/0x24) from [<80413478>] (mutex_lock_nested+0x220/0x3c0)
> [    3.422134] [<80413258>] (mutex_lock_nested+0x0/0x3c0) from [<803116f8>] (clk_get_rate+0x1c/0x5c)
> [    3.422148] [<803116dc>] (clk_get_rate+0x0/0x5c) from [<80302a74>] (esdhc_pltfm_set_clock+0x1c/0x200)
> [    3.422158]  r4:b405e400 r3:80302a58
> [    3.422171] [<80302a58>] (esdhc_pltfm_set_clock+0x0/0x200) from [<802fd3d4>] (sdhci_set_clock+0x4c/0x31c)
> [    3.422183] [<802fd388>] (sdhci_set_clock+0x0/0x31c) from [<802fe174>] (sdhci_do_set_ios+0x114/0x6a8)
> [    3.422194] [<802fe060>] (sdhci_do_set_ios+0x0/0x6a8) from [<802fe730>] (sdhci_set_ios+0x28/0x34)
> [    3.422213] [<802fe708>] (sdhci_set_ios+0x0/0x34) from [<802efa6c>] (mmc_power_up+0xb8/0xc4)
> [    3.422224]  r5:b405e2ac r4:b405e000
> [    3.422239] [<802ef9b4>] (mmc_power_up+0x0/0xc4) from [<802f06b4>] (mmc_start_host+0x30/0x44)
> [    3.422255]  r7:b405e4b4 r6:00000000 r5:00000000 r4:b405e000
> [    3.422267] [<802f0684>] (mmc_start_host+0x0/0x44) from [<802f0fd8>] (mmc_add_host+0x58/0x6c)
> [    3.422277]  r5:00000000 r4:b405e000
> [    3.422289] [<802f0f80>] (mmc_add_host+0x0/0x6c) from [<802ff188>] (sdhci_add_host+0x9f8/0xb88)
> [    3.422299]  r5:b405e000 r4:b405e400
> [    3.422313] [<802fe790>] (sdhci_add_host+0x0/0xb88) from [<80303dd8>] (sdhci_esdhc_imx_probe_wrapper+0x5c4/0x6a4)
> [    3.422334]  r8:b4128400 r7:b4128410 r6:00002801 r5:b4394110 r4:b405e400
> [    3.422356] [<80303814>] (sdhci_esdhc_imx_probe_wrapper+0x0/0x6a4) from [<8004fe34>] (async_run_entry_fn+0xf0/0x20c)
> [    3.422370] [<8004fd44>] (async_run_entry_fn+0x0/0x20c) from [<8004219c>] (process_one_work+0x380/0x668)
> [    3.422393]  r9:00000089 r8:00000000 r7:b4004600 r6:8064ca40 r5:b4391e08
> [    3.422393] r4:b4081200
> [    3.422406] [<80041e1c>] (process_one_work+0x0/0x668) from [<800428e0>] (worker_thread+0x290/0x3a8)
> [    3.422423] [<80042650>] (worker_thread+0x0/0x3a8) from [<800476b0>] (kthread+0xb4/0xc0)
> [    3.422439] [<800475fc>] (kthread+0x0/0xc0) from [<8000d970>] (ret_from_fork+0x14/0x24)
> [    3.422455]  r7:00000000 r6:00000000 r5:800475fc r4:b409de58
>
> The scheduling-while-atomic bug is due to clk_get_rate() trying to lock a mutex while a spinlock is being held by sdhci_do_set_ios().
> The sequence is:
> sdhci_do_set_ios()
>    spin_lock_irqsave(&host->lock, flags);
>    sdhci_set_clock(host, ios->clock);
>      esdhc_pltfm_set_clock(host, clock); /* called via host->ops->set_clock */
>        esdhc_set_clock(host, clock, clk_get_rate(pltfm_host->clk));
> The locking violation occurs when clk_get_rate tries to lock a mutex.
>
> The esdhc_pltfm_set_clock() routine was introduced in commit:
> 8ba9580 mmc: sdhci-esdhc: calculate sdclk divider from controller clock
>
> Would you please have a look at this issue?

Any opinions on this?

Many thanks

Dirk

^ permalink raw reply	[flat|nested] 3+ messages in thread

* BUG: imx6: scheduling-while-atomic bug detected with commit: 8ba9580
  2013-10-15  6:29 ` Dirk Behme
@ 2013-10-15  6:38   ` Lucas Stach
  0 siblings, 0 replies; 3+ messages in thread
From: Lucas Stach @ 2013-10-15  6:38 UTC (permalink / raw)
  To: linux-arm-kernel

Am Dienstag, den 15.10.2013, 08:29 +0200 schrieb Dirk Behme:
> Hi Shawn and Lucas,
> 
> On 09.10.2013 08:10, Wang, Jiada (ESD) wrote:
> > Hi Shawn and Lucas
> >
> >    The following scheduling-while-atomic bug was detected in our kernel as well as v3.11.0-rc5 vanilla
> >
> > [    3.421817] BUG: scheduling while atomic: kworker/u:0/6/0x00000002
> > [    3.421825] 4 locks held by kworker/u:0/6:
> > [    3.421857]  #0:  (events_unbound){.+.+.+}, at: [<80042028>] process_one_work+0x20c/0x668
> > [    3.421878]  #1:  ((&entry->work)){+.+.+.}, at: [<80042028>] process_one_work+0x20c/0x668
> > [    3.421901]  #2:  (&(&host->lock)->rlock){......}, at: [<802fe084>] sdhci_do_set_ios+0x24/0x6a8
> > [    3.421923]  #3:  (prepare_lock){+.+...}, at: [<803116f8>] clk_get_rate+0x1c/0x5c
> > [    3.421932] Modules linked in:
> > [    3.421936] irq event stamp: 640
> > [    3.421948] hardirqs last  enabled at (639): [<800412a4>] __cancel_work_timer+0x94/0xd0
> > [    3.421964] hardirqs last disabled at (640): [<8041579c>] _raw_spin_lock_irqsave+0x24/0x60
> > [    3.421980] softirqs last  enabled at (378): [<8002abc0>] __do_softirq+0x368/0x3f8
> > [    3.421992] softirqs last disabled at (373): [<8002b00c>] irq_exit+0x5c/0xb0
> > [    3.421998] Backtrace:
> > [    3.422025] [<80011aa0>] (dump_backtrace+0x0/0x10c) from [<8040dbd4>] (dump_stack+0x18/0x1c)
> > [    3.422044]  r6:b40bfbf8 r5:81392b80 r4:b40a2e40 r3:b40a2e40
> > [    3.422058] [<8040dbbc>] (dump_stack+0x0/0x1c) from [<8040e49c>] (__schedule_bug+0x64/0x78)
> > [    3.422072] [<8040e438>] (__schedule_bug+0x0/0x78) from [<8041424c>] (__schedule+0x68/0x7f8)
> > [    3.422083]  r4:b40a2e40 r3:00000002
> > [    3.422094] [<804141e4>] (__schedule+0x0/0x7f8) from [<80414b6c>] (schedule+0x84/0x88)
> > [    3.422106] [<80414ae8>] (schedule+0x0/0x88) from [<80414e38>] (schedule_preempt_disabled+0x18/0x24)
> > [    3.422120] [<80414e20>] (schedule_preempt_disabled+0x0/0x24) from [<80413478>] (mutex_lock_nested+0x220/0x3c0)
> > [    3.422134] [<80413258>] (mutex_lock_nested+0x0/0x3c0) from [<803116f8>] (clk_get_rate+0x1c/0x5c)
> > [    3.422148] [<803116dc>] (clk_get_rate+0x0/0x5c) from [<80302a74>] (esdhc_pltfm_set_clock+0x1c/0x200)
> > [    3.422158]  r4:b405e400 r3:80302a58
> > [    3.422171] [<80302a58>] (esdhc_pltfm_set_clock+0x0/0x200) from [<802fd3d4>] (sdhci_set_clock+0x4c/0x31c)
> > [    3.422183] [<802fd388>] (sdhci_set_clock+0x0/0x31c) from [<802fe174>] (sdhci_do_set_ios+0x114/0x6a8)
> > [    3.422194] [<802fe060>] (sdhci_do_set_ios+0x0/0x6a8) from [<802fe730>] (sdhci_set_ios+0x28/0x34)
> > [    3.422213] [<802fe708>] (sdhci_set_ios+0x0/0x34) from [<802efa6c>] (mmc_power_up+0xb8/0xc4)
> > [    3.422224]  r5:b405e2ac r4:b405e000
> > [    3.422239] [<802ef9b4>] (mmc_power_up+0x0/0xc4) from [<802f06b4>] (mmc_start_host+0x30/0x44)
> > [    3.422255]  r7:b405e4b4 r6:00000000 r5:00000000 r4:b405e000
> > [    3.422267] [<802f0684>] (mmc_start_host+0x0/0x44) from [<802f0fd8>] (mmc_add_host+0x58/0x6c)
> > [    3.422277]  r5:00000000 r4:b405e000
> > [    3.422289] [<802f0f80>] (mmc_add_host+0x0/0x6c) from [<802ff188>] (sdhci_add_host+0x9f8/0xb88)
> > [    3.422299]  r5:b405e000 r4:b405e400
> > [    3.422313] [<802fe790>] (sdhci_add_host+0x0/0xb88) from [<80303dd8>] (sdhci_esdhc_imx_probe_wrapper+0x5c4/0x6a4)
> > [    3.422334]  r8:b4128400 r7:b4128410 r6:00002801 r5:b4394110 r4:b405e400
> > [    3.422356] [<80303814>] (sdhci_esdhc_imx_probe_wrapper+0x0/0x6a4) from [<8004fe34>] (async_run_entry_fn+0xf0/0x20c)
> > [    3.422370] [<8004fd44>] (async_run_entry_fn+0x0/0x20c) from [<8004219c>] (process_one_work+0x380/0x668)
> > [    3.422393]  r9:00000089 r8:00000000 r7:b4004600 r6:8064ca40 r5:b4391e08
> > [    3.422393] r4:b4081200
> > [    3.422406] [<80041e1c>] (process_one_work+0x0/0x668) from [<800428e0>] (worker_thread+0x290/0x3a8)
> > [    3.422423] [<80042650>] (worker_thread+0x0/0x3a8) from [<800476b0>] (kthread+0xb4/0xc0)
> > [    3.422439] [<800475fc>] (kthread+0x0/0xc0) from [<8000d970>] (ret_from_fork+0x14/0x24)
> > [    3.422455]  r7:00000000 r6:00000000 r5:800475fc r4:b409de58
> >
> > The scheduling-while-atomic bug is due to clk_get_rate() trying to lock a mutex while a spinlock is being held by sdhci_do_set_ios().
> > The sequence is:
> > sdhci_do_set_ios()
> >    spin_lock_irqsave(&host->lock, flags);
> >    sdhci_set_clock(host, ios->clock);
> >      esdhc_pltfm_set_clock(host, clock); /* called via host->ops->set_clock */
> >        esdhc_set_clock(host, clock, clk_get_rate(pltfm_host->clk));
> > The locking violation occurs when clk_get_rate tries to lock a mutex.
> >
> > The esdhc_pltfm_set_clock() routine was introduced in commit:
> > 8ba9580 mmc: sdhci-esdhc: calculate sdclk divider from controller clock
> >
> > Would you please have a look at this issue?
> 
> Any opinions on this?
> 
I've had a brief look into this and IMHO it's totally bogus that the
SDHCI framework is protecting such potentially long running sections
with a spinlock.

I'll spin a patch to work around the issue for now, but honestly in the
end the spinlock should go away.

Regards,
Lucas

-- 
Pengutronix e.K.                           | Lucas Stach                 |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-5076 |
Amtsgericht Hildesheim, HRA 2686           | Fax:   +49-5121-206917-5555 |

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2013-10-15  6:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-09  6:10 BUG: imx6: scheduling-while-atomic bug detected with commit: 8ba9580 Wang, Jiada (ESD)
2013-10-15  6:29 ` Dirk Behme
2013-10-15  6:38   ` Lucas Stach

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).