Kenrel panic with 3.18.7-rt2

linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Kenrel panic with 3.18.7-rt2 - rootfs at MMC
@ 2015-02-24  9:46 Michal Šmucr
  2015-02-26 11:29 ` [RT PATCH] mmc: sdhci: don't provide hard irq handler Sebastian Andrzej Siewior
  2015-03-02  1:25 ` Kenrel panic with 3.18.7-rt2 - rootfs at MMC Paul Gortmaker
  0 siblings, 2 replies; 8+ messages in thread
From: Michal Šmucr @ 2015-02-24  9:46 UTC (permalink / raw)
  To: linux-rt-users

[-- Attachment #1: Type: text/plain, Size: 550 bytes --]

Hello to all,

I'm getting kernel panics, when I'm using kernel 3.18.7 with recently 
released rt1 and rt2 patches. Platform is x86_64.
It happens only when root filesystem is at SDHC card.
At 3.14.31-rt28 or recent plain vanilla kernels I haven't spotted this 
issue.

Boot panic is possible to workaround by adding kernel parameter: 
"scsi_mod.scan=sync", but it still happens during initrd rebuild using 
update-initramfs at Debian Jessie.

Log from panic is attached.

Thanks for any comment regarding issue or better way for bug-report,

Michal

[-- Attachment #2: panic-mmc-3.18.7-rt2.log --]
[-- Type: text/plain, Size: 3591 bytes --]

[    3.184053] task: ffff880036e17170 ti: ffff880036f4c000 task.ti: ffff880036f4c000
[    3.184076] RIP: 0010:[<ffffffff816d2168>]  [<ffffffff816d2168>] rt_spin_lock_slowlock+0x238/0x250
[    3.184082] RSP: 0018:ffff880079403dc8  EFLAGS: 00010046
[    3.184086] RAX: ffff880036e17170 RBX: ffff880075394790 RCX: 0000000000000001
[    3.184090] RDX: 0000000000000000 RSI: ffff880036e17170 RDI: ffff880075394790
[    3.184095] RBP: ffff880079403de0 R08: ffff880036e17170 R09: ffff8800753947a8
[    3.184099] R10: ffff880036e17171 R11: 0000000000000000 R12: ffff880036e17170
[    3.184103] R13: 0000000000000000 R14: ffff88007a9a2e00 R15: ffff880075394600
[    3.184110] FS:  0000000000000000(0000) GS:ffff880079400000(0000) knlGS:0000000000000000
[    3.184115] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    3.184119] CR2: 00007f2224bd7000 CR3: 0000000036da1000 CR4: 00000000001007f0
[    3.184121] Stack:
[    3.184132]  ffff8800751ce720 ffffffff81c575c0 ffffffff81d13480 ffff880079403de0
[    3.184140]  0000000000000000 0000000000000086 ffff880079403df8 ffff88007940dac0
[    3.184148]  ffffffff81c575c0 0000000000000000 0000000000000086 0000000000000001
[    3.184150] Call Trace:
[    3.184160]  <IRQ>
[    3.184172]  [<ffffffff8158e79f>] ? sdhci_irq+0x2f/0x930
[    3.184186]  [<ffffffff810a7d1f>] ? dequeue_rt_stack+0x4f/0x230
[    3.184198]  [<ffffffff810c13ad>] ? handle_irq_event_percpu+0x4d/0x200
[    3.184207]  [<ffffffff816d075d>] ? preempt_schedule_irq+0x3d/0x70
[    3.184217]  [<ffffffff810c15b4>] ? handle_irq_event+0x54/0x80
[    3.184227]  [<ffffffff810c45f7>] ? handle_fasteoi_irq+0xd7/0x1a0
[    3.184240]  [<ffffffff810176cd>] ? handle_irq+0x1d/0x30
[    3.184250]  [<ffffffff816d6bc4>] ? do_IRQ+0x54/0xf0
[    3.184260]  [<ffffffff816d4a6d>] ? common_interrupt+0x6d/0x6d
[    3.184265]  <EOI>
[    3.184278]  [<ffffffff812f4fe9>] ? check_preemption_disabled+0xa9/0x160
[    3.184288]  [<ffffffff816d075d>] ? preempt_schedule_irq+0x3d/0x70
[    3.184299]  [<ffffffff816d4c50>] ? retint_kernel+0x20/0x30
[    3.184313]  [<ffffffff8158d138>] ? sdhci_send_command+0x348/0xd10
[    3.184324]  [<ffffffff8158e098>] ? sdhci_request+0xc8/0x1f0
[    3.184337]  [<ffffffff8157be85>] ? mmc_start_req+0x305/0x400
[    3.184353]  [<ffffffffa001d9a6>] ? mmc_blk_rw_rq_prep+0x1f6/0x4a0 [mmc_block]
[    3.184366]  [<ffffffffa001fca0>] ? mmc_blk_issue_rw_rq+0xe0/0xb80 [mmc_block]
[    3.184379]  [<ffffffff81099400>] ? wake_up_state+0x20/0x20
[    3.184392]  [<ffffffffa002094c>] ? mmc_blk_issue_rq+0x20c/0x550 [mmc_block]
[    3.184404]  [<ffffffff8106d792>] ? unpin_current_cpu+0x12/0x70
[    3.184416]  [<ffffffffa0020d70>] ? mmc_queue_thread+0xe0/0x1f0 [mmc_block]
[    3.184429]  [<ffffffffa0020c90>] ? mmc_blk_issue_rq+0x550/0x550 [mmc_block]
[    3.184439]  [<ffffffff8108c165>] ? kthread+0xc5/0xe0
[    3.184449]  [<ffffffff8108c0a0>] ? kthread_worker_fn+0x1d0/0x1d0
[    3.184459]  [<ffffffff816d3d7c>] ? ret_from_fork+0x7c/0xb0
[    3.184469]  [<ffffffff8108c0a0>] ? kthread_worker_fn+0x1d0/0x1d0
[    3.184558] Code: 00 00 00 e8 db 63 9f ff e9 b3 fe ff ff 66 0f 1f 44 00 00 48 8b 43 10 48 3b 58 38 75 21 48 39 e8 75 a7 0f 0b 0f 1f 80 00 00 00 00 <0f> 0b 66 0f 1f 44 00 00 0f 0b 0f 0b e8 b9 67 c1 ff eb ac e8 10
[    3.184567] RIP  [<ffffffff816d2168>] rt_spin_lock_slowlock+0x238/0x250
[    3.184570]  RSP <ffff880079403dc8>
[    3.559610] ---[ end trace 0000000000000002 ]---
[    3.559615] Kernel panic - not syncing: Fatal exception in interrupt
[    3.559936] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RT PATCH] mmc: sdhci: don't provide hard irq handler
  2015-02-24  9:46 Kenrel panic with 3.18.7-rt2 - rootfs at MMC Michal Šmucr
@ 2015-02-26 11:29 ` Sebastian Andrzej Siewior
  2015-02-27  9:33   ` Michal Šmucr
  2015-03-02  1:25 ` Kenrel panic with 3.18.7-rt2 - rootfs at MMC Paul Gortmaker
  1 sibling, 1 reply; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-02-26 11:29 UTC (permalink / raw)
  To: Michal Šmucr; +Cc: linux-rt-users, linux-mmc, Chris Ball

the sdhci code provides both irq handlers: the primary and the thread
handler. Initially it was meant for the primary handler to be very
short.
The result is not that on -RT we have the primrary handler grabing locks
and this isn't really working. As a hack for now I just push both
handler into the threaded mode.

Reported-By: Michal Šmucr <msmucr@gmail.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
The "same thing" was reported against the iwlwifi driver
(request_threaded_irq(…, iwl_pcie_isr, iwl_pcie_irq_handler, …) and they
managed to rework it and not do anything that would break -RT in their
primary handler. Besides sdhci there are a few others drivers in the
same tree doing similar things.
I'm not sure what to do here in general. Motivating upstream maintainer
to rework their code or introducing IRQF_RT_SAFE and for others doing
the conversation like in the patch below.

Michal: This is untested but should fix the issue, reported.

 drivers/mmc/host/sdhci.c | 32 +++++++++++++++++++++++++++-----
 1 file changed, 27 insertions(+), 5 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index 023c2010cd75..bcde53774bc9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -2565,6 +2565,31 @@ static irqreturn_t sdhci_thread_irq(int irq, void *dev_id)
 	return isr ? IRQ_HANDLED : IRQ_NONE;
 }
 
+#ifdef CONFIG_PREEMPT_RT_BASE
+static irqreturn_t sdhci_rt_irq(int irq, void *dev_id)
+{
+	irqreturn_t ret;
+
+	local_bh_disable();
+	ret = sdhci_irq(irq, dev_id);
+	local_bh_enable();
+	if (ret == IRQ_WAKE_THREAD)
+		ret = sdhci_thread_irq(irq, dev_id);
+	return ret;
+}
+#endif
+
+static int sdhci_req_irq(struct sdhci_host *host)
+{
+#ifdef CONFIG_PREEMPT_RT_BASE
+	return request_threaded_irq(host->irq, NULL, sdhci_rt_irq,
+				    IRQF_SHARED, mmc_hostname(host->mmc), host);
+#else
+	return request_threaded_irq(host->irq, sdhci_irq, sdhci_thread_irq,
+				    IRQF_SHARED, mmc_hostname(host->mmc), host);
+#endif
+}
+
 /*****************************************************************************\
  *                                                                           *
  * Suspend/resume                                                            *
@@ -2632,9 +2657,7 @@ int sdhci_resume_host(struct sdhci_host *host)
 	}
 
 	if (!device_may_wakeup(mmc_dev(host->mmc))) {
-		ret = request_threaded_irq(host->irq, sdhci_irq,
-					   sdhci_thread_irq, IRQF_SHARED,
-					   mmc_hostname(host->mmc), host);
+		ret = sdhci_req_irq(host);
 		if (ret)
 			return ret;
 	} else {
@@ -3253,8 +3276,7 @@ int sdhci_add_host(struct sdhci_host *host)
 
 	sdhci_init(host, 0);
 
-	ret = request_threaded_irq(host->irq, sdhci_irq, sdhci_thread_irq,
-				   IRQF_SHARED,	mmc_hostname(mmc), host);
+	ret = sdhci_req_irq(host);
 	if (ret) {
 		pr_err("%s: Failed to request IRQ %d: %d\n",
 		       mmc_hostname(mmc), host->irq, ret);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RT PATCH] mmc: sdhci: don't provide hard irq handler
  2015-02-26 11:29 ` [RT PATCH] mmc: sdhci: don't provide hard irq handler Sebastian Andrzej Siewior
@ 2015-02-27  9:33   ` Michal Šmucr
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Šmucr @ 2015-02-27  9:33 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior; +Cc: linux-rt-users, linux-mmc, Chris Ball

On 26.2.2015 12:29, Sebastian Andrzej Siewior wrote:
> the sdhci code provides both irq handlers: the primary and the thread
> handler. Initially it was meant for the primary handler to be very
> short.
> The result is not that on -RT we have the primrary handler grabing locks
> and this isn't really working. As a hack for now I just push both
> handler into the threaded mode.
>
> Reported-By: Michal Šmucr <msmucr@gmail.com>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> The "same thing" was reported against the iwlwifi driver
> (request_threaded_irq(…, iwl_pcie_isr, iwl_pcie_irq_handler, …) and they
> managed to rework it and not do anything that would break -RT in their
> primary handler. Besides sdhci there are a few others drivers in the
> same tree doing similar things.
> I'm not sure what to do here in general. Motivating upstream maintainer
> to rework their code or introducing IRQF_RT_SAFE and for others doing
> the conversation like in the patch below.
>
> Michal: This is untested but should fix the issue, reported.
>

Sebastian,

thank you very much for the patch and explanation of what's going on. It 
is very handy to know for possible similar issues with other modules. 
The trick with forced threaded handlers seems to be working well and I 
haven't been able to crash it again during all of my tests.

Michal



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kenrel panic with 3.18.7-rt2 - rootfs at MMC
  2015-02-24  9:46 Kenrel panic with 3.18.7-rt2 - rootfs at MMC Michal Šmucr
  2015-02-26 11:29 ` [RT PATCH] mmc: sdhci: don't provide hard irq handler Sebastian Andrzej Siewior
@ 2015-03-02  1:25 ` Paul Gortmaker
  2015-03-02  1:31   ` Paul Gortmaker
  2015-03-02  3:34   ` Ralf Mardorf
  1 sibling, 2 replies; 8+ messages in thread
From: Paul Gortmaker @ 2015-03-02  1:25 UTC (permalink / raw)
  To: Michal Šmucr; +Cc: linux-rt-users

On Tue, Feb 24, 2015 at 4:46 AM, Michal Šmucr <msmucr@gmail.com> wrote:
> Hello to all,
>
> I'm getting kernel panics, when I'm using kernel 3.18.7 with recently
> released rt1 and rt2 patches. Platform is x86_64.
> It happens only when root filesystem is at SDHC card.
> At 3.14.31-rt28 or recent plain vanilla kernels I haven't spotted this
> issue.
>
> Boot panic is possible to workaround by adding kernel parameter:
> "scsi_mod.scan=sync", but it still happens during initrd rebuild using
> update-initramfs at Debian Jessie.
>
> Log from panic is attached.

Seems sdhci_irq is operating on a (non-raw) host->lock and you
are running that code in hard IRQ context, which will trigger the
alarm for operating on a sleeping lock while in atomic context.
(i.e code is not currently -rt friendly as-is).

>
> Thanks for any comment regarding issue or better way for bug-report,

In the future, putting the log lines in line (vs attachment) makes
it easier to reply and comment - meaning more people will look
at the report.

P.
--

>
> Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kenrel panic with 3.18.7-rt2 - rootfs at MMC
  2015-03-02  1:25 ` Kenrel panic with 3.18.7-rt2 - rootfs at MMC Paul Gortmaker
@ 2015-03-02  1:31   ` Paul Gortmaker
  2015-03-02 11:43     ` Michal Šmucr
  2015-03-02  3:34   ` Ralf Mardorf
  1 sibling, 1 reply; 8+ messages in thread
From: Paul Gortmaker @ 2015-03-02  1:31 UTC (permalink / raw)
  To: Michal Šmucr; +Cc: linux-rt-users

On Sun, Mar 1, 2015 at 8:25 PM, Paul Gortmaker
<paul.gortmaker@windriver.com> wrote:
> On Tue, Feb 24, 2015 at 4:46 AM, Michal Šmucr <msmucr@gmail.com> wrote:
>> Hello to all,
>>
>> I'm getting kernel panics, when I'm using kernel 3.18.7 with recently
>> released rt1 and rt2 patches. Platform is x86_64.
>> It happens only when root filesystem is at SDHC card.
>> At 3.14.31-rt28 or recent plain vanilla kernels I haven't spotted this
>> issue.
>>
>> Boot panic is possible to workaround by adding kernel parameter:
>> "scsi_mod.scan=sync", but it still happens during initrd rebuild using
>> update-initramfs at Debian Jessie.
>>
>> Log from panic is attached.
>
> Seems sdhci_irq is operating on a (non-raw) host->lock and you
> are running that code in hard IRQ context, which will trigger the
> alarm for operating on a sleeping lock while in atomic context.
> (i.e code is not currently -rt friendly as-is).

Aha, I see Sebastian already sent out a patch to move it out
of hard IRQ context  ; didn't immediately see it since it wasn't
threaded to this discussion and so it looked unreplied to.

P.
--

>
>>
>> Thanks for any comment regarding issue or better way for bug-report,
>
> In the future, putting the log lines in line (vs attachment) makes
> it easier to reply and comment - meaning more people will look
> at the report.
>
> P.
> --
>
>>
>> Michal
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kenrel panic with 3.18.7-rt2 - rootfs at MMC
  2015-03-02  1:31   ` Paul Gortmaker
@ 2015-03-02 11:43     ` Michal Šmucr
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Šmucr @ 2015-03-02 11:43 UTC (permalink / raw)
  To: Paul Gortmaker; +Cc: linux-rt-users

On 2.3.2015 2:31, Paul Gortmaker wrote:
>> Seems sdhci_irq is operating on a (non-raw) host->lock and you
>> are running that code in hard IRQ context, which will trigger the
>> alarm for operating on a sleeping lock while in atomic context.
>> (i.e code is not currently -rt friendly as-is).

>> In the future, putting the log lines in line (vs attachment) makes
>> it easier to reply and comment - meaning more people will look
>> at the report.

Paul,

thank you for comment and advice, I will post logs recommended way in 
the future.

Best regards,

Michal

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kenrel panic with 3.18.7-rt2 - rootfs at MMC
  2015-03-02  1:25 ` Kenrel panic with 3.18.7-rt2 - rootfs at MMC Paul Gortmaker
  2015-03-02  1:31   ` Paul Gortmaker
@ 2015-03-02  3:34   ` Ralf Mardorf
  2015-03-06 12:24     ` Sebastian Andrzej Siewior
  1 sibling, 1 reply; 8+ messages in thread
From: Ralf Mardorf @ 2015-03-02  3:34 UTC (permalink / raw)
  To: linux-rt-users

On Sun, 1 Mar 2015 20:25:24 -0500, Paul Gortmaker wrote:
>In the future, putting the log lines in line (vs attachment) makes
>it easier to reply and comment - meaning more people will look
>at the report.

I get kernel panic for 3.18.7-rt2 on an AMD Athlon dual-core x86_64
machine, but I don't know what log files to post.

When first booting

$ pacman -Q linux-rt
linux-rt 3.18.7_rt2-3

at the beginning the startup messages were as usual, but after a while
a high amount of unreadable fast messages scrolled, finished with
something similar to "recursive fixed, restart is needed".

When booting for the second time at the beginning the startup messages
were as usual, but then several kernel panic messages were displayed.

$ journalctl -b -1
$ journalctl -b -2
etc. would take much too long to display something, I only hear the HDD
working.

$ journalctl --since "15 min ago"
$ journalctl -k
$ dmesg
don't show useful output, at least they don't show the strange startup
messages.

Regards,
Ralf

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Kenrel panic with 3.18.7-rt2 - rootfs at MMC
  2015-03-02  3:34   ` Ralf Mardorf
@ 2015-03-06 12:24     ` Sebastian Andrzej Siewior
  0 siblings, 0 replies; 8+ messages in thread
From: Sebastian Andrzej Siewior @ 2015-03-06 12:24 UTC (permalink / raw)
  To: Ralf Mardorf; +Cc: linux-rt-users

* Ralf Mardorf | 2015-03-02 04:34:04 [+0100]:

>On Sun, 1 Mar 2015 20:25:24 -0500, Paul Gortmaker wrote:
>>In the future, putting the log lines in line (vs attachment) makes
>>it easier to reply and comment - meaning more people will look
>>at the report.

It would be usefull not to reply to this thread since what I read it
does not seem to be MMC related.

>I get kernel panic for 3.18.7-rt2 on an AMD Athlon dual-core x86_64
>machine, but I don't know what log files to post.

The dmesg output / kernel log would be usefull. If you can get it out of
serial or logged on disk, that would be good.

>a high amount of unreadable fast messages scrolled, finished with
>something similar to "recursive fixed, restart is needed".

if the system is still responsible you could "dmesg > file" and then
that file would interresting as long as there is no systemd crap in it.

>Regards,
>Ralf

Sebastian

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-06 12:24 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-24  9:46 Kenrel panic with 3.18.7-rt2 - rootfs at MMC Michal Šmucr
2015-02-26 11:29 ` [RT PATCH] mmc: sdhci: don't provide hard irq handler Sebastian Andrzej Siewior
2015-02-27  9:33   ` Michal Šmucr
2015-03-02  1:25 ` Kenrel panic with 3.18.7-rt2 - rootfs at MMC Paul Gortmaker
2015-03-02  1:31   ` Paul Gortmaker
2015-03-02 11:43     ` Michal Šmucr
2015-03-02  3:34   ` Ralf Mardorf
2015-03-06 12:24     ` Sebastian Andrzej Siewior

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).