From: Thomas Huth <thuth@redhat.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org, anton@samba.org
Subject: BUG: sleeping function called from ras_epow_interrupt context
Date: Tue, 14 Jul 2015 18:43:18 +0000 [thread overview]
Message-ID: <55A55846.5080904@redhat.com> (raw)
Hi all!
A colleague recently ran into some kernel BUG messages that happen when
hot-plugging a virtio disk to a KVM guest on powerpc (with "virsh
attach-disk"), and IIRC CONFIG_DEBUG_ATOMIC_SLEEP enabled. I've tried to
re-create the problem with an up-to-date kernel (4.2.0-rc2) and the
problem still seems to be there:
The hotplug action triggers the ras_epow_interrupt() in
arch/powerpc/platforms/pseries/ras.c, which again calls
rtas_get_sensor(). That function then uses rtas_busy_delay() to wait in
case the RTAS call did not succeed immediately. But rtas_busy_delay()
uses msleep() for sleeping - which is forbidden during an atomic
interrupt context!
Following backtrace is printed out by the kernel:
[ 33.920528] BUG: sleeping function called from invalid context at
/home/thuth/devel/linux-up/arch/powerpc/kernel/rtas.c:496
[ 33.920590] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
[ 33.920624] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #1
[ 33.920657] Call Trace:
[ 33.920677] [c00000007ffe79b0] [c0000000007e43f4]
.dump_stack+0x98/0xd4 (unreliable)
[ 33.920729] [c00000007ffe7a30] [c0000000000dcc78]
.___might_sleep+0x128/0x170
[ 33.920769] [c00000007ffe7aa0] [c000000000029f38]
.rtas_busy_delay+0x28/0xe0
[ 33.920809] [c00000007ffe7b20] [c00000000002adb4]
.rtas_get_sensor+0x74/0xe0
[ 33.920850] [c00000007ffe7bc0] [c00000000007ff58]
.ras_epow_interrupt+0x48/0x450
[ 33.920896] [c00000007ffe7c80] [c000000000119d94]
.handle_irq_event_percpu+0xa4/0x310
[ 33.920942] [c00000007ffe7d70] [c00000000011a05c]
.handle_irq_event+0x5c/0xa0
[ 33.920982] [c00000007ffe7e00] [c00000000011e7a8]
.handle_fasteoi_irq+0xe8/0x270
[ 33.921028] [c00000007ffe7e90] [c0000000001190bc]
.generic_handle_irq+0x4c/0x80
[ 33.921074] [c00000007ffe7f10] [c000000000010a48] .__do_irq+0x88/0x1f0
[ 33.921115] [c00000007ffe7f90] [c000000000022a0c] .call_do_irq+0x14/0x24
[ 33.921155] [c00000007e6f37e0] [c000000000010c3c] .do_IRQ+0x8c/0x100
[ 33.921195] [c00000007e6f3880] [c000000000002594]
hardware_interrupt_common+0x114/0x180
[ 33.921243] --- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
[ 33.921243] LR = .check_and_cede_processor+0x24/0x40
[ 33.921300] [c00000007e6f3b70] [0000000000000000] (null)
(unreliable)
[ 33.921347] [c00000007e6f3be0] [c000000000628068]
.shared_cede_loop+0x58/0x160
[ 33.921393] [c00000007e6f3c70] [c0000000006259ac]
.cpuidle_enter_state+0xbc/0x3b0
[ 33.921439] [c00000007e6f3d30] [c0000000000fe32c] .call_cpuidle+0x4c/0xa0
[ 33.921479] [c00000007e6f3db0] [c0000000000fe700]
.cpu_startup_entry+0x380/0x4a0
[ 33.921526] [c00000007e6f3ed0] [c000000000043110]
.start_secondary+0x320/0x350
[ 33.921571] [c00000007e6f3f90] [c000000000008b6c]
start_secondary_prolog+0x10/0x14
I think that bug might have been introduced by commit
587f83e8dd50d22bc0c62 ("Use rtas_get_sensor in RAS code") since the
rtas_busy_delay() was not called before that commit, as far as I can see.
Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy?
Thanks,
Thomas
WARNING: multiple messages have this Message-ID (diff)
From: Thomas Huth <thuth@redhat.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org, anton@samba.org
Subject: BUG: sleeping function called from ras_epow_interrupt context
Date: Tue, 14 Jul 2015 20:43:18 +0200 [thread overview]
Message-ID: <55A55846.5080904@redhat.com> (raw)
Hi all!
A colleague recently ran into some kernel BUG messages that happen when
hot-plugging a virtio disk to a KVM guest on powerpc (with "virsh
attach-disk"), and IIRC CONFIG_DEBUG_ATOMIC_SLEEP enabled. I've tried to
re-create the problem with an up-to-date kernel (4.2.0-rc2) and the
problem still seems to be there:
The hotplug action triggers the ras_epow_interrupt() in
arch/powerpc/platforms/pseries/ras.c, which again calls
rtas_get_sensor(). That function then uses rtas_busy_delay() to wait in
case the RTAS call did not succeed immediately. But rtas_busy_delay()
uses msleep() for sleeping - which is forbidden during an atomic
interrupt context!
Following backtrace is printed out by the kernel:
[ 33.920528] BUG: sleeping function called from invalid context at
/home/thuth/devel/linux-up/arch/powerpc/kernel/rtas.c:496
[ 33.920590] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
[ 33.920624] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #1
[ 33.920657] Call Trace:
[ 33.920677] [c00000007ffe79b0] [c0000000007e43f4]
.dump_stack+0x98/0xd4 (unreliable)
[ 33.920729] [c00000007ffe7a30] [c0000000000dcc78]
.___might_sleep+0x128/0x170
[ 33.920769] [c00000007ffe7aa0] [c000000000029f38]
.rtas_busy_delay+0x28/0xe0
[ 33.920809] [c00000007ffe7b20] [c00000000002adb4]
.rtas_get_sensor+0x74/0xe0
[ 33.920850] [c00000007ffe7bc0] [c00000000007ff58]
.ras_epow_interrupt+0x48/0x450
[ 33.920896] [c00000007ffe7c80] [c000000000119d94]
.handle_irq_event_percpu+0xa4/0x310
[ 33.920942] [c00000007ffe7d70] [c00000000011a05c]
.handle_irq_event+0x5c/0xa0
[ 33.920982] [c00000007ffe7e00] [c00000000011e7a8]
.handle_fasteoi_irq+0xe8/0x270
[ 33.921028] [c00000007ffe7e90] [c0000000001190bc]
.generic_handle_irq+0x4c/0x80
[ 33.921074] [c00000007ffe7f10] [c000000000010a48] .__do_irq+0x88/0x1f0
[ 33.921115] [c00000007ffe7f90] [c000000000022a0c] .call_do_irq+0x14/0x24
[ 33.921155] [c00000007e6f37e0] [c000000000010c3c] .do_IRQ+0x8c/0x100
[ 33.921195] [c00000007e6f3880] [c000000000002594]
hardware_interrupt_common+0x114/0x180
[ 33.921243] --- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
[ 33.921243] LR = .check_and_cede_processor+0x24/0x40
[ 33.921300] [c00000007e6f3b70] [0000000000000000] (null)
(unreliable)
[ 33.921347] [c00000007e6f3be0] [c000000000628068]
.shared_cede_loop+0x58/0x160
[ 33.921393] [c00000007e6f3c70] [c0000000006259ac]
.cpuidle_enter_state+0xbc/0x3b0
[ 33.921439] [c00000007e6f3d30] [c0000000000fe32c] .call_cpuidle+0x4c/0xa0
[ 33.921479] [c00000007e6f3db0] [c0000000000fe700]
.cpu_startup_entry+0x380/0x4a0
[ 33.921526] [c00000007e6f3ed0] [c000000000043110]
.start_secondary+0x320/0x350
[ 33.921571] [c00000007e6f3f90] [c000000000008b6c]
start_secondary_prolog+0x10/0x14
I think that bug might have been introduced by commit
587f83e8dd50d22bc0c62 ("Use rtas_get_sensor in RAS code") since the
rtas_busy_delay() was not called before that commit, as far as I can see.
Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy?
Thanks,
Thomas
next reply other threads:[~2015-07-14 18:43 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-14 18:43 Thomas Huth [this message]
2015-07-14 18:43 ` BUG: sleeping function called from ras_epow_interrupt context Thomas Huth
2015-07-14 21:22 ` Benjamin Herrenschmidt
2015-07-14 21:22 ` Benjamin Herrenschmidt
2015-07-15 14:35 ` Thomas Huth
2015-07-15 14:35 ` Thomas Huth
2015-07-15 19:58 ` Nathan Fontenot
2015-07-15 19:58 ` Nathan Fontenot
2015-07-16 6:23 ` Thomas Huth
2015-07-16 6:23 ` Thomas Huth
2015-07-16 17:39 ` Nathan Fontenot
2015-07-16 17:39 ` Nathan Fontenot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=55A55846.5080904@redhat.com \
--to=thuth@redhat.com \
--cc=anton@samba.org \
--cc=kvm-ppc@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.