All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Huth <thuth@redhat.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org, anton@samba.org
Subject: BUG: sleeping function called from ras_epow_interrupt context
Date: Tue, 14 Jul 2015 18:43:18 +0000	[thread overview]
Message-ID: <55A55846.5080904@redhat.com> (raw)


 Hi all!

A colleague recently ran into some kernel BUG messages that happen when
hot-plugging a virtio disk to a KVM guest on powerpc (with "virsh
attach-disk"), and IIRC CONFIG_DEBUG_ATOMIC_SLEEP enabled. I've tried to
re-create the problem with an up-to-date kernel (4.2.0-rc2) and the
problem still seems to be there:

The hotplug action triggers the ras_epow_interrupt() in
arch/powerpc/platforms/pseries/ras.c, which again calls
rtas_get_sensor(). That function then uses rtas_busy_delay() to wait in
case the RTAS call did not succeed immediately. But rtas_busy_delay()
uses msleep() for sleeping - which is forbidden during an atomic
interrupt context!

Following backtrace is printed out by the kernel:

[   33.920528] BUG: sleeping function called from invalid context at
/home/thuth/devel/linux-up/arch/powerpc/kernel/rtas.c:496
[   33.920590] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
[   33.920624] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #1
[   33.920657] Call Trace:
[   33.920677] [c00000007ffe79b0] [c0000000007e43f4]
.dump_stack+0x98/0xd4 (unreliable)
[   33.920729] [c00000007ffe7a30] [c0000000000dcc78]
.___might_sleep+0x128/0x170
[   33.920769] [c00000007ffe7aa0] [c000000000029f38]
.rtas_busy_delay+0x28/0xe0
[   33.920809] [c00000007ffe7b20] [c00000000002adb4]
.rtas_get_sensor+0x74/0xe0
[   33.920850] [c00000007ffe7bc0] [c00000000007ff58]
.ras_epow_interrupt+0x48/0x450
[   33.920896] [c00000007ffe7c80] [c000000000119d94]
.handle_irq_event_percpu+0xa4/0x310
[   33.920942] [c00000007ffe7d70] [c00000000011a05c]
.handle_irq_event+0x5c/0xa0
[   33.920982] [c00000007ffe7e00] [c00000000011e7a8]
.handle_fasteoi_irq+0xe8/0x270
[   33.921028] [c00000007ffe7e90] [c0000000001190bc]
.generic_handle_irq+0x4c/0x80
[   33.921074] [c00000007ffe7f10] [c000000000010a48] .__do_irq+0x88/0x1f0
[   33.921115] [c00000007ffe7f90] [c000000000022a0c] .call_do_irq+0x14/0x24
[   33.921155] [c00000007e6f37e0] [c000000000010c3c] .do_IRQ+0x8c/0x100
[   33.921195] [c00000007e6f3880] [c000000000002594]
hardware_interrupt_common+0x114/0x180
[   33.921243] --- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
[   33.921243]     LR = .check_and_cede_processor+0x24/0x40
[   33.921300] [c00000007e6f3b70] [0000000000000000]           (null)
(unreliable)
[   33.921347] [c00000007e6f3be0] [c000000000628068]
.shared_cede_loop+0x58/0x160
[   33.921393] [c00000007e6f3c70] [c0000000006259ac]
.cpuidle_enter_state+0xbc/0x3b0
[   33.921439] [c00000007e6f3d30] [c0000000000fe32c] .call_cpuidle+0x4c/0xa0
[   33.921479] [c00000007e6f3db0] [c0000000000fe700]
.cpu_startup_entry+0x380/0x4a0
[   33.921526] [c00000007e6f3ed0] [c000000000043110]
.start_secondary+0x320/0x350
[   33.921571] [c00000007e6f3f90] [c000000000008b6c]
start_secondary_prolog+0x10/0x14

I think that bug might have been introduced by commit
587f83e8dd50d22bc0c62 ("Use rtas_get_sensor in RAS code") since the
rtas_busy_delay() was not called before that commit, as far as I can see.

Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy?

 Thanks,
  Thomas

WARNING: multiple messages have this Message-ID (diff)
From: Thomas Huth <thuth@redhat.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: kvm-ppc@vger.kernel.org, anton@samba.org
Subject: BUG: sleeping function called from ras_epow_interrupt context
Date: Tue, 14 Jul 2015 20:43:18 +0200	[thread overview]
Message-ID: <55A55846.5080904@redhat.com> (raw)


 Hi all!

A colleague recently ran into some kernel BUG messages that happen when
hot-plugging a virtio disk to a KVM guest on powerpc (with "virsh
attach-disk"), and IIRC CONFIG_DEBUG_ATOMIC_SLEEP enabled. I've tried to
re-create the problem with an up-to-date kernel (4.2.0-rc2) and the
problem still seems to be there:

The hotplug action triggers the ras_epow_interrupt() in
arch/powerpc/platforms/pseries/ras.c, which again calls
rtas_get_sensor(). That function then uses rtas_busy_delay() to wait in
case the RTAS call did not succeed immediately. But rtas_busy_delay()
uses msleep() for sleeping - which is forbidden during an atomic
interrupt context!

Following backtrace is printed out by the kernel:

[   33.920528] BUG: sleeping function called from invalid context at
/home/thuth/devel/linux-up/arch/powerpc/kernel/rtas.c:496
[   33.920590] in_atomic(): 1, irqs_disabled(): 1, pid: 0, name: swapper/1
[   33.920624] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.2.0-rc2-thuth #1
[   33.920657] Call Trace:
[   33.920677] [c00000007ffe79b0] [c0000000007e43f4]
.dump_stack+0x98/0xd4 (unreliable)
[   33.920729] [c00000007ffe7a30] [c0000000000dcc78]
.___might_sleep+0x128/0x170
[   33.920769] [c00000007ffe7aa0] [c000000000029f38]
.rtas_busy_delay+0x28/0xe0
[   33.920809] [c00000007ffe7b20] [c00000000002adb4]
.rtas_get_sensor+0x74/0xe0
[   33.920850] [c00000007ffe7bc0] [c00000000007ff58]
.ras_epow_interrupt+0x48/0x450
[   33.920896] [c00000007ffe7c80] [c000000000119d94]
.handle_irq_event_percpu+0xa4/0x310
[   33.920942] [c00000007ffe7d70] [c00000000011a05c]
.handle_irq_event+0x5c/0xa0
[   33.920982] [c00000007ffe7e00] [c00000000011e7a8]
.handle_fasteoi_irq+0xe8/0x270
[   33.921028] [c00000007ffe7e90] [c0000000001190bc]
.generic_handle_irq+0x4c/0x80
[   33.921074] [c00000007ffe7f10] [c000000000010a48] .__do_irq+0x88/0x1f0
[   33.921115] [c00000007ffe7f90] [c000000000022a0c] .call_do_irq+0x14/0x24
[   33.921155] [c00000007e6f37e0] [c000000000010c3c] .do_IRQ+0x8c/0x100
[   33.921195] [c00000007e6f3880] [c000000000002594]
hardware_interrupt_common+0x114/0x180
[   33.921243] --- interrupt: 501 at .plpar_hcall_norets+0x14/0x20
[   33.921243]     LR = .check_and_cede_processor+0x24/0x40
[   33.921300] [c00000007e6f3b70] [0000000000000000]           (null)
(unreliable)
[   33.921347] [c00000007e6f3be0] [c000000000628068]
.shared_cede_loop+0x58/0x160
[   33.921393] [c00000007e6f3c70] [c0000000006259ac]
.cpuidle_enter_state+0xbc/0x3b0
[   33.921439] [c00000007e6f3d30] [c0000000000fe32c] .call_cpuidle+0x4c/0xa0
[   33.921479] [c00000007e6f3db0] [c0000000000fe700]
.cpu_startup_entry+0x380/0x4a0
[   33.921526] [c00000007e6f3ed0] [c000000000043110]
.start_secondary+0x320/0x350
[   33.921571] [c00000007e6f3f90] [c000000000008b6c]
start_secondary_prolog+0x10/0x14

I think that bug might have been introduced by commit
587f83e8dd50d22bc0c62 ("Use rtas_get_sensor in RAS code") since the
rtas_busy_delay() was not called before that commit, as far as I can see.

Any suggestions how to fix this? Simply revert 587f83e8dd50d? Use
mdelay() instead of msleep() in rtas_busy_delay()? Something more fancy?

 Thanks,
  Thomas

             reply	other threads:[~2015-07-14 18:43 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-07-14 18:43 Thomas Huth [this message]
2015-07-14 18:43 ` BUG: sleeping function called from ras_epow_interrupt context Thomas Huth
2015-07-14 21:22 ` Benjamin Herrenschmidt
2015-07-14 21:22   ` Benjamin Herrenschmidt
2015-07-15 14:35   ` Thomas Huth
2015-07-15 14:35     ` Thomas Huth
2015-07-15 19:58     ` Nathan Fontenot
2015-07-15 19:58       ` Nathan Fontenot
2015-07-16  6:23       ` Thomas Huth
2015-07-16  6:23         ` Thomas Huth
2015-07-16 17:39         ` Nathan Fontenot
2015-07-16 17:39           ` Nathan Fontenot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55A55846.5080904@redhat.com \
    --to=thuth@redhat.com \
    --cc=anton@samba.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.