From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Michael Ellerman <mpe@ellerman.id.au>,
Anton Blanchard <anton@samba.org>
Subject: [PATCH 3.10 46/55] powerpc/smp: Wait until secondaries are active & online
Date: Tue, 24 Mar 2015 16:43:26 +0100 [thread overview]
Message-ID: <20150324154200.706449259@linuxfoundation.org> (raw)
In-Reply-To: <20150324154158.748418668@linuxfoundation.org>
3.10-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman <mpe@ellerman.id.au>
commit 875ebe940d77a41682c367ad799b4f39f128d3fa upstream.
Anton has a busy ppc64le KVM box where guests sometimes hit the infamous
"kernel BUG at kernel/smpboot.c:134!" issue during boot:
BUG_ON(td->cpu != smp_processor_id());
Basically a per CPU hotplug thread scheduled on the wrong CPU. The oops
output confirms it:
CPU: 0
Comm: watchdog/130
The problem is that we aren't ensuring the CPU active bit is set for the
secondary before allowing the master to continue on. The master unparks
the secondary CPU's kthreads and the scheduler looks for a CPU to run
on. It calls select_task_rq() and realises the suggested CPU is not in
the cpus_allowed mask. It then ends up in select_fallback_rq(), and
since the active bit isnt't set we choose some other CPU to run on.
This seems to have been introduced by 6acbfb96976f "sched: Fix hotplug
vs. set_cpus_allowed_ptr()", which changed from setting active before
online to setting active after online. However that was in turn fixing a
bug where other code assumed an active CPU was also online, so we can't
just revert that fix.
The simplest fix is just to spin waiting for both active & online to be
set. We already have a barrier prior to set_cpu_online() (which also
sets active), to ensure all other setup is completed before online &
active are set.
Fixes: 6acbfb96976f ("sched: Fix hotplug vs. set_cpus_allowed_ptr()")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/powerpc/kernel/smp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -544,8 +544,8 @@ int __cpuinit __cpu_up(unsigned int cpu,
if (smp_ops->give_timebase)
smp_ops->give_timebase();
- /* Wait until cpu puts itself in the online map */
- while (!cpu_online(cpu))
+ /* Wait until cpu puts itself in the online & active maps */
+ while (!cpu_online(cpu) || !cpu_active(cpu))
cpu_relax();
return 0;
next prev parent reply other threads:[~2015-03-24 15:48 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-24 15:42 [PATCH 3.10 00/55] 3.10.73-stable review Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 01/55] sparc32: destroy_context() and switch_mm() needs to disable interrupts Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 02/55] sparc: semtimedop() unreachable due to comparison error Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 03/55] sparc: perf: Remove redundant perf_pmu_{en|dis}able calls Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 04/55] sparc: perf: Make counting mode actually work Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 05/55] sparc: Touch NMI watchdog when walking cpus and calling printk Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 06/55] sparc64: Fix several bugs in memmove() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 07/55] net: sysctl_net_core: check SNDBUF and RCVBUF for min length Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 08/55] rds: avoid potential stack overflow Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 09/55] inet_diag: fix possible overflow in inet_diag_dump_one_icsk() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 10/55] caif: fix MSG_OOB test in caif_seqpkt_recvmsg() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 11/55] rxrpc: bogus MSG_PEEK test in rxrpc_recvmsg() Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 12/55] Revert "net: cx82310_eth: use common match macro" Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 13/55] tcp: fix tcp fin memory accounting Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 14/55] net: compat: Update get_compat_msghdr() to match copy_msghdr_from_user() behaviour Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 15/55] tcp: make connect() mem charging friendly Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 17/55] drm/radeon: do a posting read in evergreen_set_irq Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 18/55] drm/radeon: do a posting read in r100_set_irq Greg Kroah-Hartman
2015-03-24 15:42 ` [PATCH 3.10 19/55] drm/radeon: do a posting read in r600_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 20/55] drm/radeon: do a posting read in si_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 21/55] drm/radeon: do a posting read in rs600_set_irq Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 23/55] fuse: set stolen page uptodate Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 24/55] fuse: notify: dont move pages Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 25/55] virtio_console: init work unconditionally Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 26/55] Change email address for 8250_pci Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 27/55] can: add missing initialisations in CAN related skbuffs Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 28/55] workqueue: fix hang involving racing cancel[_delayed]_work_sync()s for PREEMPT_NONE Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 29/55] tpm/ibmvtpm: Additional LE support for tpm_ibmvtpm_send Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 30/55] spi: pl022: Fix race in giveback() leading to driver lock-up Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 31/55] ALSA: control: Add sanity checks for user ctl id name string Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 32/55] ALSA: hda - Fix built-in mic on Compaq Presario CQ60 Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 33/55] ALSA: hda - Dont access stereo amps for mono channel widgets Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 34/55] ALSA: hda - Set single_adc_amp flag for CS420x codecs Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 35/55] ALSA: hda - Add workaround for MacBook Air 5,2 built-in mic Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 36/55] ALSA: hda - Treat stereo-to-mono mix properly Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 37/55] regulator: Only enable disabled regulators on resume Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 38/55] regulator: core: Fix enable GPIO reference counting Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 39/55] nilfs2: fix deadlock of segment constructor during recovery Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 40/55] xen-pciback: limit guest control of command register Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 41/55] libsas: Fix Kernel Crash in smp_execute_task Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 42/55] crypto: aesni - fix memory usage in GCM decryption Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 43/55] x86/fpu: Avoid math_state_restore() without used_math() in __restore_xstate_sig() Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 44/55] x86/fpu: Drop_fpu() should not assume that tsk equals current Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 45/55] x86/vdso: Fix the build on GCC5 Greg Kroah-Hartman
2015-03-24 15:43 ` Greg Kroah-Hartman [this message]
2015-03-24 15:43 ` [PATCH 3.10 47/55] ipvs: add missing ip_vs_pe_put in sync code Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 48/55] ipvs: rerouting to local clients is not needed anymore Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 49/55] ARM: at91: pm: fix at91rm9200 standby Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 50/55] target: Fix reference leak in target_get_sess_cmd() error path Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 51/55] iscsi-target: Avoid early conn_logout_comp for iser connections Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 52/55] target/pscsi: Fix NULL pointer dereference in get_device_type Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 53/55] target: Fix R_HOLDER bit usage for AllRegistrants Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 54/55] target: Allow AllRegistrants to re-RESERVE existing reservation Greg Kroah-Hartman
2015-03-24 15:43 ` [PATCH 3.10 55/55] target: Allow Write Exclusive non-reservation holders to READ Greg Kroah-Hartman
2015-03-25 2:34 ` [PATCH 3.10 00/55] 3.10.73-stable review Guenter Roeck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150324154200.706449259@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=anton@samba.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mpe@ellerman.id.au \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox