From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Matt Redfearn <matt.redfearn@imgtec.com>,
Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>,
James Hogan <jhogan@kernel.org>
Subject: [PATCH 4.13 29/36] MIPS: SMP: Fix deadlock & online race
Date: Mon, 6 Nov 2017 10:12:42 +0100 [thread overview]
Message-ID: <20171106085048.346245389@linuxfoundation.org> (raw)
In-Reply-To: <20171106085047.005824077@linuxfoundation.org>
4.13-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matt Redfearn <matt.redfearn@imgtec.com>
commit 9e8c399a88f0b87e41a894911475ed2a8f8dff9e upstream.
Commit 6f542ebeaee0 ("MIPS: Fix race on setting and getting
cpu_online_mask") effectively reverted commit 8f46cca1e6c06 ("MIPS: SMP:
Fix possibility of deadlock when bringing CPUs online") and thus has
reinstated the possibility of deadlock.
The commit was based on testing of kernel v4.4, where the CPU hotplug
core code issued a BUG() if the starting CPU is not marked online when
the boot CPU returns from __cpu_up. The commit fixes this race (in
v4.4), but re-introduces the deadlock situation.
As noted in the commit message, upstream differs in this area. Commit
8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu bring itself fully up")
adds a completion event in the CPU hotplug core code, making this race
impossible. However, people were unhappy with relying on the core code
to do the right thing.
To address the issues both commits were trying to fix, add a second
completion event in the MIPS smp hotplug path. It removes the
possibility of a race, since the MIPS smp hotplug code now synchronises
both the boot and secondary CPUs before they return to the hotplug core
code. It also addresses the deadlock by ensuring that the secondary CPU
is not marked online before it's counters are synchronised.
This fix should also be backported to fix the race condition introduced
by the backport of commit 8f46cca1e6c06 ("MIPS: SMP: Fix possibility of
deadlock when bringing CPUs online"), through really that race only
existed before commit 8df3e07e7f21f ("cpu/hotplug: Let upcoming cpu
bring itself fully up").
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Fixes: 6f542ebeaee0 ("MIPS: Fix race on setting and getting cpu_online_mask")
CC: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com>
Patchwork: https://patchwork.linux-mips.org/patch/17376/
Signed-off-by: James Hogan <jhogan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
arch/mips/kernel/smp.c | 22 ++++++++++++++++------
1 file changed, 16 insertions(+), 6 deletions(-)
--- a/arch/mips/kernel/smp.c
+++ b/arch/mips/kernel/smp.c
@@ -66,6 +66,7 @@ EXPORT_SYMBOL(cpu_sibling_map);
cpumask_t cpu_core_map[NR_CPUS] __read_mostly;
EXPORT_SYMBOL(cpu_core_map);
+static DECLARE_COMPLETION(cpu_starting);
static DECLARE_COMPLETION(cpu_running);
/*
@@ -376,6 +377,12 @@ asmlinkage void start_secondary(void)
cpumask_set_cpu(cpu, &cpu_coherent_mask);
notify_cpu_starting(cpu);
+ /* Notify boot CPU that we're starting & ready to sync counters */
+ complete(&cpu_starting);
+
+ synchronise_count_slave(cpu);
+
+ /* The CPU is running and counters synchronised, now mark it online */
set_cpu_online(cpu, true);
set_cpu_sibling_map(cpu);
@@ -383,8 +390,11 @@ asmlinkage void start_secondary(void)
calculate_cpu_foreign_map();
+ /*
+ * Notify boot CPU that we're up & online and it can safely return
+ * from __cpu_up
+ */
complete(&cpu_running);
- synchronise_count_slave(cpu);
/*
* irq will be enabled in ->smp_finish(), enabling it too early
@@ -443,17 +453,17 @@ int __cpu_up(unsigned int cpu, struct ta
{
mp_ops->boot_secondary(cpu, tidle);
- /*
- * We must check for timeout here, as the CPU will not be marked
- * online until the counters are synchronised.
- */
- if (!wait_for_completion_timeout(&cpu_running,
+ /* Wait for CPU to start and be ready to sync counters */
+ if (!wait_for_completion_timeout(&cpu_starting,
msecs_to_jiffies(1000))) {
pr_crit("CPU%u: failed to start\n", cpu);
return -EIO;
}
synchronise_count_master(cpu);
+
+ /* Wait for CPU to finish startup & mark itself online before return */
+ wait_for_completion(&cpu_running);
return 0;
}
next prev parent reply other threads:[~2017-11-06 9:14 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-06 9:12 [PATCH 4.13 00/36] 4.13.12-stable review Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 01/36] ALSA: timer: Add missing mutex lock for compat ioctls Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 02/36] ALSA: seq: Fix nested rwsem annotation for lockdep splat Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 03/36] cifs: check MaxPathNameComponentLength != 0 before using it Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 04/36] KEYS: return full count in keyring_read() if buffer is too small Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 05/36] KEYS: trusted: fix writing past end of buffer in trusted_read() Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 06/36] KEYS: fix out-of-bounds read during ASN.1 parsing Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 07/36] ASoC: adau17x1: Workaround for noise bug in ADC Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 08/36] virtio_blk: Fix an SG_IO regression Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 09/36] PM / QoS: Fix device resume latency PM QoS Greg Kroah-Hartman
2017-11-07 0:51 ` Rafael J. Wysocki
2017-11-07 10:32 ` Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 10/36] PM / QoS: Fix default runtime_pm device resume latency Greg Kroah-Hartman
2017-11-07 0:51 ` Rafael J. Wysocki
2017-11-06 9:12 ` [PATCH 4.13 11/36] arm64: ensure __dump_instr() checks addr_limit Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 12/36] KVM: arm64: its: Fix missing dynamic allocation check in scan_its_table Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 13/36] arm/arm64: KVM: set right LR register value for 32 bit guest when inject abort Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 14/36] arm/arm64: kvm: Disable branch profiling in HYP code Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 15/36] ARM: dts: mvebu: pl310-cache disable double-linefill Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 16/36] ARM: 8715/1: add a private asm/unaligned.h Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 17/36] drm/amdgpu: return -ENOENT from uvd 6.0 early init for harvesting Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 18/36] drm/amdgpu: allow harvesting check for Polaris VCE Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 19/36] userfaultfd: hugetlbfs: prevent UFFDIO_COPY to fill beyond the end of i_size Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 20/36] ocfs2: fstrim: Fix start offset of first cluster group during fstrim Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 21/36] fs/hugetlbfs/inode.c: fix hwpoison reserve accounting Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 22/36] mm, swap: fix race between swap count continuation operations Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 25/36] Revert "powerpc64/elfv1: Only dereference function descriptor for non-text symbols" Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 26/36] MIPS: bpf: Fix a typo in build_one_insn() Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 28/36] MIPS: microMIPS: Fix incorrect mask in insn_table_MM Greg Kroah-Hartman
2017-11-06 9:12 ` Greg Kroah-Hartman [this message]
2017-11-06 9:12 ` [PATCH 4.13 30/36] Revert "x86: do not use cpufreq_quick_get() for /proc/cpuinfo "cpu MHz"" Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 31/36] x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 32/36] powerpc/kprobes: Dereference function pointers only if the address does not belong to kernel text Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 33/36] futex: Fix more put_pi_state() vs. exit_pi_state_list() races Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 34/36] perf/cgroup: Fix perf cgroup hierarchy support Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 35/36] x86/mcelog: Get rid of RCU remnants Greg Kroah-Hartman
2017-11-06 9:12 ` [PATCH 4.13 36/36] irqchip/irq-mvebu-gicp: Add missing spin_lock init Greg Kroah-Hartman
2017-11-06 21:18 ` [PATCH 4.13 00/36] 4.13.12-stable review Guenter Roeck
2017-11-06 23:27 ` Shuah Khan
2017-11-07 10:33 ` Greg Kroah-Hartman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171106085048.346245389@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=jhogan@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matija.glavinic-pecotic.ext@nokia.com \
--cc=matt.redfearn@imgtec.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).