From: Sasha Levin <sashal@kernel.org>
To: linux-kernel@vger.kernel.org, stable@vger.kernel.org
Cc: Qian Cai <cai@lca.pw>, Borislav Petkov <bp@suse.de>,
"Rafael J . Wysocki" <rafael.j.wysocki@intel.com>,
Sasha Levin <sashal@kernel.org>,
linux-pm@vger.kernel.org, linux-acpi@vger.kernel.org,
devel@acpica.org
Subject: [PATCH AUTOSEL 4.19 26/40] x86: ACPI: fix CPU hotplug deadlock
Date: Wed, 15 Apr 2020 07:46:09 -0400 [thread overview]
Message-ID: <20200415114623.14972-26-sashal@kernel.org> (raw)
In-Reply-To: <20200415114623.14972-1-sashal@kernel.org>
From: Qian Cai <cai@lca.pw>
[ Upstream commit 696ac2e3bf267f5a2b2ed7d34e64131f2287d0ad ]
Similar to commit 0266d81e9bf5 ("acpi/processor: Prevent cpu hotplug
deadlock") except this is for acpi_processor_ffh_cstate_probe():
"The problem is that the work is scheduled on the current CPU from the
hotplug thread associated with that CPU.
It's not required to invoke these functions via the workqueue because
the hotplug thread runs on the target CPU already.
Check whether current is a per cpu thread pinned on the target CPU and
invoke the function directly to avoid the workqueue."
WARNING: possible circular locking dependency detected
------------------------------------------------------
cpuhp/1/15 is trying to acquire lock:
ffffc90003447a28 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: __flush_work+0x4c6/0x630
but task is already holding lock:
ffffffffafa1c0e8 (cpuidle_lock){+.+.}-{3:3}, at: cpuidle_pause_and_lock+0x17/0x20
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (cpu_hotplug_lock){++++}-{0:0}:
cpus_read_lock+0x3e/0xc0
irq_calc_affinity_vectors+0x5f/0x91
__pci_enable_msix_range+0x10f/0x9a0
pci_alloc_irq_vectors_affinity+0x13e/0x1f0
pci_alloc_irq_vectors_affinity at drivers/pci/msi.c:1208
pqi_ctrl_init+0x72f/0x1618 [smartpqi]
pqi_pci_probe.cold.63+0x882/0x892 [smartpqi]
local_pci_probe+0x7a/0xc0
work_for_cpu_fn+0x2e/0x50
process_one_work+0x57e/0xb90
worker_thread+0x363/0x5b0
kthread+0x1f4/0x220
ret_from_fork+0x27/0x50
-> #0 ((work_completion)(&wfc.work)){+.+.}-{0:0}:
__lock_acquire+0x2244/0x32a0
lock_acquire+0x1a2/0x680
__flush_work+0x4e6/0x630
work_on_cpu+0x114/0x160
acpi_processor_ffh_cstate_probe+0x129/0x250
acpi_processor_evaluate_cst+0x4c8/0x580
acpi_processor_get_power_info+0x86/0x740
acpi_processor_hotplug+0xc3/0x140
acpi_soft_cpu_online+0x102/0x1d0
cpuhp_invoke_callback+0x197/0x1120
cpuhp_thread_fun+0x252/0x2f0
smpboot_thread_fn+0x255/0x440
kthread+0x1f4/0x220
ret_from_fork+0x27/0x50
other info that might help us debug this:
Chain exists of:
(work_completion)(&wfc.work) --> cpuhp_state-up --> cpuidle_lock
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(cpuidle_lock);
lock(cpuhp_state-up);
lock(cpuidle_lock);
lock((work_completion)(&wfc.work));
*** DEADLOCK ***
3 locks held by cpuhp/1/15:
#0: ffffffffaf51ab10 (cpu_hotplug_lock){++++}-{0:0}, at: cpuhp_thread_fun+0x69/0x2f0
#1: ffffffffaf51ad40 (cpuhp_state-up){+.+.}-{0:0}, at: cpuhp_thread_fun+0x69/0x2f0
#2: ffffffffafa1c0e8 (cpuidle_lock){+.+.}-{3:3}, at: cpuidle_pause_and_lock+0x17/0x20
Call Trace:
dump_stack+0xa0/0xea
print_circular_bug.cold.52+0x147/0x14c
check_noncircular+0x295/0x2d0
__lock_acquire+0x2244/0x32a0
lock_acquire+0x1a2/0x680
__flush_work+0x4e6/0x630
work_on_cpu+0x114/0x160
acpi_processor_ffh_cstate_probe+0x129/0x250
acpi_processor_evaluate_cst+0x4c8/0x580
acpi_processor_get_power_info+0x86/0x740
acpi_processor_hotplug+0xc3/0x140
acpi_soft_cpu_online+0x102/0x1d0
cpuhp_invoke_callback+0x197/0x1120
cpuhp_thread_fun+0x252/0x2f0
smpboot_thread_fn+0x255/0x440
kthread+0x1f4/0x220
ret_from_fork+0x27/0x50
Signed-off-by: Qian Cai <cai@lca.pw>
Tested-by: Borislav Petkov <bp@suse.de>
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/x86/kernel/acpi/cstate.c | 3 ++-
drivers/acpi/processor_throttling.c | 7 -------
include/acpi/processor.h | 8 ++++++++
3 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/arch/x86/kernel/acpi/cstate.c b/arch/x86/kernel/acpi/cstate.c
index 158ad1483c435..92539a1c3e317 100644
--- a/arch/x86/kernel/acpi/cstate.c
+++ b/arch/x86/kernel/acpi/cstate.c
@@ -133,7 +133,8 @@ int acpi_processor_ffh_cstate_probe(unsigned int cpu,
/* Make sure we are running on right CPU */
- retval = work_on_cpu(cpu, acpi_processor_ffh_cstate_probe_cpu, cx);
+ retval = call_on_cpu(cpu, acpi_processor_ffh_cstate_probe_cpu, cx,
+ false);
if (retval == 0) {
/* Use the hint in CST */
percpu_entry->states[cx->index].eax = cx->address;
diff --git a/drivers/acpi/processor_throttling.c b/drivers/acpi/processor_throttling.c
index fbc936cf2025c..62c0fe9ef4124 100644
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -910,13 +910,6 @@ static long __acpi_processor_get_throttling(void *data)
return pr->throttling.acpi_processor_get_throttling(pr);
}
-static int call_on_cpu(int cpu, long (*fn)(void *), void *arg, bool direct)
-{
- if (direct || (is_percpu_thread() && cpu == smp_processor_id()))
- return fn(arg);
- return work_on_cpu(cpu, fn, arg);
-}
-
static int acpi_processor_get_throttling(struct acpi_processor *pr)
{
if (!pr)
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index 1194a4c78d557..5b9eab15a1e6c 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -293,6 +293,14 @@ static inline void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx
}
#endif
+static inline int call_on_cpu(int cpu, long (*fn)(void *), void *arg,
+ bool direct)
+{
+ if (direct || (is_percpu_thread() && cpu == smp_processor_id()))
+ return fn(arg);
+ return work_on_cpu(cpu, fn, arg);
+}
+
/* in processor_perflib.c */
#ifdef CONFIG_CPU_FREQ
--
2.20.1
next prev parent reply other threads:[~2020-04-15 11:59 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-15 11:45 [PATCH AUTOSEL 4.19 01/40] clk: at91: usb: continue if clk_hw_round_rate() return zero Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 02/40] power: supply: bq27xxx_battery: Silence deferred-probe error Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 03/40] clk: tegra: Fix Tegra PMC clock out parents Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 04/40] soc: imx: gpc: fix power up sequencing Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 05/40] rtc: 88pm860x: fix possible race condition Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 06/40] NFSv4/pnfs: Return valid stateids in nfs_layout_find_inode_by_stateid() Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 07/40] NFS: direct.c: Fix memory leak of dreq when nfs_get_lock_context fails Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 08/40] s390/cpuinfo: fix wrong output when CPU0 is offline Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 09/40] btrfs: handle NULL roots in btrfs_put/btrfs_grab_fs_root Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 10/40] powerpc/maple: Fix declaration made after definition Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 11/40] s390/cpum_sf: Fix wrong page count in error message Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 12/40] ext4: do not commit super on read-only bdev Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 13/40] ext4: fix incorrect group count in ext4_fill_super error message Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 14/40] ext4: fix incorrect inodes per group in " Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 15/40] um: ubd: Prevent buffer overrun on command completion Sasha Levin
2020-04-15 11:45 ` [PATCH AUTOSEL 4.19 16/40] cifs: Allocate encryption header through kmalloc Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 17/40] slcan: Don't transmit uninitialized stack data in padding Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 18/40] net: qualcomm: rmnet: Allow configuration updates to existing devices Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 19/40] net: stmmac: dwmac1000: fix out-of-bounds mac address reg setting Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 20/40] net: dsa: bcm_sf2: Do not register slave MDIO bus with OF Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 21/40] include/linux/swapops.h: correct guards for non_swap_entry() Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 22/40] percpu_counter: fix a data race at vm_committed_as Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 23/40] compiler.h: fix error in BUILD_BUG_ON() reporting Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 24/40] KVM: s390: vsie: Fix possible race when shadowing region 3 tables Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 25/40] net: dsa: bcm_sf2: Ensure correct sub-node is parsed Sasha Levin
2020-04-15 11:46 ` Sasha Levin [this message]
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 27/40] net: phy: micrel: kszphy_resume(): add delay after genphy_resume() before accessing PHY registers Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 28/40] drm/amdkfd: kfree the wrong pointer Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 29/40] NFS: Fix memory leaks in nfs_pageio_stop_mirroring() Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 30/40] f2fs: fix NULL pointer dereference in f2fs_write_begin() Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 31/40] mfd: dln2: Fix sanity checking for endpoints Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 32/40] drm/vc4: Fix HDMI mode validation Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 33/40] iommu/vt-d: Fix mm reference leak Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 34/40] ext2: fix empty body warnings when -Wextra is used Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 35/40] ovl: fix value of i_ino for lower hardlink corner case Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 36/40] ext2: fix debug reference to ext2_xattr_cache Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 37/40] power: supply: axp288_fuel_gauge: Broaden vendor check for Intel Compute Sticks Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 38/40] libnvdimm: Out of bounds read in __nd_ioctl() Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 39/40] iommu/amd: Fix the configuration of GCR3 table root pointer Sasha Levin
2020-04-15 11:46 ` [PATCH AUTOSEL 4.19 40/40] f2fs: fix to wait all node page writeback Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200415114623.14972-26-sashal@kernel.org \
--to=sashal@kernel.org \
--cc=bp@suse.de \
--cc=cai@lca.pw \
--cc=devel@acpica.org \
--cc=linux-acpi@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pm@vger.kernel.org \
--cc=rafael.j.wysocki@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox