public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, Yu Liao <liaoyu15@huawei.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Liu Tie <liutie4@huawei.com>, Sasha Levin <sashal@kernel.org>
Subject: [PATCH 5.4 01/67] hrtimers: Push pending hrtimers away from outgoing CPU earlier
Date: Mon, 11 Dec 2023 19:21:45 +0100	[thread overview]
Message-ID: <20231211182015.116511505@linuxfoundation.org> (raw)
In-Reply-To: <20231211182015.049134368@linuxfoundation.org>

5.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Thomas Gleixner <tglx@linutronix.de>

[ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ]

2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug")
solved the straight forward CPU hotplug deadlock vs. the scheduler
bandwidth timer. Yu discovered a more involved variant where a task which
has a bandwidth timer started on the outgoing CPU holds a lock and then
gets throttled. If the lock required by one of the CPU hotplug callbacks
the hotplug operation deadlocks because the unthrottling timer event is not
handled on the dying CPU and can only be recovered once the control CPU
reaches the hotplug state which pulls the pending hrtimers from the dead
CPU.

Solve this by pushing the hrtimers away from the dying CPU in the dying
callbacks. Nothing can queue a hrtimer on the dying CPU at that point because
all other CPUs spin in stop_machine() with interrupts disabled and once the
operation is finished the CPU is marked offline.

Reported-by: Yu Liao <liaoyu15@huawei.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Liu Tie <liutie4@huawei.com>
Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/cpuhotplug.h |  1 +
 include/linux/hrtimer.h    |  4 ++--
 kernel/cpu.c               |  8 +++++++-
 kernel/time/hrtimer.c      | 33 ++++++++++++---------------------
 4 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 8134cc3b99cdc..206d7ac411b88 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -141,6 +141,7 @@ enum cpuhp_state {
 	CPUHP_AP_ARM_CORESIGHT_STARTING,
 	CPUHP_AP_ARM64_ISNDEP_STARTING,
 	CPUHP_AP_SMPCFD_DYING,
+	CPUHP_AP_HRTIMERS_DYING,
 	CPUHP_AP_X86_TBOOT_DYING,
 	CPUHP_AP_ARM_CACHE_B15_RAC_DYING,
 	CPUHP_AP_ONLINE,
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 48be92aded5ee..16c68a7287bc4 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -526,9 +526,9 @@ extern void sysrq_timer_list_show(void);
 
 int hrtimers_prepare_cpu(unsigned int cpu);
 #ifdef CONFIG_HOTPLUG_CPU
-int hrtimers_dead_cpu(unsigned int cpu);
+int hrtimers_cpu_dying(unsigned int cpu);
 #else
-#define hrtimers_dead_cpu	NULL
+#define hrtimers_cpu_dying	NULL
 #endif
 
 #endif
diff --git a/kernel/cpu.c b/kernel/cpu.c
index c08456af0c7fe..ba579bb6b8978 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1473,7 +1473,7 @@ static struct cpuhp_step cpuhp_hp_states[] = {
 	[CPUHP_HRTIMERS_PREPARE] = {
 		.name			= "hrtimers:prepare",
 		.startup.single		= hrtimers_prepare_cpu,
-		.teardown.single	= hrtimers_dead_cpu,
+		.teardown.single	= NULL,
 	},
 	[CPUHP_SMPCFD_PREPARE] = {
 		.name			= "smpcfd:prepare",
@@ -1540,6 +1540,12 @@ static struct cpuhp_step cpuhp_hp_states[] = {
 		.startup.single		= NULL,
 		.teardown.single	= smpcfd_dying_cpu,
 	},
+	[CPUHP_AP_HRTIMERS_DYING] = {
+		.name			= "hrtimers:dying",
+		.startup.single		= NULL,
+		.teardown.single	= hrtimers_cpu_dying,
+	},
+
 	/* Entry state on starting. Interrupts enabled from here on. Transient
 	 * state for synchronsization */
 	[CPUHP_AP_ONLINE] = {
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index 8e3c9228aec97..e2a055e462551 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -2105,29 +2105,22 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base,
 	}
 }
 
-int hrtimers_dead_cpu(unsigned int scpu)
+int hrtimers_cpu_dying(unsigned int dying_cpu)
 {
 	struct hrtimer_cpu_base *old_base, *new_base;
-	int i;
+	int i, ncpu = cpumask_first(cpu_active_mask);
 
-	BUG_ON(cpu_online(scpu));
-	tick_cancel_sched_timer(scpu);
+	tick_cancel_sched_timer(dying_cpu);
+
+	old_base = this_cpu_ptr(&hrtimer_bases);
+	new_base = &per_cpu(hrtimer_bases, ncpu);
 
-	/*
-	 * this BH disable ensures that raise_softirq_irqoff() does
-	 * not wakeup ksoftirqd (and acquire the pi-lock) while
-	 * holding the cpu_base lock
-	 */
-	local_bh_disable();
-	local_irq_disable();
-	old_base = &per_cpu(hrtimer_bases, scpu);
-	new_base = this_cpu_ptr(&hrtimer_bases);
 	/*
 	 * The caller is globally serialized and nobody else
 	 * takes two locks at once, deadlock is not possible.
 	 */
-	raw_spin_lock(&new_base->lock);
-	raw_spin_lock_nested(&old_base->lock, SINGLE_DEPTH_NESTING);
+	raw_spin_lock(&old_base->lock);
+	raw_spin_lock_nested(&new_base->lock, SINGLE_DEPTH_NESTING);
 
 	for (i = 0; i < HRTIMER_MAX_CLOCK_BASES; i++) {
 		migrate_hrtimer_list(&old_base->clock_base[i],
@@ -2138,15 +2131,13 @@ int hrtimers_dead_cpu(unsigned int scpu)
 	 * The migration might have changed the first expiring softirq
 	 * timer on this CPU. Update it.
 	 */
-	hrtimer_update_softirq_timer(new_base, false);
+	__hrtimer_get_next_event(new_base, HRTIMER_ACTIVE_SOFT);
+	/* Tell the other CPU to retrigger the next event */
+	smp_call_function_single(ncpu, retrigger_next_event, NULL, 0);
 
-	raw_spin_unlock(&old_base->lock);
 	raw_spin_unlock(&new_base->lock);
+	raw_spin_unlock(&old_base->lock);
 
-	/* Check, if we got expired work to do */
-	__hrtimer_peek_ahead_timers();
-	local_irq_enable();
-	local_bh_enable();
 	return 0;
 }
 
-- 
2.42.0




  reply	other threads:[~2023-12-11 18:41 UTC|newest]

Thread overview: 74+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-11 18:21 [PATCH 5.4 00/67] 5.4.264-rc1 review Greg Kroah-Hartman
2023-12-11 18:21 ` Greg Kroah-Hartman [this message]
2023-12-11 18:21 ` [PATCH 5.4 02/67] netfilter: ipset: fix race condition between swap/destroy and kernel side add/del/test Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 03/67] tg3: Move the [rt]x_dropped counters to tg3_napi Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 04/67] tg3: Increment tx_dropped in tg3_tso_bug() Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 05/67] kconfig: fix memory leak from range properties Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 06/67] drm/amdgpu: correct chunk_ptr to a pointer to chunk Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 07/67] of: base: Add of_get_cpu_state_node() to get idle states for a CPU node Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 08/67] ACPI/IORT: Make iort_get_device_domain IRQ domain agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 09/67] ACPI/IORT: Make iort_msi_map_rid() PCI agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 10/67] of/iommu: Make of_map_rid() " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 11/67] of/irq: make of_msi_map_get_device_domain() bus agnostic Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 12/67] of/irq: Make of_msi_map_rid() PCI " Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 13/67] of: base: Fix some formatting issues and provide missing descriptions Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 14/67] of: Fix kerneldoc output formatting Greg Kroah-Hartman
2023-12-11 18:21 ` [PATCH 5.4 15/67] of: Add missing Return section in kerneldoc comments Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 16/67] of: dynamic: Fix of_reconfig_get_state_change() return value documentation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 17/67] ipv6: fix potential NULL deref in fib6_add() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 18/67] hv_netvsc: rndis_filter needs to select NLS Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 19/67] net: arcnet: Fix RESET flag handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 20/67] net: arcnet: com20020 fix error handling Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 21/67] arcnet: restoring support for multiple Sohard Arcnet cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 22/67] ipv4: ip_gre: Avoid skb_pull() failure in ipgre_xmit() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 23/67] net: hns: fix fake link up on xge port Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 24/67] netfilter: xt_owner: Fix for unsafe access of sk->sk_socket Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 25/67] tcp: do not accept ACK of bytes we never sent Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 26/67] bpf: sockmap, updating the sg structure should also update curr Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 27/67] RDMA/bnxt_re: Correct module description string Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 28/67] hwmon: (acpi_power_meter) Fix 4.29 MW bug Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 29/67] ASoC: wm_adsp: fix memleak in wm_adsp_buffer_populate Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 30/67] tracing: Fix a warning when allocating buffered events fails Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 31/67] scsi: be2iscsi: Fix a memleak in beiscsi_init_wrb_handle() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 32/67] ARM: imx: Check return value of devm_kasprintf in imx_mmdc_perf_init Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 33/67] ARM: dts: imx: make gpt node name generic Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 34/67] ARM: dts: imx7: Declare timers compatible with fsl,imx6dl-gpt Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 35/67] ALSA: pcm: fix out-of-bounds in snd_pcm_state_names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 36/67] nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 37/67] tracing: Always update snapshot buffer size Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 38/67] tracing: Fix incomplete locking when disabling buffered events Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 39/67] tracing: Fix a possible race " Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 40/67] packet: Move reference count in packet_sock to atomic_long_t Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 41/67] arm64: dts: mediatek: mt7622: fix memory node warning check Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 42/67] arm64: dts: mediatek: mt8173-evb: Fix regulator-fixed node names Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 43/67] perf/core: Add a new read format to get a number of lost samples Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 44/67] perf: Fix perf_event_validate_size() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 45/67] gpiolib: sysfs: Fix error handling on failed export Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 46/67] mmc: core: add helpers mmc_regulator_enable/disable_vqmmc Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 47/67] mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 48/67] usb: gadget: f_hid: fix report descriptor allocation Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 49/67] parport: Add support for Brainboxes IX/UC/PX parallel cards Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 50/67] usb: typec: class: fix typec_altmode_put_partner to put plugs Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 51/67] ARM: PL011: Fix DMA support Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 52/67] serial: sc16is7xx: address RX timeout interrupt errata Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 53/67] serial: 8250_omap: Add earlycon support for the AM654 UART controller Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 54/67] x86/CPU/AMD: Check vendor in the AMD microcode callback Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 55/67] KVM: s390/mm: Properly reset no-dat Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 56/67] nilfs2: fix missing error check for sb_set_blocksize call Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 57/67] io_uring/af_unix: disable sending io_uring over sockets Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 58/67] netlink: dont call ->netlink_bind with table lock held Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 59/67] genetlink: add CAP_NET_ADMIN test for multicast bind Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 60/67] psample: Require CAP_NET_ADMIN when joining "packets" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 61/67] drop_monitor: Require CAP_SYS_ADMIN when joining "events" group Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 62/67] tools headers UAPI: Sync linux/perf_event.h with the kernel sources Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 63/67] Revert "btrfs: add dmesg output for first mount and last unmount of a filesystem" Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 64/67] cifs: Fix non-availability of dedup breaking generic/304 Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 65/67] smb: client: fix potential NULL deref in parse_dfs_referrals() Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 66/67] devcoredump : Serialize devcd_del work Greg Kroah-Hartman
2023-12-11 18:22 ` [PATCH 5.4 67/67] devcoredump: Send uevent once devcd is ready Greg Kroah-Hartman
2023-12-11 19:04 ` [PATCH 5.4 00/67] 5.4.264-rc1 review Florian Fainelli
2023-12-12 13:23 ` Harshit Mogalapalli
2023-12-12 16:13 ` Shuah Khan
2023-12-12 17:00 ` Guenter Roeck
2023-12-12 17:05 ` Naresh Kamboju
2023-12-12 22:19 ` Jon Hunter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231211182015.116511505@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=liaoyu15@huawei.com \
    --cc=liutie4@huawei.com \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox