From: Frederic Weisbecker <frederic@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: "Frederic Weisbecker" <frederic@kernel.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Bjorn Helgaas" <bhelgaas@google.com>,
"Catalin Marinas" <catalin.marinas@arm.com>,
"Danilo Krummrich" <dakr@kernel.org>,
"David S . Miller" <davem@davemloft.net>,
"Eric Dumazet" <edumazet@google.com>,
"Gabriele Monaco" <gmonaco@redhat.com>,
"Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
"Ingo Molnar" <mingo@redhat.com>,
"Jakub Kicinski" <kuba@kernel.org>,
"Jens Axboe" <axboe@kernel.dk>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Lai Jiangshan" <jiangshanlai@gmail.com>,
"Marco Crivellari" <marco.crivellari@suse.com>,
"Michal Hocko" <mhocko@suse.com>,
"Muchun Song" <muchun.song@linux.dev>,
"Paolo Abeni" <pabeni@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Phil Auld" <pauld@redhat.com>,
"Rafael J . Wysocki" <rafael@kernel.org>,
"Roman Gushchin" <roman.gushchin@linux.dev>,
"Shakeel Butt" <shakeel.butt@linux.dev>,
"Simon Horman" <horms@kernel.org>, "Tejun Heo" <tj@kernel.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Waiman Long" <longman@redhat.com>,
"Will Deacon" <will@kernel.org>,
cgroups@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-block@vger.kernel.org, linux-mm@kvack.org,
linux-pci@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 02/31] cpu: Revert "cpu/hotplug: Prevent self deadlock on CPU hot-unplug"
Date: Wed, 5 Nov 2025 22:03:18 +0100 [thread overview]
Message-ID: <20251105210348.35256-3-frederic@kernel.org> (raw)
In-Reply-To: <20251105210348.35256-1-frederic@kernel.org>
1) The commit:
2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug")
was added to fix an issue where the hotplug control task (BP) was
throttled between CPUHP_AP_IDLE_DEAD and CPUHP_HRTIMERS_PREPARE waiting
in the hrtimer blindspot for the bandwidth callback queued in the dead
CPU.
2) Later on, the commit:
38685e2a0476 ("cpu/hotplug: Don't offline the last non-isolated CPU")
plugged on the target selection for the workqueue offloaded CPU down
process to prevent from destroying the last CPU domain.
3) Finally:
5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
removed entirely the conditions for the race exposed and partially fixed
in 1). The offloading of the CPU down process to a workqueue on another
CPU then becomes unnecessary. But the last CPU belonging to scheduler
domains must still remain online.
Therefore revert the now obsolete commit
2b8272ff4a70b866106ae13c36be7ecbef5d5da2 and move the housekeeping check
under the cpu_hotplug_lock write held. Since HK_TYPE_DOMAIN will include
both isolcpus and cpuset isolated partition, the hotplug lock will
synchronize against concurrent cpuset partition updates.
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/cpu.c | 37 +++++++++++--------------------------
1 file changed, 11 insertions(+), 26 deletions(-)
diff --git a/kernel/cpu.c b/kernel/cpu.c
index db9f6c539b28..453a806af2ee 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1410,6 +1410,16 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
cpus_write_lock();
+ /*
+ * Keep at least one housekeeping cpu onlined to avoid generating
+ * an empty sched_domain span.
+ */
+ if (cpumask_any_and(cpu_online_mask,
+ housekeeping_cpumask(HK_TYPE_DOMAIN)) >= nr_cpu_ids) {
+ ret = -EBUSY;
+ goto out;
+ }
+
cpuhp_tasks_frozen = tasks_frozen;
prev_state = cpuhp_set_state(cpu, st, target);
@@ -1456,22 +1466,8 @@ static int __ref _cpu_down(unsigned int cpu, int tasks_frozen,
return ret;
}
-struct cpu_down_work {
- unsigned int cpu;
- enum cpuhp_state target;
-};
-
-static long __cpu_down_maps_locked(void *arg)
-{
- struct cpu_down_work *work = arg;
-
- return _cpu_down(work->cpu, 0, work->target);
-}
-
static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
{
- struct cpu_down_work work = { .cpu = cpu, .target = target, };
-
/*
* If the platform does not support hotplug, report it explicitly to
* differentiate it from a transient offlining failure.
@@ -1480,18 +1476,7 @@ static int cpu_down_maps_locked(unsigned int cpu, enum cpuhp_state target)
return -EOPNOTSUPP;
if (cpu_hotplug_disabled)
return -EBUSY;
-
- /*
- * Ensure that the control task does not run on the to be offlined
- * CPU to prevent a deadlock against cfs_b->period_timer.
- * Also keep at least one housekeeping cpu onlined to avoid generating
- * an empty sched_domain span.
- */
- for_each_cpu_and(cpu, cpu_online_mask, housekeeping_cpumask(HK_TYPE_DOMAIN)) {
- if (cpu != work.cpu)
- return work_on_cpu(cpu, __cpu_down_maps_locked, &work);
- }
- return -EBUSY;
+ return _cpu_down(cpu, 0, target);
}
static int cpu_down(unsigned int cpu, enum cpuhp_state target)
--
2.51.0
next prev parent reply other threads:[~2025-11-05 21:04 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-05 21:03 [PATCH 00/31 v4] cpuset/isolation: Honour kthreads preferred affinity Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 01/31] PCI: Prepare to protect against concurrent isolated cpuset change Frederic Weisbecker
2025-11-05 21:03 ` Frederic Weisbecker [this message]
2025-11-05 21:03 ` [PATCH 03/31] memcg: " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 04/31] mm: vmstat: " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 05/31] sched/isolation: Save boot defined domain flags Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 06/31] cpuset: Convert boot_hk_cpus to use HK_TYPE_DOMAIN_BOOT Frederic Weisbecker
2025-11-06 1:20 ` Chen Ridong
2025-11-05 21:03 ` [PATCH 07/31] driver core: cpu: Convert /sys/devices/system/cpu/isolated " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 08/31] net: Keep ignoring isolated cpuset change Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 09/31] block: Protect against concurrent " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 10/31] cpu: Provide lockdep check for CPU hotplug lock write-held Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 11/31] cpuset: Provide lockdep check for cpuset lock held Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 12/31] sched/isolation: Convert housekeeping cpumasks to rcu pointers Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 13/31] cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset Frederic Weisbecker
2025-11-08 9:05 ` Chen Ridong
2025-11-05 21:03 ` [PATCH 14/31] sched/isolation: Flush memcg workqueues on cpuset isolated partition change Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 15/31] sched/isolation: Flush vmstat " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 16/31] PCI: Flush PCI probe workqueue " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 17/31] cpuset: Propagate cpuset isolation update to workqueue through housekeeping Frederic Weisbecker
2025-11-06 0:55 ` Chen Ridong
2025-11-05 21:03 ` [PATCH 18/31] cpuset: Remove cpuset_cpu_is_isolated() Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 19/31] sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 20/31] PCI: Remove superfluous HK_TYPE_WQ check Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 21/31] kthread: Refine naming of affinity related fields Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 22/31] kthread: Include unbound kthreads in the managed affinity list Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 23/31] kthread: Include kthreadd to " Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 24/31] kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 25/31] sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 26/31] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 27/31] sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 28/31] kthread: Honour kthreads preferred affinity after cpuset changes Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 29/31] kthread: Comment on the purpose and placement of kthread_affine_node() call Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 30/31] kthread: Document kthread_affine_preferred() Frederic Weisbecker
2025-11-05 21:03 ` [PATCH 31/31] doc: Add housekeeping documentation Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20251105210348.35256-3-frederic@kernel.org \
--to=frederic@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=axboe@kernel.dk \
--cc=bhelgaas@google.com \
--cc=catalin.marinas@arm.com \
--cc=cgroups@vger.kernel.org \
--cc=dakr@kernel.org \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=gmonaco@redhat.com \
--cc=gregkh@linuxfoundation.org \
--cc=hannes@cmpxchg.org \
--cc=horms@kernel.org \
--cc=jiangshanlai@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-pci@vger.kernel.org \
--cc=longman@redhat.com \
--cc=marco.crivellari@suse.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=muchun.song@linux.dev \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=pauld@redhat.com \
--cc=peterz@infradead.org \
--cc=rafael@kernel.org \
--cc=roman.gushchin@linux.dev \
--cc=shakeel.butt@linux.dev \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).