From: Frederic Weisbecker <frederic@kernel.org>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: linux-kernel@vger.kernel.org,
"Anna-Maria Behnsen" <anna-maria@linutronix.de>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Waiman Long" <longman@redhat.com>, "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
cgroups@vger.kernel.org, "John B. Wyatt IV" <jwyatt@redhat.com>,
"John B. Wyatt IV" <sageofredondo@gmail.com>
Subject: Re: [PATCH v12 9/9] timers: Exclude isolated cpus from timer migration
Date: Tue, 16 Sep 2025 15:41:16 +0200 [thread overview]
Message-ID: <aMlo_K8koZ6HH2kh@localhost.localdomain> (raw)
In-Reply-To: <20250915145920.140180-20-gmonaco@redhat.com>
Le Mon, Sep 15, 2025 at 04:59:30PM +0200, Gabriele Monaco a écrit :
> Tested-by: John B. Wyatt IV <jwyatt@redhat.com>
> Tested-by: John B. Wyatt IV <sageofredondo@gmail.com>
Two people, one tester? :-)
> +/**
> + * tmigr_isolated_exclude_cpumask - Exclude given CPUs from hierarchy
> + * @exclude_cpumask: the cpumask to be excluded from timer migration hierarchy
> + *
> + * This function can be called from cpuset code to provide the new set of
> + * isolated CPUs that should be excluded from the hierarchy.
> + * Online CPUs not present in exclude_cpumask but already excluded are brought
> + * back to the hierarchy.
> + * Functions to isolate/unisolate need to be called locally and can sleep.
> + */
> +int tmigr_isolated_exclude_cpumask(struct cpumask *exclude_cpumask)
> +{
> + struct work_struct __percpu *works __free(free_percpu) =
> + alloc_percpu(struct work_struct);
> + cpumask_var_t cpumask_unisol __free(free_cpumask_var) = CPUMASK_NULL;
> + cpumask_var_t cpumask_isol __free(free_cpumask_var) = CPUMASK_NULL;
> + int cpu;
> +
> + lockdep_assert_cpus_held();
> +
> + if (!alloc_cpumask_var(&cpumask_isol, GFP_KERNEL))
> + return -ENOMEM;
> + if (!alloc_cpumask_var(&cpumask_unisol, GFP_KERNEL))
> + return -ENOMEM;
> + if (!works)
> + return -ENOMEM;
> +
> + cpumask_andnot(cpumask_unisol, cpu_online_mask, exclude_cpumask);
> + cpumask_andnot(cpumask_unisol, cpumask_unisol, tmigr_available_cpumask);
> + /* Set up the mask earlier to avoid races with the migrator CPU */
> + cpumask_or(tmigr_available_cpumask, tmigr_available_cpumask, cpumask_unisol);
> + for_each_cpu(cpu, cpumask_unisol) {
> + struct work_struct *work = per_cpu_ptr(works, cpu);
> +
> + INIT_WORK(work, tmigr_cpu_unisolate);
> + schedule_work_on(cpu, work);
> + }
> +
> + cpumask_and(cpumask_isol, exclude_cpumask, tmigr_available_cpumask);
> + cpumask_and(cpumask_isol, cpumask_isol, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE));
> + /*
> + * Handle this here and not in the cpuset code because exclude_cpumask
> + * might include also the tick CPU if included in isolcpus.
> + */
> + for_each_cpu(cpu, cpumask_isol) {
> + if (!tick_nohz_cpu_hotpluggable(cpu)) {
> + cpumask_clear_cpu(cpu, cpumask_isol);
> + break;
> + }
> + }
> + /* Set up the mask earlier to avoid races with the migrator CPU */
> + cpumask_andnot(tmigr_available_cpumask, tmigr_available_cpumask, cpumask_isol);
> + for_each_cpu(cpu, cpumask_isol) {
> + struct work_struct *work = per_cpu_ptr(works, cpu);
> +
> + INIT_WORK(work, tmigr_cpu_isolate);
> + schedule_work_on(cpu, work);
> + }
This is racy at various levels:
* The tmigr_available_cpumask clear can race with the cpumask_set_cpu() in
tmigr_cpu_unisolate(), risking overwrites when CPUs are on the same bitmap
chunk (bitmap operations aren't atomic).
* tmigr_cpu_isolate() and tmigr_cpu_unisolate() can now run concurrently and
then cpumask_set_cpu() can race with cpumask_clear_cpu() on
tmigr_available_cpumask, risking overwrites, though the real problem is
on the precedent point.
* tmigr_cpu_isolate() can race with tmigr_cpu_isolate() on other CPUs so
the calls to cpumask_clear_cpu() can race. That's fine because
tmigr_available_cpumask is already set to those CPUs but that still
leaves un uncomfortable taste. That would leave an excuse for KSCAN to warn
for example.
* Similar with tmigr_cpu_unisolate() racing together.
So, something like that should be added?
diff --git a/kernel/time/timer_migration.c b/kernel/time/timer_migration.c
index 08e29fc01479..6615e56c8b0d 100644
--- a/kernel/time/timer_migration.c
+++ b/kernel/time/timer_migration.c
@@ -1473,7 +1473,6 @@ static int tmigr_clear_cpu_available(unsigned int cpu)
int migrator;
u64 firstexp;
- cpumask_clear_cpu(cpu, tmigr_available_cpumask);
scoped_guard(raw_spinlock_irq, &tmc->lock) {
if (!tmc->available)
return 0;
@@ -1489,11 +1488,11 @@ static int tmigr_clear_cpu_available(unsigned int cpu)
}
if (firstexp != KTIME_MAX) {
- migrator = cpumask_any(tmigr_available_cpumask);
+ migrator = cpumask_any_but(tmigr_available_cpumask, cpu);
work_on_cpu(migrator, tmigr_trigger_active, NULL);
}
- return 0;
+ return 1;
}
static int tmigr_set_cpu_available(unsigned int cpu)
@@ -1506,7 +1505,7 @@ static int tmigr_set_cpu_available(unsigned int cpu)
if (tmigr_is_isolated(cpu))
return 0;
- cpumask_set_cpu(cpu, tmigr_available_cpumask);
+
scoped_guard(raw_spinlock_irq, &tmc->lock) {
if (tmc->available)
return 0;
@@ -1516,7 +1515,19 @@ static int tmigr_set_cpu_available(unsigned int cpu)
__tmigr_cpu_activate(tmc);
tmc->available = true;
}
- return 0;
+ return 1;
+}
+
+static int tmigr_online_cpu(unsigned int cpu)
+{
+ if (tmigr_set_cpu_available(cpu) > 0)
+ cpumask_set_cpu(cpu, tmigr_available_cpumask);
+}
+
+static int tmigr_offline_cpu(unsigned int cpu)
+{
+ if (tmigr_clear_cpu_available(cpu) > 0)
+ cpumask_clear_cpu(cpu, tmigr_available_cpumask);
}
static void tmigr_cpu_isolate(struct work_struct *ignored)
@@ -1601,8 +1612,7 @@ int tmigr_isolated_exclude_cpumask(struct cpumask *exclude_cpumask)
static int __init tmigr_late_init(void)
{
return cpuhp_setup_state(CPUHP_AP_TMIGR_ONLINE, "tmigr:online",
- tmigr_set_cpu_available,
- tmigr_clear_cpu_available);
+ tmigr_online_cpu, tmigr_offline_cpu);
}
static void tmigr_init_group(struct tmigr_group *group, unsigned int lvl,
--
Frederic Weisbecker
SUSE Labs
prev parent reply other threads:[~2025-09-16 13:41 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-15 14:59 [PATCH v12 0/9] timers: Exclude isolated cpus from timer migration Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 1/9] timers/migration: Postpone online/offline callbacks registration to late initcall Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 2/9] timers: Rename tmigr 'online' bit to 'available' Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 3/9] timers: Add the available mask in timer migration Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 4/9] timers: Use scoped_guard when setting/clearing the tmigr available flag Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 5/9] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_exclusion_cpumasks() Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 6/9] sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 7/9] cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping Gabriele Monaco
2025-09-15 20:43 ` Waiman Long
2025-09-16 8:31 ` Chen Ridong
2025-09-15 14:59 ` [PATCH v12 8/9] cpumask: Add initialiser CPUMASK_NULL to use cleanup helpers Gabriele Monaco
2025-09-15 16:04 ` Yury Norov
2025-09-15 17:02 ` Gabriele Monaco
2025-09-15 18:35 ` Yury Norov
2025-09-16 11:27 ` Frederic Weisbecker
2025-09-17 7:51 ` Gabriele Monaco
2025-09-17 11:38 ` Yury Norov
2025-09-17 12:08 ` Gabriele Monaco
2025-09-17 12:34 ` Yury Norov
2025-09-19 14:58 ` Yury Norov
2025-09-22 10:07 ` Gabriele Monaco
2025-09-15 14:59 ` [PATCH v12 9/9] timers: Exclude isolated cpus from timer migration Gabriele Monaco
2025-09-15 20:51 ` John B. Wyatt IV
2025-09-16 5:29 ` Gabriele Monaco
2025-09-16 13:41 ` Frederic Weisbecker [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aMlo_K8koZ6HH2kh@localhost.localdomain \
--to=frederic@kernel.org \
--cc=anna-maria@linutronix.de \
--cc=cgroups@vger.kernel.org \
--cc=gmonaco@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=jwyatt@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mkoutny@suse.com \
--cc=sageofredondo@gmail.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.