From: Thomas Gleixner <tglx@linutronix.de>
To: Gabriele Monaco <gmonaco@redhat.com>,
linux-kernel@vger.kernel.org,
Anna-Maria Behnsen <anna-maria@linutronix.de>,
Frederic Weisbecker <frederic@kernel.org>,
Waiman Long <longman@redhat.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>,
"John B. Wyatt IV" <jwyatt@redhat.com>,
"John B. Wyatt IV" <sageofredondo@gmail.com>
Subject: Re: [PATCH v15 7/7] timers: Exclude isolated cpus from timer migration
Date: Wed, 19 Nov 2025 17:50:15 +0100 [thread overview]
Message-ID: <87pl9eklvc.ffs@tglx> (raw)
In-Reply-To: <20251113083324.33490-8-gmonaco@redhat.com>
On Thu, Nov 13 2025 at 09:33, Gabriele Monaco wrote:
> +/* Enabled during late initcall */
> +static DEFINE_STATIC_KEY_FALSE(tmigr_exclude_isolated);
> +
> #define TMIGR_NONE 0xFF
> #define BIT_CNT 8
>
> @@ -438,6 +442,33 @@ static inline bool tmigr_is_not_available(struct tmigr_cpu *tmc)
> return !(tmc->tmgroup && tmc->available);
> }
>
> +/*
> + * Returns true if @cpu should be excluded from the hierarchy as isolated.
> + * Domain isolated CPUs don't participate in timer migration, nohz_full CPUs
> + * are still part of the hierarchy but become idle (from a tick and timer
> + * migration perspective) when they stop their tick. This lets the timekeeping
> + * CPU handle their global timers. Marking also isolated CPUs as idle would be
> + * too costly, hence they are completely excluded from the hierarchy.
> + * This check is necessary, for instance, to prevent offline isolated CPUs from
> + * being incorrectly marked as available once getting back online.
> + *
> + * This function returns false during early boot and the isolation logic is
> + * enabled only after isolated CPUs are marked as unavailable at late boot.
> + * The tick CPU can be isolated at boot, however we cannot mark it as
> + * unavailable to avoid having no global migrator for the nohz_full CPUs. This
> + * should be ensured by the callers of this function: implicitly from hotplug
> + * callbacs and explicitly in tmigr_init_isolation and
callbacks tmigr_init_isolation()
> + * tmigr_isolated_exclude_cpumask.
tmigr_isolated_exclude_cpumask() It's documented how functions should be
denoted in comments and change logs, no?
> + */
> +static inline bool tmigr_is_isolated(int cpu)
> +{
> + if (static_branch_unlikely(&tmigr_exclude_isolated))
> + return (!housekeeping_cpu(cpu, HK_TYPE_DOMAIN) ||
> + cpuset_cpu_is_isolated(cpu)) &&
> + housekeeping_cpu(cpu, HK_TYPE_KERNEL_NOISE);
Lacks brackets on the if ()
https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#bracket-rules
Also you can make this way more readable by inverting the condition:
if (!static_branch_unlikely(&tmigr_exclude_isolated))
return false;
return .....;
No?
> + return false;
> +}
> +
> /*
> * Returns true, when @childmask corresponds to the group migrator or when the
> * group is not active - so no migrator is set.
> @@ -1439,8 +1470,9 @@ static int tmigr_clear_cpu_available(unsigned int cpu)
> int migrator;
> u64 firstexp;
>
> - cpumask_clear_cpu(cpu, tmigr_available_cpumask);
By removing this the function name does not make any sense any
more. Splitting the cpumask_clear_set() out, renaming the function
> scoped_guard(raw_spinlock_irq, &tmc->lock) {
> + if (!tmc->available)
> + return 0;
and adding this
> tmc->available = false;
> WRITE_ONCE(tmc->wakeup, KTIME_MAX);
>
> @@ -1453,11 +1485,11 @@ static int tmigr_clear_cpu_available(unsigned int cpu)
> }
>
> if (firstexp != KTIME_MAX) {
> - migrator = cpumask_any(tmigr_available_cpumask);
> + migrator = cpumask_any_but(tmigr_available_cpumask, cpu);
and this should be done in a preparatory patch along with a
reasonable explanation in the change log.
> work_on_cpu(migrator, tmigr_trigger_active, NULL);
> }
>
> - return 0;
> + return 1;
But thinking more about it. What's the actual point of moving this 'clear'
out instead of just moving it further down?
It does not matter at all whether the isol/unisol muck clears an already
cleared bit or not. But it would keep the function name comprehensible
and avoid all this online/offline wrapper nonsense.
> }
>
> static int tmigr_set_cpu_available(unsigned int cpu)
> @@ -1468,17 +1500,130 @@ static int tmigr_set_cpu_available(unsigned int cpu)
> if (WARN_ON_ONCE(!tmc->tmgroup))
> return -EINVAL;
>
> - cpumask_set_cpu(cpu, tmigr_available_cpumask);
Ditto.
> +static int __tmigr_isolated_exclude_cpumask(struct cpumask *exclude_cpumask)
> +{
> + struct work_struct __percpu *works __free(free_percpu) =
> + alloc_percpu(struct work_struct);
> + cpumask_var_t cpumask_unisol __free(free_cpumask_var) = CPUMASK_VAR_NULL;
> + cpumask_var_t cpumask_isol __free(free_cpumask_var) = CPUMASK_VAR_NULL;
> + int cpu;
> +
> + if (!alloc_cpumask_var(&cpumask_isol, GFP_KERNEL))
> + return -ENOMEM;
> + if (!alloc_cpumask_var(&cpumask_unisol, GFP_KERNEL))
> + return -ENOMEM;
> + if (!works)
> + return -ENOMEM;
Checking the first allocation after trying to allocate other stuff makes
a lot of sense - NOT.
> + cpumask_andnot(cpumask_unisol, cpu_online_mask, exclude_cpumask);
> + cpumask_andnot(cpumask_unisol, cpumask_unisol, tmigr_available_cpumask);
> + /* Set up the mask earlier to avoid races with the migrator CPU */
> + cpumask_or(tmigr_available_cpumask, tmigr_available_cpumask, cpumask_unisol);
Your new line key is broken. This comment is barely noticeable. What's
worse is that it completely fails to explain what the actual race is.
> + for_each_cpu(cpu, cpumask_unisol) {
> + struct work_struct *work = per_cpu_ptr(works, cpu);
> +
> + INIT_WORK(work, tmigr_cpu_unisolate);
> + schedule_work_on(cpu, work);
> + }
> +
> + cpumask_and(cpumask_isol, exclude_cpumask, tmigr_available_cpumask);
> + cpumask_and(cpumask_isol, cpumask_isol, housekeeping_cpumask(HK_TYPE_KERNEL_NOISE));
> + /*
> + * Handle this here and not in the cpuset code because exclude_cpumask
> + * might include also the tick CPU if included in isolcpus.
> + */
> + for_each_cpu(cpu, cpumask_isol) {
> + if (!tick_nohz_cpu_hotpluggable(cpu)) {
> + cpumask_clear_cpu(cpu, cpumask_isol);
> + break;
> + }
> + }
> + /* Set up the mask earlier to avoid races with the migrator CPU */
> + cpumask_andnot(tmigr_available_cpumask, tmigr_available_cpumask, cpumask_isol);
> + for_each_cpu(cpu, cpumask_isol) {
> + struct work_struct *work = per_cpu_ptr(works, cpu);
This lacks a comment explaining that cpumask_unisol and _isol are not
overlapping. I had to stare at this five times to convince myself that
it's correct.
> +
> + INIT_WORK(work, tmigr_cpu_isolate);
> + schedule_work_on(cpu, work);
> + }
> +
> + for_each_cpu_or(cpu, cpumask_isol, cpumask_unisol)
> + flush_work(per_cpu_ptr(works, cpu));
> +
> return 0;
> }
>
> +/**
> + * tmigr_isolated_exclude_cpumask - Exclude given CPUs from hierarchy
> + * @exclude_cpumask: the cpumask to be excluded from timer migration hierarchy
> + *
> + * This function can be called from cpuset code to provide the new set of
> + * isolated CPUs that should be excluded from the hierarchy.
> + * Online CPUs not present in exclude_cpumask but already excluded are brought
> + * back to the hierarchy.
> + * Functions to isolate/unisolate need to be called locally and can sleep.
> + */
> +int tmigr_isolated_exclude_cpumask(struct cpumask *exclude_cpumask)
> +{
> + lockdep_assert_cpus_held();
> + return __tmigr_isolated_exclude_cpumask(exclude_cpumask);
This wrapper is required because...
> +}
> +
> +static int __init tmigr_init_isolation(void)
> +{
> + cpumask_var_t cpumask __free(free_cpumask_var) = CPUMASK_VAR_NULL;
> +
> + static_branch_enable(&tmigr_exclude_isolated);
> +
> + if (!housekeeping_enabled(HK_TYPE_DOMAIN))
> + return 0;
> + if (!alloc_cpumask_var(&cpumask, GFP_KERNEL))
> + return -ENOMEM;
> +
> + cpumask_andnot(cpumask, cpu_possible_mask, housekeeping_cpumask(HK_TYPE_DOMAIN));
... it would be too sensible to guard this with:
guard(cpus_read_lock)();
for consistency sake _AND_ what's more important to protect it against
the RCU torture test code which plays with CPU hotplug starting in
device_initcall(), which runs before
> +late_initcall(tmigr_init_isolation);
Thanks,
tglx
next prev parent reply other threads:[~2025-11-19 16:50 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-13 8:33 [PATCH v15 0/7] timers: Exclude isolated cpus from timer migration Gabriele Monaco
2025-11-13 8:33 ` [PATCH v15 1/7] timers: Rename tmigr 'online' bit to 'available' Gabriele Monaco
2025-11-19 15:49 ` Thomas Gleixner
2025-11-13 8:33 ` [PATCH v15 2/7] timers: Add the available mask in timer migration Gabriele Monaco
2025-11-19 15:56 ` Thomas Gleixner
2025-11-13 8:33 ` [PATCH v15 3/7] timers: Use scoped_guard when setting/clearing the tmigr available flag Gabriele Monaco
2025-11-19 15:57 ` Thomas Gleixner
2025-11-13 8:33 ` [PATCH v15 4/7] cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() Gabriele Monaco
2025-11-13 8:33 ` [PATCH v15 5/7] sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any Gabriele Monaco
2025-11-13 8:33 ` [PATCH v15 6/7] cpumask: Add initialiser to use cleanup helpers Gabriele Monaco
2025-11-13 8:33 ` [PATCH v15 7/7] timers: Exclude isolated cpus from timer migration Gabriele Monaco
2025-11-19 16:50 ` Thomas Gleixner [this message]
2025-11-19 17:14 ` Frederic Weisbecker
2025-11-19 18:15 ` Thomas Gleixner
2025-11-19 20:13 ` Frederic Weisbecker
2025-11-19 21:23 ` Thomas Gleixner
2025-11-19 22:02 ` Frederic Weisbecker
2025-11-19 22:10 ` Thomas Gleixner
2025-11-19 22:31 ` Frederic Weisbecker
2025-11-19 20:43 ` Waiman Long
2025-11-20 10:48 ` Gabriele Monaco
2025-11-20 21:04 ` Waiman Long
2025-11-13 13:12 ` [PATCH v15 0/7] " Frederic Weisbecker
2025-11-18 10:01 ` Gabriele Monaco
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87pl9eklvc.ffs@tglx \
--to=tglx@linutronix.de \
--cc=anna-maria@linutronix.de \
--cc=frederic@kernel.org \
--cc=gmonaco@redhat.com \
--cc=jwyatt@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=sageofredondo@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.