From: Frederic Weisbecker <frederic@kernel.org>
To: Gabriele Monaco <gmonaco@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, Waiman Long <longman@redhat.com>
Subject: Re: [PATCH] timers: Exclude isolated cpus from timer migation
Date: Thu, 10 Apr 2025 15:03:39 +0200 [thread overview]
Message-ID: <Z_fBq2AQjzyg8m5w@localhost.localdomain> (raw)
In-Reply-To: <2c9d71fd79d7d1cec66e48bcb87b39a874858f01.camel@redhat.com>
Le Thu, Apr 10, 2025 at 12:38:25PM +0200, Gabriele Monaco a écrit :
> On Thu, 2025-04-10 at 10:26 +0200, Thomas Gleixner wrote:
> > How can that happen? There is always at least _ONE_ housekeeping,
> > non-isolated, CPU online, no?
> >
>
> In my understanding it shouldn't, but I'm not sure there's anything
> preventing the user from isolating everything via cpuset.
> Anyway that's something no one in their mind should do, so I guess I'd
> just opt for the cpumask_first (or actually cpumask_any, like before
> the change).
With "nohz_full=..." or "isolcpus=nohz,..." there is always at least one
housekeeping CPU. But with isolcpus=[domain] or cpusets equivalents
(v1 cpuset.sched_load_balance, v2 isolated partion) there is nothing that
prevents all CPUs from being isolated.
Speaking of, those are two different issues here:
* nohz_full CPUs are handled just like idle CPUs. Once the tick is stopped,
the global timers are handled by other CPUs (housekeeping). There is always
one housekeeping CPU that never goes idle.
One subtle thing though: if the nohz_full CPU fires a tick, because there
is a local timer to be handled for example, it will also possibly handle
some global timers along the way. If it happens to be a problem, it should
be easy to resolve.
* Domain isolated CPUs are treated just like other CPUs. But there is not
always a housekeeping CPU around. And no guarantee that there is always
a non-idle CPU to take care of global timers.
> > That brings me to the general design decision here. Your changelog
> > explains at great length WHAT the change is doing, but completely
> > fails
> > to explain the consequences and the rationale why this is the right
> > thing to do.
> >
> > By excluding the isolated CPUs from migration completely, any
> > 'global'
> > timer, which is armed on such a CPU, has to be expired on that
> > isolated
> > CPU. That's fundamentaly different from e.g. RCU isolation.
> >
> > It might be the right thing to do and harmless, but without a proper
> > explanation it's a silent side effect of your changes, which leaves
> > people scratching their heads.
>
> Mmh, essentially the idea is that global timer should not migrate from
> housekeeping to isolated cores. I assumed the opposite never occurs (as
> global timers /should/ not even start on isolated cores on a properly
> isolated system), but you're right, that's not quite true.
Indeed, they can definetly start there.
I'm tempted to propose to offline/reonline isolated CPUs in order to migrate
away those timers. But that only works for timers that are currently queued.
>
> Thinking about it now, since global timers /can/ start on isolated
> cores, that makes them quite different from offline ones and probably
> considering them the same is just not the right thing to do..
>
> I'm going to have a deeper thought about this whole approach, perhaps
> something simpler just preventing migration in that one direction would
> suffice.
I think we can use your solution, which involves isolating the CPU from tmigr
hierarchy. And also always queue global timers to non-isolated targets.
--
Frederic Weisbecker
SUSE Labs
next prev parent reply other threads:[~2025-04-10 13:03 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-10 6:54 [PATCH] timers: Exclude isolated cpus from timer migation Gabriele Monaco
2025-04-10 8:26 ` Thomas Gleixner
2025-04-10 10:38 ` Gabriele Monaco
2025-04-10 13:03 ` Frederic Weisbecker [this message]
2025-04-10 13:15 ` Thomas Gleixner
2025-04-10 13:27 ` Frederic Weisbecker
2025-04-10 13:56 ` Gabriele Monaco
2025-04-10 14:20 ` Frederic Weisbecker
2025-04-10 14:46 ` Thomas Gleixner
2025-04-10 14:54 ` Frederic Weisbecker
2025-04-10 15:06 ` Waiman Long
2025-04-10 14:46 ` Gabriele Monaco
2025-04-10 14:59 ` Frederic Weisbecker
2025-04-10 15:05 ` Gabriele Monaco
2025-04-10 15:32 ` Frederic Weisbecker
2025-04-11 7:08 ` Gabriele Monaco
2025-04-11 11:31 ` Frederic Weisbecker
2025-04-11 13:02 ` Gabriele Monaco
2025-04-11 22:57 ` Frederic Weisbecker
2025-04-14 8:06 ` Gabriele Monaco
2025-04-10 14:35 ` Waiman Long
2025-04-10 14:43 ` Frederic Weisbecker
2025-04-10 14:49 ` Gabriele Monaco
2025-04-10 14:50 ` Waiman Long
2025-04-10 14:56 ` Frederic Weisbecker
2025-04-10 13:08 ` Thomas Gleixner
2025-04-10 14:21 ` Waiman Long
2025-04-10 14:32 ` Waiman Long
2025-04-11 7:12 ` kernel test robot
2025-04-11 9:27 ` kernel test robot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z_fBq2AQjzyg8m5w@localhost.localdomain \
--to=frederic@kernel.org \
--cc=gmonaco@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox