From: Frederic Weisbecker <frederic@kernel.org>
To: Costa Shulyupin <costa.shul@redhat.com>
Cc: "Waiman Long" <longman@redhat.com>, "Tejun Heo" <tj@kernel.org>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Dan Carpenter" <dan.carpenter@linaro.org>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Andrew Morton" <akpm@linux-foundation.org>,
"Chen Yu" <yu.c.chen@intel.com>, "Kees Cook" <kees@kernel.org>,
"Randy Dunlap" <rdunlap@infradead.org>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org
Subject: Re: [RFC PATCH v1] Add kthreads_update_affinity()
Date: Wed, 22 Jan 2025 18:11:16 +0100 [thread overview]
Message-ID: <Z5EmtNh_lryVj0S3@localhost.localdomain> (raw)
In-Reply-To: <20250113190911.800623-1-costa.shul@redhat.com>
Hi Costa,
Le Mon, Jan 13, 2025 at 09:08:54PM +0200, Costa Shulyupin a écrit :
> Changing and using housekeeping and isolated CPUs requires reboot.
>
> The goal is to change CPU isolation dynamically without reboot.
>
> This patch is based on the parent patch
> cgroup/cpuset: Exclude isolated CPUs from housekeeping CPU masks
> https://lore.kernel.org/lkml/20240821142312.236970-3-longman@redhat.com/
> Its purpose is to update isolation cpumasks.
>
> However, some subsystems may use outdated housekeeping CPU masks. To
> prevent the use of these isolated CPUs, it is essential to explicitly
> propagate updates to the housekeeping masks across all subsystems that
> depend on them.
>
> This patch is not intended to be merged and disrupt the kernel.
> It is still a proof-of-concept for research purposes.
>
> The questions are:
> - Is this the right direction, or should I explore an alternative approach?
> - What factors need to be considered?
> - Any suggestions or advice?
> - Have similar attempts been made before?
Since the kthreads preferred affinity patchset just got merged,
I don't think anything needs to be done anymore in the kthreads front to toggle
nohz_full through cpusets safely. Let's see the details below:
>
> Update the affinity of kthreadd and trigger the recalculation of kthread
> affinities using kthreads_online_cpu().
>
> The argument passed to kthreads_online_cpu() is irrelevant, as the
> function reassigns affinities of kthreads based on their
> preferred_affinity and the updated housekeeping_cpumask(HK_TYPE_KTHREAD).
>
> Currently only RCU uses kthread_affine_preferred().
>
> I dare to try calling kthread_affine_preferred() from kthread_run() to
> set preferred_affinity as cpu_possible_mask for kthreads without a
> specific affinity, enabling their management through
> kthreads_online_cpu().
>
> Any objections?
>
> For details about kthreads affinity patterns please see:
> https://lore.kernel.org/lkml/20241211154035.75565-16-frederic@kernel.org/
>
> Signed-off-by: Costa Shulyupin <costa.shul@redhat.com>
> ---
> include/linux/kthread.h | 5 ++++-
> kernel/cgroup/cpuset.c | 1 +
> kernel/kthread.c | 6 ++++++
> 3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/kthread.h b/include/linux/kthread.h
> index 8d27403888ce9..b43c5aeb2cfd7 100644
> --- a/include/linux/kthread.h
> +++ b/include/linux/kthread.h
> @@ -52,8 +52,10 @@ bool kthread_is_per_cpu(struct task_struct *k);
> ({ \
> struct task_struct *__k \
> = kthread_create(threadfn, data, namefmt, ## __VA_ARGS__); \
> - if (!IS_ERR(__k)) \
> + if (!IS_ERR(__k)) { \
> + kthread_affine_preferred(__k, cpu_possible_mask); \
> wake_up_process(__k); \
> + } \
> __k; \
> })
>
> @@ -270,4 +272,5 @@ struct cgroup_subsys_state *kthread_blkcg(void);
> #else
> static inline void kthread_associate_blkcg(struct cgroup_subsys_state *css) { }
> #endif
> +void kthreads_update_affinity(void);
> #endif /* _LINUX_KTHREAD_H */
> diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
> index 65658a5c2ac81..7d71acc7f46b6 100644
> --- a/kernel/cgroup/cpuset.c
> +++ b/kernel/cgroup/cpuset.c
> @@ -1355,6 +1355,7 @@ static void update_isolation_cpumasks(bool isolcpus_updated)
> trl();
> ret = housekeeping_exlude_isolcpus(isolated_cpus, HOUSEKEEPING_FLAGS);
> WARN_ON_ONCE((ret < 0) && (ret != -EOPNOTSUPP));
> + kthreads_update_affinity();
A few things to consider:
1) update_isolation_cpumasks() will be called with cpus_read_lock()
(cf: sched_partition_write() and cpuset_write_resmask()), therefore
kthreads_online_cpu() can't run concurrently.
2) The constraint to turn on/off a CPU as nohz_full will be that the
target CPU is offline.
So when the target CPU will later become offline, kthreads_online_cpu()
will take care of the newly modified housekeeping_mask() (which will be visible
because cpus_write_lock() is held) and apply accordingly the appropriate
effective affinity.
The only thing that might need to be done by update_isolation_cpumasks() is
to hold kthreads_hotplug_lock so that a subsequent call to
kthread_affine_preferred() sees the "freshest" update to housekeeping_mask().
But even that may not be mandatory because the concerned CPUs are offline
and kthreads_online_cpu() will fix the race when they boot.
Oh and the special case kthreadd might need a direct affinity update.
So the good news is that we have a lot of things sorted out to prepare
for that cpuset interface:
_ RCU NOCB is ready
_ kthreads are ready
_ timers are fine since we toggle only offline CPUs and timer_migration.c
should work without changes.
Still some details need to be taken care of:
* scheduler (see the housekeeping_mask() references, especially the ilb which is
my biggest worry, get_nohz_timer_target() shouldn't be an issue)
* posix cpu timers (make tick_dep unconditional ?)
* kernel/watchdog.c (make proc_watchdog_cpumask() and
lockup_detector_online_cpu() safe against update_isolation_cpumasks()
* workqueue unbound mask (just apply the new one?)
* some RCU tick_dep to handle (make them unconditional since they
apply on slow path anyway?)
* other things? (grep for tick_nohz_full_cpu(), housekeeping_* and tick_dep_* )
But we are getting closer!
Thanks!
next prev parent reply other threads:[~2025-01-22 17:11 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-13 19:08 [RFC PATCH v1] Add kthreads_update_affinity() Costa Shulyupin
2025-01-22 17:11 ` Frederic Weisbecker [this message]
2025-02-20 13:57 ` Costa Shulyupin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z5EmtNh_lryVj0S3@localhost.localdomain \
--to=frederic@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=costa.shul@redhat.com \
--cc=dan.carpenter@linaro.org \
--cc=hannes@cmpxchg.org \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=longman@redhat.com \
--cc=mkoutny@suse.com \
--cc=rdunlap@infradead.org \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox