From: Waiman Long <llong@redhat.com>
To: Xi Wang <xii@google.com>, Frederic Weisbecker <frederic@kernel.org>
Cc: "Tejun Heo" <tj@kernel.org>,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
"Ingo Molnar" <mingo@redhat.com>,
"Peter Zijlstra" <peterz@infradead.org>,
"Juri Lelli" <juri.lelli@redhat.com>,
"Vincent Guittot" <vincent.guittot@linaro.org>,
"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
"Steven Rostedt" <rostedt@goodmis.org>,
"Ben Segall" <bsegall@google.com>,
"David Rientjes" <rientjes@google.com>,
"Mel Gorman" <mgorman@suse.de>,
"Valentin Schneider" <vschneid@redhat.com>,
"Johannes Weiner" <hannes@cmpxchg.org>,
"Michal Koutný" <mkoutny@suse.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Dan Carpenter" <dan.carpenter@linaro.org>,
"Chen Yu" <yu.c.chen@intel.com>, "Kees Cook" <kees@kernel.org>,
"Yu-Chun Lin" <eleanor15x@gmail.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Mickaël Salaün" <mic@digikod.net>,
jiangshanlai@gmail.com
Subject: Re: [RFC/PATCH] sched: Support moving kthreads into cpuset cgroups
Date: Thu, 8 May 2025 15:34:56 -0400 [thread overview]
Message-ID: <b7aa4b10-1afb-476f-ac5d-d8db7151d866@redhat.com> (raw)
In-Reply-To: <CAOBoifhWNi-j6jbP6B9CofTrT+Kr6TCSYYPMv7SQdbY5s930og@mail.gmail.com>
On 5/8/25 1:51 PM, Xi Wang wrote:
> I think our problem spaces are different. Perhaps your problems are closer to
> hard real-time systems but our problems are about improving latency of existing
> systems while maintaining efficiency (max supported cpu util).
>
> For hard real-time systems we sometimes throw cores at the problem and run no
> more than one thread per cpu. But if we want efficiency we have to share cpus
> with scheduling policies. Disconnecting the cpu scheduler with isolcpus results
> in losing too much of the machine capacity. CPU scheduling is needed for both
> kernel and userspace threads.
>
> For our use case we need to move kernel threads away from certain vcpu threads,
> but other vcpu threads can share cpus with kernel threads. The ratio changes
> from time to time. Permanently putting aside a few cpus results in a reduction
> in machine capacity.
>
> The PF_NO_SETAFFINTIY case is already handled by the patch. These threads will
> run in root cgroup with affinities just like before.
>
> The original justifications for the cpuset feature is here and the reasons are
> still applicable:
>
> "The management of large computer systems, with many processors (CPUs), complex
> memory cache hierarchies and multiple Memory Nodes having non-uniform access
> times (NUMA) presents additional challenges for the efficient scheduling and
> memory placement of processes."
>
> "But larger systems, which benefit more from careful processor and memory
> placement to reduce memory access times and contention.."
>
> "These subsets, or “soft partitions” must be able to be dynamically adjusted, as
> the job mix changes, without impacting other concurrently executing jobs."
>
> https://docs.kernel.org/admin-guide/cgroup-v1/cpusets.html
>
> -Xi
>
If you create a cpuset root partition, we are pushing some kthreads
aways from CPUs dedicated to the newly created partition which has its
own scheduling domain separate from the cgroup root. I do realize that
the current way of excluding only per cpu kthreads isn't quite right. So
I send out a new patch to extend to all the PF_NO_SETAFFINITY kthreads.
So instead of putting kthreads into the dedicated cpuset, we still keep
them in the root cgroup. Instead we can create a separate cpuset
partition to run the workload without interference from the background
kthreads. Will that functionality suit your current need?
Cheers,
Longman
next prev parent reply other threads:[~2025-05-08 19:35 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-06 18:35 [RFC/PATCH] sched: Support moving kthreads into cpuset cgroups Xi Wang
2025-05-06 19:57 ` Waiman Long
2025-05-06 23:15 ` Xi Wang
2025-05-07 0:17 ` Tejun Heo
2025-05-07 3:43 ` Xi Wang
2025-05-07 14:11 ` Frederic Weisbecker
2025-05-07 17:23 ` Xi Wang
2025-05-07 17:36 ` Tejun Heo
2025-05-07 20:07 ` Xi Wang
2025-05-08 0:08 ` Frederic Weisbecker
2025-05-08 17:51 ` Xi Wang
2025-05-08 19:34 ` Waiman Long [this message]
2025-05-08 22:39 ` Xi Wang
2025-05-09 0:30 ` Waiman Long
2025-05-09 16:52 ` Xi Wang
2025-05-12 10:36 ` Michal Koutný
2025-05-12 18:55 ` Xi Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b7aa4b10-1afb-476f-ac5d-d8db7151d866@redhat.com \
--to=llong@redhat.com \
--cc=bsegall@google.com \
--cc=cgroups@vger.kernel.org \
--cc=dan.carpenter@linaro.org \
--cc=dietmar.eggemann@arm.com \
--cc=eleanor15x@gmail.com \
--cc=frederic@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jiangshanlai@gmail.com \
--cc=juri.lelli@redhat.com \
--cc=kees@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mic@digikod.net \
--cc=mingo@redhat.com \
--cc=mkoutny@suse.com \
--cc=peterz@infradead.org \
--cc=rientjes@google.com \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
--cc=xii@google.com \
--cc=yu.c.chen@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox