public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Waiman Long <llong@redhat.com>
To: Xi Wang <xii@google.com>, Frederic Weisbecker <frederic@kernel.org>
Cc: "Tejun Heo" <tj@kernel.org>,
	linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
	"Ingo Molnar" <mingo@redhat.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Juri Lelli" <juri.lelli@redhat.com>,
	"Vincent Guittot" <vincent.guittot@linaro.org>,
	"Dietmar Eggemann" <dietmar.eggemann@arm.com>,
	"Steven Rostedt" <rostedt@goodmis.org>,
	"Ben Segall" <bsegall@google.com>,
	"David Rientjes" <rientjes@google.com>,
	"Mel Gorman" <mgorman@suse.de>,
	"Valentin Schneider" <vschneid@redhat.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Michal Koutný" <mkoutny@suse.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Dan Carpenter" <dan.carpenter@linaro.org>,
	"Chen Yu" <yu.c.chen@intel.com>, "Kees Cook" <kees@kernel.org>,
	"Yu-Chun Lin" <eleanor15x@gmail.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Mickaël Salaün" <mic@digikod.net>,
	jiangshanlai@gmail.com
Subject: Re: [RFC/PATCH] sched: Support moving kthreads into cpuset cgroups
Date: Thu, 8 May 2025 15:34:56 -0400	[thread overview]
Message-ID: <b7aa4b10-1afb-476f-ac5d-d8db7151d866@redhat.com> (raw)
In-Reply-To: <CAOBoifhWNi-j6jbP6B9CofTrT+Kr6TCSYYPMv7SQdbY5s930og@mail.gmail.com>

On 5/8/25 1:51 PM, Xi Wang wrote:
> I think our problem spaces are different. Perhaps your problems are closer to
> hard real-time systems but our problems are about improving latency of existing
> systems while maintaining efficiency (max supported cpu util).
>
> For hard real-time systems we sometimes throw cores at the problem and run no
> more than one thread per cpu. But if we want efficiency we have to share cpus
> with scheduling policies. Disconnecting the cpu scheduler with isolcpus results
> in losing too much of the machine capacity. CPU scheduling is needed for both
> kernel and userspace threads.
>
> For our use case we need to move kernel threads away from certain vcpu threads,
> but other vcpu threads can share cpus with kernel threads. The ratio changes
> from time to time. Permanently putting aside a few cpus results in a reduction
> in machine capacity.
>
> The PF_NO_SETAFFINTIY case is already handled by the patch. These threads will
> run in root cgroup with affinities just like before.
>
> The original justifications for the cpuset feature is here and the reasons are
> still applicable:
>
> "The management of large computer systems, with many processors (CPUs), complex
> memory cache hierarchies and multiple Memory Nodes having non-uniform access
> times (NUMA) presents additional challenges for the efficient scheduling and
> memory placement of processes."
>
> "But larger systems, which benefit more from careful processor and memory
> placement to reduce memory access times and contention.."
>
> "These subsets, or “soft partitions” must be able to be dynamically adjusted, as
> the job mix changes, without impacting other concurrently executing jobs."
>
> https://docs.kernel.org/admin-guide/cgroup-v1/cpusets.html
>
> -Xi
>
If you create a cpuset root partition, we are pushing some kthreads 
aways from CPUs dedicated to the newly created partition which has its 
own scheduling domain separate from the cgroup root. I do realize that 
the current way of excluding only per cpu kthreads isn't quite right. So 
I send out a new patch to extend to all the PF_NO_SETAFFINITY kthreads.

So instead of putting kthreads into the dedicated cpuset, we still keep 
them in the root cgroup. Instead we can create a separate cpuset 
partition to run the workload without interference from the background 
kthreads. Will that functionality suit your current need?

Cheers,
Longman


  reply	other threads:[~2025-05-08 19:35 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-06 18:35 [RFC/PATCH] sched: Support moving kthreads into cpuset cgroups Xi Wang
2025-05-06 19:57 ` Waiman Long
2025-05-06 23:15   ` Xi Wang
2025-05-07  0:17 ` Tejun Heo
2025-05-07  3:43   ` Xi Wang
2025-05-07 14:11     ` Frederic Weisbecker
2025-05-07 17:23       ` Xi Wang
2025-05-07 17:36         ` Tejun Heo
2025-05-07 20:07           ` Xi Wang
2025-05-08  0:08             ` Frederic Weisbecker
2025-05-08 17:51               ` Xi Wang
2025-05-08 19:34                 ` Waiman Long [this message]
2025-05-08 22:39                   ` Xi Wang
2025-05-09  0:30                     ` Waiman Long
2025-05-09 16:52                       ` Xi Wang
2025-05-12 10:36 ` Michal Koutný
2025-05-12 18:55   ` Xi Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7aa4b10-1afb-476f-ac5d-d8db7151d866@redhat.com \
    --to=llong@redhat.com \
    --cc=bsegall@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=dan.carpenter@linaro.org \
    --cc=dietmar.eggemann@arm.com \
    --cc=eleanor15x@gmail.com \
    --cc=frederic@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=jiangshanlai@gmail.com \
    --cc=juri.lelli@redhat.com \
    --cc=kees@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mic@digikod.net \
    --cc=mingo@redhat.com \
    --cc=mkoutny@suse.com \
    --cc=peterz@infradead.org \
    --cc=rientjes@google.com \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vbabka@suse.cz \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=xii@google.com \
    --cc=yu.c.chen@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox