Re: [Patch v4 17/22] sched/cache: Avoid cache-aware scheduling for memory-heavy processes

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: "Chen, Yu C" <yu.c.chen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>, Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>, Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>, Josh Don <joshdon@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	Qais Yousef <qyousef@layalina.io>,
	Libo Chen <libchen@purestorage.com>,
	linux-kernel@vger.kernel.org, "Luck, Tony" <tony.luck@intel.com>,
	Reinette Chatre <reinette.chatre@intel.com>
Subject: Re: [Patch v4 17/22] sched/cache: Avoid cache-aware scheduling for memory-heavy processes
Date: Fri, 10 Apr 2026 11:20:57 +0200	[thread overview]
Message-ID: <20260410092057.GG3126523@noisy.programming.kicks-ass.net> (raw)
In-Reply-To: <a2d7e7b2-5a4c-48bc-b567-61e3cbd17786@intel.com>

On Fri, Apr 10, 2026 at 04:59:19PM +0800, Chen, Yu C wrote:

> > This is pretty terrible. If you want LLC size, add it to the topology
> > information (and ideally integrate with RDT) and make proportional to
> > cpumask size, such that if someone cuts the domain in pieces, they get
> > proportional size etc.
> > 
> 
> If I understand correctly, do you mean the following:
> 
> 1.Introduce a generic arch_get_llc_size() as a wrapper
>   around the existing get_cpu_cacheinfo_level(), which
>   returns the llc_size. Both the scheduler and RDT can
>   use arch_get_llc_size().

The tie in with RDT was more to affect the return of
arch_get_llc_size(). Eg. when RDT takes away some ways for specific
tasks, then the total effective size gets reduced for generic use.


> 2. The sched domain stores llc_size in
>    sd->res_size = llc_size * sd_span / arch_llc_span,
>    and the cache_aware_scheduler uses sd->res_size for
>    the comparison.

Just so.

> We will adjust the code accordingly.

Thanks.

> > Also, if we have NUMA_BALANCING on, that can provide a much better
> > estimate for the actual size.
> > 
> > Just using RSS seems like a very bad metric here.
> > 
> 
> Got it. Currently we lack accurate memory footprint metrics in
> the kernel. If we support user-provided hints in the future, we
> can leverage RDT llc_occupancy metrics(Is it legal to use
> RDT's metrics directly in the kernel? It would switch from
> MSR-read to MMIO read thus less overhead). For now, let me have
> a try how to leverage NUMA fault-in stats. If NUMA balancing
> is off, I need to think more on how to avoid over-aggregation for
> memory-intensive workloads.

There is also things like this:

  https://lkml.kernel.org/r/20260323095104.238982-1-bharata@amd.com

But yeah, in an ideal world we could be looking at LLC cache hit/miss
information... streaming workloads would have very low hit rate.

But yes, possible prctl() controls could help, create tools to disable
things per program etc.

next prev parent reply	other threads:[~2026-04-10  9:21 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-01 21:52 [Patch v4 00/22] Cache aware scheduling Tim Chen
2026-04-01 21:52 ` [Patch v4 01/22] sched/cache: Introduce infrastructure for cache-aware load balancing Tim Chen
2026-04-09 12:41   ` Peter Zijlstra
2026-04-09 19:21     ` Tim Chen
2026-04-09 23:00       ` Peter Zijlstra
2026-04-10  6:30         ` Chen, Yu C
2026-04-01 21:52 ` [Patch v4 02/22] sched/cache: Limit the scan number of CPUs when calculating task occupancy Tim Chen
2026-04-09 13:17   ` Luo Gengkun
2026-04-09 13:41     ` Peter Zijlstra
2026-04-10 10:12       ` Luo Gengkun
2026-04-10  7:29     ` Chen, Yu C
2026-04-10 10:20       ` Luo Gengkun
2026-04-01 21:52 ` [Patch v4 03/22] sched/cache: Record per LLC utilization to guide cache aware scheduling decisions Tim Chen
2026-04-01 21:52 ` [Patch v4 04/22] sched/cache: Introduce helper functions to enforce LLC migration policy Tim Chen
2026-04-01 21:52 ` [Patch v4 05/22] sched/cache: Make LLC id continuous Tim Chen
2026-04-01 21:52 ` [Patch v4 06/22] sched/cache: Assign preferred LLC ID to processes Tim Chen
2026-04-01 21:52 ` [Patch v4 07/22] sched/cache: Track LLC-preferred tasks per runqueue Tim Chen
2026-04-01 21:52 ` [Patch v4 08/22] sched/cache: Introduce per CPU's tasks LLC preference counter Tim Chen
2026-04-01 21:52 ` [Patch v4 09/22] sched/cache: Calculate the percpu sd task LLC preference Tim Chen
2026-04-01 21:52 ` [Patch v4 10/22] sched/cache: Count tasks prefering destination LLC in a sched group Tim Chen
2026-04-01 21:52 ` [Patch v4 11/22] sched/cache: Check local_group only once in update_sg_lb_stats() Tim Chen
2026-04-01 21:52 ` [Patch v4 12/22] sched/cache: Prioritize tasks preferring destination LLC during balancing Tim Chen
2026-04-01 21:52 ` [Patch v4 13/22] sched/cache: Add migrate_llc_task migration type for cache-aware balancing Tim Chen
2026-04-01 21:52 ` [Patch v4 14/22] sched/cache: Handle moving single tasks to/from their preferred LLC Tim Chen
2026-04-01 21:52 ` [Patch v4 15/22] sched/cache: Respect LLC preference in task migration and detach Tim Chen
2026-04-01 21:52 ` [Patch v4 16/22] sched/cache: Disable cache aware scheduling for processes with high thread counts Tim Chen
2026-04-09 12:43   ` Peter Zijlstra
2026-04-09 19:27     ` Tim Chen
2026-04-01 21:52 ` [Patch v4 17/22] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Tim Chen
2026-04-09 12:46   ` Peter Zijlstra
2026-04-09 12:55     ` Peter Zijlstra
2026-04-10  8:59     ` Chen, Yu C
2026-04-10  9:20       ` Peter Zijlstra [this message]
2026-04-01 21:52 ` [Patch v4 18/22] sched/cache: Enable cache aware scheduling for multi LLCs NUMA node Tim Chen
2026-04-09 13:37   ` Peter Zijlstra
2026-04-09 19:39     ` Tim Chen
2026-04-01 21:52 ` [Patch v4 19/22] sched/cache: Allow the user space to turn on and off cache aware scheduling Tim Chen
2026-04-01 21:52 ` [Patch v4 20/22] sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling Tim Chen
2026-04-01 21:52 ` [Patch v4 21/22] -- DO NOT APPLY!!! -- sched/cache/debug: Display the per LLC occupancy for each process via proc fs Tim Chen
2026-04-01 21:52 ` [Patch v4 22/22] -- DO NOT APPLY!!! -- sched/cache/debug: Add ftrace to track the load balance statistics Tim Chen
2026-04-09 13:54 ` [Patch v4 00/22] Cache aware scheduling Peter Zijlstra
2026-04-09 20:02   ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260410092057.GG3126523@noisy.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=adamli@os.amperecomputing.com \
    --cc=aubrey.li@intel.com \
    --cc=bsegall@google.com \
    --cc=cyy@cyyself.name \
    --cc=dietmar.eggemann@arm.com \
    --cc=gautham.shenoy@amd.com \
    --cc=gavinguo@igalia.com \
    --cc=haoxing990@gmail.com \
    --cc=hdanton@sina.com \
    --cc=jianyong.wu@outlook.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=len.brown@intel.com \
    --cc=libchen@purestorage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=qyousef@layalina.io \
    --cc=reinette.chatre@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tim.c.chen@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tingyin.duan@gmail.com \
    --cc=tony.luck@intel.com \
    --cc=vernhao@tencent.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineethr@linux.ibm.com \
    --cc=vschneid@redhat.com \
    --cc=yu.c.chen@intel.com \
    --cc=yu.chen.surf@gmail.com \
    --cc=zhao1.liu@intel.com \
    --cc=ziqianlu@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox