[Patch v4 00/16] Cache aware scheduling enhancements

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>, Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>, Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>, Josh Don <joshdon@google.com>,
	Gavin Guo <gavinguo@igalia.com>,
	Qais Yousef <qyousef@layalina.io>,
	Libo Chen <libchen@purestorage.com>,
	Luo Gengkun <luogengkun2@huawei.com>,
	linux-kernel@vger.kernel.org
Subject: [Patch v4 00/16] Cache aware scheduling enhancements
Date: Wed, 13 May 2026 13:39:11 -0700	[thread overview]
Message-ID: <cover.1778703694.git.tim.c.chen@linux.intel.com> (raw)

This patch set contains cache-aware scheduling enhancements
and bug fixes on top of Peter's sched/cache branch:
https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=sched/cache

Patches 1 to 6 resolve the over-aggregation issue, which is the remaining
part of v4 that has not yet been merged into sched/cache. Patches 7 to 15
fix bugs reported by Sashiko (online and local).

Compared with cache-aware v4, the major change in the first part is
storing the LLC effective size in the per-CPU bottom sched_domain. This
allows checking whether a task's memory footprint exceeds the threshold
by fetching the value directly from the corresponding sched_domain,
instead of recalculating it every time. Besides,  the NUMA balance
page-fault statistics is used instead of RSS to estimate the working
set. We also picked up Jianyong's optimization patch to reduce CPU scan
overhead. However, if NUMA balancing is not enabled we will not have
this working set estimate.  Perhaps using RSS will be apprpriate for
such scenario.

Gengkun's CPU scan optimization is not
included for now and will be revisited after further tuning.

Most patches in the second part address race conditions. Each patch fixes
one independent issue to facilitate easier review.

Test results show that the current version keeps the same performance
as v4 for workloads and platforms we tested.

Future plans are to introduce fine-grained control of using cache aware
scheduling on specific tasks after the load-balance-based cache-aware
scheduling is merged:

- Look into task tagging (e.g. with schedqos framework, cgroup) for non process 
  based tasks grouping to LLC.
- Evaluate fast cache-aware aggregation in the wakeup path.

I will be on sabbatical from mid May to mid June. Chen Yu will still be
following up these patches.

Thanks.

Tim

Chen Yu (15):
  sched/cache: Disable cache aware scheduling for processes with high
    thread counts
  sched/cache: Skip cache-aware scheduling for single-threaded processes
  sched/cache: Calculate the LLC size and store it in sched_domain
  sched/cache: Avoid cache-aware scheduling for memory-heavy processes
  sched/cache: Add user control to adjust the aggressiveness of
    cache-aware scheduling
  sched/cache: Fix rcu warning when accessing sd_llc domain
  sched/cache: Fix potential NULL mm pointer access
  sched/cache: Annotate lockless accesses to mm->sc_stat.cpu
  sched/cache: Fix unpaired account_llc_enqueue/dequeue
  sched/cache: Fix checking active load balance by only considering the
    CFS task
  sched/cache: Fix race condition during sched domain rebuild
  sched/cache: Fix cache aware scheduling enabling for multi LLCs system
  sched/cache: Fix has_multi_llcs iff at least one partition has
    multiple LLCs
  sched/cache: Fix possible overflow when invalidating the preferred CPU
  sched/cache: Fix stale preferred_llc for a new task

Jianyong Wu (1):
  sched/cache: Allow only 1 thread of the process to calculate the LLC
    occupancy

 drivers/base/cacheinfo.c       |  23 +++
 include/linux/cacheinfo.h      |   1 +
 include/linux/sched.h          |   5 +
 include/linux/sched/topology.h |   7 +
 init/init_task.c               |   1 +
 kernel/exit.c                  |  29 ++++
 kernel/sched/debug.c           |  14 +-
 kernel/sched/fair.c            | 256 +++++++++++++++++++++++++++++----
 kernel/sched/sched.h           |   7 +-
 kernel/sched/topology.c        | 240 +++++++++++++++++++++++++------
 10 files changed, 509 insertions(+), 74 deletions(-)

-- 
2.32.0

next             reply	other threads:[~2026-05-13 20:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 20:39 Tim Chen [this message]
2026-05-13 20:39 ` [Patch v4 01/16] sched/cache: Allow only 1 thread of the process to calculate the LLC occupancy Tim Chen
2026-05-13 20:39 ` [Patch v4 02/16] sched/cache: Disable cache aware scheduling for processes with high thread counts Tim Chen
2026-05-13 20:39 ` [Patch v4 03/16] sched/cache: Skip cache-aware scheduling for single-threaded processes Tim Chen
2026-05-13 20:39 ` [Patch v4 04/16] sched/cache: Calculate the LLC size and store it in sched_domain Tim Chen
2026-05-13 20:39 ` [Patch v4 05/16] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Tim Chen
2026-05-13 20:39 ` [Patch v4 06/16] sched/cache: Add user control to adjust the aggressiveness of cache-aware scheduling Tim Chen
2026-05-13 20:39 ` [Patch v4 07/16] sched/cache: Fix rcu warning when accessing sd_llc domain Tim Chen
2026-05-13 20:39 ` [Patch v4 08/16] sched/cache: Fix potential NULL mm pointer access Tim Chen
2026-05-13 20:39 ` [Patch v4 09/16] sched/cache: Annotate lockless accesses to mm->sc_stat.cpu Tim Chen
2026-05-13 20:39 ` [Patch v4 10/16] sched/cache: Fix unpaired account_llc_enqueue/dequeue Tim Chen
2026-05-13 20:39 ` [Patch v4 11/16] sched/cache: Fix checking active load balance by only considering the CFS task Tim Chen
2026-05-13 20:39 ` [Patch v4 12/16] sched/cache: Fix race condition during sched domain rebuild Tim Chen
2026-05-13 20:39 ` [Patch v4 13/16] sched/cache: Fix cache aware scheduling enabling for multi LLCs system Tim Chen
2026-05-13 20:39 ` [Patch v4 14/16] sched/cache: Fix has_multi_llcs iff at least one partition has multiple LLCs Tim Chen
2026-05-13 20:39 ` [Patch v4 15/16] sched/cache: Fix possible overflow when invalidating the preferred CPU Tim Chen
2026-05-13 20:39 ` [Patch v4 16/16] sched/cache: Fix stale preferred_llc for a new task Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1778703694.git.tim.c.chen@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=adamli@os.amperecomputing.com \
    --cc=aubrey.li@intel.com \
    --cc=bsegall@google.com \
    --cc=cyy@cyyself.name \
    --cc=dietmar.eggemann@arm.com \
    --cc=gavinguo@igalia.com \
    --cc=haoxing990@gmail.com \
    --cc=hdanton@sina.com \
    --cc=jianyong.wu@outlook.com \
    --cc=joshdon@google.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=len.brown@intel.com \
    --cc=libchen@purestorage.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luogengkun2@huawei.com \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tim.c.chen@intel.com \
    --cc=tingyin.duan@gmail.com \
    --cc=vernhao@tencent.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineethr@linux.ibm.com \
    --cc=vschneid@redhat.com \
    --cc=yu.c.chen@intel.com \
    --cc=yu.chen.surf@gmail.com \
    --cc=zhao1.liu@intel.com \
    --cc=ziqianlu@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox