[PATCH v2 00/23] Cache aware scheduling

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Tim Chen <tim.c.chen@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	K Prateek Nayak <kprateek.nayak@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>,
	Juri Lelli <juri.lelli@redhat.com>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Madadi Vineeth Reddy <vineethr@linux.ibm.com>,
	Hillf Danton <hdanton@sina.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Jianyong Wu <jianyong.wu@outlook.com>,
	Yangyu Chen <cyy@cyyself.name>,
	Tingyin Duan <tingyin.duan@gmail.com>,
	Vern Hao <vernhao@tencent.com>, Vern Hao <haoxing990@gmail.com>,
	Len Brown <len.brown@intel.com>, Aubrey Li <aubrey.li@intel.com>,
	Zhao Liu <zhao1.liu@intel.com>, Chen Yu <yu.chen.surf@gmail.com>,
	Chen Yu <yu.c.chen@intel.com>,
	Adam Li <adamli@os.amperecomputing.com>,
	Aaron Lu <ziqianlu@bytedance.com>,
	Tim Chen <tim.c.chen@intel.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH v2 00/23] Cache aware scheduling
Date: Wed,  3 Dec 2025 15:07:19 -0800	[thread overview]
Message-ID: <cover.1764801860.git.tim.c.chen@linux.intel.com> (raw)

This patch series introduces infrastructure for cache-aware load
balancing, with the goal of co-locating tasks that share data on the
same Last Level Cache (LLC) domain. By improving cache locality, the
scheduler can reduce cache bouncing and cache misses, ultimately
improving data access efficiency. The design builds on the initial
prototype from Peter [1].
 
In this initial implementation, threads within the same process are
treated as entities that are likely to share data. During load
balancing, the scheduler attempts to aggregate these threads onto the
same LLC domain whenever possible.
 
We would like to thank everyone who provided feedbacks on the v1
series[1]. Most of the comments have been addressed in this revision.
Several broader suggestions surfaced during review, and we believe
they are best approached in follow-up work once the foundational
cache-aware scheduling infrastructure is merged:
 
1. **Generalizing task grouping beyond processes.**
   While v2 focuses on grouping threads within a single process, other
   classes of workloads naturally share data and could benefit from LLC
   co-location, such as:
   a) Tasks from different processes that operate on shared data.
   b) Tasks belonging to the same NUMA group.
   c) Tasks with strong waker/wakee relationships.
   d) User-defined groups via cgroups or other user interfaces.
 
2. **Configurable cache-aware scheduling policies.**
   The current iteration implements a global cache-aware scheduling
   policy. Future work may introduce per-process or per-task-group
   policies, exposed through prctl() or other mechanisms.
 
**v2 Changes:**
1. Align NUMA balancing and cache affinity by
   prioritizing NUMA balancing when their decisions differ.
2. Dynamically resize per-LLC statistics structures based on the LLC
   size.
3. Switch to a contiguous LLC-ID space so these IDs can be used
   directly as array indices for LLC statistics.
4. Add clarification comments.
5. Add 3 debug patches (not meant for merging).
6. Other changes to address feedbacks from review of v1 patch set
   (see individual patch change log).
 
Test results:

The patch series was applied and tested on v6.18-rc7.
See: https://github.com/timcchen1298/linux/commits/cache_aware_v2

The first test platform is a 2 socket Intel Sapphire Rapids with 30
cores per socket. The DRAM interleaving is enabled in the BIOS so it
essential has one NUMA node with two last level caches. There are 60
CPUs associated with each last level cache.

The second test platform is a AMD Genoa. There are 4 Nodes and 32 CPUs
per node. Each node has 2 CCXs and each CCX has 16 CPUs.

hackbench/schbench/netperf/stream/stress-ng/chacha20 were launched on
these two platforms.

[TL;DR]
Sappire Rapids:
hackbench shows significant improvement when the number of
different active threads is below the capacity of a LLC.
schbench shows overall wakeup latency improvement.
ChaCha20-xiangshan shows good throughput improvement.

Genoa:
ChaCha20-xiangshan shows huge throughput improvement.
No obvious difference is observed in hackbench/schbench
/netperf/stream/stress-ng.
Phoronix has tested v1 and shows good improvements
in 33 cases[2].

Detail:
Due to length constraints, only part of the data is presented.

Sapphire Rapids:

hackbench thread pipes
                           baseline            sched_cache
       groups
Amean     1      38.8224 (   0.00%)     26.4582 *  31.85%*
Amean     3      38.2358 (   0.00%)     38.0758 (   0.42%)
Amean     5      40.7282 (   0.00%)     41.1568 (  -1.05%)
Amean     7      51.1720 (   0.00%)     50.6646 (   0.99%)
Amean     12     63.1562 (   0.00%)     63.3516 (  -0.31%)
Amean     16     73.9584 (   0.00%)     75.5596 (  -2.17%)
Max       1      39.4140 (   0.00%)     26.7590 (  32.11%)
Max       3      40.8310 (   0.00%)     39.8000 (   2.53%)
Max       5      42.2150 (   0.00%)     42.4860 (  -0.64%)
Max       7      52.1800 (   0.00%)     51.9370 (   0.47%)
Max       12     63.9430 (   0.00%)     64.2820 (  -0.53%)
Max       16     74.3710 (   0.00%)     76.4170 (  -2.75%)

further test hackbench using other number of fds:

case         fd          groups         baseline(std%)  compare%( std%)
threads-pipe-2          1-groups         1.00 (  1.25)  +38.52 (  1.33)
threads-pipe-2          2-groups         1.00 ( 12.52)  +12.74 (  1.31)
threads-pipe-2          4-groups         1.00 (  7.91)  +12.29 (  1.86)
threads-pipe-4          1-groups         1.00 (  0.55)  +34.99 (  0.45)
threads-pipe-4          2-groups         1.00 ( 16.00)  +27.32 (  0.75)
threads-pipe-4          4-groups         1.00 ( 17.37)  +25.75 (  0.20)
threads-pipe-8          1-groups         1.00 (  0.74)  +27.13 (  0.44)
threads-pipe-8          2-groups         1.00 (  8.82)  +23.79 (  0.32)
threads-pipe-8          4-groups         1.00 (  1.30)  +27.64 (  0.51)
threads-pipe-16         1-groups         1.00 (  1.03)  +30.55 (  0.27)
threads-pipe-16         2-groups         1.00 (  6.43)  +29.52 (  0.20)
threads-pipe-16         4-groups         1.00 (  1.36)   -1.85 (  1.43)
threads-pipe-20         1-groups         1.00 (  0.45)  +30.88 (  0.42)
threads-pipe-20         2-groups         1.00 (  1.95)   -0.81 (  5.84)
threads-pipe-20         4-groups         1.00 (  2.09)   -1.77 (  7.57)

stream:
                              baseline            sched_cache
GB/sec copy-2        36.48 (   0.00%)       36.55 (   0.18%)
GB/sec scale-2       36.83 (   0.00%)       36.97 (   0.38%)
GB/sec add-2         37.92 (   0.00%)       38.03 (   0.31%)
GB/sec triad-2       37.83 (   0.00%)       37.97 (   0.37%)

stress-ng context switch:
                                    baseline            sched_cache
Min       context-1       2957.81 (   0.00%)     2966.17 (   0.28%)
Min       context-2       5931.68 (   0.00%)     5930.17 (  -0.03%)
Min       context-4      11874.20 (   0.00%)    11875.68 (   0.01%)
Min       context-8      23755.30 (   0.00%)    23762.43 (   0.03%)
Min       context-16     47535.14 (   0.00%)    47526.46 (  -0.02%)
Min       context-32     95078.66 (   0.00%)    94356.39 (  -0.76%)
Min       context-64    190074.62 (   0.00%)   190042.93 (  -0.02%)
Min       context-128   371107.12 (   0.00%)   371008.10 (  -0.03%)
Min       context-256   578443.73 (   0.00%)   579037.86 (   0.10%)
Min       context-480   580203.34 (   0.00%)   580499.43 (   0.05%)
Hmean     context-1       2964.59 (   0.00%)     2967.69 (   0.10%)
Hmean     context-2       5936.41 (   0.00%)     5935.51 (  -0.02%)
Hmean     context-4      11879.56 (   0.00%)    11881.70 (   0.02%)
Hmean     context-8      23771.92 (   0.00%)    23770.28 (  -0.01%)
Hmean     context-16     47552.23 (   0.00%)    47538.01 (  -0.03%)
Hmean     context-32     95102.67 (   0.00%)    94969.43 (  -0.14%)
Hmean     context-64    190129.74 (   0.00%)   190088.68 (  -0.02%)
Hmean     context-128   371291.95 (   0.00%)   371114.82 (  -0.05%)
Hmean     context-256   578907.96 (   0.00%)   579338.99 (   0.07%)
Hmean     context-480   580541.78 (   0.00%)   580726.13 (   0.03%)
Max       context-1       2967.93 (   0.00%)     2968.90 (   0.03%)
Max       context-2       5942.37 (   0.00%)     5940.40 (  -0.03%)
Max       context-4      11885.25 (   0.00%)    11886.43 (   0.01%)
Max       context-8      23784.17 (   0.00%)    23783.31 (  -0.00%)
Max       context-16     47576.84 (   0.00%)    47561.42 (  -0.03%)
Max       context-32     95139.03 (   0.00%)    95094.86 (  -0.05%)
Max       context-64    190180.08 (   0.00%)   190123.31 (  -0.03%)
Max       context-128   371451.73 (   0.00%)   371240.25 (  -0.06%)
Max       context-256   579355.24 (   0.00%)   579731.37 (   0.06%)
Max       context-480   580750.44 (   0.00%)   581118.33 (   0.06%)
BHmean-50 context-1       2966.80 (   0.00%)     2968.82 (   0.07%)
BHmean-50 context-2       5939.32 (   0.00%)     5939.49 (   0.00%)
BHmean-50 context-4      11883.02 (   0.00%)    11886.08 (   0.03%)
BHmean-50 context-8      23778.40 (   0.00%)    23775.90 (  -0.01%)
BHmean-50 context-16     47568.31 (   0.00%)    47546.19 (  -0.05%)
BHmean-50 context-32     95125.84 (   0.00%)    95087.06 (  -0.04%)
BHmean-50 context-64    190165.37 (   0.00%)   190117.94 (  -0.02%)
BHmean-50 context-128   371405.28 (   0.00%)   371168.75 (  -0.06%)
BHmean-50 context-256   579137.11 (   0.00%)   579609.35 (   0.08%)
BHmean-50 context-480   580646.72 (   0.00%)   580920.46 (   0.05%)
BHmean-95 context-1       2965.72 (   0.00%)     2967.94 (   0.07%)
BHmean-95 context-2       5937.20 (   0.00%)     5936.40 (  -0.01%)
BHmean-95 context-4      11880.45 (   0.00%)    11882.71 (   0.02%)
BHmean-95 context-8      23774.69 (   0.00%)    23771.59 (  -0.01%)
BHmean-95 context-16     47555.08 (   0.00%)    47539.93 (  -0.03%)
BHmean-95 context-32     95106.67 (   0.00%)    95072.38 (  -0.04%)
BHmean-95 context-64    190138.93 (   0.00%)   190096.30 (  -0.02%)
BHmean-95 context-128   371322.78 (   0.00%)   371132.61 (  -0.05%)
BHmean-95 context-256   578985.41 (   0.00%)   579389.21 (   0.07%)
BHmean-95 context-480   580598.22 (   0.00%)   580763.93 (   0.03%)
BHmean-99 context-1       2965.72 (   0.00%)     2967.94 (   0.07%)
BHmean-99 context-2       5937.20 (   0.00%)     5936.40 (  -0.01%)
BHmean-99 context-4      11880.45 (   0.00%)    11882.71 (   0.02%)
BHmean-99 context-8      23774.69 (   0.00%)    23771.59 (  -0.01%)
BHmean-99 context-16     47555.08 (   0.00%)    47539.93 (  -0.03%)
BHmean-99 context-32     95106.67 (   0.00%)    95072.38 (  -0.04%)
BHmean-99 context-64    190138.93 (   0.00%)   190096.30 (  -0.02%)
BHmean-99 context-128   371322.78 (   0.00%)   371132.61 (  -0.05%)
BHmean-99 context-256   578985.41 (   0.00%)   579389.21 (   0.07%)
BHmean-99 context-480   580598.22 (   0.00%)   580763.93 (   0.03%)

schbench thread = 1
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        10.71(0.76)          9.86(1.46)           +7.94%    
Request Latencies 99.0th       4036.00(6.53)        4054.29(10.03)       -0.45%    
RPS 50.0th                     267.29(0.49)         266.86(0.38)         -0.16%    
Average RPS                    268.42(0.16)         267.86(0.31)         -0.21%    

schbench thread = 2
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        11.43(1.13)          8.00(2.00)           +30.01%   
Request Latencies 99.0th       4007.43(34.52)       3967.43(70.03)       +1.00%    
RPS 50.0th                     536.71(0.76)         536.14(1.57)         -0.11%    
Average RPS                    536.59(0.55)         535.33(1.34)         -0.23%    

schbench thread = 4
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        9.57(0.79)           6.14(1.46)           +35.84%   
Request Latencies 99.0th       3789.14(31.47)       3810.86(48.97)       -0.57%    
RPS 50.0th                     1074.00(0.00)        1073.43(2.76)        -0.05%    
Average RPS                    1075.03(1.07)        1072.93(2.13)        -0.20%    

schbench thread = 8
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        9.29(0.49)           6.57(1.81)           +29.28%   
Request Latencies 99.0th       3756.00(19.60)       3769.71(23.87)       -0.37%    
RPS 50.0th                     2152.57(4.28)        2152.57(4.28)        0.00%     
Average RPS                    2151.07(2.71)        2150.58(3.41)        -0.02%    

schbench thread = 16
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        9.43(0.53)           6.86(0.90)           +27.25%   
Request Latencies 99.0th       3780.00(32.98)       3774.29(11.04)       +0.15%    
RPS 50.0th                     4305.14(8.55)        4307.43(7.81)        +0.05%    
Average RPS                    4303.47(5.74)        4301.71(4.35)        -0.04%    

schbench thread = 32
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        10.14(0.38)          6.86(0.69)           +32.35%   
Request Latencies 99.0th       3764.00(21.66)       3806.29(32.24)       -1.12%    
RPS 50.0th                     8624.00(0.00)        8619.43(12.09)       -0.05%    
Average RPS                    8607.36(5.29)        8602.69(7.08)        -0.05%    

schbench thread = 64
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        11.71(0.49)          8.43(1.81)           +28.01%   
Request Latencies 99.0th       3796.00(62.48)       3860.25(147.35)      -1.69%  
RPS 50.0th                     17238.86(24.19)      16411.43(88.95)      -4.80%    
Average RPS                    17209.02(10.18)      16389.73(100.27)     -4.76%    

schbench thread = 128
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        13.29(0.49)          12.00(0.00)          +9.71%    
Request Latencies 99.0th       7893.71(11.04)       7909.71(17.10)       -0.20%    
RPS 50.0th                     32013.71(194.52)     32068.57(50.35)      +0.17%    
Average RPS                    31762.03(238.18)     31884.81(300.85)     +0.39%    

schbench thread = 239
Metric                         Base (mean±std)      Compare (mean±std)   Change    
-------------------------------------------------------------------------------------
Wakeup Latencies 99.0th        13.29(0.49)          14.43(0.53)          -8.58%    
Request Latencies 99.0th       8174.86(8.55)        8244.57(12.09)       -0.85%    
RPS 50.0th                     30624.00(0.00)       30614.86(24.19)      -0.03%    
Average RPS                    30695.86(11.03)      30673.35(17.31)      -0.07%    

chacha20:
baseline:
Host time spent: 66,320ms
sched_cache:
Host time spent: 53,859ms
Time reduced by 18%, throughput increased by 23%

Genoa:
chacha20
baseline:
Host time spent: 51,848ms
sched_cache:
Host time spent: 28,439ms

Time reduced by 45%, throughput increased by 82%

[1] https://lore.kernel.org/all/cover.1760206683.git.tim.c.chen@linux.intel.com/
[2] https://www.phoronix.com/review/cache-aware-scheduling-amd-turin

Chen Yu (10):
  sched/cache: Record per-LLC utilization to guide cache-aware
    scheduling decisions
  sched/cache: Introduce helper functions to enforce LLC migration
    policy
  sched/cache: Introduce sched_cache_present to enable cache aware
    scheduling for multi LLCs NUMA node
  sched/cache: Record the number of active threads per process for
    cache-aware scheduling
  sched/cache: Disable cache aware scheduling for processes with high
    thread counts
  sched/cache: Avoid cache-aware scheduling for memory-heavy processes
  sched/cache: Add user control to adjust the parameters of cache-aware
    scheduling
  -- DO NOT APPLY!!! -- sched/cache/stats: Add schedstat for cache aware
    load balancing
  -- DO NOT APPLY!!! -- sched/cache/debug: Add ftrace to track the load
    balance statistics
  -- DO NOT APPLY!!! -- sched/cache/debug: Display the per LLC occupancy
    for each process via proc fs

Peter Zijlstra (Intel) (1):
  sched/cache: Introduce infrastructure for cache-aware load balancing

Tim Chen (12):
  sched/cache: Make LLC id continuous
  sched/cache: Assign preferred LLC ID to processes
  sched/cache: Track LLC-preferred tasks per runqueue
  sched/cache: Introduce per runqueue task LLC preference counter
  sched/cache: Calculate the per runqueue task LLC preference
  sched/cache: Count tasks prefering destination LLC in a sched group
  sched/cache: Check local_group only once in update_sg_lb_stats()
  sched/cache: Prioritize tasks preferring destination LLC during
    balancing
  sched/cache: Add migrate_llc_task migration type for cache-aware
    balancing
  sched/cache: Handle moving single tasks to/from their preferred LLC
  sched/cache: Consider LLC preference when selecting tasks for load
    balancing
  sched/cache: Respect LLC preference in task migration and detach

 fs/proc/base.c                 |   22 +
 include/linux/cacheinfo.h      |   21 +-
 include/linux/mm_types.h       |   60 ++
 include/linux/sched.h          |   19 +
 include/linux/sched/topology.h |    5 +
 include/trace/events/sched.h   |   31 +
 init/Kconfig                   |   11 +
 init/init_task.c               |    4 +
 kernel/fork.c                  |    6 +
 kernel/sched/core.c            |   12 +
 kernel/sched/debug.c           |   62 ++
 kernel/sched/fair.c            | 1034 +++++++++++++++++++++++++++++++-
 kernel/sched/sched.h           |   39 ++
 kernel/sched/stats.c           |    5 +-
 kernel/sched/topology.c        |  239 +++++++-
 15 files changed, 1543 insertions(+), 27 deletions(-)

-- 
2.32.0

next             reply	other threads:[~2025-12-03 23:01 UTC|newest]

Thread overview: 111+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-03 23:07 Tim Chen [this message]
2025-12-03 23:07 ` [PATCH v2 01/23] sched/cache: Introduce infrastructure for cache-aware load balancing Tim Chen
2025-12-09 11:12   ` Peter Zijlstra
2025-12-09 21:39     ` Tim Chen
2025-12-10  9:37   ` Peter Zijlstra
2025-12-10 13:57     ` Chen, Yu C
2025-12-10 15:11       ` Peter Zijlstra
2025-12-11  9:03   ` Vern Hao
2025-12-16  6:12     ` Chen, Yu C
2025-12-17  1:17       ` Vern Hao
     [not found]   ` <fbf52d91-0605-4608-b9cc-e8cc56115fd5@gmail.com>
2025-12-16 22:30     ` Tim Chen
2025-12-03 23:07 ` [PATCH v2 02/23] sched/cache: Record per-LLC utilization to guide cache-aware scheduling decisions Tim Chen
2025-12-09 11:21   ` Peter Zijlstra
2025-12-10 14:02     ` Chen, Yu C
2025-12-10 15:13       ` Peter Zijlstra
2025-12-10 23:58         ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 03/23] sched/cache: Introduce helper functions to enforce LLC migration policy Tim Chen
2025-12-03 23:07 ` [PATCH v2 04/23] sched/cache: Make LLC id continuous Tim Chen
2025-12-09 11:58   ` Peter Zijlstra
2025-12-15 20:49     ` Tim Chen
2025-12-16  5:31       ` Chen, Yu C
2025-12-16 19:53         ` Tim Chen
2025-12-17  5:25           ` Chen, Yu C
2025-12-23  5:31   ` K Prateek Nayak
2025-12-24  7:08     ` Chen, Yu C
2025-12-24  8:19       ` K Prateek Nayak
2025-12-24  9:46         ` Chen, Yu C
2025-12-26  3:17           ` K Prateek Nayak
2025-12-03 23:07 ` [PATCH v2 05/23] sched/cache: Assign preferred LLC ID to processes Tim Chen
2025-12-09 12:11   ` Peter Zijlstra
2025-12-09 22:34     ` Tim Chen
2025-12-12  3:34   ` Vern Hao
2025-12-15 19:32     ` Tim Chen
2025-12-19  4:01       ` Vern Hao
2025-12-24 10:20         ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 06/23] sched/cache: Track LLC-preferred tasks per runqueue Tim Chen
2025-12-09 12:16   ` Peter Zijlstra
2025-12-09 22:55     ` Tim Chen
2025-12-10  9:42       ` Peter Zijlstra
2025-12-16  0:20         ` Chen, Yu C
2025-12-17 10:04   ` Vern Hao
2025-12-17 12:37     ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 07/23] sched/cache: Introduce per runqueue task LLC preference counter Tim Chen
2025-12-09 13:06   ` Peter Zijlstra
2025-12-09 23:17     ` Tim Chen
2025-12-10 12:43   ` Peter Zijlstra
2025-12-10 18:36     ` Tim Chen
2025-12-10 12:51   ` Peter Zijlstra
2025-12-10 18:49     ` Tim Chen
2025-12-11 10:31       ` Peter Zijlstra
2025-12-15 19:21         ` Tim Chen
2025-12-16 22:45         ` Tim Chen
2025-12-03 23:07 ` [PATCH v2 08/23] sched/cache: Calculate the per runqueue task LLC preference Tim Chen
2025-12-03 23:07 ` [PATCH v2 09/23] sched/cache: Count tasks prefering destination LLC in a sched group Tim Chen
2025-12-10 12:52   ` Peter Zijlstra
2025-12-10 14:05     ` Chen, Yu C
2025-12-10 15:16       ` Peter Zijlstra
2025-12-10 19:00         ` Tim Chen
2025-12-10 23:50         ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 10/23] sched/cache: Check local_group only once in update_sg_lb_stats() Tim Chen
2025-12-03 23:07 ` [PATCH v2 11/23] sched/cache: Prioritize tasks preferring destination LLC during balancing Tim Chen
2025-12-03 23:07 ` [PATCH v2 12/23] sched/cache: Add migrate_llc_task migration type for cache-aware balancing Tim Chen
2025-12-10 13:32   ` Peter Zijlstra
2025-12-16  0:52     ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 13/23] sched/cache: Handle moving single tasks to/from their preferred LLC Tim Chen
2025-12-03 23:07 ` [PATCH v2 14/23] sched/cache: Consider LLC preference when selecting tasks for load balancing Tim Chen
2025-12-10 15:58   ` Peter Zijlstra
2025-12-03 23:07 ` [PATCH v2 15/23] sched/cache: Respect LLC preference in task migration and detach Tim Chen
2025-12-10 16:30   ` Peter Zijlstra
2025-12-16  7:30     ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node Tim Chen
2025-12-10 16:32   ` Peter Zijlstra
2025-12-10 16:52     ` Peter Zijlstra
2025-12-16  7:36       ` Chen, Yu C
2025-12-16  7:31     ` Chen, Yu C
2025-12-03 23:07 ` [PATCH v2 17/23] sched/cache: Record the number of active threads per process for cache-aware scheduling Tim Chen
2025-12-10 16:51   ` Peter Zijlstra
2025-12-16  7:40     ` Chen, Yu C
2025-12-17  9:40   ` Aaron Lu
2025-12-17 12:51     ` Chen, Yu C
2025-12-19  3:32       ` Aaron Lu
2025-12-03 23:07 ` [PATCH v2 18/23] sched/cache: Disable cache aware scheduling for processes with high thread counts Tim Chen
2025-12-03 23:07 ` [PATCH v2 19/23] sched/cache: Avoid cache-aware scheduling for memory-heavy processes Tim Chen
2025-12-18  3:59   ` Vern Hao
2025-12-18  8:32     ` Chen, Yu C
2025-12-18  9:42       ` Vern Hao
2025-12-19  3:14         ` K Prateek Nayak
2025-12-19 12:55           ` Chen, Yu C
2025-12-22  2:49             ` Vern Hao
2025-12-22  2:19           ` Vern Hao
2025-12-03 23:07 ` [PATCH v2 20/23] sched/cache: Add user control to adjust the parameters of cache-aware scheduling Tim Chen
2025-12-10 17:02   ` Peter Zijlstra
2025-12-16  7:42     ` Chen, Yu C
2025-12-19  4:14   ` Vern Hao
2025-12-19 13:21     ` Chen, Yu C
2025-12-19 13:39     ` Chen, Yu C
2025-12-23 12:12   ` Yangyu Chen
2025-12-23 16:44     ` Yangyu Chen
2025-12-24  3:28       ` Yangyu Chen
2025-12-24  7:51         ` Chen, Yu C
2025-12-24 12:15           ` Yangyu Chen
2025-12-03 23:07 ` [PATCH v2 21/23] -- DO NOT APPLY!!! -- sched/cache/stats: Add schedstat for cache aware load balancing Tim Chen
2025-12-19  5:03   ` Yangyu Chen
2025-12-19 14:41     ` Chen, Yu C
2025-12-19 14:48       ` Yangyu Chen
2025-12-03 23:07 ` [PATCH v2 22/23] -- DO NOT APPLY!!! -- sched/cache/debug: Add ftrace to track the load balance statistics Tim Chen
2025-12-03 23:07 ` [PATCH v2 23/23] -- DO NOT APPLY!!! -- sched/cache/debug: Display the per LLC occupancy for each process via proc fs Tim Chen
2025-12-17  9:59   ` Aaron Lu
2025-12-17 13:01     ` Chen, Yu C
2025-12-19  3:19 ` [PATCH v2 00/23] Cache aware scheduling Aaron Lu
2025-12-19 13:04   ` Chen, Yu C

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1764801860.git.tim.c.chen@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=adamli@os.amperecomputing.com \
    --cc=aubrey.li@intel.com \
    --cc=bsegall@google.com \
    --cc=cyy@cyyself.name \
    --cc=dietmar.eggemann@arm.com \
    --cc=gautham.shenoy@amd.com \
    --cc=haoxing990@gmail.com \
    --cc=hdanton@sina.com \
    --cc=jianyong.wu@outlook.com \
    --cc=juri.lelli@redhat.com \
    --cc=kprateek.nayak@amd.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tim.c.chen@intel.com \
    --cc=tingyin.duan@gmail.com \
    --cc=vernhao@tencent.com \
    --cc=vincent.guittot@linaro.org \
    --cc=vineethr@linux.ibm.com \
    --cc=vschneid@redhat.com \
    --cc=yu.c.chen@intel.com \
    --cc=yu.chen.surf@gmail.com \
    --cc=zhao1.liu@intel.com \
    --cc=ziqianlu@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).