* [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality
@ 2026-02-24 5:56 David Carlier
2026-02-24 17:43 ` Tejun Heo
2026-02-24 18:06 ` Tejun Heo
0 siblings, 2 replies; 4+ messages in thread
From: David Carlier @ 2026-02-24 5:56 UTC (permalink / raw)
To: Tejun Heo, David Vernet; +Cc: linux-kernel, David Carlier
Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and
ddsp_enq_flags immediately after dsq. These fields are accessed together
in the do_enqueue_task() and finish_dispatch() hot paths but were
previously spread across three different cache lines. Grouping them on
the same cache line reduces cache misses on every enqueue and dispatch
operation.
Signed-off-by: David Carlier <devnexen@gmail.com>
---
include/linux/sched/ext.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h
index bcb962d5ee7d..80e70a9642fd 100644
--- a/include/linux/sched/ext.h
+++ b/include/linux/sched/ext.h
@@ -162,6 +162,9 @@ struct scx_dsq_list_node {
*/
struct sched_ext_entity {
struct scx_dispatch_q *dsq;
+ atomic_long_t ops_state;
+ u64 ddsp_dsq_id;
+ u64 ddsp_enq_flags;
struct scx_dsq_list_node dsq_list; /* dispatch order */
struct rb_node dsq_priq; /* p->scx.dsq_vtime order */
u32 dsq_seq;
@@ -173,7 +176,6 @@ struct sched_ext_entity {
s32 selected_cpu;
u32 kf_mask; /* see scx_kf_mask above */
struct task_struct *kf_tasks[2]; /* see SCX_CALL_OP_TASK() */
- atomic_long_t ops_state;
struct list_head runnable_node; /* rq->scx.runnable_list */
unsigned long runnable_at;
@@ -181,8 +183,6 @@ struct sched_ext_entity {
#ifdef CONFIG_SCHED_CORE
u64 core_sched_at; /* see scx_prio_less() */
#endif
- u64 ddsp_dsq_id;
- u64 ddsp_enq_flags;
/* BPF scheduler modifiable fields */
--
2.51.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality
2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier
@ 2026-02-24 17:43 ` Tejun Heo
2026-02-24 18:24 ` David CARLIER
2026-02-24 18:06 ` Tejun Heo
1 sibling, 1 reply; 4+ messages in thread
From: Tejun Heo @ 2026-02-24 17:43 UTC (permalink / raw)
To: David Carlier; +Cc: David Vernet, linux-kernel
On Tue, Feb 24, 2026 at 05:56:37AM +0000, David Carlier wrote:
> Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and
> ddsp_enq_flags immediately after dsq. These fields are accessed together
> in the do_enqueue_task() and finish_dispatch() hot paths but were
> previously spread across three different cache lines. Grouping them on
> the same cache line reduces cache misses on every enqueue and dispatch
> operation.
>
> Signed-off-by: David Carlier <devnexen@gmail.com>
Were you able to measure any different by any chance?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality
2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier
2026-02-24 17:43 ` Tejun Heo
@ 2026-02-24 18:06 ` Tejun Heo
1 sibling, 0 replies; 4+ messages in thread
From: Tejun Heo @ 2026-02-24 18:06 UTC (permalink / raw)
To: David Carlier; +Cc: David Vernet, linux-kernel, emil
> Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and
> ddsp_enq_flags immediately after dsq. These fields are accessed together
> in the do_enqueue_task() and finish_dispatch() hot paths but were
> previously spread across three different cache lines. Grouping them on
> the same cache line reduces cache misses on every enqueue and dispatch
> operation.
Applied to sched_ext/for-7.1 with the subject line capitalized.
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality
2026-02-24 17:43 ` Tejun Heo
@ 2026-02-24 18:24 ` David CARLIER
0 siblings, 0 replies; 4+ messages in thread
From: David CARLIER @ 2026-02-24 18:24 UTC (permalink / raw)
To: Tejun Heo; +Cc: David Vernet, linux-kernel
David CARLIER <devnexen@gmail.com>
18:13 (8 minutes ago)
to Tejun, David, linux-kernel
Thanks for merging. I honestly haven't run a formal benchmark yet but
here is the pahole output before and after the patch, both built from
the same tree (compiled with make kernel/sched/core.o which pulls in
sched_ext_entity via
sched/ext.h):
Before:
struct sched_ext_entity {
struct scx_dispatch_q * dsq; /* 0 8 */
struct scx_dsq_list_node dsq_list; /* 8 24 */
struct rb_node dsq_priq; /* 32 24 */
u32 dsq_seq; /* 56 4 */
u32 dsq_flags; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
u32 flags; /* 64 4 */
u32 weight; /* 68 4 */
s32 sticky_cpu; /* 72 4 */
s32 holding_cpu; /* 76 4 */
s32 selected_cpu; /* 80 4 */
u32 kf_mask; /* 84 4 */
struct task_struct * kf_tasks[2]; /* 88 16 */
atomic_long_t ops_state; /* 104 8 */
struct list_head runnable_node; /* 112 16 */
/* --- cacheline 2 boundary (128 bytes) --- */
long unsigned int runnable_at; /* 128 8 */
u64 core_sched_at; /* 136 8 */
u64 ddsp_dsq_id; /* 144 8 */
u64 ddsp_enq_flags; /* 152 8 */
u64 slice; /* 160 8 */
u64 dsq_vtime; /* 168 8 */
bool disallow; /* 176 1 */
/* XXX 7 bytes hole, try to pack */
struct cgroup * cgrp_moving_from; /* 184 8 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct list_head tasks_node; /* 192 16 */
/* size: 208, cachelines: 4, members: 23 */
/* sum members: 201, holes: 1, sum holes: 7 */
};
dsq sits at offset 0 (cacheline 0), ops_state at offset 104
(cacheline 1), and ddsp_dsq_id/ddsp_enq_flags at offsets 144-152
(cacheline 2) — three cache lines touched on every do_enqueue_task(),
finish_dispatch(), and direct_dispatch() call.
After:
struct sched_ext_entity {
struct scx_dispatch_q * dsq; /* 0 8 */
atomic_long_t ops_state; /* 8 8 */
u64 ddsp_dsq_id; /* 16 8 */
u64 ddsp_enq_flags; /* 24 8 */
struct scx_dsq_list_node dsq_list; /* 32 24 */
struct rb_node dsq_priq; /* 56 24 */
/* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
u32 dsq_seq; /* 80 4 */
u32 dsq_flags; /* 84 4 */
u32 flags; /* 88 4 */
u32 weight; /* 92 4 */
s32 sticky_cpu; /* 96 4 */
s32 holding_cpu; /* 100 4 */
s32 selected_cpu; /* 104 4 */
u32 kf_mask; /* 108 4 */
struct task_struct * kf_tasks[2]; /* 112 16 */
/* --- cacheline 2 boundary (128 bytes) --- */
struct list_head runnable_node; /* 128 16 */
long unsigned int runnable_at; /* 144 8 */
u64 core_sched_at; /* 152 8 */
u64 slice; /* 160 8 */
u64 dsq_vtime; /* 168 8 */
bool disallow; /* 176 1 */
/* XXX 7 bytes hole, try to pack */
struct cgroup * cgrp_moving_from; /* 184 8 */
/* --- cacheline 3 boundary (192 bytes) --- */
struct list_head tasks_node; /* 192 16 */
/* size: 208, cachelines: 4, members: 23 */
/* sum members: 201, holes: 1, sum holes: 7 */
};
All four hot-path fields now sit within the first 32 bytes of
cacheline 0. Struct size and total cacheline count are unchanged (208
bytes, 4 cachelines) — it is purely a field reorder.
If you want, I might follow up with perf stat cache-miss numbers
(hackbench/schbench under scx_simple) once I can test on appropriate
hardware.
On Tue, 24 Feb 2026 at 17:43, Tejun Heo <tj@kernel.org> wrote:
>
> On Tue, Feb 24, 2026 at 05:56:37AM +0000, David Carlier wrote:
> > Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and
> > ddsp_enq_flags immediately after dsq. These fields are accessed together
> > in the do_enqueue_task() and finish_dispatch() hot paths but were
> > previously spread across three different cache lines. Grouping them on
> > the same cache line reduces cache misses on every enqueue and dispatch
> > operation.
> >
> > Signed-off-by: David Carlier <devnexen@gmail.com>
>
> Were you able to measure any different by any chance?
>
> Thanks.
>
> --
> tejun
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-24 18:24 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier
2026-02-24 17:43 ` Tejun Heo
2026-02-24 18:24 ` David CARLIER
2026-02-24 18:06 ` Tejun Heo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox