* [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality @ 2026-02-24 5:56 David Carlier 2026-02-24 17:43 ` Tejun Heo 2026-02-24 18:06 ` Tejun Heo 0 siblings, 2 replies; 4+ messages in thread From: David Carlier @ 2026-02-24 5:56 UTC (permalink / raw) To: Tejun Heo, David Vernet; +Cc: linux-kernel, David Carlier Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and ddsp_enq_flags immediately after dsq. These fields are accessed together in the do_enqueue_task() and finish_dispatch() hot paths but were previously spread across three different cache lines. Grouping them on the same cache line reduces cache misses on every enqueue and dispatch operation. Signed-off-by: David Carlier <devnexen@gmail.com> --- include/linux/sched/ext.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/sched/ext.h b/include/linux/sched/ext.h index bcb962d5ee7d..80e70a9642fd 100644 --- a/include/linux/sched/ext.h +++ b/include/linux/sched/ext.h @@ -162,6 +162,9 @@ struct scx_dsq_list_node { */ struct sched_ext_entity { struct scx_dispatch_q *dsq; + atomic_long_t ops_state; + u64 ddsp_dsq_id; + u64 ddsp_enq_flags; struct scx_dsq_list_node dsq_list; /* dispatch order */ struct rb_node dsq_priq; /* p->scx.dsq_vtime order */ u32 dsq_seq; @@ -173,7 +176,6 @@ struct sched_ext_entity { s32 selected_cpu; u32 kf_mask; /* see scx_kf_mask above */ struct task_struct *kf_tasks[2]; /* see SCX_CALL_OP_TASK() */ - atomic_long_t ops_state; struct list_head runnable_node; /* rq->scx.runnable_list */ unsigned long runnable_at; @@ -181,8 +183,6 @@ struct sched_ext_entity { #ifdef CONFIG_SCHED_CORE u64 core_sched_at; /* see scx_prio_less() */ #endif - u64 ddsp_dsq_id; - u64 ddsp_enq_flags; /* BPF scheduler modifiable fields */ -- 2.51.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality 2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier @ 2026-02-24 17:43 ` Tejun Heo 2026-02-24 18:24 ` David CARLIER 2026-02-24 18:06 ` Tejun Heo 1 sibling, 1 reply; 4+ messages in thread From: Tejun Heo @ 2026-02-24 17:43 UTC (permalink / raw) To: David Carlier; +Cc: David Vernet, linux-kernel On Tue, Feb 24, 2026 at 05:56:37AM +0000, David Carlier wrote: > Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and > ddsp_enq_flags immediately after dsq. These fields are accessed together > in the do_enqueue_task() and finish_dispatch() hot paths but were > previously spread across three different cache lines. Grouping them on > the same cache line reduces cache misses on every enqueue and dispatch > operation. > > Signed-off-by: David Carlier <devnexen@gmail.com> Were you able to measure any different by any chance? Thanks. -- tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality 2026-02-24 17:43 ` Tejun Heo @ 2026-02-24 18:24 ` David CARLIER 0 siblings, 0 replies; 4+ messages in thread From: David CARLIER @ 2026-02-24 18:24 UTC (permalink / raw) To: Tejun Heo; +Cc: David Vernet, linux-kernel David CARLIER <devnexen@gmail.com> 18:13 (8 minutes ago) to Tejun, David, linux-kernel Thanks for merging. I honestly haven't run a formal benchmark yet but here is the pahole output before and after the patch, both built from the same tree (compiled with make kernel/sched/core.o which pulls in sched_ext_entity via sched/ext.h): Before: struct sched_ext_entity { struct scx_dispatch_q * dsq; /* 0 8 */ struct scx_dsq_list_node dsq_list; /* 8 24 */ struct rb_node dsq_priq; /* 32 24 */ u32 dsq_seq; /* 56 4 */ u32 dsq_flags; /* 60 4 */ /* --- cacheline 1 boundary (64 bytes) --- */ u32 flags; /* 64 4 */ u32 weight; /* 68 4 */ s32 sticky_cpu; /* 72 4 */ s32 holding_cpu; /* 76 4 */ s32 selected_cpu; /* 80 4 */ u32 kf_mask; /* 84 4 */ struct task_struct * kf_tasks[2]; /* 88 16 */ atomic_long_t ops_state; /* 104 8 */ struct list_head runnable_node; /* 112 16 */ /* --- cacheline 2 boundary (128 bytes) --- */ long unsigned int runnable_at; /* 128 8 */ u64 core_sched_at; /* 136 8 */ u64 ddsp_dsq_id; /* 144 8 */ u64 ddsp_enq_flags; /* 152 8 */ u64 slice; /* 160 8 */ u64 dsq_vtime; /* 168 8 */ bool disallow; /* 176 1 */ /* XXX 7 bytes hole, try to pack */ struct cgroup * cgrp_moving_from; /* 184 8 */ /* --- cacheline 3 boundary (192 bytes) --- */ struct list_head tasks_node; /* 192 16 */ /* size: 208, cachelines: 4, members: 23 */ /* sum members: 201, holes: 1, sum holes: 7 */ }; dsq sits at offset 0 (cacheline 0), ops_state at offset 104 (cacheline 1), and ddsp_dsq_id/ddsp_enq_flags at offsets 144-152 (cacheline 2) — three cache lines touched on every do_enqueue_task(), finish_dispatch(), and direct_dispatch() call. After: struct sched_ext_entity { struct scx_dispatch_q * dsq; /* 0 8 */ atomic_long_t ops_state; /* 8 8 */ u64 ddsp_dsq_id; /* 16 8 */ u64 ddsp_enq_flags; /* 24 8 */ struct scx_dsq_list_node dsq_list; /* 32 24 */ struct rb_node dsq_priq; /* 56 24 */ /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */ u32 dsq_seq; /* 80 4 */ u32 dsq_flags; /* 84 4 */ u32 flags; /* 88 4 */ u32 weight; /* 92 4 */ s32 sticky_cpu; /* 96 4 */ s32 holding_cpu; /* 100 4 */ s32 selected_cpu; /* 104 4 */ u32 kf_mask; /* 108 4 */ struct task_struct * kf_tasks[2]; /* 112 16 */ /* --- cacheline 2 boundary (128 bytes) --- */ struct list_head runnable_node; /* 128 16 */ long unsigned int runnable_at; /* 144 8 */ u64 core_sched_at; /* 152 8 */ u64 slice; /* 160 8 */ u64 dsq_vtime; /* 168 8 */ bool disallow; /* 176 1 */ /* XXX 7 bytes hole, try to pack */ struct cgroup * cgrp_moving_from; /* 184 8 */ /* --- cacheline 3 boundary (192 bytes) --- */ struct list_head tasks_node; /* 192 16 */ /* size: 208, cachelines: 4, members: 23 */ /* sum members: 201, holes: 1, sum holes: 7 */ }; All four hot-path fields now sit within the first 32 bytes of cacheline 0. Struct size and total cacheline count are unchanged (208 bytes, 4 cachelines) — it is purely a field reorder. If you want, I might follow up with perf stat cache-miss numbers (hackbench/schbench under scx_simple) once I can test on appropriate hardware. On Tue, 24 Feb 2026 at 17:43, Tejun Heo <tj@kernel.org> wrote: > > On Tue, Feb 24, 2026 at 05:56:37AM +0000, David Carlier wrote: > > Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and > > ddsp_enq_flags immediately after dsq. These fields are accessed together > > in the do_enqueue_task() and finish_dispatch() hot paths but were > > previously spread across three different cache lines. Grouping them on > > the same cache line reduces cache misses on every enqueue and dispatch > > operation. > > > > Signed-off-by: David Carlier <devnexen@gmail.com> > > Were you able to measure any different by any chance? > > Thanks. > > -- > tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality 2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier 2026-02-24 17:43 ` Tejun Heo @ 2026-02-24 18:06 ` Tejun Heo 1 sibling, 0 replies; 4+ messages in thread From: Tejun Heo @ 2026-02-24 18:06 UTC (permalink / raw) To: David Carlier; +Cc: David Vernet, linux-kernel, emil > Reorder struct sched_ext_entity to place ops_state, ddsp_dsq_id, and > ddsp_enq_flags immediately after dsq. These fields are accessed together > in the do_enqueue_task() and finish_dispatch() hot paths but were > previously spread across three different cache lines. Grouping them on > the same cache line reduces cache misses on every enqueue and dispatch > operation. Applied to sched_ext/for-7.1 with the subject line capitalized. Thanks. -- tejun ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-02-24 18:24 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-24 5:56 [PATCH] sched_ext: optimize sched_ext_entity layout for cache locality David Carlier 2026-02-24 17:43 ` Tejun Heo 2026-02-24 18:24 ` David CARLIER 2026-02-24 18:06 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox