public inbox for sched-ext@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
@ 2026-03-31  9:18 Cheng-Yang Chou
  2026-03-31  9:24 ` Cheng-Yang Chou
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Cheng-Yang Chou @ 2026-03-31  9:18 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, yphbchou0911

scx_alloc_free_idx() zeroes the payload of a freed arena allocation
one word at a time. The loop bound was alloc->pool.elem_size / 8, but
elem_size includes sizeof(struct sdt_data) (the 8-byte union sdt_id
header). This caused the loop to write one extra u64 past the
allocation, corrupting the tid field of the adjacent pool element.

Fix the loop bound to (elem_size - sizeof(struct sdt_data)) / 8 so
only the payload portion is zeroed.

Test plan:
- Add a temporary sanity check in scx_task_free() before the free call:

  if (mval->data->tid.idx != mval->tid.idx)
      scx_bpf_error("tid corruption: arena=%d storage=%d",
                    mval->data->tid.idx, (int)mval->tid.idx);

- stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt

Without this fix, running scx_sdt under fork-heavy load triggers the
corruption error. With the fix applied, the same workload completes
without error.

Fixes: 36929ebd17ae ("tools/sched_ext: add arena based scheduler")
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
---
 tools/sched_ext/scx_sdt.bpf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/sched_ext/scx_sdt.bpf.c b/tools/sched_ext/scx_sdt.bpf.c
index 10248b71ef02..a1e33e6c412b 100644
--- a/tools/sched_ext/scx_sdt.bpf.c
+++ b/tools/sched_ext/scx_sdt.bpf.c
@@ -317,7 +317,8 @@ int scx_alloc_free_idx(struct scx_allocator *alloc, __u64 idx)
 		};
 
 		/* Zero out one word at a time. */
-		for (i = zero; i < alloc->pool.elem_size / 8 && can_loop; i++) {
+		for (i = zero; i < (alloc->pool.elem_size - sizeof(struct sdt_data)) / 8
+		     && can_loop; i++) {
 			data->payload[i] = 0;
 		}
 	}
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-03-31  9:18 [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing Cheng-Yang Chou
@ 2026-03-31  9:24 ` Cheng-Yang Chou
  2026-03-31  9:42   ` Andrea Righi
  2026-04-04  6:14 ` Cheng-Yang Chou
  2026-04-06 18:09 ` Tejun Heo
  2 siblings, 1 reply; 7+ messages in thread
From: Cheng-Yang Chou @ 2026-03-31  9:24 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai

On Tue, Mar 31, 2026 at 05:18:33PM +0800, Cheng-Yang Chou wrote:
> Test plan:
> - Add a temporary sanity check in scx_task_free() before the free call:
> 
>   if (mval->data->tid.idx != mval->tid.idx)
>       scx_bpf_error("tid corruption: arena=%d storage=%d",
>                     mval->data->tid.idx, (int)mval->tid.idx);
> 
> - stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt
> 

While testing in virtme-ng, I encountered a bug (see below).
IIRC, Andrea sent a patch to fix this.
Has the patch landed yet, or is it perhaps in a different tree?

$ vng -m 2G -v --rw
$ stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt
[  111.268570] irq event stamp: 24073
[  111.268577] hardirqs last  enabled at (24073): [<ffffffff98dcc597>] _raw_spin_unlock_irqrestore+0x57/0x80
[  111.268654] =============================
[  111.268655] [ BUG: Invalid wait context ]
[  111.268732] hardirqs last disabled at (24072): [<ffffffff98dcc277>] _raw_spin_lock_irqsave+0x57/0x60
[  111.268882] 7.0.0-rc2-g94555ca6d0bb #8 Not tainted
[  111.268952] softirqs last  enabled at (24060): [<ffffffff957da30c>] fpu_clone+0xdc/0x520
[  111.269016] -----------------------------
[  111.269180] softirqs last disabled at (24058): [<ffffffff957da2b4>] fpu_clone+0x84/0x520
[  111.269282] stress-ng-fork/703 is trying to lock:
[  111.269762] ffffffff9aa64650 (rcu_tasks_trace_srcu_struct_srcu_usage.lock){....}-{3:3}, at: spin_lock_irqsave_sdp_contention+0x107/0x260
[  111.269996] other info that might help us debug this:
[  111.270098] context-{5:5}
[  111.270149] 3 locks held by stress-ng-fork/703:
[  111.270279]  #0: ffff8880060a2820 (&p->pi_lock){-.-.}-{2:2}, at: _task_rq_lock+0x6c/0x4d0
[  111.270563]  #1: ffff88806cfc9b20 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x31/0x150
[  111.270747]  #2: ffffffff9aa65560 (rcu_read_lock){....}-{1:3}, at: __bpf_prog_enter+0x97/0x290
[  111.270933] stack backtrace:
[  111.271009] CPU: 23 UID: 1000 PID: 703 Comm: stress-ng-fork Not tainted 7.0.0-rc2-g94555ca6d0bb #8 PREEMPT(full) 
[  111.271014] Hardware name: QEMU Ubuntu 25.04 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[  111.271017] Sched_ext: sdt (enabled+all), task: runnable_at=-8ms
[  111.271019] Call Trace:
[  111.271021]  <TASK>
[  111.271023]  dump_stack_lvl+0x6f/0xb0
[  111.271028]  __lock_acquire+0xfaf/0x1f30
[  111.271034]  lock_acquire+0x1b4/0x360
[  111.271037]  ? spin_lock_irqsave_sdp_contention+0x107/0x260
[  111.271042]  ? __pfx_do_raw_spin_trylock+0x10/0x10
[  111.271046]  _raw_spin_lock_irqsave+0x3f/0x60
[  111.271050]  ? spin_lock_irqsave_sdp_contention+0x107/0x260
[  111.271053]  spin_lock_irqsave_sdp_contention+0x107/0x260
[  111.271057]  srcu_gp_start_if_needed+0x1b9/0xcd0
[  111.271062]  ? __pfx_bpf_selem_unlink_storage_nolock_misc+0x10/0x10
[  111.271065]  ? __pfx_srcu_gp_start_if_needed+0x10/0x10
[  111.271070]  ? bpf_selem_unlink_storage_nolock+0x92/0x340
[  111.271074]  bpf_selem_unlink+0x5c7/0xab0
[  111.271078]  ? __pfx_bpf_selem_unlink+0x10/0x10
[  111.271084]  bpf_task_storage_delete+0x3a/0x90
[  111.271088]  bpf_prog_c5eb66872eaef407_sdt_exit_task+0x257/0x25e
[  111.271092]  bpf__sched_ext_ops_exit_task+0x4b/0xa7
[  111.271096]  __scx_disable_and_exit_task+0x2b5/0x6f0
[  111.271100]  ? __pfx___scx_disable_and_exit_task+0x10/0x10
[  111.271105]  scx_disable_and_exit_task+0x16/0x110
[  111.271109]  sched_ext_dead+0x1b5/0x3f0
[  111.271112]  ? __pfx_sched_ext_dead+0x10/0x10
[  111.271115]  ? finish_task_switch.isra.0+0x69d/0xa30
[  111.271118]  ? kmem_cache_free+0x10b/0x5b0
[  111.271121]  ? finish_task_switch.isra.0+0x20b/0xa30
[  111.271125]  finish_task_switch.isra.0+0x70b/0xa30
[  111.271129]  __schedule+0x2ab1/0x62f0
[  111.271137]  ? __pfx___schedule+0x10/0x10
[  111.271139]  ? find_held_lock+0x2b/0x80
[  111.271142]  ? schedule+0x2c4/0x390
[  111.271147]  schedule+0xe2/0x390
[  111.271150]  do_wait+0x14c/0x460
[  111.271154]  kernel_wait4+0x103/0x1e0
[  111.271157]  ? __pfx_kernel_wait4+0x10/0x10
[  111.271162]  ? __pfx_child_wait_callback+0x10/0x10
[  111.271166]  ? __pfx___handle_mm_fault+0x10/0x10
[  111.271171]  __do_sys_wait4+0x109/0x120
[  111.271175]  ? __pfx___do_sys_wait4+0x10/0x10
[  111.271181]  ? exc_page_fault+0x66/0xc0
[  111.271184]  ? handle_mm_fault+0x151/0x430
[  111.271193]  do_syscall_64+0xfc/0x670
[  111.271197]  ? irqentry_exit+0xdf/0x570
[  111.271200]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  111.271204] RIP: 0033:0x7f66078e1ba6
[  111.271207] Code: 00 00 48 8b 15 53 12 17 00 64 89 02 48 c7 c2 ff ff ff ff 48 8b 5d f8 c9 48 89 d0 c3 0f 1f 84 00 00 00 00 00 48 8b 45 10 0f 05 <48> 63 d0 3d 00 f0 ff ff 77 10 48 8b 5d f8 48 89 d0 c9 c3 0f 1f 80
[  111.271210] RSP: 002b:00007fff733cdb60 EFLAGS: 00000202 ORIG_RAX: 000000000000003d
[  111.271213] RAX: ffffffffffffffda RBX: 00007f6606ff6b00 RCX: 00007f66078e1ba6
[  111.271215] RDX: 0000000000000000 RSI: 00007fff733cdc14 RDI: 0000000000000a5c
[  111.271217] RBP: 00007fff733cdb70 R08: 0000000000000000 R09: 0000000000000000
[  111.271219] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000000
[  111.271221] R13: 00007f6606ff66e8 R14: 0000000000000a5c R15: 00007fff733cdc14
[  111.271226]  </TASK>

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-03-31  9:24 ` Cheng-Yang Chou
@ 2026-03-31  9:42   ` Andrea Righi
  2026-03-31 10:58     ` Cheng-Yang Chou
  0 siblings, 1 reply; 7+ messages in thread
From: Andrea Righi @ 2026-03-31  9:42 UTC (permalink / raw)
  To: Cheng-Yang Chou
  Cc: sched-ext, Tejun Heo, David Vernet, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai

Hi Cheng-Yang,

On Tue, Mar 31, 2026 at 05:24:35PM +0800, Cheng-Yang Chou wrote:
> On Tue, Mar 31, 2026 at 05:18:33PM +0800, Cheng-Yang Chou wrote:
> > Test plan:
> > - Add a temporary sanity check in scx_task_free() before the free call:
> > 
> >   if (mval->data->tid.idx != mval->tid.idx)
> >       scx_bpf_error("tid corruption: arena=%d storage=%d",
> >                     mval->data->tid.idx, (int)mval->tid.idx);
> > 
> > - stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt
> > 
> 
> While testing in virtme-ng, I encountered a bug (see below).
> IIRC, Andrea sent a patch to fix this.
> Has the patch landed yet, or is it perhaps in a different tree?
> 
> $ vng -m 2G -v --rw
> $ stress-ng --fork 100 -t 10 & sudo ./build/bin/scx_sdt
> [  111.268570] irq event stamp: 24073
> [  111.268577] hardirqs last  enabled at (24073): [<ffffffff98dcc597>] _raw_spin_unlock_irqrestore+0x57/0x80
> [  111.268654] =============================
> [  111.268655] [ BUG: Invalid wait context ]
> [  111.268732] hardirqs last disabled at (24072): [<ffffffff98dcc277>] _raw_spin_lock_irqsave+0x57/0x60
> [  111.268882] 7.0.0-rc2-g94555ca6d0bb #8 Not tainted
> [  111.268952] softirqs last  enabled at (24060): [<ffffffff957da30c>] fpu_clone+0xdc/0x520
> [  111.269016] -----------------------------
> [  111.269180] softirqs last disabled at (24058): [<ffffffff957da2b4>] fpu_clone+0x84/0x520
> [  111.269282] stress-ng-fork/703 is trying to lock:
> [  111.269762] ffffffff9aa64650 (rcu_tasks_trace_srcu_struct_srcu_usage.lock){....}-{3:3}, at: spin_lock_irqsave_sdp_contention+0x107/0x260

Yes, landed upstream, you need these:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=175b45ed343a9c547b5f45293d3ea08d38a7b6f4
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61bbcfb50514a8a94e035a7349697a3790ab4783
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7c405fb3279b39244b260b54f1bd6488689ae235

-Andrea

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-03-31  9:42   ` Andrea Righi
@ 2026-03-31 10:58     ` Cheng-Yang Chou
  0 siblings, 0 replies; 7+ messages in thread
From: Cheng-Yang Chou @ 2026-03-31 10:58 UTC (permalink / raw)
  To: Andrea Righi
  Cc: sched-ext, Tejun Heo, David Vernet, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai

Hi Andrea,

On Tue, Mar 31, 2026 at 11:42:39AM +0200, Andrea Righi wrote:
> 
> Yes, landed upstream, you need these:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=175b45ed343a9c547b5f45293d3ea08d38a7b6f4
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=61bbcfb50514a8a94e035a7349697a3790ab4783
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7c405fb3279b39244b260b54f1bd6488689ae235
> 

Applied all, and tested w/o errors.
Thanks!

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-03-31  9:18 [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing Cheng-Yang Chou
  2026-03-31  9:24 ` Cheng-Yang Chou
@ 2026-04-04  6:14 ` Cheng-Yang Chou
  2026-04-04 16:40   ` Emil Tsalapatis
  2026-04-06 18:09 ` Tejun Heo
  2 siblings, 1 reply; 7+ messages in thread
From: Cheng-Yang Chou @ 2026-04-04  6:14 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai

Hi all,

On Tue, Mar 31, 2026 at 05:18:33PM +0800, Cheng-Yang Chou wrote:
> --- a/tools/sched_ext/scx_sdt.bpf.c
> +++ b/tools/sched_ext/scx_sdt.bpf.c
> @@ -317,7 +317,8 @@ int scx_alloc_free_idx(struct scx_allocator *alloc, __u64 idx)
>  		};
>  
>  		/* Zero out one word at a time. */
> -		for (i = zero; i < alloc->pool.elem_size / 8 && can_loop; i++) {
> +		for (i = zero; i < (alloc->pool.elem_size - sizeof(struct sdt_data)) / 8
> +		     && can_loop; i++) {
>  			data->payload[i] = 0;
>  		}

Same fix landed in upstream SCX:
https://github.com/sched-ext/scx/commit/a01197a

Minor, but it does address a real issue. Thanks!

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-04-04  6:14 ` Cheng-Yang Chou
@ 2026-04-04 16:40   ` Emil Tsalapatis
  0 siblings, 0 replies; 7+ messages in thread
From: Emil Tsalapatis @ 2026-04-04 16:40 UTC (permalink / raw)
  To: Cheng-Yang Chou
  Cc: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai

On Fri, Apr 3, 2026 at 11:14 PM Cheng-Yang Chou <yphbchou0911@gmail.com> wrote:
>
> >
> Hi all,
>
> On Tue, Mar 31, 2026 at 05:18:33PM +0800, Cheng-Yang Chou wrote:
> > --- a/tools/sched_ext/scx_sdt.bpf.c
> > +++ b/tools/sched_ext/scx_sdt.bpf.c
> > @@ -317,7 +317,8 @@ int scx_alloc_free_idx(struct scx_allocator *alloc, __u64 idx)
> >               };
> >
> >               /* Zero out one word at a time. */
> > -             for (i = zero; i < alloc->pool.elem_size / 8 && can_loop; i++) {
> > +             for (i = zero; i < (alloc->pool.elem_size - sizeof(struct sdt_data)) / 8
> > +                  && can_loop; i++) {
> >                       data->payload[i] = 0;
> >               }
>
> Same fix landed in upstream SCX:
> https://github.com/sched-ext/scx/commit/a01197a
>
> Minor, but it does address a real issue. Thanks!
>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>

> --
> Thanks,
> Cheng-Yang
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing
  2026-03-31  9:18 [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing Cheng-Yang Chou
  2026-03-31  9:24 ` Cheng-Yang Chou
  2026-04-04  6:14 ` Cheng-Yang Chou
@ 2026-04-06 18:09 ` Tejun Heo
  2 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2026-04-06 18:09 UTC (permalink / raw)
  To: Cheng-Yang Chou, sched-ext, David Vernet, Andrea Righi,
	Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, Emil Tsalapatis, linux-kernel

> tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing

Applied to sched_ext/for-7.1.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-04-06 18:09 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-31  9:18 [PATCH] tools/sched_ext: Fix off-by-one in scx_sdt payload zeroing Cheng-Yang Chou
2026-03-31  9:24 ` Cheng-Yang Chou
2026-03-31  9:42   ` Andrea Righi
2026-03-31 10:58     ` Cheng-Yang Chou
2026-04-04  6:14 ` Cheng-Yang Chou
2026-04-04 16:40   ` Emil Tsalapatis
2026-04-06 18:09 ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox