All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH sched_ext/for-7.1-fixes 0/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs
@ 2026-04-16  6:46 Cheng-Yang Chou
  2026-04-16  6:46 ` [PATCH 1/2] " Cheng-Yang Chou
  2026-04-16  6:46 ` [PATCH 2/2] selftests/sched_ext: Add non_scx_kfunc_deny test Cheng-Yang Chou
  0 siblings, 2 replies; 5+ messages in thread
From: Cheng-Yang Chou @ 2026-04-16  6:46 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, yphbchou0911

As discussed in [1], scx_kfunc_context_filter() currently allows non-SCX
struct_ops programs (e.g. tcp_congestion_ops) to call SCX kfuncs that are
only meaningful inside an SCX scheduler. This is wrong for two reasons.

First, it is semantically incorrect: a TCP congestion control program
has no business calling SCX kfuncs such as scx_bpf_kick_cpu().

Second, with CONFIG_EXT_SUB_SCHED=y, kfuncs like scx_bpf_kick_cpu()
call scx_prog_sched(aux), which retrieves the struct_ops kdata via
bpf_prog_get_assoc_struct_ops() and casts it to struct sched_ext_ops *
before reading ops->priv.  For a non-SCX struct_ops program the kdata
is far smaller than sched_ext_ops, turning the read into an
out-of-bounds access (confirmed with KASAN).

Patch 1 extends scx_kfunc_context_filter() to also cover
scx_kfunc_set_any and denies all context-sensitive kfuncs to any
struct_ops program that is not the SCX struct_ops.

Patch 2 adds a selftest that loads a TCP congestion control program
calling scx_bpf_kick_cpu() and verifies the BPF verifier rejects it.

[1] https://lore.kernel.org/r/f2ab3yg5niso6hxqe7sd4jmv5xzdizk3khcspm5bylfbn3mj44@tpyiezvs4cod/

Thanks,
Cheng-Yang

---
Cheng-Yang Chou (2):
  sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs
  selftests/sched_ext: Add non_scx_kfunc_deny test

 kernel/sched/ext.c                            | 25 +++++-----
 tools/testing/selftests/sched_ext/Makefile    |  1 +
 .../sched_ext/non_scx_kfunc_deny.bpf.c        | 44 +++++++++++++++++
 .../selftests/sched_ext/non_scx_kfunc_deny.c  | 47 +++++++++++++++++++
 4 files changed, 106 insertions(+), 11 deletions(-)
 create mode 100644 tools/testing/selftests/sched_ext/non_scx_kfunc_deny.bpf.c
 create mode 100644 tools/testing/selftests/sched_ext/non_scx_kfunc_deny.c

-- 
2.48.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs
  2026-04-16  6:46 [PATCH sched_ext/for-7.1-fixes 0/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs Cheng-Yang Chou
@ 2026-04-16  6:46 ` Cheng-Yang Chou
  2026-04-17 18:53   ` Tejun Heo
  2026-04-16  6:46 ` [PATCH 2/2] selftests/sched_ext: Add non_scx_kfunc_deny test Cheng-Yang Chou
  1 sibling, 1 reply; 5+ messages in thread
From: Cheng-Yang Chou @ 2026-04-16  6:46 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, yphbchou0911

scx_kfunc_context_filter() currently allows non-SCX struct_ops programs
(e.g. tcp_congestion_ops) to call SCX unlocked kfuncs. This is wrong
for two reasons:

- It is semantically incorrect: a TCP congestion control program has no
  business calling SCX kfuncs such as scx_bpf_kick_cpu().

- With CONFIG_EXT_SUB_SCHED=y, kfuncs like scx_bpf_kick_cpu() call
  scx_prog_sched(aux), which invokes bpf_prog_get_assoc_struct_ops(aux)
  and casts the result to struct sched_ext_ops * before reading ops->priv.
  For a non-SCX struct_ops program the returned pointer is the kdata of
  that struct_ops type, which is far smaller than sched_ext_ops, making
  the read an out-of-bounds access (confirmed with KASAN).

Extend the filter to cover scx_kfunc_set_any as well, and deny all
context-sensitive kfuncs for any struct_ops program that is not the SCX
struct_ops. This addresses both issues: the semantic contract is enforced
at the verifier level, and the runtime out-of-bounds access becomes
unreachable.

Fixes: d1d3c1c6ae36 ("sched_ext: Add verifier-time kfunc context filter")
Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
---
 kernel/sched/ext.c | 25 ++++++++++++++-----------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
index 012ca8bd70fb..768abc6cf7ad 100644
--- a/kernel/sched/ext.c
+++ b/kernel/sched/ext.c
@@ -9479,6 +9479,7 @@ BTF_KFUNCS_END(scx_kfunc_ids_any)
 static const struct btf_kfunc_id_set scx_kfunc_set_any = {
 	.owner			= THIS_MODULE,
 	.set			= &scx_kfunc_ids_any,
+	.filter			= scx_kfunc_context_filter,
 };
 
 /*
@@ -9530,9 +9531,9 @@ static const u32 scx_kf_allow_flags[] = {
  * .filter field on each per-group btf_kfunc_id_set. The BPF core invokes this
  * for every kfunc call in the registered hook (BPF_PROG_TYPE_STRUCT_OPS or
  * BPF_PROG_TYPE_SYSCALL), regardless of which set originally introduced the
- * kfunc - so the filter must short-circuit on kfuncs it doesn't govern (e.g.
- * scx_kfunc_ids_any) by falling through to "allow" when none of the
- * context-sensitive sets contain the kfunc.
+ * kfunc - so the filter must short-circuit on kfuncs it doesn't govern by
+ * falling through to "allow" when none of the context-sensitive sets contain
+ * the kfunc.
  */
 int scx_kfunc_context_filter(const struct bpf_prog *prog, u32 kfunc_id)
 {
@@ -9541,18 +9542,19 @@ int scx_kfunc_context_filter(const struct bpf_prog *prog, u32 kfunc_id)
 	bool in_enqueue = btf_id_set8_contains(&scx_kfunc_ids_enqueue_dispatch, kfunc_id);
 	bool in_dispatch = btf_id_set8_contains(&scx_kfunc_ids_dispatch, kfunc_id);
 	bool in_cpu_release = btf_id_set8_contains(&scx_kfunc_ids_cpu_release, kfunc_id);
+	bool in_any = btf_id_set8_contains(&scx_kfunc_ids_any, kfunc_id);
 	u32 moff, flags;
 
-	/* Not a context-sensitive kfunc (e.g. from scx_kfunc_ids_any) - allow. */
-	if (!(in_unlocked || in_select_cpu || in_enqueue || in_dispatch || in_cpu_release))
+	/* Not a context-sensitive kfunc - allow. */
+	if (!(in_unlocked || in_select_cpu || in_enqueue || in_dispatch || in_cpu_release || in_any))
 		return 0;
 
 	/* SYSCALL progs (e.g. BPF test_run()) may call unlocked and select_cpu kfuncs. */
 	if (prog->type == BPF_PROG_TYPE_SYSCALL)
-		return (in_unlocked || in_select_cpu) ? 0 : -EACCES;
+		return (in_unlocked || in_select_cpu || in_any) ? 0 : -EACCES;
 
 	if (prog->type != BPF_PROG_TYPE_STRUCT_OPS)
-		return -EACCES;
+		return in_any ? 0 : -EACCES;
 
 	/*
 	 * add_subprog_and_kfunc() collects all kfunc calls, including dead code
@@ -9565,14 +9567,15 @@ int scx_kfunc_context_filter(const struct bpf_prog *prog, u32 kfunc_id)
 		return 0;
 
 	/*
-	 * Non-SCX struct_ops: only unlocked kfuncs are safe. The other
-	 * context-sensitive kfuncs assume the rq lock is held by the SCX
-	 * dispatch path, which doesn't apply to other struct_ops users.
+	 * Non-SCX struct_ops: context-sensitive kfuncs are not permitted.
 	 */
 	if (prog->aux->st_ops != &bpf_sched_ext_ops)
-		return in_unlocked ? 0 : -EACCES;
+		return -EACCES;
 
 	/* SCX struct_ops: check the per-op allow list. */
+	if (in_any)
+		return 0;
+
 	moff = prog->aux->attach_st_ops_member_off;
 	flags = scx_kf_allow_flags[SCX_MOFF_IDX(moff)];
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] selftests/sched_ext: Add non_scx_kfunc_deny test
  2026-04-16  6:46 [PATCH sched_ext/for-7.1-fixes 0/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs Cheng-Yang Chou
  2026-04-16  6:46 ` [PATCH 1/2] " Cheng-Yang Chou
@ 2026-04-16  6:46 ` Cheng-Yang Chou
  1 sibling, 0 replies; 5+ messages in thread
From: Cheng-Yang Chou @ 2026-04-16  6:46 UTC (permalink / raw)
  To: sched-ext, Tejun Heo, David Vernet, Andrea Righi, Changwoo Min
  Cc: Ching-Chun Huang, Chia-Ping Tsai, yphbchou0911

Verify that the BPF verifier rejects a non-SCX struct_ops program
(tcp_congestion_ops) that attempts to call an SCX kfunc (scx_bpf_kick_cpu).
The test expects the load to fail with -EACCES from scx_kfunc_context_filter.

Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com>
---
 tools/testing/selftests/sched_ext/Makefile    |  1 +
 .../sched_ext/non_scx_kfunc_deny.bpf.c        | 44 +++++++++++++++++
 .../selftests/sched_ext/non_scx_kfunc_deny.c  | 47 +++++++++++++++++++
 3 files changed, 92 insertions(+)
 create mode 100644 tools/testing/selftests/sched_ext/non_scx_kfunc_deny.bpf.c
 create mode 100644 tools/testing/selftests/sched_ext/non_scx_kfunc_deny.c

diff --git a/tools/testing/selftests/sched_ext/Makefile b/tools/testing/selftests/sched_ext/Makefile
index 789037be44c7..2880a122a214 100644
--- a/tools/testing/selftests/sched_ext/Makefile
+++ b/tools/testing/selftests/sched_ext/Makefile
@@ -190,6 +190,7 @@ auto-test-targets :=			\
 	test_example			\
 	total_bw			\
 	cyclic_kick_wait		\
+	non_scx_kfunc_deny		\
 
 testcase-targets := $(addsuffix .o,$(addprefix $(SCXOBJ_DIR)/,$(auto-test-targets)))
 
diff --git a/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.bpf.c b/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.bpf.c
new file mode 100644
index 000000000000..9f16d39255e7
--- /dev/null
+++ b/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.bpf.c
@@ -0,0 +1,44 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Verify that context-sensitive SCX kfuncs (even "unlocked" ones) are
+ * restricted to only SCX struct_ops programs. Non-SCX struct_ops programs,
+ * such as TCP congestion control programs, should be rejected by the BPF
+ * verifier when attempting to call these kfuncs.
+ *
+ * Copyright (C) 2026 Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
+ * Copyright (C) 2026 Cheng-Yang Chou <yphbchou0911@gmail.com>
+ */
+
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+/* SCX kfunc from scx_kfunc_ids_any set */
+void scx_bpf_kick_cpu(s32 cpu, u64 flags) __ksym;
+
+SEC("struct_ops/ssthresh")
+__u32 BPF_PROG(tcp_ca_ssthresh, struct sock *sk)
+{
+	/*
+	 * This call should be rejected by the verifier because this is a
+	 * TCP congestion control program (non-SCX struct_ops).
+	 */
+	scx_bpf_kick_cpu(0, 0);
+	return 2;
+}
+
+SEC("struct_ops/cong_avoid")
+void BPF_PROG(tcp_ca_cong_avoid, struct sock *sk, __u32 ack, __u32 acked) {}
+
+SEC("struct_ops/undo_cwnd")
+__u32 BPF_PROG(tcp_ca_undo_cwnd, struct sock *sk) { return 2; }
+
+SEC(".struct_ops")
+struct tcp_congestion_ops tcp_non_scx_ca = {
+	.ssthresh   = (void *)tcp_ca_ssthresh,
+	.cong_avoid = (void *)tcp_ca_cong_avoid,
+	.undo_cwnd  = (void *)tcp_ca_undo_cwnd,
+	.name       = "tcp_kfunc_deny",
+};
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.c b/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.c
new file mode 100644
index 000000000000..1c031575fb87
--- /dev/null
+++ b/tools/testing/selftests/sched_ext/non_scx_kfunc_deny.c
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Verify that context-sensitive SCX kfuncs (even "unlocked" ones) are
+ * restricted to only SCX struct_ops programs. Non-SCX struct_ops programs,
+ * such as TCP congestion control programs, should be rejected by the BPF
+ * verifier when attempting to call these kfuncs.
+ *
+ * Copyright (C) 2026 Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
+ * Copyright (C) 2026 Cheng-Yang Chou <yphbchou0911@gmail.com>
+ */
+
+#include <bpf/bpf.h>
+#include <scx/common.h>
+#include <unistd.h>
+#include <errno.h>
+#include <stdio.h>
+#include "non_scx_kfunc_deny.bpf.skel.h"
+#include "scx_test.h"
+
+static enum scx_test_status run(void *ctx)
+{
+	struct non_scx_kfunc_deny *skel;
+	int err;
+
+	skel = non_scx_kfunc_deny__open();
+	if (!skel) {
+		SCX_ERR("Failed to open skel");
+		return SCX_TEST_FAIL;
+	}
+
+	err = non_scx_kfunc_deny__load(skel);
+	non_scx_kfunc_deny__destroy(skel);
+
+	if (err == 0) {
+		SCX_ERR("non-SCX BPF program loaded when it should have been rejected");
+		return SCX_TEST_FAIL;
+	}
+
+	return SCX_TEST_PASS;
+}
+
+struct scx_test non_scx_kfunc_deny = {
+	.name = "non_scx_kfunc_deny",
+	.description = "Verify that non-SCX struct_ops programs cannot call SCX kfuncs",
+	.run = run,
+};
+REGISTER_SCX_TEST(&non_scx_kfunc_deny)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs
  2026-04-16  6:46 ` [PATCH 1/2] " Cheng-Yang Chou
@ 2026-04-17 18:53   ` Tejun Heo
  2026-04-18  6:50     ` Cheng-Yang Chou
  0 siblings, 1 reply; 5+ messages in thread
From: Tejun Heo @ 2026-04-17 18:53 UTC (permalink / raw)
  To: Cheng-Yang Chou
  Cc: sched-ext, David Vernet, Andrea Righi, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai

Hello,

A few things:

scx_kfunc_set_idle (ext_idle.c) isn't covered by the patch. Its kfuncs
(scx_bpf_cpu_node(), scx_bpf_get_idle_cpumask(), scx_bpf_pick_idle_cpu()
etc.) also call scx_prog_sched(aux) and hit the same OOB pattern on
non-SCX struct_ops programs. Please add an in_idle check in the filter
and treat it the same as in_any.

Separately, please also set .filter on scx_kfunc_set_idle itself. In
practice, the BPF core dedups filters per hook in
btf_populate_kfunc_set(), so the filter is already invoked for idle
kfuncs via the other sets' registrations on the same hook. But it's
confusing to read the code without setting .filter on every set.

The following line is too long - please break and indent (it gets
longer still once in_idle is added):

> +	if (!(in_unlocked || in_select_cpu || in_enqueue || in_dispatch || in_cpu_release || in_any))

Once idle is covered, every SCX kfunc set ends up in this "context-
sensitive" list. At that point "context-sensitive" no longer
distinguishes anything - it just means "SCX kfuncs". Please update the
function-level comment and the two inline comments ("Not a context-
sensitive kfunc - allow." and "Non-SCX struct_ops: context-sensitive
kfuncs are not permitted.") to drop the term and talk about SCX kfuncs
directly.

Thanks.

--
tejun

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 1/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs
  2026-04-17 18:53   ` Tejun Heo
@ 2026-04-18  6:50     ` Cheng-Yang Chou
  0 siblings, 0 replies; 5+ messages in thread
From: Cheng-Yang Chou @ 2026-04-18  6:50 UTC (permalink / raw)
  To: Tejun Heo
  Cc: sched-ext, David Vernet, Andrea Righi, Changwoo Min,
	Ching-Chun Huang, Chia-Ping Tsai

Hi Tejun,

On Fri, Apr 17, 2026 at 08:53:40AM -1000, Tejun Heo wrote:
> A few things:
> 
> scx_kfunc_set_idle (ext_idle.c) isn't covered by the patch. Its kfuncs
> (scx_bpf_cpu_node(), scx_bpf_get_idle_cpumask(), scx_bpf_pick_idle_cpu()
> etc.) also call scx_prog_sched(aux) and hit the same OOB pattern on
> non-SCX struct_ops programs. Please add an in_idle check in the filter
> and treat it the same as in_any.

Ack.

> Separately, please also set .filter on scx_kfunc_set_idle itself. In
> practice, the BPF core dedups filters per hook in
> btf_populate_kfunc_set(), so the filter is already invoked for idle
> kfuncs via the other sets' registrations on the same hook. But it's
> confusing to read the code without setting .filter on every set.

Agree, will do.

> The following line is too long - please break and indent (it gets
> longer still once in_idle is added):
> 
> > +	if (!(in_unlocked || in_select_cpu || in_enqueue || in_dispatch || in_cpu_release || in_any))

Ack.

> Once idle is covered, every SCX kfunc set ends up in this "context-
> sensitive" list. At that point "context-sensitive" no longer
> distinguishes anything - it just means "SCX kfuncs". Please update the
> function-level comment and the two inline comments ("Not a context-
> sensitive kfunc - allow." and "Non-SCX struct_ops: context-sensitive
> kfuncs are not permitted.") to drop the term and talk about SCX kfuncs
> directly.

Ahh, thanks for point out! I'll reword them and send v2 patch.

-- 
Thanks,
Cheng-Yang

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-04-18  6:51 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-16  6:46 [PATCH sched_ext/for-7.1-fixes 0/2] sched_ext: Deny SCX kfuncs to non-SCX struct_ops programs Cheng-Yang Chou
2026-04-16  6:46 ` [PATCH 1/2] " Cheng-Yang Chou
2026-04-17 18:53   ` Tejun Heo
2026-04-18  6:50     ` Cheng-Yang Chou
2026-04-16  6:46 ` [PATCH 2/2] selftests/sched_ext: Add non_scx_kfunc_deny test Cheng-Yang Chou

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.