From: Andrea Righi <arighi@nvidia.com>
To: "zhidao su (Xiaomi)" <soolaugust@gmail.com>
Cc: Tejun Heo <tj@kernel.org>, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
K Prateek Nayak <kprateek.nayak@amd.com>,
Shuah Khan <shuah@kernel.org>,
Joel Fernandes <joelagnelf@nvidia.com>,
Christian Loehle <christian.loehle@arm.com>,
zhidao su <suzhidao@xiaomi.com>,
Cheng-Yang Chou <yphbchou0911@gmail.com>,
Zhao Mengmeng <zhaomengmeng@kylinos.cn>,
linux-kernel@vger.kernel.org, sched-ext@lists.linux.dev,
linux-kselftest@vger.kernel.org, bpf@vger.kernel.org
Subject: Re: [PATCH] sched_ext: Reject task setter kfuncs outside SCX contexts
Date: Wed, 1 Jul 2026 18:47:32 +0200 [thread overview]
Message-ID: <akVEpEArgmcEJVqA@gpd4> (raw)
In-Reply-To: <20260701091954.384565-1-soolaugust@gmail.com>
Hi zhidao,
On Wed, Jul 01, 2026 at 05:19:54PM +0800, zhidao su (Xiaomi) wrote:
> scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() mutate
> sched_ext task state and rely on scx_prog_sched() and
> scx_task_on_sched() to validate the calling scheduler's authority over
> the target task.
>
> However, the kfuncs were part of scx_kfunc_ids_any. That set is also
> registered for tracing and syscall programs and the verifier-time
> context filter allows any-set kfuncs from non-STRUCT_OPS program types.
> As a result, a non-SCX tracing program can load after calling the task
> setter wrappers, reaching the mutator kfuncs outside an SCX scheduler
> context.
>
> Split the task setter kfuncs into their own filtered kfunc set. Register
> that set for STRUCT_OPS, TRACING, and SYSCALL so the verifier can find
> the kfuncs, but let scx_kfunc_context_filter() reject them outside SCX
> struct_ops contexts. SCX schedulers can still call the setters, and the
> existing runtime task authority checks remain in place.
>
> Add a negative sched_ext selftest which tries to call both task setter
> kfuncs from a tracing program. The test fails before this change because
> the program loads successfully, and passes after this change because the
> verifier rejects the task setter call.
>
> Signed-off-by: zhidao su (Xiaomi) <soolaugust@gmail.com>
This patch doesn't apply cleanly, it looks like it's based on old for-7.2
branch. Can you rebase it to a more recent kernel (we've moved some code
around)?
In general, the restriction makes sense to me. It may be worth clarifying in the
description that this change isn't expected to introduce any runtime overhead,
as everything is handled by the verifier.
Few comments below.
>
> ---
> Validation:
> - Before fix: task_setter_kfunc_deny failed because non-SCX BPF loaded after calling task setter kfuncs
> - After fix: runner -t task_setter_kfunc_deny passed (PASSED=1 FAILED=0)
> - checkpatch --strict: 0 errors, 1 warning (MAINTAINERS new-file prompt), 0 checks
>
> kernel/sched/ext.c | 24 ++++++++++++++++++------
> tools/testing/selftests/sched_ext/Makefile | 1 +
> tools/testing/selftests/sched_ext/task_setter_kfunc_deny.bpf.c | 17 +++++++++++++++++
> tools/testing/selftests/sched_ext/task_setter_kfunc_deny.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 113 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index b8dd3358959d..257ffaca45f6 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -10254,9 +10254,12 @@ __bpf_kfunc struct cgroup *scx_bpf_task_cgroup(struct task_struct *p,
>
> __bpf_kfunc_end_defs();
>
> -BTF_KFUNCS_START(scx_kfunc_ids_any)
> +BTF_KFUNCS_START(scx_kfunc_ids_task_setter)
> BTF_ID_FLAGS(func, scx_bpf_task_set_slice, KF_IMPLICIT_ARGS | KF_RCU);
> BTF_ID_FLAGS(func, scx_bpf_task_set_dsq_vtime, KF_IMPLICIT_ARGS | KF_RCU);
> +BTF_KFUNCS_END(scx_kfunc_ids_task_setter)
> +
> +BTF_KFUNCS_START(scx_kfunc_ids_any)
> BTF_ID_FLAGS(func, scx_bpf_kick_cpu, KF_IMPLICIT_ARGS)
> BTF_ID_FLAGS(func, scx_bpf_kick_cid, KF_IMPLICIT_ARGS)
> BTF_ID_FLAGS(func, scx_bpf_dsq_nr_queued, KF_IMPLICIT_ARGS)
> @@ -10299,6 +10302,12 @@ BTF_ID_FLAGS(func, scx_bpf_task_cgroup, KF_IMPLICIT_ARGS | KF_RCU | KF_ACQUIRE)
> #endif
> BTF_KFUNCS_END(scx_kfunc_ids_any)
>
> +static const struct btf_kfunc_id_set scx_kfunc_set_task_setter = {
> + .owner = THIS_MODULE,
> + .set = &scx_kfunc_ids_task_setter,
> + .filter = scx_kfunc_context_filter,
> +};
> +
> static const struct btf_kfunc_id_set scx_kfunc_set_any = {
> .owner = THIS_MODULE,
> .set = &scx_kfunc_ids_any,
> @@ -10408,13 +10417,14 @@ int scx_kfunc_context_filter(const struct bpf_prog *prog, u32 kfunc_id)
> bool in_dispatch = btf_id_set8_contains(&scx_kfunc_ids_dispatch, kfunc_id);
> bool in_cpu_release = btf_id_set8_contains(&scx_kfunc_ids_cpu_release, kfunc_id);
> bool in_idle = btf_id_set8_contains(&scx_kfunc_ids_idle, kfunc_id);
> + bool in_task_setter = btf_id_set8_contains(&scx_kfunc_ids_task_setter, kfunc_id);
> bool in_any = btf_id_set8_contains(&scx_kfunc_ids_any, kfunc_id);
> bool in_cpu_only = btf_id_set8_contains(&scx_kfunc_ids_cpu_only, kfunc_id);
> u32 moff, flags;
>
> /* Not an SCX kfunc - allow. */
> if (!(in_unlocked || in_init || in_select_cpu || in_enqueue || in_dispatch ||
> - in_cpu_release || in_idle || in_any))
> + in_cpu_release || in_idle || in_task_setter || in_any))
> return 0;
>
> /* SYSCALL progs (e.g. BPF test_run()) may call unlocked and select_cpu kfuncs. */
> @@ -10457,7 +10467,7 @@ int scx_kfunc_context_filter(const struct bpf_prog *prog, u32 kfunc_id)
> return -EACCES;
>
> /* SCX struct_ops: check the per-op allow list. */
> - if (in_any || in_idle)
> + if (in_any || in_idle || in_task_setter)
> return 0;
>
> moff = prog->aux->attach_st_ops_member_off;
> @@ -10572,6 +10582,12 @@ static int __init scx_init(void)
> &scx_kfunc_set_unlocked)) ||
> (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL,
> &scx_kfunc_set_unlocked)) ||
> + (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> + &scx_kfunc_set_task_setter)) ||
> + (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
> + &scx_kfunc_set_task_setter)) ||
> + (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL,
> + &scx_kfunc_set_task_setter)) ||
> (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS,
> &scx_kfunc_set_any)) ||
> (ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING,
> diff --git a/tools/testing/selftests/sched_ext/Makefile b/tools/testing/selftests/sched_ext/Makefile
> index 5d2dffca0e91..000000000000 100644
> --- a/tools/testing/selftests/sched_ext/Makefile
> +++ b/tools/testing/selftests/sched_ext/Makefile
> @@ -176,6 +176,7 @@ auto-test-targets := \
> maybe_null \
> minimal \
> non_scx_kfunc_deny \
> + task_setter_kfunc_deny \
> numa \
> allowed_cpus \
> peek_dsq \
> diff --git a/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.bpf.c b/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.bpf.c
> new file mode 100644
> index 000000000000..79b619a3445f
> --- /dev/null
> +++ b/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.bpf.c
> @@ -0,0 +1,17 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Verify that non-SCX programs cannot call task-mutating SCX kfuncs.
> + */
> +
> +#include <scx/common.bpf.h>
> +
> +SEC("tp_btf/sched_switch")
> +int BPF_PROG(non_scx_task_setter, bool preempt, struct task_struct *prev,
> + struct task_struct *next, unsigned int prev_state)
> +{
> + scx_bpf_task_set_slice(next, 1000000);
> + scx_bpf_task_set_dsq_vtime(next, 1);
This only verifies that at least one of the two calls is rejected. Once the
verifier rejects the first call, the test does not independently demonstrate
that scx_bpf_task_set_dsq_vtime() is filtered too.
Can we use two BPF progs to test each kfunc?
> + return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> diff --git a/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.c b/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.c
> new file mode 100644
> index 000000000000..1450a7133a5c
> --- /dev/null
> +++ b/tools/testing/selftests/sched_ext/task_setter_kfunc_deny.c
> @@ -0,0 +1,77 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Verify that task-mutating SCX kfuncs are rejected outside SCX contexts.
> + */
> +
> +#include <bpf/libbpf.h>
> +#include <stdarg.h>
> +#include <scx/common.h>
> +#include <stdio.h>
> +#include <string.h>
> +#include "scx_test.h"
> +#include "task_setter_kfunc_deny.bpf.skel.h"
> +
> +static char verifier_log[8192];
> +static size_t verifier_log_pos;
> +
> +static int libbpf_print_fn(enum libbpf_print_level level, const char *format, va_list args)
> +{
> + if (level == LIBBPF_DEBUG && verifier_log_pos < sizeof(verifier_log)) {
I think libbpf emits a failed program's verifier log at LIBBPF_WARN, not
LIBBPF_DEBUG. I think you should capture LIBBPF_WARN, or preferably provide a
verifier buffer directly with bpf_program__set_log_buf().
> + int ret = vsnprintf(verifier_log + verifier_log_pos,
> + sizeof(verifier_log) - verifier_log_pos, format, args);
vsnprintf() returns the required length when truncated, so
verifier_log_pos += ret may advance beyond the buffer size.
> +
> + if (ret > 0)
> + verifier_log_pos += ret;
> + }
> +
> + return 0;
> +}
> +
> +static bool expected_rejection_seen(void)
> +{
> + return strstr(verifier_log, "scx_bpf_task_set_slice") ||
> + strstr(verifier_log, "scx_bpf_task_set_dsq_vtime");
> +}
With separate progs, each load should check for its corresponding kfunc name
rather than accepting either name. That would also ensure both paths are
covered.
Thanks,
-Andrea
next prev parent reply other threads:[~2026-07-01 16:47 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-07-01 9:19 [PATCH] sched_ext: Reject task setter kfuncs outside SCX contexts zhidao su (Xiaomi)
2026-07-01 16:47 ` Andrea Righi [this message]
2026-07-01 18:26 ` Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=akVEpEArgmcEJVqA@gpd4 \
--to=arighi@nvidia.com \
--cc=bpf@vger.kernel.org \
--cc=bsegall@google.com \
--cc=changwoo@igalia.com \
--cc=christian.loehle@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=joelagnelf@nvidia.com \
--cc=juri.lelli@redhat.com \
--cc=kprateek.nayak@amd.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sched-ext@lists.linux.dev \
--cc=shuah@kernel.org \
--cc=soolaugust@gmail.com \
--cc=suzhidao@xiaomi.com \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
--cc=void@manifault.com \
--cc=vschneid@redhat.com \
--cc=yphbchou0911@gmail.com \
--cc=zhaomengmeng@kylinos.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox