From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: sched-ext@lists.linux.dev, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Cheng-Yang Chou <yphbchou0911@gmail.com>,
Juntong Deng <juntong.deng@outlook.com>,
Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
Chia-Ping Tsai <chia7712@gmail.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs
Date: Fri, 10 Apr 2026 18:04:15 +0200 [thread overview]
Message-ID: <adkffz6gR0vGKgqX@gpd4> (raw)
In-Reply-To: <20260410063046.3556100-2-tj@kernel.org>
Hi Tejun,
On Thu, Apr 09, 2026 at 08:30:37PM -1000, Tejun Heo wrote:
> The select_cpu kfuncs - scx_bpf_select_cpu_dfl(), scx_bpf_select_cpu_and()
> and __scx_bpf_select_cpu_and() - take task_rq_lock() internally. Exposing
> them via scx_kfunc_set_idle to BPF_PROG_TYPE_TRACING is unsafe: arbitrary
> tracing contexts (kprobes, tracepoints, fentry, LSM) may run with @p's
> pi_lock state unknown.
>
> Move them out of scx_kfunc_ids_idle into a new scx_kfunc_ids_select_cpu
> set registered only for STRUCT_OPS and SYSCALL.
In addition of being unsafe it also makes sense from a logical perspective to
not "allocate" idle CPUs from a BPF_PROG_TYPE_TRACING context.
>
> Extracted from a larger verifier-time kfunc context filter patch
> originally written by Juntong Deng.
>
> Original-patch-by: Juntong Deng <juntong.deng@outlook.com>
> Cc: Cheng-Yang Chou <yphbchou0911@gmail.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
> ---
> kernel/sched/ext_idle.c | 25 +++++++++++++++++++++----
> 1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index ecf7e09b54ae..cd88aee47bd8 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -1469,9 +1469,6 @@ BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_KFUNCS_END(scx_kfunc_ids_idle)
>
> static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> @@ -1479,13 +1476,33 @@ static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> .set = &scx_kfunc_ids_idle,
> };
>
> +/*
> + * The select_cpu kfuncs internally call task_rq_lock() when invoked from an
> + * rq-unlocked context, and thus cannot be safely called from arbitrary tracing
> + * contexts where @p's pi_lock state is unknown. Keep them out of
> + * BPF_PROG_TYPE_TRACING by registering them in their own set which is exposed
> + * only to STRUCT_OPS and SYSCALL programs.
> + */
> +BTF_KFUNCS_START(scx_kfunc_ids_select_cpu)
> +BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_KFUNCS_END(scx_kfunc_ids_select_cpu)
> +
> +static const struct btf_kfunc_id_set scx_kfunc_set_select_cpu = {
> + .owner = THIS_MODULE,
> + .set = &scx_kfunc_ids_select_cpu,
> +};
> +
> int scx_idle_init(void)
> {
> int ret;
>
> ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_idle) ||
> register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &scx_kfunc_set_idle) ||
> - register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle);
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle) ||
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_select_cpu) ||
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_select_cpu);
>
> return ret;
> }
> --
> 2.53.0
>
next prev parent reply other threads:[~2026-04-10 16:04 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 6:30 [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10 6:30 ` [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs Tejun Heo
2026-04-10 16:04 ` Andrea Righi [this message]
2026-04-10 6:30 ` [PATCH 02/10] sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked Tejun Heo
2026-04-10 16:07 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 03/10] sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask Tejun Heo
2026-04-10 16:12 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 04/10] sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking Tejun Heo
2026-04-10 16:16 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 05/10] sched_ext: Decouple kfunc unlocked-context check from kf_mask Tejun Heo
2026-04-10 16:34 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 06/10] sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup() Tejun Heo
2026-04-10 16:36 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 07/10] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10 16:49 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 08/10] sched_ext: Remove runtime kfunc mask enforcement Tejun Heo
2026-04-10 16:50 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 09/10] sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok() Tejun Heo
2026-04-10 16:55 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 10/10] sched_ext: Warn on task-based SCX op recursion Tejun Heo
2026-04-10 17:38 ` Andrea Righi
2026-04-10 17:45 ` [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Andrea Righi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adkffz6gR0vGKgqX@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=chia7712@gmail.com \
--cc=emil@etsalapatis.com \
--cc=jserv@ccns.ncku.edu.tw \
--cc=juntong.deng@outlook.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox