From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: sched-ext@lists.linux.dev, David Vernet <void@manifault.com>,
Changwoo Min <changwoo@igalia.com>,
Cheng-Yang Chou <yphbchou0911@gmail.com>,
Juntong Deng <juntong.deng@outlook.com>,
Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
Chia-Ping Tsai <chia7712@gmail.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs
Date: Fri, 10 Apr 2026 18:04:15 +0200 [thread overview]
Message-ID: <adkffz6gR0vGKgqX@gpd4> (raw)
In-Reply-To: <20260410063046.3556100-2-tj@kernel.org>
Hi Tejun,
On Thu, Apr 09, 2026 at 08:30:37PM -1000, Tejun Heo wrote:
> The select_cpu kfuncs - scx_bpf_select_cpu_dfl(), scx_bpf_select_cpu_and()
> and __scx_bpf_select_cpu_and() - take task_rq_lock() internally. Exposing
> them via scx_kfunc_set_idle to BPF_PROG_TYPE_TRACING is unsafe: arbitrary
> tracing contexts (kprobes, tracepoints, fentry, LSM) may run with @p's
> pi_lock state unknown.
>
> Move them out of scx_kfunc_ids_idle into a new scx_kfunc_ids_select_cpu
> set registered only for STRUCT_OPS and SYSCALL.
In addition of being unsafe it also makes sense from a logical perspective to
not "allocate" idle CPUs from a BPF_PROG_TYPE_TRACING context.
>
> Extracted from a larger verifier-time kfunc context filter patch
> originally written by Juntong Deng.
>
> Original-patch-by: Juntong Deng <juntong.deng@outlook.com>
> Cc: Cheng-Yang Chou <yphbchou0911@gmail.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>
Reviewed-by: Andrea Righi <arighi@nvidia.com>
Thanks,
-Andrea
> ---
> kernel/sched/ext_idle.c | 25 +++++++++++++++++++++----
> 1 file changed, 21 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index ecf7e09b54ae..cd88aee47bd8 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -1469,9 +1469,6 @@ BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
> BTF_KFUNCS_END(scx_kfunc_ids_idle)
>
> static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> @@ -1479,13 +1476,33 @@ static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> .set = &scx_kfunc_ids_idle,
> };
>
> +/*
> + * The select_cpu kfuncs internally call task_rq_lock() when invoked from an
> + * rq-unlocked context, and thus cannot be safely called from arbitrary tracing
> + * contexts where @p's pi_lock state is unknown. Keep them out of
> + * BPF_PROG_TYPE_TRACING by registering them in their own set which is exposed
> + * only to STRUCT_OPS and SYSCALL programs.
> + */
> +BTF_KFUNCS_START(scx_kfunc_ids_select_cpu)
> +BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_KFUNCS_END(scx_kfunc_ids_select_cpu)
> +
> +static const struct btf_kfunc_id_set scx_kfunc_set_select_cpu = {
> + .owner = THIS_MODULE,
> + .set = &scx_kfunc_ids_select_cpu,
> +};
> +
> int scx_idle_init(void)
> {
> int ret;
>
> ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_idle) ||
> register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &scx_kfunc_set_idle) ||
> - register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle);
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle) ||
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_select_cpu) ||
> + register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_select_cpu);
>
> return ret;
> }
> --
> 2.53.0
>
next prev parent reply other threads:[~2026-04-10 16:04 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-10 6:30 [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10 6:30 ` [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs Tejun Heo
2026-04-10 16:04 ` Andrea Righi [this message]
2026-04-10 6:30 ` [PATCH 02/10] sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked Tejun Heo
2026-04-10 16:07 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 03/10] sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask Tejun Heo
2026-04-10 16:12 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 04/10] sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking Tejun Heo
2026-04-10 16:16 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 05/10] sched_ext: Decouple kfunc unlocked-context check from kf_mask Tejun Heo
2026-04-10 16:34 ` Andrea Righi
2026-04-10 17:51 ` [PATCH v2 " Tejun Heo
2026-04-10 6:30 ` [PATCH 06/10] sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup() Tejun Heo
2026-04-10 16:36 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 07/10] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10 16:49 ` Andrea Righi
2026-04-14 12:38 ` Cheng-Yang Chou
2026-04-14 17:25 ` Tejun Heo
2026-04-10 6:30 ` [PATCH 08/10] sched_ext: Remove runtime kfunc mask enforcement Tejun Heo
2026-04-10 16:50 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 09/10] sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok() Tejun Heo
2026-04-10 16:55 ` Andrea Righi
2026-04-10 6:30 ` [PATCH 10/10] sched_ext: Warn on task-based SCX op recursion Tejun Heo
2026-04-10 17:38 ` Andrea Righi
2026-04-10 17:45 ` [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Andrea Righi
2026-04-11 6:17 ` Cheng-Yang Chou
2026-04-11 7:41 ` Tejun Heo
2026-04-11 15:09 ` Cheng-Yang Chou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adkffz6gR0vGKgqX@gpd4 \
--to=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=chia7712@gmail.com \
--cc=emil@etsalapatis.com \
--cc=jserv@ccns.ncku.edu.tw \
--cc=juntong.deng@outlook.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=tj@kernel.org \
--cc=void@manifault.com \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.