public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: sched-ext@lists.linux.dev, David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	Cheng-Yang Chou <yphbchou0911@gmail.com>,
	Juntong Deng <juntong.deng@outlook.com>,
	Ching-Chun Huang <jserv@ccns.ncku.edu.tw>,
	Chia-Ping Tsai <chia7712@gmail.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs
Date: Fri, 10 Apr 2026 18:04:15 +0200	[thread overview]
Message-ID: <adkffz6gR0vGKgqX@gpd4> (raw)
In-Reply-To: <20260410063046.3556100-2-tj@kernel.org>

Hi Tejun,

On Thu, Apr 09, 2026 at 08:30:37PM -1000, Tejun Heo wrote:
> The select_cpu kfuncs - scx_bpf_select_cpu_dfl(), scx_bpf_select_cpu_and()
> and __scx_bpf_select_cpu_and() - take task_rq_lock() internally. Exposing
> them via scx_kfunc_set_idle to BPF_PROG_TYPE_TRACING is unsafe: arbitrary
> tracing contexts (kprobes, tracepoints, fentry, LSM) may run with @p's
> pi_lock state unknown.
> 
> Move them out of scx_kfunc_ids_idle into a new scx_kfunc_ids_select_cpu
> set registered only for STRUCT_OPS and SYSCALL.

In addition of being unsafe it also makes sense from a logical perspective to
not "allocate" idle CPUs from a BPF_PROG_TYPE_TRACING context.

> 
> Extracted from a larger verifier-time kfunc context filter patch
> originally written by Juntong Deng.
> 
> Original-patch-by: Juntong Deng <juntong.deng@outlook.com>
> Cc: Cheng-Yang Chou <yphbchou0911@gmail.com>
> Signed-off-by: Tejun Heo <tj@kernel.org>

Reviewed-by: Andrea Righi <arighi@nvidia.com>

Thanks,
-Andrea

> ---
>  kernel/sched/ext_idle.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index ecf7e09b54ae..cd88aee47bd8 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -1469,9 +1469,6 @@ BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_IMPLICIT_ARGS | KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu_node, KF_IMPLICIT_ARGS | KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
>  BTF_KFUNCS_END(scx_kfunc_ids_idle)
>  
>  static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> @@ -1479,13 +1476,33 @@ static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
>  	.set			= &scx_kfunc_ids_idle,
>  };
>  
> +/*
> + * The select_cpu kfuncs internally call task_rq_lock() when invoked from an
> + * rq-unlocked context, and thus cannot be safely called from arbitrary tracing
> + * contexts where @p's pi_lock state is unknown. Keep them out of
> + * BPF_PROG_TYPE_TRACING by registering them in their own set which is exposed
> + * only to STRUCT_OPS and SYSCALL programs.
> + */
> +BTF_KFUNCS_START(scx_kfunc_ids_select_cpu)
> +BTF_ID_FLAGS(func, __scx_bpf_select_cpu_and, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_IMPLICIT_ARGS | KF_RCU)
> +BTF_KFUNCS_END(scx_kfunc_ids_select_cpu)
> +
> +static const struct btf_kfunc_id_set scx_kfunc_set_select_cpu = {
> +	.owner			= THIS_MODULE,
> +	.set			= &scx_kfunc_ids_select_cpu,
> +};
> +
>  int scx_idle_init(void)
>  {
>  	int ret;
>  
>  	ret = register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_idle) ||
>  	      register_btf_kfunc_id_set(BPF_PROG_TYPE_TRACING, &scx_kfunc_set_idle) ||
> -	      register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle);
> +	      register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_idle) ||
> +	      register_btf_kfunc_id_set(BPF_PROG_TYPE_STRUCT_OPS, &scx_kfunc_set_select_cpu) ||
> +	      register_btf_kfunc_id_set(BPF_PROG_TYPE_SYSCALL, &scx_kfunc_set_select_cpu);
>  
>  	return ret;
>  }
> -- 
> 2.53.0
> 

  reply	other threads:[~2026-04-10 16:04 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  6:30 [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10  6:30 ` [PATCH 01/10] sched_ext: Drop TRACING access to select_cpu kfuncs Tejun Heo
2026-04-10 16:04   ` Andrea Righi [this message]
2026-04-10  6:30 ` [PATCH 02/10] sched_ext: Add select_cpu kfuncs to scx_kfunc_ids_unlocked Tejun Heo
2026-04-10 16:07   ` Andrea Righi
2026-04-10 17:51   ` [PATCH v2 " Tejun Heo
2026-04-10  6:30 ` [PATCH 03/10] sched_ext: Track @p's rq lock across set_cpus_allowed_scx -> ops.set_cpumask Tejun Heo
2026-04-10 16:12   ` Andrea Righi
2026-04-10 17:51   ` [PATCH v2 " Tejun Heo
2026-04-10  6:30 ` [PATCH 04/10] sched_ext: Fix ops.cgroup_move() invocation kf_mask and rq tracking Tejun Heo
2026-04-10 16:16   ` Andrea Righi
2026-04-10 17:51   ` [PATCH v2 " Tejun Heo
2026-04-10  6:30 ` [PATCH 05/10] sched_ext: Decouple kfunc unlocked-context check from kf_mask Tejun Heo
2026-04-10 16:34   ` Andrea Righi
2026-04-10 17:51   ` [PATCH v2 " Tejun Heo
2026-04-10  6:30 ` [PATCH 06/10] sched_ext: Drop redundant rq-locked check from scx_bpf_task_cgroup() Tejun Heo
2026-04-10 16:36   ` Andrea Righi
2026-04-10  6:30 ` [PATCH 07/10] sched_ext: Add verifier-time kfunc context filter Tejun Heo
2026-04-10 16:49   ` Andrea Righi
2026-04-10  6:30 ` [PATCH 08/10] sched_ext: Remove runtime kfunc mask enforcement Tejun Heo
2026-04-10 16:50   ` Andrea Righi
2026-04-10  6:30 ` [PATCH 09/10] sched_ext: Rename scx_kf_allowed_on_arg_tasks() to scx_kf_arg_task_ok() Tejun Heo
2026-04-10 16:55   ` Andrea Righi
2026-04-10  6:30 ` [PATCH 10/10] sched_ext: Warn on task-based SCX op recursion Tejun Heo
2026-04-10 17:38   ` Andrea Righi
2026-04-10 17:45 ` [PATCHSET sched_ext/for-7.1] sched_ext: Add verifier-time kfunc context filter Andrea Righi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adkffz6gR0vGKgqX@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=chia7712@gmail.com \
    --cc=emil@etsalapatis.com \
    --cc=jserv@ccns.ncku.edu.tw \
    --cc=juntong.deng@outlook.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    --cc=yphbchou0911@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox