public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	linux-kernel@vger.kernel.org, sched-ext@lists.linux.dev
Subject: Re: [PATCH 3/4] sched_ext: Wrap kfunc args in struct to prepare for aux__prog
Date: Tue, 7 Oct 2025 11:48:44 +0200	[thread overview]
Message-ID: <aOTh_PAGkX9y2Tsy@gpd4> (raw)
In-Reply-To: <20251007015147.2496026-4-tj@kernel.org>

Hi Tejun,

On Mon, Oct 06, 2025 at 03:51:46PM -1000, Tejun Heo wrote:
> scx_bpf_dsq_insert_vtime() and scx_bpf_select_cpu_and() currently have 5
> parameters. An upcoming change will add aux__prog parameter which will exceed
> BPF's 5 argument limit.
> 
> Prepare by adding new kfuncs __scx_bpf_dsq_insert_vtime() and
> __scx_bpf_select_cpu_and() that take args structs. The existing kfuncs are
> kept as compatibility wrappers. BPF programs use inline wrappers that detect
> kernel API version via bpf_core_type_exists() and use the new struct-based
> kfuncs when available, falling back to compat kfuncs otherwise. This allows
> BPF programs to work with both old and new kernels.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  kernel/sched/ext.c                       | 82 ++++++++++++++++++------
>  kernel/sched/ext_idle.c                  | 43 +++++++++++--
>  tools/sched_ext/include/scx/common.bpf.h |  6 +-
>  tools/sched_ext/include/scx/compat.bpf.h | 72 +++++++++++++++++++++
>  4 files changed, 173 insertions(+), 30 deletions(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 6d76efaaa9b2..a34e731229de 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -5345,54 +5345,94 @@ __bpf_kfunc void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice
>  	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags);
>  }
>  
> +static void scx_dsq_insert_vtime(struct scx_sched *sch, struct task_struct *p,
> +				 u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags)
> +{
> +	if (!scx_dsq_insert_preamble(sch, p, enq_flags))
> +		return;
> +
> +	if (slice)
> +		p->scx.slice = slice;
> +	else
> +		p->scx.slice = p->scx.slice ?: 1;
> +
> +	p->scx.dsq_vtime = vtime;
> +
> +	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags | SCX_ENQ_DSQ_PRIQ);
> +}
> +
> +struct scx_bpf_dsq_insert_vtime_args {
> +	/* @p can't be packed together as KF_RCU is not transitive */
> +	u64			dsq_id;
> +	u64			slice;
> +	u64			vtime;
> +	u64			enq_flags;
> +};

With PATCH 2/4 introducing scx_bpf_task_set_slice() and
scx_bpf_task_set_dsq_vtime(), would it be reasonable to use those to set
these task properties and then completely get rid of these args in
scx_bpf_dsq_insert[_vtime]()?

> +
>  /**
> - * scx_bpf_dsq_insert_vtime - Insert a task into the vtime priority queue of a DSQ
> + * __scx_bpf_dsq_insert_vtime - Arg-wrapped vtime DSQ insertion
>   * @p: task_struct to insert
> - * @dsq_id: DSQ to insert into
> - * @slice: duration @p can run for in nsecs, 0 to keep the current value
> - * @vtime: @p's ordering inside the vtime-sorted queue of the target DSQ
> - * @enq_flags: SCX_ENQ_*
> + * @args: struct containing the rest of the arguments
> + *       @args->dsq_id: DSQ to insert into
> + *       @args->slice: duration @p can run for in nsecs, 0 to keep the current value
> + *       @args->vtime: @p's ordering inside the vtime-sorted queue of the target DSQ
> + *       @args->enq_flags: SCX_ENQ_*
>   *
> - * Insert @p into the vtime priority queue of the DSQ identified by @dsq_id.
> - * Tasks queued into the priority queue are ordered by @vtime. All other aspects
> - * are identical to scx_bpf_dsq_insert().
> + * Wrapper kfunc that takes arguments via struct to work around BPF's 5 argument
> + * limit. BPF programs should use scx_bpf_dsq_insert_vtime() which is provided
> + * as an inline wrapper in common.bpf.h.
>   *
> - * @vtime ordering is according to time_before64() which considers wrapping. A
> - * numerically larger vtime may indicate an earlier position in the ordering and
> - * vice-versa.
> + * Insert @p into the vtime priority queue of the DSQ identified by
> + * @args->dsq_id. Tasks queued into the priority queue are ordered by
> + * @args->vtime. All other aspects are identical to scx_bpf_dsq_insert().
> + *
> + * @args->vtime ordering is according to time_before64() which considers
> + * wrapping. A numerically larger vtime may indicate an earlier position in the
> + * ordering and vice-versa.
>   *
>   * A DSQ can only be used as a FIFO or priority queue at any given time and this
>   * function must not be called on a DSQ which already has one or more FIFO tasks
>   * queued and vice-versa. Also, the built-in DSQs (SCX_DSQ_LOCAL and
>   * SCX_DSQ_GLOBAL) cannot be used as priority queues.
>   */
> -__bpf_kfunc void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id,
> -					  u64 slice, u64 vtime, u64 enq_flags)
> +__bpf_kfunc void
> +__scx_bpf_dsq_insert_vtime(struct task_struct *p,
> +			   struct scx_bpf_dsq_insert_vtime_args *args)
>  {
>  	struct scx_sched *sch;
>  
>  	guard(rcu)();
> +
>  	sch = rcu_dereference(scx_root);
>  	if (unlikely(!sch))
>  		return;
>  
> -	if (!scx_dsq_insert_preamble(sch, p, enq_flags))
> -		return;
> +	scx_dsq_insert_vtime(sch, p, args->dsq_id, args->slice, args->vtime,
> +			     args->enq_flags);
> +}
>  
> -	if (slice)
> -		p->scx.slice = slice;
> -	else
> -		p->scx.slice = p->scx.slice ?: 1;
> +/*
> + * COMPAT: Will be removed in v6.23.
> + */
> +__bpf_kfunc void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id,
> +					  u64 slice, u64 vtime, u64 enq_flags)
> +{
> +	struct scx_sched *sch;
>  
> -	p->scx.dsq_vtime = vtime;
> +	guard(rcu)();
>  
> -	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags | SCX_ENQ_DSQ_PRIQ);
> +	sch = rcu_dereference(scx_root);
> +	if (unlikely(!sch))
> +		return;
> +
> +	scx_dsq_insert_vtime(sch, p, dsq_id, slice, vtime, enq_flags);
>  }
>  
>  __bpf_kfunc_end_defs();
>  
>  BTF_KFUNCS_START(scx_kfunc_ids_enqueue_dispatch)
>  BTF_ID_FLAGS(func, scx_bpf_dsq_insert, KF_RCU)
> +BTF_ID_FLAGS(func, __scx_bpf_dsq_insert_vtime, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_dsq_insert_vtime, KF_RCU)
>  BTF_KFUNCS_END(scx_kfunc_ids_enqueue_dispatch)
>  
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index d2434c954848..3d9d404d5cd2 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -995,26 +995,56 @@ __bpf_kfunc s32 scx_bpf_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
>  	return prev_cpu;
>  }
>  
> +struct scx_bpf_select_cpu_and_args {
> +	/* @p and @cpus_allowed can't be packed together as KF_RCU is not transitive */
> +	s32			prev_cpu;
> +	u64			wake_flags;
> +	u64			flags;
> +};

And for this one, would it make sense to pack flags and wake_flags in a
single u64?

Thanks,
-Andrea

  reply	other threads:[~2025-10-07  9:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-07  1:51 [PATCHSET sched_ext/for-6.19] sched_ext: Misc changes with some prep patches for sub-sched support Tejun Heo
2025-10-07  1:51 ` [PATCH 1/4] tools/sched_ext: Strip compatibility macros for cgroup and dispatch APIs Tejun Heo
2025-10-07  2:42   ` Emil Tsalapatis
2025-10-07  9:42   ` Andrea Righi
2025-10-07 16:22   ` Changwoo Min
2025-10-07  1:51 ` [PATCH 2/4] sched_ext: Add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() Tejun Heo
2025-10-07  2:56   ` Emil Tsalapatis
2025-10-07 18:09     ` Tejun Heo
2025-10-07  9:34   ` Andrea Righi
2025-10-07 18:09     ` Tejun Heo
2025-10-07 16:28   ` Changwoo Min
2025-10-07 18:11     ` Tejun Heo
2025-10-07  1:51 ` [PATCH 3/4] sched_ext: Wrap kfunc args in struct to prepare for aux__prog Tejun Heo
2025-10-07  9:48   ` Andrea Righi [this message]
2025-10-07 18:24     ` Tejun Heo
2025-10-07 18:37       ` Andrea Righi
2025-10-07 16:04   ` Emil Tsalapatis
2025-10-07 16:38   ` Changwoo Min
2025-10-07  1:51 ` [PATCH 4/4] sched_ext: Make scx_bpf_dsq_insert*() return bool Tejun Heo
2025-10-07  4:28   ` Emil Tsalapatis
2025-10-07  9:41   ` Andrea Righi
2025-10-07 15:03     ` Emil Tsalapatis
2025-10-07 17:25       ` Andrea Righi
2025-10-07 16:47   ` Changwoo Min
2025-10-07 19:06 ` [PATCH 5/4] sched_ext/tools: Add compat wrapper for scx_bpf_task_set_slice/dsq_vtime() Tejun Heo
2025-10-13 18:53 ` [PATCHSET sched_ext/for-6.19] sched_ext: Misc changes with some prep patches for sub-sched support Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOTh_PAGkX9y2Tsy@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox