All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Tejun Heo <tj@kernel.org>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	linux-kernel@vger.kernel.org, sched-ext@lists.linux.dev
Subject: Re: [PATCH 3/4] sched_ext: Wrap kfunc args in struct to prepare for aux__prog
Date: Tue, 7 Oct 2025 11:48:44 +0200	[thread overview]
Message-ID: <aOTh_PAGkX9y2Tsy@gpd4> (raw)
In-Reply-To: <20251007015147.2496026-4-tj@kernel.org>

Hi Tejun,

On Mon, Oct 06, 2025 at 03:51:46PM -1000, Tejun Heo wrote:
> scx_bpf_dsq_insert_vtime() and scx_bpf_select_cpu_and() currently have 5
> parameters. An upcoming change will add aux__prog parameter which will exceed
> BPF's 5 argument limit.
> 
> Prepare by adding new kfuncs __scx_bpf_dsq_insert_vtime() and
> __scx_bpf_select_cpu_and() that take args structs. The existing kfuncs are
> kept as compatibility wrappers. BPF programs use inline wrappers that detect
> kernel API version via bpf_core_type_exists() and use the new struct-based
> kfuncs when available, falling back to compat kfuncs otherwise. This allows
> BPF programs to work with both old and new kernels.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> ---
>  kernel/sched/ext.c                       | 82 ++++++++++++++++++------
>  kernel/sched/ext_idle.c                  | 43 +++++++++++--
>  tools/sched_ext/include/scx/common.bpf.h |  6 +-
>  tools/sched_ext/include/scx/compat.bpf.h | 72 +++++++++++++++++++++
>  4 files changed, 173 insertions(+), 30 deletions(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 6d76efaaa9b2..a34e731229de 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -5345,54 +5345,94 @@ __bpf_kfunc void scx_bpf_dsq_insert(struct task_struct *p, u64 dsq_id, u64 slice
>  	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags);
>  }
>  
> +static void scx_dsq_insert_vtime(struct scx_sched *sch, struct task_struct *p,
> +				 u64 dsq_id, u64 slice, u64 vtime, u64 enq_flags)
> +{
> +	if (!scx_dsq_insert_preamble(sch, p, enq_flags))
> +		return;
> +
> +	if (slice)
> +		p->scx.slice = slice;
> +	else
> +		p->scx.slice = p->scx.slice ?: 1;
> +
> +	p->scx.dsq_vtime = vtime;
> +
> +	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags | SCX_ENQ_DSQ_PRIQ);
> +}
> +
> +struct scx_bpf_dsq_insert_vtime_args {
> +	/* @p can't be packed together as KF_RCU is not transitive */
> +	u64			dsq_id;
> +	u64			slice;
> +	u64			vtime;
> +	u64			enq_flags;
> +};

With PATCH 2/4 introducing scx_bpf_task_set_slice() and
scx_bpf_task_set_dsq_vtime(), would it be reasonable to use those to set
these task properties and then completely get rid of these args in
scx_bpf_dsq_insert[_vtime]()?

> +
>  /**
> - * scx_bpf_dsq_insert_vtime - Insert a task into the vtime priority queue of a DSQ
> + * __scx_bpf_dsq_insert_vtime - Arg-wrapped vtime DSQ insertion
>   * @p: task_struct to insert
> - * @dsq_id: DSQ to insert into
> - * @slice: duration @p can run for in nsecs, 0 to keep the current value
> - * @vtime: @p's ordering inside the vtime-sorted queue of the target DSQ
> - * @enq_flags: SCX_ENQ_*
> + * @args: struct containing the rest of the arguments
> + *       @args->dsq_id: DSQ to insert into
> + *       @args->slice: duration @p can run for in nsecs, 0 to keep the current value
> + *       @args->vtime: @p's ordering inside the vtime-sorted queue of the target DSQ
> + *       @args->enq_flags: SCX_ENQ_*
>   *
> - * Insert @p into the vtime priority queue of the DSQ identified by @dsq_id.
> - * Tasks queued into the priority queue are ordered by @vtime. All other aspects
> - * are identical to scx_bpf_dsq_insert().
> + * Wrapper kfunc that takes arguments via struct to work around BPF's 5 argument
> + * limit. BPF programs should use scx_bpf_dsq_insert_vtime() which is provided
> + * as an inline wrapper in common.bpf.h.
>   *
> - * @vtime ordering is according to time_before64() which considers wrapping. A
> - * numerically larger vtime may indicate an earlier position in the ordering and
> - * vice-versa.
> + * Insert @p into the vtime priority queue of the DSQ identified by
> + * @args->dsq_id. Tasks queued into the priority queue are ordered by
> + * @args->vtime. All other aspects are identical to scx_bpf_dsq_insert().
> + *
> + * @args->vtime ordering is according to time_before64() which considers
> + * wrapping. A numerically larger vtime may indicate an earlier position in the
> + * ordering and vice-versa.
>   *
>   * A DSQ can only be used as a FIFO or priority queue at any given time and this
>   * function must not be called on a DSQ which already has one or more FIFO tasks
>   * queued and vice-versa. Also, the built-in DSQs (SCX_DSQ_LOCAL and
>   * SCX_DSQ_GLOBAL) cannot be used as priority queues.
>   */
> -__bpf_kfunc void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id,
> -					  u64 slice, u64 vtime, u64 enq_flags)
> +__bpf_kfunc void
> +__scx_bpf_dsq_insert_vtime(struct task_struct *p,
> +			   struct scx_bpf_dsq_insert_vtime_args *args)
>  {
>  	struct scx_sched *sch;
>  
>  	guard(rcu)();
> +
>  	sch = rcu_dereference(scx_root);
>  	if (unlikely(!sch))
>  		return;
>  
> -	if (!scx_dsq_insert_preamble(sch, p, enq_flags))
> -		return;
> +	scx_dsq_insert_vtime(sch, p, args->dsq_id, args->slice, args->vtime,
> +			     args->enq_flags);
> +}
>  
> -	if (slice)
> -		p->scx.slice = slice;
> -	else
> -		p->scx.slice = p->scx.slice ?: 1;
> +/*
> + * COMPAT: Will be removed in v6.23.
> + */
> +__bpf_kfunc void scx_bpf_dsq_insert_vtime(struct task_struct *p, u64 dsq_id,
> +					  u64 slice, u64 vtime, u64 enq_flags)
> +{
> +	struct scx_sched *sch;
>  
> -	p->scx.dsq_vtime = vtime;
> +	guard(rcu)();
>  
> -	scx_dsq_insert_commit(sch, p, dsq_id, enq_flags | SCX_ENQ_DSQ_PRIQ);
> +	sch = rcu_dereference(scx_root);
> +	if (unlikely(!sch))
> +		return;
> +
> +	scx_dsq_insert_vtime(sch, p, dsq_id, slice, vtime, enq_flags);
>  }
>  
>  __bpf_kfunc_end_defs();
>  
>  BTF_KFUNCS_START(scx_kfunc_ids_enqueue_dispatch)
>  BTF_ID_FLAGS(func, scx_bpf_dsq_insert, KF_RCU)
> +BTF_ID_FLAGS(func, __scx_bpf_dsq_insert_vtime, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_dsq_insert_vtime, KF_RCU)
>  BTF_KFUNCS_END(scx_kfunc_ids_enqueue_dispatch)
>  
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index d2434c954848..3d9d404d5cd2 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -995,26 +995,56 @@ __bpf_kfunc s32 scx_bpf_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
>  	return prev_cpu;
>  }
>  
> +struct scx_bpf_select_cpu_and_args {
> +	/* @p and @cpus_allowed can't be packed together as KF_RCU is not transitive */
> +	s32			prev_cpu;
> +	u64			wake_flags;
> +	u64			flags;
> +};

And for this one, would it make sense to pack flags and wake_flags in a
single u64?

Thanks,
-Andrea

  reply	other threads:[~2025-10-07  9:49 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-07  1:51 [PATCHSET sched_ext/for-6.19] sched_ext: Misc changes with some prep patches for sub-sched support Tejun Heo
2025-10-07  1:51 ` [PATCH 1/4] tools/sched_ext: Strip compatibility macros for cgroup and dispatch APIs Tejun Heo
2025-10-07  2:42   ` Emil Tsalapatis
2025-10-07  9:42   ` Andrea Righi
2025-10-07 16:22   ` Changwoo Min
2025-10-07  1:51 ` [PATCH 2/4] sched_ext: Add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() Tejun Heo
2025-10-07  2:56   ` Emil Tsalapatis
2025-10-07 18:09     ` Tejun Heo
2025-10-07  9:34   ` Andrea Righi
2025-10-07 18:09     ` Tejun Heo
2025-10-07 16:28   ` Changwoo Min
2025-10-07 18:11     ` Tejun Heo
2025-10-07  1:51 ` [PATCH 3/4] sched_ext: Wrap kfunc args in struct to prepare for aux__prog Tejun Heo
2025-10-07  9:48   ` Andrea Righi [this message]
2025-10-07 18:24     ` Tejun Heo
2025-10-07 18:37       ` Andrea Righi
2025-10-07 16:04   ` Emil Tsalapatis
2025-10-07 16:38   ` Changwoo Min
2025-10-07  1:51 ` [PATCH 4/4] sched_ext: Make scx_bpf_dsq_insert*() return bool Tejun Heo
2025-10-07  4:28   ` Emil Tsalapatis
2025-10-07  9:41   ` Andrea Righi
2025-10-07 15:03     ` Emil Tsalapatis
2025-10-07 17:25       ` Andrea Righi
2025-10-07 16:47   ` Changwoo Min
2025-10-07 19:06 ` [PATCH 5/4] sched_ext/tools: Add compat wrapper for scx_bpf_task_set_slice/dsq_vtime() Tejun Heo
2025-10-13 18:53 ` [PATCHSET sched_ext/for-6.19] sched_ext: Misc changes with some prep patches for sub-sched support Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOTh_PAGkX9y2Tsy@gpd4 \
    --to=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=tj@kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.