public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrea Righi <arighi@nvidia.com>
To: Ryan Newton <rrnewton@gmail.com>
Cc: linux-kernel@vger.kernel.org, tj@kernel.org, newton@meta.com
Subject: Re: [PATCH 2/3] sched_ext: optimize first_task update logic
Date: Thu, 2 Oct 2025 09:27:12 +0200	[thread overview]
Message-ID: <aN4pUAfE30rF6-n4@gpd4> (raw)
In-Reply-To: <20251002025722.3420916-3-rrnewton@gmail.com>

On Wed, Oct 01, 2025 at 10:57:20PM -0400, Ryan Newton wrote:
> From: Ryan Newton <newton@meta.com>
> 
> This is a follow-on optimization to the prior commit which added a
> lockless peek operation on DSQs. That implementation is correct and
> simple, but elides several optimizations.
> 
> Previously, we read the first_task using the same slowpath, irrespective
> of where we enqueue the task. With this change, we instead base the
> update on what we know about the calling context. On both insert and
> removal we can break down whether the change (1) definitely, (2) never,
> or (3) sometimes changes first task. In some cases we know what the new
> first task will be, and can set it more directly.
> 
> Signed-off-by: Ryan Newton <newton@meta.com>
> ---
>  kernel/sched/ext.c | 26 ++++++++++++++++++++------
>  1 file changed, 20 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index fd0121c03311..1cb10aa9913a 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -953,8 +953,11 @@ static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq,
>  				container_of(rbp, struct task_struct,
>  					     scx.dsq_priq);
>  			list_add(&p->scx.dsq_list.node, &prev->scx.dsq_list.node);
> +			/* first task unchanged - no update needed */
>  		} else {
>  			list_add(&p->scx.dsq_list.node, &dsq->list);
> +			/* new task is at head - use fastpath */
> +			rcu_assign_pointer(dsq->first_task, p);
>  		}
>  	} else {
>  		/* a FIFO DSQ shouldn't be using PRIQ enqueuing */
> @@ -962,15 +965,20 @@ static void dispatch_enqueue(struct scx_sched *sch, struct scx_dispatch_q *dsq,
>  			scx_error(sch, "DSQ ID 0x%016llx already had PRIQ-enqueued tasks",
>  				  dsq->id);
>  
> -		if (enq_flags & (SCX_ENQ_HEAD | SCX_ENQ_PREEMPT))
> +		if (enq_flags & (SCX_ENQ_HEAD | SCX_ENQ_PREEMPT)) {
>  			list_add(&p->scx.dsq_list.node, &dsq->list);
> -		else
> +			/* new task inserted at head - use fastpath */
> +			rcu_assign_pointer(dsq->first_task, p);
> +		} else {
> +			bool was_empty;
> +
> +			was_empty = list_empty(&dsq->list);
>  			list_add_tail(&p->scx.dsq_list.node, &dsq->list);
> +			if (was_empty)
> +				rcu_assign_pointer(dsq->first_task, p);
> +		}
>  	}
>  
> -	/* even the add_tail code path may have changed the first element */
> -	dsq_update_first_task(dsq);
> -
>  	/* seq records the order tasks are queued, used by BPF DSQ iterator */
>  	dsq->seq++;
>  	p->scx.dsq_seq = dsq->seq;
> @@ -1023,9 +1031,15 @@ static void task_unlink_from_dsq(struct task_struct *p,
>  		p->scx.dsq_flags &= ~SCX_TASK_DSQ_ON_PRIQ;
>  	}
>  
> +	if (dsq->first_task == p) {
> +		if (dsq->id & SCX_DSQ_FLAG_BUILTIN)
> +			rcu_assign_pointer(dsq->first_task,
> +			  list_next_entry(p, scx.dsq_list.node));

nit: no need to split in two lines, it should fit in the 100 characters per
line limit.

> +		else
> +			dsq_update_first_task(dsq);
> +	}

However, from my comment in PATCH 1/3, if we allow to use
scx_bpf_dsq_peek() only with user DSQs this would become:

if (!(dsq->id & SCX_DSQ_FLAG_BUILTIN) && dsq->first_task == p)
	dsq_update_first_task(dsq);

>  	list_del_init(&p->scx.dsq_list.node);
>  	dsq_mod_nr(dsq, -1);
> -	dsq_update_first_task(dsq);
>  }
>  
>  static void dispatch_dequeue(struct rq *rq, struct task_struct *p)
> -- 
> 2.51.0
> 

Thanks,
-Andrea

  reply	other threads:[~2025-10-02  7:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-02  2:57 [PATCH 0/3] sched_ext: lockless peek operation for DSQs Ryan Newton
2025-10-02  2:57 ` [PATCH 1/3] sched_ext: Add " Ryan Newton
2025-10-02  7:13   ` Andrea Righi
2025-10-02  2:57 ` [PATCH 2/3] sched_ext: optimize first_task update logic Ryan Newton
2025-10-02  7:27   ` Andrea Righi [this message]
2025-10-02  2:57 ` [PATCH 3/3] sched_ext: Add a selftest for scx_bpf_dsq_peek Ryan Newton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aN4pUAfE30rF6-n4@gpd4 \
    --to=arighi@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=newton@meta.com \
    --cc=rrnewton@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox