Sched_ext development
 help / color / mirror / Atom feed
From: sashiko-bot@kernel.org
To: "Andrea Righi" <arighi@nvidia.com>
Cc: sched-ext@lists.linux.dev
Subject: Re: [PATCH 7.0.y] sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()
Date: Thu, 14 May 2026 05:02:20 +0000	[thread overview]
Message-ID: <20260514050221.56321C2BCB7@smtp.kernel.org> (raw)
In-Reply-To: <20260513130111.689740-1-arighi@nvidia.com>

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] Unbounded lock hold time with IRQs disabled when skipping tasks in `bypass_lb_cpu()`.
- [Low] Obsolete comment contradicts the new task_rq lock constraints.
--

commit 68327c3a383140f9157a96a24a7b23be2c135ac5
Author: Tejun Heo <tj@kernel.org>

sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()

This commit updates bypass_lb_cpu() to skip tasks whose task_rq() does not
match the donor rq. These tasks were placed on the donor DSQ from a different
CPU but haven't been consumed yet, preventing their migration without the
proper rq lock.

> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c

[ ... ]

>  	while ((p = n)) {
>  		struct rq *donee_rq;

[ ... ]

> +		/*
> +		 * If an earlier pass placed @p on @donor_dsq from a different
> +		 * CPU and the donee hasn't consumed it yet, @p is still on the
> +		 * previous CPU and task_rq(@p) != @rq. @p can't be moved
> +		 * without its rq locked. Skip.
> +		 */
> +		if (task_rq(p) != rq)
> +			continue;

Because this uses continue, does it bypass the lock-yielding check at the
bottom of the loop (if (!(nr_balanced % SCX_BYPASS_LB_BATCH)))?

If there is a large batch of unconsumed tasks left over from previous passes,
could skipping them without incrementing nr_balanced result in an unbounded
loop?

Traversing a long list here while holding the rq lock and donor_dsq lock
with IRQs disabled could potentially cause latency spikes or trigger the
soft lockup watchdog.

>  		donee = cpumask_any_and_distribute(donee_mask, p->cpus_ptr);
>  		if (donee >= nr_cpu_ids)
>  			continue;
>  
>  		donee_rq = cpu_rq(donee);
>  		donee_dsq = &donee_rq->scx.bypass_dsq;
>  
>  		/*
>  		 * $p's rq is not locked but $p's DSQ lock protects its
>  		 * scheduling properties making this test safe.
>  		 */

This isn't a bug, but is this comment still accurate?

Since the function initially acquires raw_spin_rq_lock_irq(rq) and the
newly added check guarantees task_rq(p) == rq, it appears the task's rq is
actually locked when it reaches this point.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260513130111.689740-1-arighi@nvidia.com?part=1

  reply	other threads:[~2026-05-14  5:02 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13 13:01 [PATCH 7.0.y] sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu() Andrea Righi
2026-05-14  5:02 ` sashiko-bot [this message]
2026-05-15  9:18 ` Patch "sched_ext: Skip tasks with stale task_rq in bypass_lb_cpu()" has been added to the 7.0-stable tree gregkh
2026-05-15  9:39   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260514050221.56321C2BCB7@smtp.kernel.org \
    --to=sashiko-bot@kernel.org \
    --cc=arighi@nvidia.com \
    --cc=sashiko-reviews@lists.linux.dev \
    --cc=sched-ext@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox