public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Andrea Righi <arighi@nvidia.com>
Cc: David Vernet <void@manifault.com>,
	Changwoo Min <changwoo@igalia.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5] sched_ext: Fix lock imbalance in dispatch_to_local_dsq()
Date: Mon, 27 Jan 2025 12:41:30 -1000	[thread overview]
Message-ID: <Z5gLmsbE73PYPd-Q@slm.duckdns.org> (raw)
In-Reply-To: <20250127220616.620097-1-arighi@nvidia.com>

On Mon, Jan 27, 2025 at 11:06:16PM +0100, Andrea Righi wrote:
> While performing the rq locking dance in dispatch_to_local_dsq(), we may
> trigger the following lock imbalance condition, in particular when
> multiple tasks are rapidly changing CPU affinity (i.e., running a
> `stress-ng --race-sched 0`):
> 
> [   13.413579] =====================================
> [   13.413660] WARNING: bad unlock balance detected!
> [   13.413729] 6.13.0-virtme #15 Not tainted
> [   13.413792] -------------------------------------
> [   13.413859] kworker/1:1/80 is trying to release lock (&rq->__lock) at:
> [   13.413954] [<ffffffff873c6c48>] dispatch_to_local_dsq+0x108/0x1a0
> [   13.414111] but there are no more locks to release!
> [   13.414176]
> [   13.414176] other info that might help us debug this:
> [   13.414258] 1 lock held by kworker/1:1/80:
> [   13.414318]  #0: ffff8b66feb41698 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x20/0x90
> [   13.414612]
> [   13.414612] stack backtrace:
> [   13.415255] CPU: 1 UID: 0 PID: 80 Comm: kworker/1:1 Not tainted 6.13.0-virtme #15
> [   13.415505] Workqueue:  0x0 (events)
> [   13.415567] Sched_ext: dsp_local_on (enabled+all), task: runnable_at=-2ms
> [   13.415570] Call Trace:
> [   13.415700]  <TASK>
> [   13.415744]  dump_stack_lvl+0x78/0xe0
> [   13.415806]  ? dispatch_to_local_dsq+0x108/0x1a0
> [   13.415884]  print_unlock_imbalance_bug+0x11b/0x130
> [   13.415965]  ? dispatch_to_local_dsq+0x108/0x1a0
> [   13.416226]  lock_release+0x231/0x2c0
> [   13.416326]  _raw_spin_unlock+0x1b/0x40
> [   13.416422]  dispatch_to_local_dsq+0x108/0x1a0
> [   13.416554]  flush_dispatch_buf+0x199/0x1d0
> [   13.416652]  balance_one+0x194/0x370
> [   13.416751]  balance_scx+0x61/0x1e0
> [   13.416848]  prev_balance+0x43/0xb0
> [   13.416947]  __pick_next_task+0x6b/0x1b0
> [   13.417052]  __schedule+0x20d/0x1740
> 
> This happens because dispatch_to_local_dsq() is racing with
> dispatch_dequeue() and, when the latter wins, we incorrectly assume that
> the task has been moved to dst_rq.
> 
> Fix by properly tracking the currently locked rq.
> 
> Fixes: 4d3ca89bdd31 ("sched_ext: Refactor consume_remote_task()")
> Signed-off-by: Andrea Righi <arighi@nvidia.com>

Applied to sched_ext/for-6.14-fixes.

Thanks.

-- 
tejun

      reply	other threads:[~2025-01-27 22:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-27 22:06 [PATCH v5] sched_ext: Fix lock imbalance in dispatch_to_local_dsq() Andrea Righi
2025-01-27 22:41 ` Tejun Heo [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z5gLmsbE73PYPd-Q@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox