All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
	Andrea Righi <arighi@nvidia.com>,
	Changwoo Min <changwoo@igalia.com>
Cc: Christian Loehle <christian.loehle@arm.com>,
	Emil Tsalapatis <emil@etsalapatis.com>,
	sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org,
	Tejun Heo <tj@kernel.org>
Subject: [PATCHSET sched_ext/for-7.0-fixes] sched_ext: Fix SCX_KICK_WAIT deadlock
Date: Sat, 28 Mar 2026 14:18:54 -1000	[thread overview]
Message-ID: <20260329001856.835643-1-tj@kernel.org> (raw)

Hello,

SCX_KICK_WAIT busy-waits in kick_cpus_irq_workfn() until the target CPU
reschedules. Because the irq_work runs in hardirq context, the waiting
CPU's kick_sync never advances, and if multiple CPUs form a wait cycle, all
deadlock. This was reported by Christian while testing on arm64.

0001 fixes the deadlock by deferring the wait to a balance callback which
drops the rq lock and enables IRQs, allowing IPIs to be processed and
kick_sync to keep advancing during the wait.

0002 adds a selftest that creates a 3-CPU kick_wait cycle to reproduce the
issue.

Based on sched_ext/for-7.0-fixes (db08b1940f4b).

 0001-sched_ext-Fix-SCX_KICK_WAIT-deadlock-by-deferring-wa.patch
 0002-selftests-sched_ext-Add-cyclic-SCX_KICK_WAIT-stress-.patch

 kernel/sched/ext.c                                 |  95 +++++++---
 kernel/sched/sched.h                               |   3 +
 tools/testing/selftests/sched_ext/Makefile         |   1 +
 .../selftests/sched_ext/cyclic_kick_wait.bpf.c     |  68 ++++++++
 .../testing/selftests/sched_ext/cyclic_kick_wait.c | 194 +++++++++++++++++++++
 5 files changed, 336 insertions(+), 25 deletions(-)

--
tejun

             reply	other threads:[~2026-03-29  0:18 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-29  0:18 Tejun Heo [this message]
2026-03-29  0:18 ` [PATCH 1/2] sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback Tejun Heo
2026-03-29 16:26   ` Andrea Righi
2026-03-29  0:18 ` [PATCH 2/2] selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test Tejun Heo
2026-03-29  9:06   ` Cheng-Yang Chou
2026-03-29 15:52     ` Andrea Righi
2026-03-30  4:40       ` Cheng-Yang Chou
2026-03-30  8:51   ` Christian Loehle
2026-03-30  8:52 ` [PATCHSET sched_ext/for-7.0-fixes] sched_ext: Fix SCX_KICK_WAIT deadlock Christian Loehle
2026-03-30 18:56 ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260329001856.835643-1-tj@kernel.org \
    --to=tj@kernel.org \
    --cc=arighi@nvidia.com \
    --cc=changwoo@igalia.com \
    --cc=christian.loehle@arm.com \
    --cc=emil@etsalapatis.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sched-ext@lists.linux.dev \
    --cc=void@manifault.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.