From: Tejun Heo <tj@kernel.org>
To: David Vernet <void@manifault.com>,
Andrea Righi <arighi@nvidia.com>,
Changwoo Min <changwoo@igalia.com>
Cc: Cheng-Yang Chou <yphbchou0911@gmail.com>,
Emil Tsalapatis <emil@etsalapatis.com>,
sched-ext@lists.linux.dev, linux-kernel@vger.kernel.org
Subject: [PATCHSET sched_ext/for-7.1-fixes] sched_ext: Fix cgroup iter coverage of in-do_exit tasks
Date: Mon, 27 Apr 2026 14:16:33 -1000 [thread overview]
Message-ID: <20260428001635.3293997-1-tj@kernel.org> (raw)
Hello,
a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") made
css_task_iter_advance() skip exiting tasks. That broke scx_task_iter's
cgroup-scoped mode: it now silently skips tasks that are still on
scx_tasks but past exit_signals(), so the abort path in
scx_sub_enable_workfn() can miss SCX_TASK_SUB_INIT-marked exiting tasks
and leak __scx_init_task() state.
Restoring iter coverage exposes a separate latent issue: cgroup
iteration can return tasks whose sched_ext_dead() has already torn down
their per-task SCX state (cgroup_task_dead() runs after sched_ext_dead()
in finish_task_switch() and is irq-work deferred on PREEMPT_RT). Callers
trip WARN_ON_ONCE() / fail assertions when they see such a task.
This pair fixes both:
0001 sched_ext: Include exiting tasks in cgroup iter
Adds CSS_TASK_ITER_WITH_DEAD; scx_task_iter opts in.
0002 sched_ext: Skip past-sched_ext_dead() tasks in
scx_task_iter_next_locked()
Adds SCX_TASK_OFF_TASKS, set in sched_ext_dead() under the rq
lock; scx_task_iter_next_locked() skips flagged tasks under the
same lock.
Verified with a stress harness that runs a 4-deep nested sub-sched
hierarchy with continuous fork/switch workers and random sub-sched
restarts at 5s intervals. Baseline (without the patches) wedged a
192-CPU bare-metal box in 66s and oopsed a 24-thread bare-metal box at
227s. Patched ran clean for 30min on both plus an 8-vCPU vng - 0
WARN/BUG/lockdep across ~1000 sub-restarts.
Based on sched_ext/for-7.1-fixes (deb7b2f93d01).
include/linux/cgroup.h | 1 +
include/linux/sched/ext.h | 1 +
kernel/cgroup/cgroup.c | 8 +++++---
kernel/sched/ext.c | 39 +++++++++++++++++++++++++++++----------
4 files changed, 36 insertions(+), 13 deletions(-)
Git tree: git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext.git iter-include-dead-v1
Thanks.
--
tejun
next reply other threads:[~2026-04-28 0:16 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-28 0:16 Tejun Heo [this message]
2026-04-28 0:16 ` [PATCH 1/2] sched_ext: Include exiting tasks in cgroup iter Tejun Heo
2026-04-28 0:16 ` [PATCH 2/2] sched_ext: Skip past-sched_ext_dead() tasks in scx_task_iter_next_locked() Tejun Heo
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260428001635.3293997-1-tj@kernel.org \
--to=tj@kernel.org \
--cc=arighi@nvidia.com \
--cc=changwoo@igalia.com \
--cc=emil@etsalapatis.com \
--cc=linux-kernel@vger.kernel.org \
--cc=sched-ext@lists.linux.dev \
--cc=void@manifault.com \
--cc=yphbchou0911@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox