public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems
@ 2026-03-10 20:28 Thomas Gleixner
  2026-03-10 20:28 ` [patch 1/4] sched/mmcid: Prevent CID stalls due to concurrent forks Thomas Gleixner
                   ` (4 more replies)
  0 siblings, 5 replies; 12+ messages in thread
From: Thomas Gleixner @ 2026-03-10 20:28 UTC (permalink / raw)
  To: LKML; +Cc: Peter Zijlstra, Mathieu Desnoyers, Matthieu Baerts, Jiri Slaby

Matthiue and Jiri reported CPU stalls where a CPU git stuck in mm_get_cid():

 https://lore.kernel.org/b24ffcb3-09d5-4e48-9070-0b69bc654281@kernel.org

After some tedious debugging it turned out to be another subtle (or not so
subtle) ownership mode change issue.

The logic handling vfork()'ed tasks in sched_mmcid_fixup_tasks_to_cpus() is
broken. It is invoked when the number of tasks associated to a process is
smaller than the number of MMCID users. It then walks the task list to find
the vfork()'ed task, but accounts all the already processed tasks as well.

If that double processing brings the number of to be handled tasks to 0,
the walk stops and the vfork()'ed task's CID is not fixed up. As a
consequence a subsequent schedule in fails to acquire a (transitional) CID
and the machine stalls.

Peter and me discovered also that there is a yet unreported issue
vs. concurrent forks. Jiri noticed it independently.

The following series fixes those issues. It applies on top of Linus tree.

Thanks a lot to Matthieu and Jiri for providing valuable debug
information and running the debug patches!

Thanks,

	tglx
---
 include/linux/rseq_types.h |    6 ++-
 include/linux/sched.h      |    2 -
 kernel/fork.c              |    3 -
 kernel/sched/core.c        |   79 +++++++++++++++------------------------------
 4 files changed, 34 insertions(+), 56 deletions(-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-03-11 11:04 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 20:28 [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems Thomas Gleixner
2026-03-10 20:28 ` [patch 1/4] sched/mmcid: Prevent CID stalls due to concurrent forks Thomas Gleixner
2026-03-11  7:33   ` Jiri Slaby
2026-03-11  7:49     ` Peter Zijlstra
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:28 ` [patch 2/4] sched/mmcid: Handle vfork()/CLONE_VM correctly Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:29 ` [patch 3/4] sched/mmcid: Remove pointless preempt guard Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:29 ` [patch 4/4] sched/mmcid: Avoid full tasklist walks Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-11  9:43 ` [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems Matthieu Baerts

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox