public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@kernel.org>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Matthieu Baerts <matttbe@kernel.org>,
	Jiri Slaby <jirislaby@kernel.org>
Subject: [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems
Date: Tue, 10 Mar 2026 21:28:47 +0100	[thread overview]
Message-ID: <20260310201009.257617049@kernel.org> (raw)

Matthiue and Jiri reported CPU stalls where a CPU git stuck in mm_get_cid():

 https://lore.kernel.org/b24ffcb3-09d5-4e48-9070-0b69bc654281@kernel.org

After some tedious debugging it turned out to be another subtle (or not so
subtle) ownership mode change issue.

The logic handling vfork()'ed tasks in sched_mmcid_fixup_tasks_to_cpus() is
broken. It is invoked when the number of tasks associated to a process is
smaller than the number of MMCID users. It then walks the task list to find
the vfork()'ed task, but accounts all the already processed tasks as well.

If that double processing brings the number of to be handled tasks to 0,
the walk stops and the vfork()'ed task's CID is not fixed up. As a
consequence a subsequent schedule in fails to acquire a (transitional) CID
and the machine stalls.

Peter and me discovered also that there is a yet unreported issue
vs. concurrent forks. Jiri noticed it independently.

The following series fixes those issues. It applies on top of Linus tree.

Thanks a lot to Matthieu and Jiri for providing valuable debug
information and running the debug patches!

Thanks,

	tglx
---
 include/linux/rseq_types.h |    6 ++-
 include/linux/sched.h      |    2 -
 kernel/fork.c              |    3 -
 kernel/sched/core.c        |   79 +++++++++++++++------------------------------
 4 files changed, 34 insertions(+), 56 deletions(-)


             reply	other threads:[~2026-03-10 20:28 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-10 20:28 Thomas Gleixner [this message]
2026-03-10 20:28 ` [patch 1/4] sched/mmcid: Prevent CID stalls due to concurrent forks Thomas Gleixner
2026-03-11  7:33   ` Jiri Slaby
2026-03-11  7:49     ` Peter Zijlstra
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:28 ` [patch 2/4] sched/mmcid: Handle vfork()/CLONE_VM correctly Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:29 ` [patch 3/4] sched/mmcid: Remove pointless preempt guard Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-10 20:29 ` [patch 4/4] sched/mmcid: Avoid full tasklist walks Thomas Gleixner
2026-03-11 11:04   ` [tip: sched/urgent] " tip-bot2 for Thomas Gleixner
2026-03-11  9:43 ` [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems Matthieu Baerts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260310201009.257617049@kernel.org \
    --to=tglx@kernel.org \
    --cc=jirislaby@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=matttbe@kernel.org \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox