From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 58B163CC9FE for ; Tue, 10 Mar 2026 20:28:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773174532; cv=none; b=XSNT3eVQTDH6r8Y2V0KO80OmFRxGK0Ry0CbRqDSdwytrqaZN/5uioTAZXKMGZjN38aUfdIa6gRHRWFpEIIt7RiqJCIjj0UHRXTK8y+BeEwV5CSJErJkGmKnU6r/V/OAFPKwV1FP2Mx+j7FsS6u+kEFHkOCvCNFrmr4byqw4xBvI= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773174532; c=relaxed/simple; bh=aZMCjVvNYuUC2vEJplKipMEvMp5PcEA8qq7JLCcmDM8=; h=Date:Message-ID:From:To:Cc:Subject; b=XjchtWHoIlpRscgXr2VtZtLaQHoEASN+aNiWskE7MRRbjg/cf/TpwgmxVR31MSEhHL9T8hy1aqbwUX0Quwvrff2TqpUgcumG7i0Mu52X9j8vCkjGj6qWJy4/iKasgkFy8xZuLpApybsabQ3YeW3v2WKpbC4gDQKitfrUWN8LoBw= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SS1AL/Mw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SS1AL/Mw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 14DEDC19423; Tue, 10 Mar 2026 20:28:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773174532; bh=aZMCjVvNYuUC2vEJplKipMEvMp5PcEA8qq7JLCcmDM8=; h=Date:From:To:Cc:Subject:From; b=SS1AL/Mwtl1Pyu/sZeaifv7HftmDEv7VrMMzTeva3/PjvqMnkKkKi2EPGtuT56Ja7 dRL7fekry4egtiaS0l/rFRwDJWInM3EvtXg8A00/7mKqxvBz2USz2H1AesJaB5j80n KNRvABCHFloQhQjzcQ9VgpYl6ZlTZerKALcEi1PiOZHoG63zGYxrey+IM4zdmnQQWo OpVxlc+tmvm41zlh90ZGQvYldLs3gQ6qgAi2aOw1PtU2ELKqQ92yZ3S5JHOiGutGig jiuV5XEvL9pbGaH38GjtB8xoQRhehz3uqQjXdub6PD4mvJe80T6PP7NahmnocIUH/G MC8AYKJc/878Q== Date: Tue, 10 Mar 2026 21:28:47 +0100 Message-ID: <20260310201009.257617049@kernel.org> User-Agent: quilt/0.68 From: Thomas Gleixner To: LKML Cc: Peter Zijlstra , Mathieu Desnoyers , Matthieu Baerts , Jiri Slaby Subject: [patch 0/4] sched/mmcid: Cure fork()/vfork() related problems Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Matthiue and Jiri reported CPU stalls where a CPU git stuck in mm_get_cid(): https://lore.kernel.org/b24ffcb3-09d5-4e48-9070-0b69bc654281@kernel.org After some tedious debugging it turned out to be another subtle (or not so subtle) ownership mode change issue. The logic handling vfork()'ed tasks in sched_mmcid_fixup_tasks_to_cpus() is broken. It is invoked when the number of tasks associated to a process is smaller than the number of MMCID users. It then walks the task list to find the vfork()'ed task, but accounts all the already processed tasks as well. If that double processing brings the number of to be handled tasks to 0, the walk stops and the vfork()'ed task's CID is not fixed up. As a consequence a subsequent schedule in fails to acquire a (transitional) CID and the machine stalls. Peter and me discovered also that there is a yet unreported issue vs. concurrent forks. Jiri noticed it independently. The following series fixes those issues. It applies on top of Linus tree. Thanks a lot to Matthieu and Jiri for providing valuable debug information and running the debug patches! Thanks, tglx --- include/linux/rseq_types.h | 6 ++- include/linux/sched.h | 2 - kernel/fork.c | 3 - kernel/sched/core.c | 79 +++++++++++++++------------------------------ 4 files changed, 34 insertions(+), 56 deletions(-)