* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 [not found] <20260212211213.F1BE52A1C1D@windowsforum.com> @ 2026-02-12 21:19 ` Mathieu Desnoyers 2026-02-12 23:21 ` Thomas Gleixner 0 siblings, 1 reply; 3+ messages in thread From: Mathieu Desnoyers @ 2026-02-12 21:19 UTC (permalink / raw) To: root, Thomas Gleixner Cc: peterz, mingo, linux-kernel, mjfara, Greg Kroah-Hartman, stable@vger.kernel.org On 2026-02-12 16:12, root wrote: > To: mathieu.desnoyers@efficios.com > Cc: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org > Subject: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 > > Hi Mathieu, > > I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0 > when booting with nopti. The crash occurs during process exit > (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of > the CID bitmap. The faulting address is within a 2MB huge page that > returns a permissions violation on supervisor write access. > > The bug triggered 8 times over ~20 hours on a single boot, hitting > multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus > died and systemd became non-functional, requiring a hard power-off. Can you confirm whether the following fix in Linus' tree fixes your issue ? commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch") I suspect that it will soon be cherry picked into stable for an eventual v6.19.1. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 2026-02-12 21:19 ` [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 Mathieu Desnoyers @ 2026-02-12 23:21 ` Thomas Gleixner 2026-02-13 11:16 ` Greg Kroah-Hartman 0 siblings, 1 reply; 3+ messages in thread From: Thomas Gleixner @ 2026-02-12 23:21 UTC (permalink / raw) To: Mathieu Desnoyers, root Cc: peterz, mingo, linux-kernel, mjfara, Greg Kroah-Hartman, stable@vger.kernel.org On Thu, Feb 12 2026 at 16:19, Mathieu Desnoyers wrote: > On 2026-02-12 16:12, root wrote: >> I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0 >> when booting with nopti. The crash occurs during process exit >> (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of >> the CID bitmap. The faulting address is within a 2MB huge page that >> returns a permissions violation on supervisor write access. >> >> The bug triggered 8 times over ~20 hours on a single boot, hitting >> multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus >> died and systemd became non-functional, requiring a hard power-off. > > Can you confirm whether the following fix in Linus' tree fixes your issue ? It's exactly that problem: 2a:* f0 48 0f b3 10 lock btr %rdx,(%rax) <-- trapping instruction RDX: 0000000020000006 which has the TRANSIT bit set and that's what below fixes: > commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch") ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 2026-02-12 23:21 ` Thomas Gleixner @ 2026-02-13 11:16 ` Greg Kroah-Hartman 0 siblings, 0 replies; 3+ messages in thread From: Greg Kroah-Hartman @ 2026-02-13 11:16 UTC (permalink / raw) To: Thomas Gleixner Cc: Mathieu Desnoyers, root, peterz, mingo, linux-kernel, mjfara, stable@vger.kernel.org On Fri, Feb 13, 2026 at 12:21:52AM +0100, Thomas Gleixner wrote: > On Thu, Feb 12 2026 at 16:19, Mathieu Desnoyers wrote: > > On 2026-02-12 16:12, root wrote: > >> I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0 > >> when booting with nopti. The crash occurs during process exit > >> (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of > >> the CID bitmap. The faulting address is within a 2MB huge page that > >> returns a permissions violation on supervisor write access. > >> > >> The bug triggered 8 times over ~20 hours on a single boot, hitting > >> multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus > >> died and systemd became non-functional, requiring a hard power-off. > > > > Can you confirm whether the following fix in Linus' tree fixes your issue ? > > It's exactly that problem: > > 2a:* f0 48 0f b3 10 lock btr %rdx,(%rax) <-- trapping instruction > > RDX: 0000000020000006 > > which has the TRANSIT bit set and that's what below fixes: > > > commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch") > Great, I'll go grab it now. thanks, greg k-h ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-02-13 11:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260212211213.F1BE52A1C1D@windowsforum.com>
2026-02-12 21:19 ` [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 Mathieu Desnoyers
2026-02-12 23:21 ` Thomas Gleixner
2026-02-13 11:16 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox