* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0
[not found] <20260212211213.F1BE52A1C1D@windowsforum.com>
@ 2026-02-12 21:19 ` Mathieu Desnoyers
2026-02-12 23:21 ` Thomas Gleixner
0 siblings, 1 reply; 3+ messages in thread
From: Mathieu Desnoyers @ 2026-02-12 21:19 UTC (permalink / raw)
To: root, Thomas Gleixner
Cc: peterz, mingo, linux-kernel, mjfara, Greg Kroah-Hartman,
stable@vger.kernel.org
On 2026-02-12 16:12, root wrote:
> To: mathieu.desnoyers@efficios.com
> Cc: peterz@infradead.org, mingo@redhat.com, linux-kernel@vger.kernel.org
> Subject: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0
>
> Hi Mathieu,
>
> I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0
> when booting with nopti. The crash occurs during process exit
> (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of
> the CID bitmap. The faulting address is within a 2MB huge page that
> returns a permissions violation on supervisor write access.
>
> The bug triggered 8 times over ~20 hours on a single boot, hitting
> multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus
> died and systemd became non-functional, requiring a hard power-off.
Can you confirm whether the following fix in Linus' tree fixes your issue ?
commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch")
I suspect that it will soon be cherry picked into stable for an eventual v6.19.1.
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0
2026-02-12 21:19 ` [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 Mathieu Desnoyers
@ 2026-02-12 23:21 ` Thomas Gleixner
2026-02-13 11:16 ` Greg Kroah-Hartman
0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2026-02-12 23:21 UTC (permalink / raw)
To: Mathieu Desnoyers, root
Cc: peterz, mingo, linux-kernel, mjfara, Greg Kroah-Hartman,
stable@vger.kernel.org
On Thu, Feb 12 2026 at 16:19, Mathieu Desnoyers wrote:
> On 2026-02-12 16:12, root wrote:
>> I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0
>> when booting with nopti. The crash occurs during process exit
>> (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of
>> the CID bitmap. The faulting address is within a 2MB huge page that
>> returns a permissions violation on supervisor write access.
>>
>> The bug triggered 8 times over ~20 hours on a single boot, hitting
>> multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus
>> died and systemd became non-functional, requiring a hard power-off.
>
> Can you confirm whether the following fix in Linus' tree fixes your issue ?
It's exactly that problem:
2a:* f0 48 0f b3 10 lock btr %rdx,(%rax) <-- trapping instruction
RDX: 0000000020000006
which has the TRANSIT bit set and that's what below fixes:
> commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch")
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0
2026-02-12 23:21 ` Thomas Gleixner
@ 2026-02-13 11:16 ` Greg Kroah-Hartman
0 siblings, 0 replies; 3+ messages in thread
From: Greg Kroah-Hartman @ 2026-02-13 11:16 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Mathieu Desnoyers, root, peterz, mingo, linux-kernel, mjfara,
stable@vger.kernel.org
On Fri, Feb 13, 2026 at 12:21:52AM +0100, Thomas Gleixner wrote:
> On Thu, Feb 12 2026 at 16:19, Mathieu Desnoyers wrote:
> > On 2026-02-12 16:12, root wrote:
> >> I'm hitting a repeatable page fault in sched_mm_cid_exit() on 6.19.0
> >> when booting with nopti. The crash occurs during process exit
> >> (do_exit -> sched_mm_cid_exit) on an atomic bit-clear (lock btr) of
> >> the CID bitmap. The faulting address is within a 2MB huge page that
> >> returns a permissions violation on supervisor write access.
> >>
> >> The bug triggered 8 times over ~20 hours on a single boot, hitting
> >> multiple unrelated processes (git, gce_workload_ce). Eventually D-Bus
> >> died and systemd became non-functional, requiring a hard power-off.
> >
> > Can you confirm whether the following fix in Linus' tree fixes your issue ?
>
> It's exactly that problem:
>
> 2a:* f0 48 0f b3 10 lock btr %rdx,(%rax) <-- trapping instruction
>
> RDX: 0000000020000006
>
> which has the TRANSIT bit set and that's what below fixes:
>
> > commit 1e83ccd5921a ("sched/mmcid: Don't assume CID is CPU owned on mode switch")
>
Great, I'll go grab it now.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-02-13 11:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260212211213.F1BE52A1C1D@windowsforum.com>
2026-02-12 21:19 ` [BUG] sched_mm_cid_exit+0xe2: page fault on CID bitmap write with nopti on 6.19.0 Mathieu Desnoyers
2026-02-12 23:21 ` Thomas Gleixner
2026-02-13 11:16 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox