* Regression on linux-next (next-20250414)
@ 2025-04-16 18:09 Borah, Chaitanya Kumar
2025-04-24 13:27 ` Borah, Chaitanya Kumar
0 siblings, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-16 18:09 UTC (permalink / raw)
To: luto@kernel.org
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org
Hello Andy,
Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
Since the version next-20250414 [2], we are seeing the following regression
`````````````````````````````````````````````````````````````````````````````````
<4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 switch_mm_irqs_off+0x389/0x410
<5>[ 0.203173] Modules linked in:
<5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary)
<5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
<5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
<5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
<5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
<5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: 0000000100a8c000
<5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: 0000000000000000
<5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: 0000000000000000
<5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: 0000000000000000
<5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: 0000000000000000
<5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000) knlGS:0000000000000000
<5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: 00000000003706f0
<5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<5>[ 0.203412] Call Trace:
<5>[ 0.203418] <TASK>
<5>[ 0.203428] use_temporary_mm+0x5b/0x130
<5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250
<5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220
<5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0
<5>[ 0.203482] start_kernel+0xa38/0xc60
<5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80
<5>[ 0.203504] x86_64_start_reservations+0x18/0x30
<5>[ 0.203516] x86_64_start_kernel+0xbf/0x110
<5>[ 0.203526] ? soft_restart_cpu+0x14/0x14
<5>[ 0.203536] common_startup_64+0x13e/0x141
<5>[ 0.203555] </TASK>
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first "bad"
commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
Author: Andy Lutomirski mailto:luto@kernel.org
Date: Wed Apr 2 11:45:39 2025 +0200
x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We also verified that if we revert the patch the issue is not seen.
Could you please check why the patch causes this regression and provide a fix if necessary?
Thank you.
Regards
Chaitanya
[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-8/boot0.txt
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62
^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Regression on linux-next (next-20250414)
2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar
@ 2025-04-24 13:27 ` Borah, Chaitanya Kumar
2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
0 siblings, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-24 13:27 UTC (permalink / raw)
To: luto@kernel.org
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar
+Andy, Ingo
Friendly reminder.
Issue is still seen on latest linux-next runs.
https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
Regards
Chaitanya
> -----Original Message-----
> From: Borah, Chaitanya Kumar
> Sent: Wednesday, April 16, 2025 11:39 PM
> To: luto@kernel.org
> Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi,
> Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani
> <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
> linux-kernel@vger.kernel.org
> Subject: Regression on linux-next (next-20250414)
>
> Hello Andy,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-
> next repository.
>
> Since the version next-20250414 [2], we are seeing the following regression
>
> `````````````````````````````````````````````````````````````````````````````````
> <4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
> switch_mm_irqs_off+0x389/0x410
> <5>[ 0.203173] Modules linked in:
> <5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-
> rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary)
> <5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client
> Platform/CoffeeLake S UDIMM RVP, BIOS
> CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
> <5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
> <5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe
> ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff
> ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
> <5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
> <5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX:
> 0000000100a8c000
> <5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI:
> 0000000000000000
> <5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09:
> 0000000000000000
> <5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12:
> 0000000000000000
> <5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15:
> 0000000000000000
> <5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000)
> knlGS:0000000000000000
> <5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4:
> 00000000003706f0
> <5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> <5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> <5>[ 0.203412] Call Trace:
> <5>[ 0.203418] <TASK>
> <5>[ 0.203428] use_temporary_mm+0x5b/0x130
> <5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250
> <5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220
> <5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0
> <5>[ 0.203482] start_kernel+0xa38/0xc60
> <5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80
> <5>[ 0.203504] x86_64_start_reservations+0x18/0x30
> <5>[ 0.203516] x86_64_start_kernel+0xbf/0x110
> <5>[ 0.203526] ? soft_restart_cpu+0x14/0x14
> <5>[ 0.203536] common_startup_64+0x13e/0x141
> <5>[ 0.203555] </TASK>
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].
>
> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
> Author: Andy Lutomirski mailto:luto@kernel.org
> Date: Wed Apr 2 11:45:39 2025 +0200
>
> x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm()
> machinery
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>
> We also verified that if we revert the patch the issue is not seen.
>
> Could you please check why the patch causes this regression and provide a fix
> if necessary?
>
> Thank you.
>
> Regards
>
> Chaitanya
>
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/commit/?h=next-20250414
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-
> 8/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/commit/?h=next-
> 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62
^ permalink raw reply [flat|nested] 9+ messages in thread
* [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-24 13:27 ` Borah, Chaitanya Kumar
@ 2025-04-29 9:01 ` Jani Nikula
2025-04-29 18:29 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Jani Nikula @ 2025-04-29 9:01 UTC (permalink / raw)
To: Borah, Chaitanya Kumar, luto@kernel.org
Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar
On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> +Andy, Ingo
>
> Friendly reminder.
> Issue is still seen on latest linux-next runs.
>
> https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
>
> Regards
>
> Chaitanya
Andy, Ingo -
Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
use_/unuse_temporary_mm() machinery") on linux-next regresses as
reported by Chaitanya
Please look into it.
Thanks,
Jani.
>
>> -----Original Message-----
>> From: Borah, Chaitanya Kumar
>> Sent: Wednesday, April 16, 2025 11:39 PM
>> To: luto@kernel.org
>> Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi,
>> Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani
>> <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
>> linux-kernel@vger.kernel.org
>> Subject: Regression on linux-next (next-20250414)
>>
>> Hello Andy,
>>
>> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>>
>> This mail is regarding a regression we are seeing in our CI runs[1] on linux-
>> next repository.
>>
>> Since the version next-20250414 [2], we are seeing the following regression
>>
>> `````````````````````````````````````````````````````````````````````````````````
>> <4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
>> switch_mm_irqs_off+0x389/0x410
>> <5>[ 0.203173] Modules linked in:
>> <5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-
>> rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary)
>> <5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client
>> Platform/CoffeeLake S UDIMM RVP, BIOS
>> CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
>> <5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
>> <5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe
>> ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff
>> ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
>> <5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
>> <5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX:
>> 0000000100a8c000
>> <5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI:
>> 0000000000000000
>> <5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09:
>> 0000000000000000
>> <5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12:
>> 0000000000000000
>> <5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15:
>> 0000000000000000
>> <5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000)
>> knlGS:0000000000000000
>> <5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4:
>> 00000000003706f0
>> <5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> <5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> <5>[ 0.203412] Call Trace:
>> <5>[ 0.203418] <TASK>
>> <5>[ 0.203428] use_temporary_mm+0x5b/0x130
>> <5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250
>> <5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220
>> <5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0
>> <5>[ 0.203482] start_kernel+0xa38/0xc60
>> <5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80
>> <5>[ 0.203504] x86_64_start_reservations+0x18/0x30
>> <5>[ 0.203516] x86_64_start_kernel+0xbf/0x110
>> <5>[ 0.203526] ? soft_restart_cpu+0x14/0x14
>> <5>[ 0.203536] common_startup_64+0x13e/0x141
>> <5>[ 0.203555] </TASK>
>> `````````````````````````````````````````````````````````````````````````````````
>> Details log can be found in [3].
>>
>> After bisecting the tree, the following patch [4] seems to be the first "bad"
>> commit
>>
>> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>> commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
>> Author: Andy Lutomirski mailto:luto@kernel.org
>> Date: Wed Apr 2 11:45:39 2025 +0200
>>
>> x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm()
>> machinery
>>
>> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>>
>> We also verified that if we revert the patch the issue is not seen.
>>
>> Could you please check why the patch causes this regression and provide a fix
>> if necessary?
>>
>> Thank you.
>>
>> Regards
>>
>> Chaitanya
>>
>> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
>> next.git/commit/?h=next-20250414
>> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-
>> 8/boot0.txt
>> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
>> next.git/commit/?h=next-
>> 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62
>
--
Jani Nikula, Intel
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
@ 2025-04-29 18:29 ` Peter Zijlstra
2025-04-30 6:07 ` Hugh Dickins
2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-29 18:29 UTC (permalink / raw)
To: Jani Nikula
Cc: Borah, Chaitanya Kumar, luto@kernel.org,
intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org, Ingo Molnar
On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> > +Andy, Ingo
> >
> > Friendly reminder.
> > Issue is still seen on latest linux-next runs.
> >
> > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
> >
> > Regards
> >
> > Chaitanya
>
> Andy, Ingo -
>
> Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> use_/unuse_temporary_mm() machinery") on linux-next regresses as
> reported by Chaitanya
>
> Please look into it.
Does your kernel include the below?
---
commit aef1d0209ddf127a8069aca5fa3a062be4136b76
Author: Peter Zijlstra <peterz@infradead.org>
Date: Fri Apr 18 11:50:34 2025 +0200
x86/mm: Fix {,un}use_temporary_mm() IRQ state
As the function switch_mm_irqs_off() implies, it ought to be called with
IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs
are on") caused this to not be the case for EFI.
Ensure IRQs are off where it matters.
Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 79c124f6f3f2..39761c7765bd 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
struct mm_struct *prev_mm;
lockdep_assert_preemption_disabled();
+ guard(irqsave)();
/*
* Make sure not to be in TLB lazy mode, as otherwise we'll end up
@@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
void unuse_temporary_mm(struct mm_struct *prev_mm)
{
lockdep_assert_preemption_disabled();
+ guard(irqsave)();
/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-29 18:29 ` Peter Zijlstra
@ 2025-04-30 6:07 ` Hugh Dickins
2025-04-30 8:11 ` Peter Zijlstra
2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
1 sibling, 1 reply; 9+ messages in thread
From: Hugh Dickins @ 2025-04-30 6:07 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org,
intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org, Ingo Molnar
On Tue, 29 Apr 2025, Peter Zijlstra wrote:
> On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> > > +Andy, Ingo
> > >
> > > Friendly reminder.
> > > Issue is still seen on latest linux-next runs.
> > >
> > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
> > >
> > > Regards
> > >
> > > Chaitanya
> >
> > Andy, Ingo -
> >
> > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > reported by Chaitanya
> >
> > Please look into it.
>
> Does your kernel include the below?
>
> ---
> commit aef1d0209ddf127a8069aca5fa3a062be4136b76
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Fri Apr 18 11:50:34 2025 +0200
>
> x86/mm: Fix {,un}use_temporary_mm() IRQ state
>
> As the function switch_mm_irqs_off() implies, it ought to be called with
> IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs
> are on") caused this to not be the case for EFI.
>
> Ensure IRQs are off where it matters.
>
> Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
> Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
> Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Rik van Riel <riel@surriel.com>
> Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 79c124f6f3f2..39761c7765bd 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> struct mm_struct *prev_mm;
>
> lockdep_assert_preemption_disabled();
> + guard(irqsave)();
>
> /*
> * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> void unuse_temporary_mm(struct mm_struct *prev_mm)
> {
> lockdep_assert_preemption_disabled();
> + guard(irqsave)();
>
> /* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
> cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
Hi Peter, I haven't checked on most recent -nexts, but earlier found that
patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y:
because switch_mm_irqs_off() contains a
VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu,
mm_cpumask(prev)));
which doesn't like what (un)use_temporary_mm() is now doing. I couldn't
be sure who was right or wrong, and just proceeded by commenting out
the warning - ONCE shouldn't be much trouble, except xfstests uses
some nefarious mechanism to resurrect ONCE repeatedly.
Hugh
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-30 6:07 ` Hugh Dickins
@ 2025-04-30 8:11 ` Peter Zijlstra
2025-05-06 9:42 ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-30 8:11 UTC (permalink / raw)
To: Hugh Dickins
Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org,
intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
linux-kernel@vger.kernel.org, Ingo Molnar, riel
On Tue, Apr 29, 2025 at 11:07:45PM -0700, Hugh Dickins wrote:
> On Tue, 29 Apr 2025, Peter Zijlstra wrote:
> > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > index 79c124f6f3f2..39761c7765bd 100644
> > --- a/arch/x86/mm/tlb.c
> > +++ b/arch/x86/mm/tlb.c
> > @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> > struct mm_struct *prev_mm;
> >
> > lockdep_assert_preemption_disabled();
> > + guard(irqsave)();
> >
> > /*
> > * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> > @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> > void unuse_temporary_mm(struct mm_struct *prev_mm)
> > {
> > lockdep_assert_preemption_disabled();
> > + guard(irqsave)();
> >
> > /* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
> > cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
>
> Hi Peter, I haven't checked on most recent -nexts, but earlier found that
> patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y:
> because switch_mm_irqs_off() contains a
>
> VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu,
> mm_cpumask(prev)));
>
> which doesn't like what (un)use_temporary_mm() is now doing. I couldn't
> be sure who was right or wrong, and just proceeded by commenting out
> the warning - ONCE shouldn't be much trouble, except xfstests uses
> some nefarious mechanism to resurrect ONCE repeatedly.
Oh that one. Yeah, I thought Ingo had already delete that WARN, but it
seems it's still there.
So the problem is that unuse_temporary_mm() explicitly clears that bit;
and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.
(See also:
https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/
)
Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.
However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.
There are two ways out, one have unuse_temporary_mm() mark the mm_struct
such that a further exception (beyond init_mm) can be grafted, or simply
delete the whole check.
Anyway, something like the below, or just delete the check I suppose.
Opinions?
---
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 8b8055a8eb9e..0fe9c569d171 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -16,6 +16,8 @@
#define MM_CONTEXT_LOCK_LAM 2
/* Allow LAM and SVA coexisting */
#define MM_CONTEXT_FORCE_TAGGED_SVA 3
+/* Tracks mm_cpumask */
+#define MM_CONTEXT_NOTRACK 4
/*
* x86 has arch-specific MMU state beyond what lives in mm_struct.
@@ -44,9 +46,7 @@ typedef struct {
struct ldt_struct *ldt;
#endif
-#ifdef CONFIG_X86_64
unsigned long flags;
-#endif
#ifdef CONFIG_ADDRESS_MASKING
/* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index c511f8584ae4..73bf3b1b44e8 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif
+static inline bool is_notrack_mm(struct mm_struct *mm)
+{
+ return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
+static inline void set_notrack_mm(struct mm_struct *mm)
+{
+ set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
/*
* We only want to enforce protection keys on the current process
* because we effectively have no access to PKRU for other
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index f8c74d19bebb..aa56d9ac0b8f 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -28,6 +28,7 @@
#include <asm/text-patching.h>
#include <asm/memtype.h>
#include <asm/paravirt.h>
+#include <asm/mmu_context.h>
/*
* We need to define the tracepoints somewhere, and tlb.c
@@ -830,6 +831,8 @@ void __init poking_init(void)
/* Xen PV guests need the PGD to be pinned. */
paravirt_enter_mmap(text_poke_mm);
+ set_notrack_mm(text_poke_mm);
+
/*
* Randomize the poking address, but make sure that the following page
* will be mapped at the same PMD. We need 2 pages, so find space for 3,
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 1451e022129a..25bfc3305158 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -852,7 +852,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
* mm_cpumask. The TLB shootdown code can figure out from
* cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
*/
- if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
+ if (IS_ENABLED(CONFIG_DEBUG_VM) &&
+ WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) &&
!cpumask_test_cpu(cpu, mm_cpumask(next))))
cpumask_set_cpu(cpu, mm_cpumask(next));
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 8e1796dd6c68..e7e8f77f77f8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void)
efi_mm.pgd = efi_pgd;
mm_init_cpumask(&efi_mm);
init_new_context(NULL, &efi_mm);
+ set_notrack_mm(&efi_mm);
return 0;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-29 18:29 ` Peter Zijlstra
2025-04-30 6:07 ` Hugh Dickins
@ 2025-04-30 8:47 ` Borah, Chaitanya Kumar
2025-04-30 8:51 ` Peter Zijlstra
1 sibling, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-30 8:47 UTC (permalink / raw)
To: Peter Zijlstra, Jani Nikula
Cc: luto@kernel.org, intel-gfx@lists.freedesktop.org,
intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar,
Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org,
Ingo Molnar, hughd@google.com
> -----Original Message-----
> From: Peter Zijlstra <peterz@infradead.org>
> Sent: Tuesday, April 29, 2025 11:59 PM
> To: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>;
> luto@kernel.org; intel-gfx@lists.freedesktop.org; intel-
> xe@lists.freedesktop.org; Kurmi, Suresh Kumar
> <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>;
> De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org;
> Ingo Molnar <mingo@kernel.org>
> Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the
> use_/unuse_temporary_mm() machinery (linux-next)
>
> On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar"
> <chaitanya.kumar.borah@intel.com> wrote:
> > > +Andy, Ingo
> > >
> > > Friendly reminder.
> > > Issue is still seen on latest linux-next runs.
> > >
> > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4
> > > /boot0.txt
> > >
> > > Regards
> > >
> > > Chaitanya
> >
> > Andy, Ingo -
> >
> > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > reported by Chaitanya
> >
> > Please look into it.
>
> Does your kernel include the below?
This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us.
Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh.
Any idea when this lands in linux-next?
Regards
Chaitanya
>
> ---
> commit aef1d0209ddf127a8069aca5fa3a062be4136b76
> Author: Peter Zijlstra <peterz@infradead.org>
> Date: Fri Apr 18 11:50:34 2025 +0200
>
> x86/mm: Fix {,un}use_temporary_mm() IRQ state
>
> As the function switch_mm_irqs_off() implies, it ought to be called with
> IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when
> IRQs
> are on") caused this to not be the case for EFI.
>
> Ensure IRQs are off where it matters.
>
> Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
> Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
> Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> Signed-off-by: Ingo Molnar <mingo@kernel.org>
> Cc: H. Peter Anvin <hpa@zytor.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Andy Lutomirski <luto@kernel.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: Rik van Riel <riel@surriel.com>
> Link:
> https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kick
> s-ass.net
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index
> 79c124f6f3f2..39761c7765bd 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct
> mm_struct *temp_mm)
> struct mm_struct *prev_mm;
>
> lockdep_assert_preemption_disabled();
> + guard(irqsave)();
>
> /*
> * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct
> mm_struct *temp_mm) void unuse_temporary_mm(struct mm_struct
> *prev_mm) {
> lockdep_assert_preemption_disabled();
> + guard(irqsave)();
>
> /* Clear the cpumask, to indicate no TLB flushing is needed anywhere
> */
> cpumask_clear_cpu(smp_processor_id(),
> mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
@ 2025-04-30 8:51 ` Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-30 8:51 UTC (permalink / raw)
To: Borah, Chaitanya Kumar
Cc: Jani Nikula, luto@kernel.org, intel-gfx@lists.freedesktop.org,
intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar,
Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org,
Ingo Molnar, hughd@google.com
On Wed, Apr 30, 2025 at 08:47:43AM +0000, Borah, Chaitanya Kumar wrote:
>
>
> > -----Original Message-----
> > From: Peter Zijlstra <peterz@infradead.org>
> > Sent: Tuesday, April 29, 2025 11:59 PM
> > To: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>;
> > luto@kernel.org; intel-gfx@lists.freedesktop.org; intel-
> > xe@lists.freedesktop.org; Kurmi, Suresh Kumar
> > <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>;
> > De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org;
> > Ingo Molnar <mingo@kernel.org>
> > Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery (linux-next)
> >
> > On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar"
> > <chaitanya.kumar.borah@intel.com> wrote:
> > > > +Andy, Ingo
> > > >
> > > > Friendly reminder.
> > > > Issue is still seen on latest linux-next runs.
> > > >
> > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4
> > > > /boot0.txt
> > > >
> > > > Regards
> > > >
> > > > Chaitanya
> > >
> > > Andy, Ingo -
> > >
> > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > > reported by Chaitanya
> > >
> > > Please look into it.
> >
> > Does your kernel include the below?
>
> This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us.
>
> Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh.
>
> Any idea when this lands in linux-next?
This is the top commit in tip/x86/alternatives and should already be in
-next, Ingo, any idea what is going wrong?
^ permalink raw reply [flat|nested] 9+ messages in thread
* [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off()
2025-04-30 8:11 ` Peter Zijlstra
@ 2025-05-06 9:42 ` tip-bot2 for Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2025-05-06 9:42 UTC (permalink / raw)
To: linux-tip-commits
Cc: Chaitanya Kumar Borah, Jani Nikula, Peter Zijlstra, Ingo Molnar,
Andrew Cooper, Andy Lutomirski, Brian Gerst, H. Peter Anvin,
Juergen Gross, Linus Torvalds, Rik van Riel, x86, linux-kernel
The following commit has been merged into the x86/alternatives branch of tip:
Commit-ID: 7f9958230d8a79d474829bee25ec9426397335ce
Gitweb: https://git.kernel.org/tip/7f9958230d8a79d474829bee25ec9426397335ce
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 30 Apr 2025 10:11:54 +02:00
Committer: Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 06 May 2025 11:28:57 +02:00
x86/mm: Fix false positive warning in switch_mm_irqs_off()
Multiple testers reported the following new warning:
WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
Which corresponds to:
if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
!cpumask_test_cpu(cpu, mm_cpumask(next))))
cpumask_set_cpu(cpu, mm_cpumask(next));
So the problem is that unuse_temporary_mm() explicitly clears
that bit; and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.
See also:
https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/
Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.
However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.
Change unuse_temporary_mm() to mark the mm_struct such that a further
exception (beyond init_mm) can be grafted, to keep the warning for all
the other cases.
Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net
---
arch/x86/include/asm/mmu.h | 4 ++--
arch/x86/include/asm/mmu_context.h | 10 ++++++++++
arch/x86/mm/init.c | 3 +++
arch/x86/mm/tlb.c | 3 ++-
arch/x86/platform/efi/efi_64.c | 1 +
5 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 8b8055a..0fe9c56 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -16,6 +16,8 @@
#define MM_CONTEXT_LOCK_LAM 2
/* Allow LAM and SVA coexisting */
#define MM_CONTEXT_FORCE_TAGGED_SVA 3
+/* Tracks mm_cpumask */
+#define MM_CONTEXT_NOTRACK 4
/*
* x86 has arch-specific MMU state beyond what lives in mm_struct.
@@ -44,9 +46,7 @@ typedef struct {
struct ldt_struct *ldt;
#endif
-#ifdef CONFIG_X86_64
unsigned long flags;
-#endif
#ifdef CONFIG_ADDRESS_MASKING
/* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index c511f85..73bf3b1 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
}
#endif
+static inline bool is_notrack_mm(struct mm_struct *mm)
+{
+ return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
+static inline void set_notrack_mm(struct mm_struct *mm)
+{
+ set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
/*
* We only want to enforce protection keys on the current process
* because we effectively have no access to PKRU for other
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index f8c74d1..aa56d9a 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -28,6 +28,7 @@
#include <asm/text-patching.h>
#include <asm/memtype.h>
#include <asm/paravirt.h>
+#include <asm/mmu_context.h>
/*
* We need to define the tracepoints somewhere, and tlb.c
@@ -830,6 +831,8 @@ void __init poking_init(void)
/* Xen PV guests need the PGD to be pinned. */
paravirt_enter_mmap(text_poke_mm);
+ set_notrack_mm(text_poke_mm);
+
/*
* Randomize the poking address, but make sure that the following page
* will be mapped at the same PMD. We need 2 pages, so find space for 3,
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 39761c7..f5b990e 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -847,7 +847,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
* mm_cpumask. The TLB shootdown code can figure out from
* cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
*/
- if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
+ if (IS_ENABLED(CONFIG_DEBUG_VM) &&
+ WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) &&
!cpumask_test_cpu(cpu, mm_cpumask(next))))
cpumask_set_cpu(cpu, mm_cpumask(next));
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index a5d3496..ce4c08a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void)
efi_mm.pgd = efi_pgd;
mm_init_cpumask(&efi_mm);
init_new_context(NULL, &efi_mm);
+ set_notrack_mm(&efi_mm);
return 0;
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-05-06 9:42 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar
2025-04-24 13:27 ` Borah, Chaitanya Kumar
2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
2025-04-29 18:29 ` Peter Zijlstra
2025-04-30 6:07 ` Hugh Dickins
2025-04-30 8:11 ` Peter Zijlstra
2025-05-06 9:42 ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra
2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
2025-04-30 8:51 ` Peter Zijlstra
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox