public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Regression on linux-next (next-20250414)
@ 2025-04-16 18:09 Borah, Chaitanya Kumar
  2025-04-24 13:27 ` Borah, Chaitanya Kumar
  0 siblings, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-16 18:09 UTC (permalink / raw)
  To: luto@kernel.org
  Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org

Hello Andy,

Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.

This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.

Since the version next-20250414 [2], we are seeing the following regression

`````````````````````````````````````````````````````````````````````````````````
<4>[    0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 switch_mm_irqs_off+0x389/0x410
<5>[    0.203173] Modules linked in:
<5>[    0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary) 
<5>[    0.203207] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
<5>[    0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
<5>[    0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
<5>[    0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
<5>[    0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: 0000000100a8c000
<5>[    0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: 0000000000000000
<5>[    0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: 0000000000000000
<5>[    0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: 0000000000000000
<5>[    0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: 0000000000000000
<5>[    0.203348] FS:  0000000000000000(0000) GS:ffff8884d94f6000(0000) knlGS:0000000000000000
<5>[    0.203363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<5>[    0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: 00000000003706f0
<5>[    0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
<5>[    0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
<5>[    0.203412] Call Trace:
<5>[    0.203418]  <TASK>
<5>[    0.203428]  use_temporary_mm+0x5b/0x130
<5>[    0.203439]  efi_set_virtual_address_map+0x4c/0x250
<5>[    0.203452]  ? efi_sync_low_kernel_mappings+0x10a/0x220
<5>[    0.203467]  efi_enter_virtual_mode+0x205/0x5b0
<5>[    0.203482]  start_kernel+0xa38/0xc60
<5>[    0.203492]  ? sme_unmap_bootdata+0x14/0x80
<5>[    0.203504]  x86_64_start_reservations+0x18/0x30
<5>[    0.203516]  x86_64_start_kernel+0xbf/0x110
<5>[    0.203526]  ? soft_restart_cpu+0x14/0x14
<5>[    0.203536]  common_startup_64+0x13e/0x141
<5>[    0.203555]  </TASK>
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].

After bisecting the tree, the following patch [4] seems to be the first "bad"
commit

`````````````````````````````````````````````````````````````````````````````````````````````````````````
commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
Author: Andy Lutomirski mailto:luto@kernel.org
Date:   Wed Apr 2 11:45:39 2025 +0200

    x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery

`````````````````````````````````````````````````````````````````````````````````````````````````````````

We also verified that if we revert the patch the issue is not seen.

Could you please check why the patch causes this regression and provide a fix if necessary?

Thank you.

Regards

Chaitanya

[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-8/boot0.txt 
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62


^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: Regression on linux-next (next-20250414)
  2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar
@ 2025-04-24 13:27 ` Borah, Chaitanya Kumar
  2025-04-29  9:01   ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
  0 siblings, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-24 13:27 UTC (permalink / raw)
  To: luto@kernel.org
  Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar

+Andy, Ingo

Friendly reminder.
Issue is still seen on latest linux-next runs.

https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt

Regards

Chaitanya

> -----Original Message-----
> From: Borah, Chaitanya Kumar
> Sent: Wednesday, April 16, 2025 11:39 PM
> To: luto@kernel.org
> Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi,
> Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani
> <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
> linux-kernel@vger.kernel.org
> Subject: Regression on linux-next (next-20250414)
> 
> Hello Andy,
> 
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
> 
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-
> next repository.
> 
> Since the version next-20250414 [2], we are seeing the following regression
> 
> `````````````````````````````````````````````````````````````````````````````````
> <4>[    0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
> switch_mm_irqs_off+0x389/0x410
> <5>[    0.203173] Modules linked in:
> <5>[    0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-
> rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary)
> <5>[    0.203207] Hardware name: Intel Corporation CoffeeLake Client
> Platform/CoffeeLake S UDIMM RVP, BIOS
> CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
> <5>[    0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
> <5>[    0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe
> ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff
> ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
> <5>[    0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
> <5>[    0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX:
> 0000000100a8c000
> <5>[    0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI:
> 0000000000000000
> <5>[    0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09:
> 0000000000000000
> <5>[    0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12:
> 0000000000000000
> <5>[    0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15:
> 0000000000000000
> <5>[    0.203348] FS:  0000000000000000(0000) GS:ffff8884d94f6000(0000)
> knlGS:0000000000000000
> <5>[    0.203363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <5>[    0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4:
> 00000000003706f0
> <5>[    0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> <5>[    0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
> 0000000000000400
> <5>[    0.203412] Call Trace:
> <5>[    0.203418]  <TASK>
> <5>[    0.203428]  use_temporary_mm+0x5b/0x130
> <5>[    0.203439]  efi_set_virtual_address_map+0x4c/0x250
> <5>[    0.203452]  ? efi_sync_low_kernel_mappings+0x10a/0x220
> <5>[    0.203467]  efi_enter_virtual_mode+0x205/0x5b0
> <5>[    0.203482]  start_kernel+0xa38/0xc60
> <5>[    0.203492]  ? sme_unmap_bootdata+0x14/0x80
> <5>[    0.203504]  x86_64_start_reservations+0x18/0x30
> <5>[    0.203516]  x86_64_start_kernel+0xbf/0x110
> <5>[    0.203526]  ? soft_restart_cpu+0x14/0x14
> <5>[    0.203536]  common_startup_64+0x13e/0x141
> <5>[    0.203555]  </TASK>
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].
> 
> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
> 
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
> Author: Andy Lutomirski mailto:luto@kernel.org
> Date:   Wed Apr 2 11:45:39 2025 +0200
> 
>     x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm()
> machinery
> 
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> 
> We also verified that if we revert the patch the issue is not seen.
> 
> Could you please check why the patch causes this regression and provide a fix
> if necessary?
> 
> Thank you.
> 
> Regards
> 
> Chaitanya
> 
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/commit/?h=next-20250414
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-
> 8/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
> next.git/commit/?h=next-
> 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-24 13:27 ` Borah, Chaitanya Kumar
@ 2025-04-29  9:01   ` Jani Nikula
  2025-04-29 18:29     ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Jani Nikula @ 2025-04-29  9:01 UTC (permalink / raw)
  To: Borah, Chaitanya Kumar, luto@kernel.org
  Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar

On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> +Andy, Ingo
>
> Friendly reminder.
> Issue is still seen on latest linux-next runs.
>
> https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
>
> Regards
>
> Chaitanya

Andy, Ingo -

Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
use_/unuse_temporary_mm() machinery") on linux-next regresses as
reported by Chaitanya

Please look into it.


Thanks,
Jani.



>
>> -----Original Message-----
>> From: Borah, Chaitanya Kumar
>> Sent: Wednesday, April 16, 2025 11:39 PM
>> To: luto@kernel.org
>> Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi,
>> Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani
>> <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>;
>> linux-kernel@vger.kernel.org
>> Subject: Regression on linux-next (next-20250414)
>> 
>> Hello Andy,
>> 
>> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>> 
>> This mail is regarding a regression we are seeing in our CI runs[1] on linux-
>> next repository.
>> 
>> Since the version next-20250414 [2], we are seeing the following regression
>> 
>> `````````````````````````````````````````````````````````````````````````````````
>> <4>[    0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795
>> switch_mm_irqs_off+0x389/0x410
>> <5>[    0.203173] Modules linked in:
>> <5>[    0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-
>> rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary)
>> <5>[    0.203207] Hardware name: Intel Corporation CoffeeLake Client
>> Platform/CoffeeLake S UDIMM RVP, BIOS
>> CNLSFWR1.R00.X220.B00.2103302221 03/30/2021
>> <5>[    0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410
>> <5>[    0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe
>> ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff
>> ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9
>> <5>[    0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246
>> <5>[    0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX:
>> 0000000100a8c000
>> <5>[    0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI:
>> 0000000000000000
>> <5>[    0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09:
>> 0000000000000000
>> <5>[    0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12:
>> 0000000000000000
>> <5>[    0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15:
>> 0000000000000000
>> <5>[    0.203348] FS:  0000000000000000(0000) GS:ffff8884d94f6000(0000)
>> knlGS:0000000000000000
>> <5>[    0.203363] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> <5>[    0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4:
>> 00000000003706f0
>> <5>[    0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> <5>[    0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
>> 0000000000000400
>> <5>[    0.203412] Call Trace:
>> <5>[    0.203418]  <TASK>
>> <5>[    0.203428]  use_temporary_mm+0x5b/0x130
>> <5>[    0.203439]  efi_set_virtual_address_map+0x4c/0x250
>> <5>[    0.203452]  ? efi_sync_low_kernel_mappings+0x10a/0x220
>> <5>[    0.203467]  efi_enter_virtual_mode+0x205/0x5b0
>> <5>[    0.203482]  start_kernel+0xa38/0xc60
>> <5>[    0.203492]  ? sme_unmap_bootdata+0x14/0x80
>> <5>[    0.203504]  x86_64_start_reservations+0x18/0x30
>> <5>[    0.203516]  x86_64_start_kernel+0xbf/0x110
>> <5>[    0.203526]  ? soft_restart_cpu+0x14/0x14
>> <5>[    0.203536]  common_startup_64+0x13e/0x141
>> <5>[    0.203555]  </TASK>
>> `````````````````````````````````````````````````````````````````````````````````
>> Details log can be found in [3].
>> 
>> After bisecting the tree, the following patch [4] seems to be the first "bad"
>> commit
>> 
>> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>> commit e7021e2fe0b4335523d3f6e2221000bdfc633b62
>> Author: Andy Lutomirski mailto:luto@kernel.org
>> Date:   Wed Apr 2 11:45:39 2025 +0200
>> 
>>     x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm()
>> machinery
>> 
>> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>> 
>> We also verified that if we revert the patch the issue is not seen.
>> 
>> Could you please check why the patch causes this regression and provide a fix
>> if necessary?
>> 
>> Thank you.
>> 
>> Regards
>> 
>> Chaitanya
>> 
>> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
>> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
>> next.git/commit/?h=next-20250414
>> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-
>> 8/boot0.txt
>> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-
>> next.git/commit/?h=next-
>> 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62
>

-- 
Jani Nikula, Intel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-29  9:01   ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
@ 2025-04-29 18:29     ` Peter Zijlstra
  2025-04-30  6:07       ` Hugh Dickins
  2025-04-30  8:47       ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-29 18:29 UTC (permalink / raw)
  To: Jani Nikula
  Cc: Borah, Chaitanya Kumar, luto@kernel.org,
	intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org, Ingo Molnar

On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> > +Andy, Ingo
> >
> > Friendly reminder.
> > Issue is still seen on latest linux-next runs.
> >
> > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
> >
> > Regards
> >
> > Chaitanya
> 
> Andy, Ingo -
> 
> Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> use_/unuse_temporary_mm() machinery") on linux-next regresses as
> reported by Chaitanya
> 
> Please look into it.

Does your kernel include the below?

---
commit aef1d0209ddf127a8069aca5fa3a062be4136b76
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Fri Apr 18 11:50:34 2025 +0200

    x86/mm: Fix {,un}use_temporary_mm() IRQ state
    
    As the function switch_mm_irqs_off() implies, it ought to be called with
    IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs
    are on") caused this to not be the case for EFI.
    
    Ensure IRQs are off where it matters.
    
    Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
    Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
    Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
    Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
    Signed-off-by: Ingo Molnar <mingo@kernel.org>
    Cc: H. Peter Anvin <hpa@zytor.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: Andy Lutomirski <luto@kernel.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Rik van Riel <riel@surriel.com>
    Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 79c124f6f3f2..39761c7765bd 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
 	struct mm_struct *prev_mm;
 
 	lockdep_assert_preemption_disabled();
+	guard(irqsave)();
 
 	/*
 	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
@@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
 void unuse_temporary_mm(struct mm_struct *prev_mm)
 {
 	lockdep_assert_preemption_disabled();
+	guard(irqsave)();
 
 	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
 	cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-29 18:29     ` Peter Zijlstra
@ 2025-04-30  6:07       ` Hugh Dickins
  2025-04-30  8:11         ` Peter Zijlstra
  2025-04-30  8:47       ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
  1 sibling, 1 reply; 9+ messages in thread
From: Hugh Dickins @ 2025-04-30  6:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org,
	intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org, Ingo Molnar

On Tue, 29 Apr 2025, Peter Zijlstra wrote:
> On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote:
> > > +Andy, Ingo
> > >
> > > Friendly reminder.
> > > Issue is still seen on latest linux-next runs.
> > >
> > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt
> > >
> > > Regards
> > >
> > > Chaitanya
> > 
> > Andy, Ingo -
> > 
> > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > reported by Chaitanya
> > 
> > Please look into it.
> 
> Does your kernel include the below?
> 
> ---
> commit aef1d0209ddf127a8069aca5fa3a062be4136b76
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Fri Apr 18 11:50:34 2025 +0200
> 
>     x86/mm: Fix {,un}use_temporary_mm() IRQ state
>     
>     As the function switch_mm_irqs_off() implies, it ought to be called with
>     IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs
>     are on") caused this to not be the case for EFI.
>     
>     Ensure IRQs are off where it matters.
>     
>     Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
>     Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
>     Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
>     Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
>     Cc: H. Peter Anvin <hpa@zytor.com>
>     Cc: Andrew Morton <akpm@linux-foundation.org>
>     Cc: Andy Lutomirski <luto@kernel.org>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Rik van Riel <riel@surriel.com>
>     Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 79c124f6f3f2..39761c7765bd 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
>  	struct mm_struct *prev_mm;
>  
>  	lockdep_assert_preemption_disabled();
> +	guard(irqsave)();
>  
>  	/*
>  	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
>  void unuse_temporary_mm(struct mm_struct *prev_mm)
>  {
>  	lockdep_assert_preemption_disabled();
> +	guard(irqsave)();
>  
>  	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
>  	cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));

Hi Peter, I haven't checked on most recent -nexts, but earlier found that
patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y:
because switch_mm_irqs_off() contains a

		VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu,
				mm_cpumask(prev)));

which doesn't like what (un)use_temporary_mm() is now doing. I couldn't
be sure who was right or wrong, and just proceeded by commenting out
the warning - ONCE shouldn't be much trouble, except xfstests uses
some nefarious mechanism to resurrect ONCE repeatedly.

Hugh

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-30  6:07       ` Hugh Dickins
@ 2025-04-30  8:11         ` Peter Zijlstra
  2025-05-06  9:42           ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-30  8:11 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org,
	intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org,
	Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas,
	linux-kernel@vger.kernel.org, Ingo Molnar, riel

On Tue, Apr 29, 2025 at 11:07:45PM -0700, Hugh Dickins wrote:
> On Tue, 29 Apr 2025, Peter Zijlstra wrote:

> > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> > index 79c124f6f3f2..39761c7765bd 100644
> > --- a/arch/x86/mm/tlb.c
> > +++ b/arch/x86/mm/tlb.c
> > @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> >  	struct mm_struct *prev_mm;
> >  
> >  	lockdep_assert_preemption_disabled();
> > +	guard(irqsave)();
> >  
> >  	/*
> >  	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> > @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
> >  void unuse_temporary_mm(struct mm_struct *prev_mm)
> >  {
> >  	lockdep_assert_preemption_disabled();
> > +	guard(irqsave)();
> >  
> >  	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
> >  	cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
> 
> Hi Peter, I haven't checked on most recent -nexts, but earlier found that
> patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y:
> because switch_mm_irqs_off() contains a
> 
> 		VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu,
> 				mm_cpumask(prev)));
> 
> which doesn't like what (un)use_temporary_mm() is now doing. I couldn't
> be sure who was right or wrong, and just proceeded by commenting out
> the warning - ONCE shouldn't be much trouble, except xfstests uses
> some nefarious mechanism to resurrect ONCE repeatedly.

Oh that one. Yeah, I thought Ingo had already delete that WARN, but it
seems it's still there.

So the problem is that unuse_temporary_mm() explicitly clears that bit;
and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.

 (See also:
   https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/
 )

Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.

However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.

There are two ways out, one have unuse_temporary_mm() mark the mm_struct
such that a further exception (beyond init_mm) can be grafted, or simply
delete the whole check.

Anyway, something like the below, or just delete the check I suppose.

Opinions?

---
diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 8b8055a8eb9e..0fe9c569d171 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -16,6 +16,8 @@
 #define MM_CONTEXT_LOCK_LAM		2
 /* Allow LAM and SVA coexisting */
 #define MM_CONTEXT_FORCE_TAGGED_SVA	3
+/* Tracks mm_cpumask */
+#define MM_CONTEXT_NOTRACK		4
 
 /*
  * x86 has arch-specific MMU state beyond what lives in mm_struct.
@@ -44,9 +46,7 @@ typedef struct {
 	struct ldt_struct	*ldt;
 #endif
 
-#ifdef CONFIG_X86_64
 	unsigned long flags;
-#endif
 
 #ifdef CONFIG_ADDRESS_MASKING
 	/* Active LAM mode:  X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index c511f8584ae4..73bf3b1b44e8 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
 }
 #endif
 
+static inline bool is_notrack_mm(struct mm_struct *mm)
+{
+	return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
+static inline void set_notrack_mm(struct mm_struct *mm)
+{
+	set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
 /*
  * We only want to enforce protection keys on the current process
  * because we effectively have no access to PKRU for other
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index f8c74d19bebb..aa56d9ac0b8f 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -28,6 +28,7 @@
 #include <asm/text-patching.h>
 #include <asm/memtype.h>
 #include <asm/paravirt.h>
+#include <asm/mmu_context.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
@@ -830,6 +831,8 @@ void __init poking_init(void)
 	/* Xen PV guests need the PGD to be pinned. */
 	paravirt_enter_mmap(text_poke_mm);
 
+	set_notrack_mm(text_poke_mm);
+
 	/*
 	 * Randomize the poking address, but make sure that the following page
 	 * will be mapped at the same PMD. We need 2 pages, so find space for 3,
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 1451e022129a..25bfc3305158 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -852,7 +852,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 		 * mm_cpumask. The TLB shootdown code can figure out from
 		 * cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
 		 */
-		if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
+		if (IS_ENABLED(CONFIG_DEBUG_VM) &&
+		    WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) &&
 				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
 			cpumask_set_cpu(cpu, mm_cpumask(next));
 
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index 8e1796dd6c68..e7e8f77f77f8 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void)
 	efi_mm.pgd = efi_pgd;
 	mm_init_cpumask(&efi_mm);
 	init_new_context(NULL, &efi_mm);
+	set_notrack_mm(&efi_mm);
 
 	return 0;
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* RE: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-29 18:29     ` Peter Zijlstra
  2025-04-30  6:07       ` Hugh Dickins
@ 2025-04-30  8:47       ` Borah, Chaitanya Kumar
  2025-04-30  8:51         ` Peter Zijlstra
  1 sibling, 1 reply; 9+ messages in thread
From: Borah, Chaitanya Kumar @ 2025-04-30  8:47 UTC (permalink / raw)
  To: Peter Zijlstra, Jani Nikula
  Cc: luto@kernel.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar,
	Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org,
	Ingo Molnar, hughd@google.com



> -----Original Message-----
> From: Peter Zijlstra <peterz@infradead.org>
> Sent: Tuesday, April 29, 2025 11:59 PM
> To: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>;
> luto@kernel.org; intel-gfx@lists.freedesktop.org; intel-
> xe@lists.freedesktop.org; Kurmi, Suresh Kumar
> <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>;
> De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org;
> Ingo Molnar <mingo@kernel.org>
> Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the
> use_/unuse_temporary_mm() machinery (linux-next)
> 
> On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar"
> <chaitanya.kumar.borah@intel.com> wrote:
> > > +Andy, Ingo
> > >
> > > Friendly reminder.
> > > Issue is still seen on latest linux-next runs.
> > >
> > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4
> > > /boot0.txt
> > >
> > > Regards
> > >
> > > Chaitanya
> >
> > Andy, Ingo -
> >
> > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > reported by Chaitanya
> >
> > Please look into it.
> 
> Does your kernel include the below?

This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us.

Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh.

Any idea when this lands in linux-next?

Regards

Chaitanya

> 
> ---
> commit aef1d0209ddf127a8069aca5fa3a062be4136b76
> Author: Peter Zijlstra <peterz@infradead.org>
> Date:   Fri Apr 18 11:50:34 2025 +0200
> 
>     x86/mm: Fix {,un}use_temporary_mm() IRQ state
> 
>     As the function switch_mm_irqs_off() implies, it ought to be called with
>     IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when
> IRQs
>     are on") caused this to not be the case for EFI.
> 
>     Ensure IRQs are off where it matters.
> 
>     Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on")
>     Reported-by: Borislav Petkov (AMD) <bp@alien8.de>
>     Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
>     Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
>     Signed-off-by: Ingo Molnar <mingo@kernel.org>
>     Cc: H. Peter Anvin <hpa@zytor.com>
>     Cc: Andrew Morton <akpm@linux-foundation.org>
>     Cc: Andy Lutomirski <luto@kernel.org>
>     Cc: Linus Torvalds <torvalds@linux-foundation.org>
>     Cc: Rik van Riel <riel@surriel.com>
>     Link:
> https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kick
> s-ass.net
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index
> 79c124f6f3f2..39761c7765bd 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct
> mm_struct *temp_mm)
>  	struct mm_struct *prev_mm;
> 
>  	lockdep_assert_preemption_disabled();
> +	guard(irqsave)();
> 
>  	/*
>  	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
> @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct
> mm_struct *temp_mm)  void unuse_temporary_mm(struct mm_struct
> *prev_mm)  {
>  	lockdep_assert_preemption_disabled();
> +	guard(irqsave)();
> 
>  	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere
> */
>  	cpumask_clear_cpu(smp_processor_id(),
> mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next)
  2025-04-30  8:47       ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
@ 2025-04-30  8:51         ` Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2025-04-30  8:51 UTC (permalink / raw)
  To: Borah, Chaitanya Kumar
  Cc: Jani Nikula, luto@kernel.org, intel-gfx@lists.freedesktop.org,
	intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar,
	Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org,
	Ingo Molnar, hughd@google.com

On Wed, Apr 30, 2025 at 08:47:43AM +0000, Borah, Chaitanya Kumar wrote:
> 
> 
> > -----Original Message-----
> > From: Peter Zijlstra <peterz@infradead.org>
> > Sent: Tuesday, April 29, 2025 11:59 PM
> > To: Jani Nikula <jani.nikula@linux.intel.com>
> > Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>;
> > luto@kernel.org; intel-gfx@lists.freedesktop.org; intel-
> > xe@lists.freedesktop.org; Kurmi, Suresh Kumar
> > <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>;
> > De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org;
> > Ingo Molnar <mingo@kernel.org>
> > Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the
> > use_/unuse_temporary_mm() machinery (linux-next)
> > 
> > On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote:
> > > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar"
> > <chaitanya.kumar.borah@intel.com> wrote:
> > > > +Andy, Ingo
> > > >
> > > > Friendly reminder.
> > > > Issue is still seen on latest linux-next runs.
> > > >
> > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4
> > > > /boot0.txt
> > > >
> > > > Regards
> > > >
> > > > Chaitanya
> > >
> > > Andy, Ingo -
> > >
> > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the
> > > use_/unuse_temporary_mm() machinery") on linux-next regresses as
> > > reported by Chaitanya
> > >
> > > Please look into it.
> > 
> > Does your kernel include the below?
> 
> This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us.
> 
> Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh.
> 
> Any idea when this lands in linux-next?

This is the top commit in tip/x86/alternatives and should already be in
-next, Ingo, any idea what is going wrong?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off()
  2025-04-30  8:11         ` Peter Zijlstra
@ 2025-05-06  9:42           ` tip-bot2 for Peter Zijlstra
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2025-05-06  9:42 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Chaitanya Kumar Borah, Jani Nikula, Peter Zijlstra, Ingo Molnar,
	Andrew Cooper, Andy Lutomirski, Brian Gerst, H. Peter Anvin,
	Juergen Gross, Linus Torvalds, Rik van Riel, x86, linux-kernel

The following commit has been merged into the x86/alternatives branch of tip:

Commit-ID:     7f9958230d8a79d474829bee25ec9426397335ce
Gitweb:        https://git.kernel.org/tip/7f9958230d8a79d474829bee25ec9426397335ce
Author:        Peter Zijlstra <peterz@infradead.org>
AuthorDate:    Wed, 30 Apr 2025 10:11:54 +02:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Tue, 06 May 2025 11:28:57 +02:00

x86/mm: Fix false positive warning in switch_mm_irqs_off()

Multiple testers reported the following new warning:

	WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795

Which corresponds to:

	if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
	    !cpumask_test_cpu(cpu, mm_cpumask(next))))
		cpumask_set_cpu(cpu, mm_cpumask(next));

So the problem is that unuse_temporary_mm() explicitly clears
that bit; and it has to, because otherwise the flush_tlb_mm_range() in
__text_poke() will try sending IPIs, which are not at all needed.

See also:

   https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/

Notably, the whole {,un}use_temporary_mm() thing requires preemption to
be disabled across it with the express purpose of keeping all TLB
nonsense CPU local, such that invalidations can also stay local etc.

However, as a side-effect, we violate this above WARN(), which sorta
makes sense for the normal case, but very much doesn't make sense here.

Change unuse_temporary_mm() to mark the mm_struct such that a further
exception (beyond init_mm) can be grafted, to keep the warning for all
the other cases.

Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reported-by: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@surriel.com>
Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net
---
 arch/x86/include/asm/mmu.h         |  4 ++--
 arch/x86/include/asm/mmu_context.h | 10 ++++++++++
 arch/x86/mm/init.c                 |  3 +++
 arch/x86/mm/tlb.c                  |  3 ++-
 arch/x86/platform/efi/efi_64.c     |  1 +
 5 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h
index 8b8055a..0fe9c56 100644
--- a/arch/x86/include/asm/mmu.h
+++ b/arch/x86/include/asm/mmu.h
@@ -16,6 +16,8 @@
 #define MM_CONTEXT_LOCK_LAM		2
 /* Allow LAM and SVA coexisting */
 #define MM_CONTEXT_FORCE_TAGGED_SVA	3
+/* Tracks mm_cpumask */
+#define MM_CONTEXT_NOTRACK		4
 
 /*
  * x86 has arch-specific MMU state beyond what lives in mm_struct.
@@ -44,9 +46,7 @@ typedef struct {
 	struct ldt_struct	*ldt;
 #endif
 
-#ifdef CONFIG_X86_64
 	unsigned long flags;
-#endif
 
 #ifdef CONFIG_ADDRESS_MASKING
 	/* Active LAM mode:  X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */
diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h
index c511f85..73bf3b1 100644
--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm)
 }
 #endif
 
+static inline bool is_notrack_mm(struct mm_struct *mm)
+{
+	return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
+static inline void set_notrack_mm(struct mm_struct *mm)
+{
+	set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags);
+}
+
 /*
  * We only want to enforce protection keys on the current process
  * because we effectively have no access to PKRU for other
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index f8c74d1..aa56d9a 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -28,6 +28,7 @@
 #include <asm/text-patching.h>
 #include <asm/memtype.h>
 #include <asm/paravirt.h>
+#include <asm/mmu_context.h>
 
 /*
  * We need to define the tracepoints somewhere, and tlb.c
@@ -830,6 +831,8 @@ void __init poking_init(void)
 	/* Xen PV guests need the PGD to be pinned. */
 	paravirt_enter_mmap(text_poke_mm);
 
+	set_notrack_mm(text_poke_mm);
+
 	/*
 	 * Randomize the poking address, but make sure that the following page
 	 * will be mapped at the same PMD. We need 2 pages, so find space for 3,
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 39761c7..f5b990e 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -847,7 +847,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
 		 * mm_cpumask. The TLB shootdown code can figure out from
 		 * cpu_tlbstate_shared.is_lazy whether or not to send an IPI.
 		 */
-		if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm &&
+		if (IS_ENABLED(CONFIG_DEBUG_VM) &&
+		    WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) &&
 				 !cpumask_test_cpu(cpu, mm_cpumask(next))))
 			cpumask_set_cpu(cpu, mm_cpumask(next));
 
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index a5d3496..ce4c08a 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void)
 	efi_mm.pgd = efi_pgd;
 	mm_init_cpumask(&efi_mm);
 	init_new_context(NULL, &efi_mm);
+	set_notrack_mm(&efi_mm);
 
 	return 0;
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-05-06  9:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar
2025-04-24 13:27 ` Borah, Chaitanya Kumar
2025-04-29  9:01   ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula
2025-04-29 18:29     ` Peter Zijlstra
2025-04-30  6:07       ` Hugh Dickins
2025-04-30  8:11         ` Peter Zijlstra
2025-05-06  9:42           ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra
2025-04-30  8:47       ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar
2025-04-30  8:51         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox