* Regression on linux-next (next-20250414) @ 2025-04-16 18:09 Borah, Chaitanya Kumar 2025-04-24 13:27 ` Borah, Chaitanya Kumar 0 siblings, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-04-16 18:09 UTC (permalink / raw) To: luto@kernel.org Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org Hello Andy, Hope you are doing well. I am Chaitanya from the linux graphics team in Intel. This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository. Since the version next-20250414 [2], we are seeing the following regression ````````````````````````````````````````````````````````````````````````````````` <4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 switch_mm_irqs_off+0x389/0x410 <5>[ 0.203173] Modules linked in: <5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary) <5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client Platform/CoffeeLake S UDIMM RVP, BIOS CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 <5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410 <5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9 <5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246 <5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: 0000000100a8c000 <5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: 0000000000000000 <5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: 0000000000000000 <5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: 0000000000000000 <5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: 0000000000000000 <5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000) knlGS:0000000000000000 <5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: 00000000003706f0 <5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 <5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 <5>[ 0.203412] Call Trace: <5>[ 0.203418] <TASK> <5>[ 0.203428] use_temporary_mm+0x5b/0x130 <5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250 <5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220 <5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0 <5>[ 0.203482] start_kernel+0xa38/0xc60 <5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80 <5>[ 0.203504] x86_64_start_reservations+0x18/0x30 <5>[ 0.203516] x86_64_start_kernel+0xbf/0x110 <5>[ 0.203526] ? soft_restart_cpu+0x14/0x14 <5>[ 0.203536] common_startup_64+0x13e/0x141 <5>[ 0.203555] </TASK> ````````````````````````````````````````````````````````````````````````````````` Details log can be found in [3]. After bisecting the tree, the following patch [4] seems to be the first "bad" commit ````````````````````````````````````````````````````````````````````````````````````````````````````````` commit e7021e2fe0b4335523d3f6e2221000bdfc633b62 Author: Andy Lutomirski mailto:luto@kernel.org Date: Wed Apr 2 11:45:39 2025 +0200 x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery ````````````````````````````````````````````````````````````````````````````````````````````````````````` We also verified that if we revert the patch the issue is not seen. Could you please check why the patch causes this regression and provide a fix if necessary? Thank you. Regards Chaitanya [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html? [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414 [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2-8/boot0.txt [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62 ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: Regression on linux-next (next-20250414) 2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar @ 2025-04-24 13:27 ` Borah, Chaitanya Kumar 2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula 0 siblings, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-04-24 13:27 UTC (permalink / raw) To: luto@kernel.org Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar +Andy, Ingo Friendly reminder. Issue is still seen on latest linux-next runs. https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt Regards Chaitanya > -----Original Message----- > From: Borah, Chaitanya Kumar > Sent: Wednesday, April 16, 2025 11:39 PM > To: luto@kernel.org > Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi, > Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani > <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>; > linux-kernel@vger.kernel.org > Subject: Regression on linux-next (next-20250414) > > Hello Andy, > > Hope you are doing well. I am Chaitanya from the linux graphics team in Intel. > > This mail is regarding a regression we are seeing in our CI runs[1] on linux- > next repository. > > Since the version next-20250414 [2], we are seeing the following regression > > ````````````````````````````````````````````````````````````````````````````````` > <4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 > switch_mm_irqs_off+0x389/0x410 > <5>[ 0.203173] Modules linked in: > <5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0- > rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary) > <5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client > Platform/CoffeeLake S UDIMM RVP, BIOS > CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 > <5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410 > <5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe > ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff > ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9 > <5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246 > <5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: > 0000000100a8c000 > <5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: > 0000000000000000 > <5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: > 0000000000000000 > <5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: > 0000000000000000 > <5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: > 0000000000000000 > <5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000) > knlGS:0000000000000000 > <5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > <5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: > 00000000003706f0 > <5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > <5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > <5>[ 0.203412] Call Trace: > <5>[ 0.203418] <TASK> > <5>[ 0.203428] use_temporary_mm+0x5b/0x130 > <5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250 > <5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220 > <5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0 > <5>[ 0.203482] start_kernel+0xa38/0xc60 > <5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80 > <5>[ 0.203504] x86_64_start_reservations+0x18/0x30 > <5>[ 0.203516] x86_64_start_kernel+0xbf/0x110 > <5>[ 0.203526] ? soft_restart_cpu+0x14/0x14 > <5>[ 0.203536] common_startup_64+0x13e/0x141 > <5>[ 0.203555] </TASK> > ````````````````````````````````````````````````````````````````````````````````` > Details log can be found in [3]. > > After bisecting the tree, the following patch [4] seems to be the first "bad" > commit > > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > commit e7021e2fe0b4335523d3f6e2221000bdfc633b62 > Author: Andy Lutomirski mailto:luto@kernel.org > Date: Wed Apr 2 11:45:39 2025 +0200 > > x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() > machinery > > ````````````````````````````````````````````````````````````````````````````````````````````````````````` > > We also verified that if we revert the patch the issue is not seen. > > Could you please check why the patch causes this regression and provide a fix > if necessary? > > Thank you. > > Regards > > Chaitanya > > [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html? > [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux- > next.git/commit/?h=next-20250414 > [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2- > 8/boot0.txt > [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux- > next.git/commit/?h=next- > 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-24 13:27 ` Borah, Chaitanya Kumar @ 2025-04-29 9:01 ` Jani Nikula 2025-04-29 18:29 ` Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Jani Nikula @ 2025-04-29 9:01 UTC (permalink / raw) To: Borah, Chaitanya Kumar, luto@kernel.org Cc: intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, peterz@infradead.org, Ingo Molnar On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote: > +Andy, Ingo > > Friendly reminder. > Issue is still seen on latest linux-next runs. > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt > > Regards > > Chaitanya Andy, Ingo - Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery") on linux-next regresses as reported by Chaitanya Please look into it. Thanks, Jani. > >> -----Original Message----- >> From: Borah, Chaitanya Kumar >> Sent: Wednesday, April 16, 2025 11:39 PM >> To: luto@kernel.org >> Cc: intel-gfx@lists.freedesktop.org; intel-xe@lists.freedesktop.org; Kurmi, >> Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>; Saarinen, Jani >> <jani.saarinen@intel.com>; De Marchi, Lucas <lucas.demarchi@intel.com>; >> linux-kernel@vger.kernel.org >> Subject: Regression on linux-next (next-20250414) >> >> Hello Andy, >> >> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel. >> >> This mail is regarding a regression we are seeing in our CI runs[1] on linux- >> next repository. >> >> Since the version next-20250414 [2], we are seeing the following regression >> >> ````````````````````````````````````````````````````````````````````````````````` >> <4>[ 0.203154] WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 >> switch_mm_irqs_off+0x389/0x410 >> <5>[ 0.203173] Modules linked in: >> <5>[ 0.203184] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0- >> rc2-next-20250414-next-20250414-gb425262c07a6+ #1 PREEMPT(voluntary) >> <5>[ 0.203207] Hardware name: Intel Corporation CoffeeLake Client >> Platform/CoffeeLake S UDIMM RVP, BIOS >> CNLSFWR1.R00.X220.B00.2103302221 03/30/2021 >> <5>[ 0.203229] RIP: 0010:switch_mm_irqs_off+0x389/0x410 >> <5>[ 0.203241] Code: e9 4d fd ff ff be 00 01 00 00 31 ff e8 60 ba f9 ff e9 29 fe >> ff ff 48 c7 c7 60 25 a1 82 e8 bf 73 a2 00 84 c0 0f 85 d4 fc ff ff <0f> 0b e9 cd fc ff >> ff bf 0b 01 00 00 be 01 00 00 00 31 d2 e8 1f e9 >> <5>[ 0.203271] RSP: 0000:ffffffff83403d90 EFLAGS: 00010246 >> <5>[ 0.203283] RAX: 0000000000000000 RBX: ffffffff8389f080 RCX: >> 0000000100a8c000 >> <5>[ 0.203296] RDX: ffffffff83414200 RSI: 0000000000000000 RDI: >> 0000000000000000 >> <5>[ 0.203309] RBP: ffffffff83403dc8 R08: 000000008d3ea018 R09: >> 0000000000000000 >> <5>[ 0.203322] R10: 0000000000000000 R11: 0000000003f55067 R12: >> 0000000000000000 >> <5>[ 0.203335] R13: ffffffff836d0b40 R14: ffffffff83414200 R15: >> 0000000000000000 >> <5>[ 0.203348] FS: 0000000000000000(0000) GS:ffff8884d94f6000(0000) >> knlGS:0000000000000000 >> <5>[ 0.203363] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> <5>[ 0.203374] CR2: ffff88846dfff000 CR3: 000000000344a001 CR4: >> 00000000003706f0 >> <5>[ 0.203387] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> <5>[ 0.203400] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >> 0000000000000400 >> <5>[ 0.203412] Call Trace: >> <5>[ 0.203418] <TASK> >> <5>[ 0.203428] use_temporary_mm+0x5b/0x130 >> <5>[ 0.203439] efi_set_virtual_address_map+0x4c/0x250 >> <5>[ 0.203452] ? efi_sync_low_kernel_mappings+0x10a/0x220 >> <5>[ 0.203467] efi_enter_virtual_mode+0x205/0x5b0 >> <5>[ 0.203482] start_kernel+0xa38/0xc60 >> <5>[ 0.203492] ? sme_unmap_bootdata+0x14/0x80 >> <5>[ 0.203504] x86_64_start_reservations+0x18/0x30 >> <5>[ 0.203516] x86_64_start_kernel+0xbf/0x110 >> <5>[ 0.203526] ? soft_restart_cpu+0x14/0x14 >> <5>[ 0.203536] common_startup_64+0x13e/0x141 >> <5>[ 0.203555] </TASK> >> ````````````````````````````````````````````````````````````````````````````````` >> Details log can be found in [3]. >> >> After bisecting the tree, the following patch [4] seems to be the first "bad" >> commit >> >> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >> commit e7021e2fe0b4335523d3f6e2221000bdfc633b62 >> Author: Andy Lutomirski mailto:luto@kernel.org >> Date: Wed Apr 2 11:45:39 2025 +0200 >> >> x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() >> machinery >> >> ````````````````````````````````````````````````````````````````````````````````````````````````````````` >> >> We also verified that if we revert the patch the issue is not seen. >> >> Could you please check why the patch causes this regression and provide a fix >> if necessary? >> >> Thank you. >> >> Regards >> >> Chaitanya >> >> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html? >> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux- >> next.git/commit/?h=next-20250414 >> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20250414/bat-dg2- >> 8/boot0.txt >> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux- >> next.git/commit/?h=next- >> 20250414&id=e7021e2fe0b4335523d3f6e2221000bdfc633b62 > -- Jani Nikula, Intel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula @ 2025-04-29 18:29 ` Peter Zijlstra 2025-04-30 6:07 ` Hugh Dickins 2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar 0 siblings, 2 replies; 9+ messages in thread From: Peter Zijlstra @ 2025-04-29 18:29 UTC (permalink / raw) To: Jani Nikula Cc: Borah, Chaitanya Kumar, luto@kernel.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, Ingo Molnar On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote: > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote: > > +Andy, Ingo > > > > Friendly reminder. > > Issue is still seen on latest linux-next runs. > > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt > > > > Regards > > > > Chaitanya > > Andy, Ingo - > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the > use_/unuse_temporary_mm() machinery") on linux-next regresses as > reported by Chaitanya > > Please look into it. Does your kernel include the below? --- commit aef1d0209ddf127a8069aca5fa3a062be4136b76 Author: Peter Zijlstra <peterz@infradead.org> Date: Fri Apr 18 11:50:34 2025 +0200 x86/mm: Fix {,un}use_temporary_mm() IRQ state As the function switch_mm_irqs_off() implies, it ought to be called with IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on") caused this to not be the case for EFI. Ensure IRQs are off where it matters. Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on") Reported-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 79c124f6f3f2..39761c7765bd 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) struct mm_struct *prev_mm; lockdep_assert_preemption_disabled(); + guard(irqsave)(); /* * Make sure not to be in TLB lazy mode, as otherwise we'll end up @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) void unuse_temporary_mm(struct mm_struct *prev_mm) { lockdep_assert_preemption_disabled(); + guard(irqsave)(); /* Clear the cpumask, to indicate no TLB flushing is needed anywhere */ cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm))); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-29 18:29 ` Peter Zijlstra @ 2025-04-30 6:07 ` Hugh Dickins 2025-04-30 8:11 ` Peter Zijlstra 2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar 1 sibling, 1 reply; 9+ messages in thread From: Hugh Dickins @ 2025-04-30 6:07 UTC (permalink / raw) To: Peter Zijlstra Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, Ingo Molnar On Tue, 29 Apr 2025, Peter Zijlstra wrote: > On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote: > > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" <chaitanya.kumar.borah@intel.com> wrote: > > > +Andy, Ingo > > > > > > Friendly reminder. > > > Issue is still seen on latest linux-next runs. > > > > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4/boot0.txt > > > > > > Regards > > > > > > Chaitanya > > > > Andy, Ingo - > > > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the > > use_/unuse_temporary_mm() machinery") on linux-next regresses as > > reported by Chaitanya > > > > Please look into it. > > Does your kernel include the below? > > --- > commit aef1d0209ddf127a8069aca5fa3a062be4136b76 > Author: Peter Zijlstra <peterz@infradead.org> > Date: Fri Apr 18 11:50:34 2025 +0200 > > x86/mm: Fix {,un}use_temporary_mm() IRQ state > > As the function switch_mm_irqs_off() implies, it ought to be called with > IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs > are on") caused this to not be the case for EFI. > > Ensure IRQs are off where it matters. > > Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on") > Reported-by: Borislav Petkov (AMD) <bp@alien8.de> > Tested-by: Borislav Petkov (AMD) <bp@alien8.de> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Ingo Molnar <mingo@kernel.org> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Rik van Riel <riel@surriel.com> > Link: https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kicks-ass.net > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > index 79c124f6f3f2..39761c7765bd 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) > struct mm_struct *prev_mm; > > lockdep_assert_preemption_disabled(); > + guard(irqsave)(); > > /* > * Make sure not to be in TLB lazy mode, as otherwise we'll end up > @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) > void unuse_temporary_mm(struct mm_struct *prev_mm) > { > lockdep_assert_preemption_disabled(); > + guard(irqsave)(); > > /* Clear the cpumask, to indicate no TLB flushing is needed anywhere */ > cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm))); Hi Peter, I haven't checked on most recent -nexts, but earlier found that patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y: because switch_mm_irqs_off() contains a VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(prev))); which doesn't like what (un)use_temporary_mm() is now doing. I couldn't be sure who was right or wrong, and just proceeded by commenting out the warning - ONCE shouldn't be much trouble, except xfstests uses some nefarious mechanism to resurrect ONCE repeatedly. Hugh ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-30 6:07 ` Hugh Dickins @ 2025-04-30 8:11 ` Peter Zijlstra 2025-05-06 9:42 ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra 0 siblings, 1 reply; 9+ messages in thread From: Peter Zijlstra @ 2025-04-30 8:11 UTC (permalink / raw) To: Hugh Dickins Cc: Jani Nikula, Borah, Chaitanya Kumar, luto@kernel.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, Ingo Molnar, riel On Tue, Apr 29, 2025 at 11:07:45PM -0700, Hugh Dickins wrote: > On Tue, 29 Apr 2025, Peter Zijlstra wrote: > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > > index 79c124f6f3f2..39761c7765bd 100644 > > --- a/arch/x86/mm/tlb.c > > +++ b/arch/x86/mm/tlb.c > > @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) > > struct mm_struct *prev_mm; > > > > lockdep_assert_preemption_disabled(); > > + guard(irqsave)(); > > > > /* > > * Make sure not to be in TLB lazy mode, as otherwise we'll end up > > @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm) > > void unuse_temporary_mm(struct mm_struct *prev_mm) > > { > > lockdep_assert_preemption_disabled(); > > + guard(irqsave)(); > > > > /* Clear the cpumask, to indicate no TLB flushing is needed anywhere */ > > cpumask_clear_cpu(smp_processor_id(), mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm))); > > Hi Peter, I haven't checked on most recent -nexts, but earlier found that > patch to be not quite enough, at least if you have CONFIG_DEBUG_VM=y: > because switch_mm_irqs_off() contains a > > VM_WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, > mm_cpumask(prev))); > > which doesn't like what (un)use_temporary_mm() is now doing. I couldn't > be sure who was right or wrong, and just proceeded by commenting out > the warning - ONCE shouldn't be much trouble, except xfstests uses > some nefarious mechanism to resurrect ONCE repeatedly. Oh that one. Yeah, I thought Ingo had already delete that WARN, but it seems it's still there. So the problem is that unuse_temporary_mm() explicitly clears that bit; and it has to, because otherwise the flush_tlb_mm_range() in __text_poke() will try sending IPIs, which are not at all needed. (See also: https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/ ) Notably, the whole {,un}use_temporary_mm() thing requires preemption to be disabled across it with the express purpose of keeping all TLB nonsense CPU local, such that invalidations can also stay local etc. However, as a side-effect, we violate this above WARN(), which sorta makes sense for the normal case, but very much doesn't make sense here. There are two ways out, one have unuse_temporary_mm() mark the mm_struct such that a further exception (beyond init_mm) can be grafted, or simply delete the whole check. Anyway, something like the below, or just delete the check I suppose. Opinions? --- diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 8b8055a8eb9e..0fe9c569d171 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -16,6 +16,8 @@ #define MM_CONTEXT_LOCK_LAM 2 /* Allow LAM and SVA coexisting */ #define MM_CONTEXT_FORCE_TAGGED_SVA 3 +/* Tracks mm_cpumask */ +#define MM_CONTEXT_NOTRACK 4 /* * x86 has arch-specific MMU state beyond what lives in mm_struct. @@ -44,9 +46,7 @@ typedef struct { struct ldt_struct *ldt; #endif -#ifdef CONFIG_X86_64 unsigned long flags; -#endif #ifdef CONFIG_ADDRESS_MASKING /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index c511f8584ae4..73bf3b1b44e8 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm) } #endif +static inline bool is_notrack_mm(struct mm_struct *mm) +{ + return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags); +} + +static inline void set_notrack_mm(struct mm_struct *mm) +{ + set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags); +} + /* * We only want to enforce protection keys on the current process * because we effectively have no access to PKRU for other diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index f8c74d19bebb..aa56d9ac0b8f 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -28,6 +28,7 @@ #include <asm/text-patching.h> #include <asm/memtype.h> #include <asm/paravirt.h> +#include <asm/mmu_context.h> /* * We need to define the tracepoints somewhere, and tlb.c @@ -830,6 +831,8 @@ void __init poking_init(void) /* Xen PV guests need the PGD to be pinned. */ paravirt_enter_mmap(text_poke_mm); + set_notrack_mm(text_poke_mm); + /* * Randomize the poking address, but make sure that the following page * will be mapped at the same PMD. We need 2 pages, so find space for 3, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 1451e022129a..25bfc3305158 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -852,7 +852,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, * mm_cpumask. The TLB shootdown code can figure out from * cpu_tlbstate_shared.is_lazy whether or not to send an IPI. */ - if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm && + if (IS_ENABLED(CONFIG_DEBUG_VM) && + WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) && !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index 8e1796dd6c68..e7e8f77f77f8 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void) efi_mm.pgd = efi_pgd; mm_init_cpumask(&efi_mm); init_new_context(NULL, &efi_mm); + set_notrack_mm(&efi_mm); return 0; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() 2025-04-30 8:11 ` Peter Zijlstra @ 2025-05-06 9:42 ` tip-bot2 for Peter Zijlstra 0 siblings, 0 replies; 9+ messages in thread From: tip-bot2 for Peter Zijlstra @ 2025-05-06 9:42 UTC (permalink / raw) To: linux-tip-commits Cc: Chaitanya Kumar Borah, Jani Nikula, Peter Zijlstra, Ingo Molnar, Andrew Cooper, Andy Lutomirski, Brian Gerst, H. Peter Anvin, Juergen Gross, Linus Torvalds, Rik van Riel, x86, linux-kernel The following commit has been merged into the x86/alternatives branch of tip: Commit-ID: 7f9958230d8a79d474829bee25ec9426397335ce Gitweb: https://git.kernel.org/tip/7f9958230d8a79d474829bee25ec9426397335ce Author: Peter Zijlstra <peterz@infradead.org> AuthorDate: Wed, 30 Apr 2025 10:11:54 +02:00 Committer: Ingo Molnar <mingo@kernel.org> CommitterDate: Tue, 06 May 2025 11:28:57 +02:00 x86/mm: Fix false positive warning in switch_mm_irqs_off() Multiple testers reported the following new warning: WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 Which corresponds to: if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); So the problem is that unuse_temporary_mm() explicitly clears that bit; and it has to, because otherwise the flush_tlb_mm_range() in __text_poke() will try sending IPIs, which are not at all needed. See also: https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/ Notably, the whole {,un}use_temporary_mm() thing requires preemption to be disabled across it with the express purpose of keeping all TLB nonsense CPU local, such that invalidations can also stay local etc. However, as a side-effect, we violate this above WARN(), which sorta makes sense for the normal case, but very much doesn't make sense here. Change unuse_temporary_mm() to mark the mm_struct such that a further exception (beyond init_mm) can be grafted, to keep the warning for all the other cases. Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reported-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net --- arch/x86/include/asm/mmu.h | 4 ++-- arch/x86/include/asm/mmu_context.h | 10 ++++++++++ arch/x86/mm/init.c | 3 +++ arch/x86/mm/tlb.c | 3 ++- arch/x86/platform/efi/efi_64.c | 1 + 5 files changed, 18 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 8b8055a..0fe9c56 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -16,6 +16,8 @@ #define MM_CONTEXT_LOCK_LAM 2 /* Allow LAM and SVA coexisting */ #define MM_CONTEXT_FORCE_TAGGED_SVA 3 +/* Tracks mm_cpumask */ +#define MM_CONTEXT_NOTRACK 4 /* * x86 has arch-specific MMU state beyond what lives in mm_struct. @@ -44,9 +46,7 @@ typedef struct { struct ldt_struct *ldt; #endif -#ifdef CONFIG_X86_64 unsigned long flags; -#endif #ifdef CONFIG_ADDRESS_MASKING /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index c511f85..73bf3b1 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -247,6 +247,16 @@ static inline bool is_64bit_mm(struct mm_struct *mm) } #endif +static inline bool is_notrack_mm(struct mm_struct *mm) +{ + return test_bit(MM_CONTEXT_NOTRACK, &mm->context.flags); +} + +static inline void set_notrack_mm(struct mm_struct *mm) +{ + set_bit(MM_CONTEXT_NOTRACK, &mm->context.flags); +} + /* * We only want to enforce protection keys on the current process * because we effectively have no access to PKRU for other diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index f8c74d1..aa56d9a 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -28,6 +28,7 @@ #include <asm/text-patching.h> #include <asm/memtype.h> #include <asm/paravirt.h> +#include <asm/mmu_context.h> /* * We need to define the tracepoints somewhere, and tlb.c @@ -830,6 +831,8 @@ void __init poking_init(void) /* Xen PV guests need the PGD to be pinned. */ paravirt_enter_mmap(text_poke_mm); + set_notrack_mm(text_poke_mm); + /* * Randomize the poking address, but make sure that the following page * will be mapped at the same PMD. We need 2 pages, so find space for 3, diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 39761c7..f5b990e 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -847,7 +847,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next, * mm_cpumask. The TLB shootdown code can figure out from * cpu_tlbstate_shared.is_lazy whether or not to send an IPI. */ - if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm && + if (IS_ENABLED(CONFIG_DEBUG_VM) && + WARN_ON_ONCE(prev != &init_mm && !is_notrack_mm(prev) && !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c index a5d3496..ce4c08a 100644 --- a/arch/x86/platform/efi/efi_64.c +++ b/arch/x86/platform/efi/efi_64.c @@ -89,6 +89,7 @@ int __init efi_alloc_page_tables(void) efi_mm.pgd = efi_pgd; mm_init_cpumask(&efi_mm); init_new_context(NULL, &efi_mm); + set_notrack_mm(&efi_mm); return 0; ^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-29 18:29 ` Peter Zijlstra 2025-04-30 6:07 ` Hugh Dickins @ 2025-04-30 8:47 ` Borah, Chaitanya Kumar 2025-04-30 8:51 ` Peter Zijlstra 1 sibling, 1 reply; 9+ messages in thread From: Borah, Chaitanya Kumar @ 2025-04-30 8:47 UTC (permalink / raw) To: Peter Zijlstra, Jani Nikula Cc: luto@kernel.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, Ingo Molnar, hughd@google.com > -----Original Message----- > From: Peter Zijlstra <peterz@infradead.org> > Sent: Tuesday, April 29, 2025 11:59 PM > To: Jani Nikula <jani.nikula@linux.intel.com> > Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>; > luto@kernel.org; intel-gfx@lists.freedesktop.org; intel- > xe@lists.freedesktop.org; Kurmi, Suresh Kumar > <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>; > De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org; > Ingo Molnar <mingo@kernel.org> > Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the > use_/unuse_temporary_mm() machinery (linux-next) > > On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote: > > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" > <chaitanya.kumar.borah@intel.com> wrote: > > > +Andy, Ingo > > > > > > Friendly reminder. > > > Issue is still seen on latest linux-next runs. > > > > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4 > > > /boot0.txt > > > > > > Regards > > > > > > Chaitanya > > > > Andy, Ingo - > > > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the > > use_/unuse_temporary_mm() machinery") on linux-next regresses as > > reported by Chaitanya > > > > Please look into it. > > Does your kernel include the below? This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us. Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh. Any idea when this lands in linux-next? Regards Chaitanya > > --- > commit aef1d0209ddf127a8069aca5fa3a062be4136b76 > Author: Peter Zijlstra <peterz@infradead.org> > Date: Fri Apr 18 11:50:34 2025 +0200 > > x86/mm: Fix {,un}use_temporary_mm() IRQ state > > As the function switch_mm_irqs_off() implies, it ought to be called with > IRQs *off*. Commit 58f8ffa91766 ("x86/mm: Allow temporary MMs when > IRQs > are on") caused this to not be the case for EFI. > > Ensure IRQs are off where it matters. > > Fixes: 58f8ffa91766 ("x86/mm: Allow temporary MMs when IRQs are on") > Reported-by: Borislav Petkov (AMD) <bp@alien8.de> > Tested-by: Borislav Petkov (AMD) <bp@alien8.de> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> > Signed-off-by: Ingo Molnar <mingo@kernel.org> > Cc: H. Peter Anvin <hpa@zytor.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Andy Lutomirski <luto@kernel.org> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Rik van Riel <riel@surriel.com> > Link: > https://lore.kernel.org/r/20250418095034.GR38216@noisy.programming.kick > s-ass.net > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index > 79c124f6f3f2..39761c7765bd 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -986,6 +986,7 @@ struct mm_struct *use_temporary_mm(struct > mm_struct *temp_mm) > struct mm_struct *prev_mm; > > lockdep_assert_preemption_disabled(); > + guard(irqsave)(); > > /* > * Make sure not to be in TLB lazy mode, as otherwise we'll end up > @@ -1018,6 +1019,7 @@ struct mm_struct *use_temporary_mm(struct > mm_struct *temp_mm) void unuse_temporary_mm(struct mm_struct > *prev_mm) { > lockdep_assert_preemption_disabled(); > + guard(irqsave)(); > > /* Clear the cpumask, to indicate no TLB flushing is needed anywhere > */ > cpumask_clear_cpu(smp_processor_id(), > mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm))); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) 2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar @ 2025-04-30 8:51 ` Peter Zijlstra 0 siblings, 0 replies; 9+ messages in thread From: Peter Zijlstra @ 2025-04-30 8:51 UTC (permalink / raw) To: Borah, Chaitanya Kumar Cc: Jani Nikula, luto@kernel.org, intel-gfx@lists.freedesktop.org, intel-xe@lists.freedesktop.org, Kurmi, Suresh Kumar, Saarinen, Jani, De Marchi, Lucas, linux-kernel@vger.kernel.org, Ingo Molnar, hughd@google.com On Wed, Apr 30, 2025 at 08:47:43AM +0000, Borah, Chaitanya Kumar wrote: > > > > -----Original Message----- > > From: Peter Zijlstra <peterz@infradead.org> > > Sent: Tuesday, April 29, 2025 11:59 PM > > To: Jani Nikula <jani.nikula@linux.intel.com> > > Cc: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>; > > luto@kernel.org; intel-gfx@lists.freedesktop.org; intel- > > xe@lists.freedesktop.org; Kurmi, Suresh Kumar > > <suresh.kumar.kurmi@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>; > > De Marchi, Lucas <lucas.demarchi@intel.com>; linux-kernel@vger.kernel.org; > > Ingo Molnar <mingo@kernel.org> > > Subject: Re: [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the > > use_/unuse_temporary_mm() machinery (linux-next) > > > > On Tue, Apr 29, 2025 at 12:01:22PM +0300, Jani Nikula wrote: > > > On Thu, 24 Apr 2025, "Borah, Chaitanya Kumar" > > <chaitanya.kumar.borah@intel.com> wrote: > > > > +Andy, Ingo > > > > > > > > Friendly reminder. > > > > Issue is still seen on latest linux-next runs. > > > > > > > > https://intel-gfx-ci.01.org/tree/linux-next/next-20250424/bat-rpls-4 > > > > /boot0.txt > > > > > > > > Regards > > > > > > > > Chaitanya > > > > > > Andy, Ingo - > > > > > > Commit e7021e2fe0b4 ("x86/efi: Make efi_enter/leave_mm() use the > > > use_/unuse_temporary_mm() machinery") on linux-next regresses as > > > reported by Chaitanya > > > > > > Please look into it. > > > > Does your kernel include the below? > > This change has not yet landed in linux-next. However, making local change on top of next-20250429 seems to help us. > > Important to note that we don't CONFIG_DEBUG_VM=y as mentioned by Hugh. > > Any idea when this lands in linux-next? This is the top commit in tip/x86/alternatives and should already be in -next, Ingo, any idea what is going wrong? ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2025-05-06 9:42 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-16 18:09 Regression on linux-next (next-20250414) Borah, Chaitanya Kumar 2025-04-24 13:27 ` Borah, Chaitanya Kumar 2025-04-29 9:01 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Jani Nikula 2025-04-29 18:29 ` Peter Zijlstra 2025-04-30 6:07 ` Hugh Dickins 2025-04-30 8:11 ` Peter Zijlstra 2025-05-06 9:42 ` [tip: x86/alternatives] x86/mm: Fix false positive warning in switch_mm_irqs_off() tip-bot2 for Peter Zijlstra 2025-04-30 8:47 ` [REGRESSION] x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machinery (linux-next) Borah, Chaitanya Kumar 2025-04-30 8:51 ` Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox