public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] x86/mm: Clean up and use temportary_mm more
@ 2024-11-19 16:25 Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 1/7] x86/mm: Add mm argument to unuse_temporary_mm() Peter Zijlstra
                   ` (6 more replies)
  0 siblings, 7 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz

Hi,

These are most of the patches I had in x86/lazy and are the preparatory patches
for x86 to eventually run witn MMU_LAZY_TLB_REFCOUNT=n.

The rebase and submission prompted by Rik's recent context switch optimization
which touched unuse_temporary_mm() in a way incompatible with these patches.

I'll see if I can find time this merge window to also finish that last patch
so that x86 can indeed switch to MMU_LAZY_TLB_REFCOUNT=n, that too should help
with context switch times.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/7] x86/mm: Add mm argument to unuse_temporary_mm()
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 2/7] x86/events, x86/insn-eval: Remove incorrect active_mm references Peter Zijlstra
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz

In commit 209954cbc7d0 ("x86/mm/tlb: Update mm_cpumask lazily")
unuse_temporary_mm() grew the assumption that it gets used on
poking_nn exclusively. While this is currently true, lets not hard
code this assumption.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/alternative.c |    6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1828,14 +1828,14 @@ static inline temp_mm_state_t use_tempor
 __ro_after_init struct mm_struct *poking_mm;
 __ro_after_init unsigned long poking_addr;
 
-static inline void unuse_temporary_mm(temp_mm_state_t prev_state)
+static inline void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state)
 {
 	lockdep_assert_irqs_disabled();
 
 	switch_mm_irqs_off(NULL, prev_state.mm, current);
 
 	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
-	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(poking_mm));
+	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm));
 
 	/*
 	 * Restore the breakpoints if they were disabled before the temporary mm
@@ -1942,7 +1942,7 @@ static void *__text_poke(text_poke_f fun
 	 * instruction that already allows the core to see the updated version.
 	 * Xen-PV is assumed to serialize execution in a similar manner.
 	 */
-	unuse_temporary_mm(prev);
+	unuse_temporary_mm(poking_mm, prev);
 
 	/*
 	 * Flushing the TLB might involve IPIs, which would require enabled



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 2/7] x86/events, x86/insn-eval: Remove incorrect active_mm references
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 1/7] x86/mm: Add mm argument to unuse_temporary_mm() Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 3/7] x86/mm: Make use/unuse_temporary_mm() non-static Peter Zijlstra
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz, Andy Lutomirski

From: Andy Lutomirski <luto@kernel.org>

When decoding an instruction or handling a perf event that references an
LDT segment, if we don't have a valid user context, trying to access the
LDT by any means other than SLDT is racy.  Certainly, using
current->active_mm is wrong, as active_mm can point to a real user mm when
CR3 and LDTR no longer reference that mm.

Clean up the code.  If nmi_uaccess_okay() says we don't have a valid
context, just fail.  Otherwise use current->mm.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/d456e7da9dbd271aacd14812d4b9b74e7d7edd52.1641659630.git.luto@kernel.org
---
 arch/x86/events/core.c   |    9 ++++++++-
 arch/x86/lib/insn-eval.c |   13 ++++++++++---
 2 files changed, 18 insertions(+), 4 deletions(-)

--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2798,8 +2798,15 @@ static unsigned long get_segment_base(un
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
 		struct ldt_struct *ldt;
 
+		/*
+		 * If we're not in a valid context with a real (not just lazy)
+		 * user mm, then don't even try.
+		 */
+		if (!nmi_uaccess_okay())
+			return 0;
+
 		/* IRQs are off, so this synchronizes with smp_store_release */
-		ldt = READ_ONCE(current->active_mm->context.ldt);
+		ldt = smp_load_acquire(&current->mm->context.ldt);
 		if (!ldt || idx >= ldt->nr_entries)
 			return 0;
 
--- a/arch/x86/lib/insn-eval.c
+++ b/arch/x86/lib/insn-eval.c
@@ -631,14 +631,21 @@ static bool get_desc(struct desc_struct
 		/* Bits [15:3] contain the index of the desired entry. */
 		sel >>= 3;
 
-		mutex_lock(&current->active_mm->context.lock);
-		ldt = current->active_mm->context.ldt;
+		/*
+		 * If we're not in a valid context with a real (not just lazy)
+		 * user mm, then don't even try.
+		 */
+		if (!nmi_uaccess_okay())
+			return false;
+
+		mutex_lock(&current->mm->context.lock);
+		ldt = current->mm->context.ldt;
 		if (ldt && sel < ldt->nr_entries) {
 			*out = ldt->entries[sel];
 			success = true;
 		}
 
-		mutex_unlock(&current->active_mm->context.lock);
+		mutex_unlock(&current->mm->context.lock);
 
 		return success;
 	}



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 3/7] x86/mm: Make use/unuse_temporary_mm() non-static
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 1/7] x86/mm: Add mm argument to unuse_temporary_mm() Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 2/7] x86/events, x86/insn-eval: Remove incorrect active_mm references Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 4/7] x86/mm: Remove mm argument from unuse_temporary_mm() again Peter Zijlstra
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz, Andy Lutomirski

From: Andy Lutomirski <luto@kernel.org>

This prepares them for use outside of the alternative machinery.
The code is unchanged.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/d1205bc7e165e249c52b7fe8cb1254f06e8a0e2a.1641659630.git.luto@kernel.org
---
 arch/x86/include/asm/mmu_context.h |    7 +++
 arch/x86/kernel/alternative.c      |   68 -------------------------------------
 arch/x86/mm/tlb.c                  |   63 ++++++++++++++++++++++++++++++++++
 3 files changed, 70 insertions(+), 68 deletions(-)

--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -263,4 +263,11 @@ unsigned long __get_current_cr3_fast(voi
 
 #include <asm-generic/mmu_context.h>
 
+typedef struct {
+	struct mm_struct *mm;
+} temp_mm_state_t;
+
+extern temp_mm_state_t use_temporary_mm(struct mm_struct *mm);
+extern void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state);
+
 #endif /* _ASM_X86_MMU_CONTEXT_H */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1774,77 +1774,9 @@ void __init_or_module text_poke_early(vo
 	}
 }
 
-typedef struct {
-	struct mm_struct *mm;
-} temp_mm_state_t;
-
-/*
- * Using a temporary mm allows to set temporary mappings that are not accessible
- * by other CPUs. Such mappings are needed to perform sensitive memory writes
- * that override the kernel memory protections (e.g., W^X), without exposing the
- * temporary page-table mappings that are required for these write operations to
- * other CPUs. Using a temporary mm also allows to avoid TLB shootdowns when the
- * mapping is torn down.
- *
- * Context: The temporary mm needs to be used exclusively by a single core. To
- *          harden security IRQs must be disabled while the temporary mm is
- *          loaded, thereby preventing interrupt handler bugs from overriding
- *          the kernel memory protection.
- */
-static inline temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
-{
-	temp_mm_state_t temp_state;
-
-	lockdep_assert_irqs_disabled();
-
-	/*
-	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
-	 * with a stale address space WITHOUT being in lazy mode after
-	 * restoring the previous mm.
-	 */
-	if (this_cpu_read(cpu_tlbstate_shared.is_lazy))
-		leave_mm();
-
-	temp_state.mm = this_cpu_read(cpu_tlbstate.loaded_mm);
-	switch_mm_irqs_off(NULL, mm, current);
-
-	/*
-	 * If breakpoints are enabled, disable them while the temporary mm is
-	 * used. Userspace might set up watchpoints on addresses that are used
-	 * in the temporary mm, which would lead to wrong signals being sent or
-	 * crashes.
-	 *
-	 * Note that breakpoints are not disabled selectively, which also causes
-	 * kernel breakpoints (e.g., perf's) to be disabled. This might be
-	 * undesirable, but still seems reasonable as the code that runs in the
-	 * temporary mm should be short.
-	 */
-	if (hw_breakpoint_active())
-		hw_breakpoint_disable();
-
-	return temp_state;
-}
-
 __ro_after_init struct mm_struct *poking_mm;
 __ro_after_init unsigned long poking_addr;
 
-static inline void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state)
-{
-	lockdep_assert_irqs_disabled();
-
-	switch_mm_irqs_off(NULL, prev_state.mm, current);
-
-	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
-	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm));
-
-	/*
-	 * Restore the breakpoints if they were disabled before the temporary mm
-	 * was loaded.
-	 */
-	if (hw_breakpoint_active())
-		hw_breakpoint_restore();
-}
-
 static void text_poke_memcpy(void *dst, const void *src, size_t len)
 {
 	memcpy(dst, src, len);
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -674,6 +674,69 @@ void enter_lazy_tlb(struct mm_struct *mm
 }
 
 /*
+ * Using a temporary mm allows to set temporary mappings that are not accessible
+ * by other CPUs. Such mappings are needed to perform sensitive memory writes
+ * that override the kernel memory protections (e.g., W^X), without exposing the
+ * temporary page-table mappings that are required for these write operations to
+ * other CPUs. Using a temporary mm also allows to avoid TLB shootdowns when the
+ * mapping is torn down.
+ *
+ * Context: The temporary mm needs to be used exclusively by a single core. To
+ *          harden security IRQs must be disabled while the temporary mm is
+ *          loaded, thereby preventing interrupt handler bugs from overriding
+ *          the kernel memory protection.
+ */
+temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
+{
+	temp_mm_state_t temp_state;
+
+	lockdep_assert_irqs_disabled();
+
+	/*
+	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
+	 * with a stale address space WITHOUT being in lazy mode after
+	 * restoring the previous mm.
+	 */
+	if (this_cpu_read(cpu_tlbstate_shared.is_lazy))
+		leave_mm();
+
+	temp_state.mm = this_cpu_read(cpu_tlbstate.loaded_mm);
+	switch_mm_irqs_off(NULL, mm, current);
+
+	/*
+	 * If breakpoints are enabled, disable them while the temporary mm is
+	 * used. Userspace might set up watchpoints on addresses that are used
+	 * in the temporary mm, which would lead to wrong signals being sent or
+	 * crashes.
+	 *
+	 * Note that breakpoints are not disabled selectively, which also causes
+	 * kernel breakpoints (e.g., perf's) to be disabled. This might be
+	 * undesirable, but still seems reasonable as the code that runs in the
+	 * temporary mm should be short.
+	 */
+	if (hw_breakpoint_active())
+		hw_breakpoint_disable();
+
+	return temp_state;
+}
+
+void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state)
+{
+	lockdep_assert_irqs_disabled();
+	switch_mm_irqs_off(NULL, prev_state.mm, current);
+
+	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
+	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm));
+
+	/*
+	 * Restore the breakpoints if they were disabled before the temporary mm
+	 * was loaded.
+	 */
+	if (hw_breakpoint_active())
+		hw_breakpoint_restore();
+}
+
+/*
  * Call this when reinitializing a CPU.  It fixes the following potential
  * problems:
  *



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 4/7] x86/mm: Remove mm argument from unuse_temporary_mm() again
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
                   ` (2 preceding siblings ...)
  2024-11-19 16:25 ` [PATCH 3/7] x86/mm: Make use/unuse_temporary_mm() non-static Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 5/7] x86/mm: Allow temporary mms when IRQs are on Peter Zijlstra
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz

Now that unuse_temporary_mm() lives in tlb.c it can access loaded_mm.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/mmu_context.h |    2 +-
 arch/x86/kernel/alternative.c      |    2 +-
 arch/x86/mm/tlb.c                  |    8 +++++---
 3 files changed, 7 insertions(+), 5 deletions(-)

--- a/arch/x86/include/asm/mmu_context.h
+++ b/arch/x86/include/asm/mmu_context.h
@@ -268,6 +268,6 @@ typedef struct {
 } temp_mm_state_t;
 
 extern temp_mm_state_t use_temporary_mm(struct mm_struct *mm);
-extern void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state);
+extern void unuse_temporary_mm(temp_mm_state_t prev_state);
 
 #endif /* _ASM_X86_MMU_CONTEXT_H */
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1874,7 +1874,7 @@ static void *__text_poke(text_poke_f fun
 	 * instruction that already allows the core to see the updated version.
 	 * Xen-PV is assumed to serialize execution in a similar manner.
 	 */
-	unuse_temporary_mm(poking_mm, prev);
+	unuse_temporary_mm(prev);
 
 	/*
 	 * Flushing the TLB might involve IPIs, which would require enabled
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -720,13 +720,15 @@ temp_mm_state_t use_temporary_mm(struct
 	return temp_state;
 }
 
-void unuse_temporary_mm(struct mm_struct *mm, temp_mm_state_t prev_state)
+void unuse_temporary_mm(temp_mm_state_t prev_state)
 {
 	lockdep_assert_irqs_disabled();
-	switch_mm_irqs_off(NULL, prev_state.mm, current);
 
 	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
-	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm));
+	cpumask_clear_cpu(smp_processor_id(),
+			  mm_cpumask(this_cpu_read(cpu_tlbstate.loaded_mm)));
+
+	switch_mm_irqs_off(NULL, prev_state.mm, current);
 
 	/*
 	 * Restore the breakpoints if they were disabled before the temporary mm



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 5/7] x86/mm: Allow temporary mms when IRQs are on
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
                   ` (3 preceding siblings ...)
  2024-11-19 16:25 ` [PATCH 4/7] x86/mm: Remove mm argument from unuse_temporary_mm() again Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 6/7] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 7/7] x86/mm: Opt in to IRQs-off activate_mm() Peter Zijlstra
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz, Andy Lutomirski

From: Andy Lutomirski <luto@kernel.org>

EFI runtime services should use temporary mms, but EFI runtime services
want IRQs on.  Preemption must still be disabled in a temporary mm context.

At some point, the entirely temporary mm mechanism should be moved out of
arch code.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/a8a92ce490b57447ef56898c55133473e481896e.1641659630.git.luto@kernel.org
---
 arch/x86/mm/tlb.c |   19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -679,18 +679,23 @@ void enter_lazy_tlb(struct mm_struct *mm
  * that override the kernel memory protections (e.g., W^X), without exposing the
  * temporary page-table mappings that are required for these write operations to
  * other CPUs. Using a temporary mm also allows to avoid TLB shootdowns when the
- * mapping is torn down.
+ * mapping is torn down.  Temporary mms can also be used for EFI runtime service
+ * calls or similar functionality.
  *
- * Context: The temporary mm needs to be used exclusively by a single core. To
- *          harden security IRQs must be disabled while the temporary mm is
- *          loaded, thereby preventing interrupt handler bugs from overriding
- *          the kernel memory protection.
+ * It is illegal to schedule while using a temporary mm -- the context switch
+ * code is unaware of the temporary mm and does not know how to context switch.
+ * Use a real (non-temporary) mm in a kernel thread if you need to sleep.
+ *
+ * Note: For sensitive memory writes, the temporary mm needs to be used
+ *       exclusively by a single core, and IRQs should be disabled while the
+ *       temporary mm is loaded, thereby preventing interrupt handler bugs from
+ *       overriding the kernel memory protection.
  */
 temp_mm_state_t use_temporary_mm(struct mm_struct *mm)
 {
 	temp_mm_state_t temp_state;
 
-	lockdep_assert_irqs_disabled();
+	lockdep_assert_preemption_disabled();
 
 	/*
 	 * Make sure not to be in TLB lazy mode, as otherwise we'll end up
@@ -722,7 +727,7 @@ temp_mm_state_t use_temporary_mm(struct
 
 void unuse_temporary_mm(temp_mm_state_t prev_state)
 {
-	lockdep_assert_irqs_disabled();
+	lockdep_assert_preemption_disabled();
 
 	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
 	cpumask_clear_cpu(smp_processor_id(),



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 6/7] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
                   ` (4 preceding siblings ...)
  2024-11-19 16:25 ` [PATCH 5/7] x86/mm: Allow temporary mms when IRQs are on Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  2024-11-19 16:25 ` [PATCH 7/7] x86/mm: Opt in to IRQs-off activate_mm() Peter Zijlstra
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz, Andy Lutomirski

From: Andy Lutomirski <luto@kernel.org>

This should be considerably more robust.  It's also necessary for optimized
for_each_possible_lazymm_cpu() on x86 -- without this patch, EFI calls in
lazy context would remove the lazy mm from mm_cpumask().

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/3efc4cfd1d7c45a32752ced389d6666be15cde56.1641659630.git.luto@kernel.org
---
 arch/x86/platform/efi/efi_64.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -54,7 +54,7 @@
  * 0xffff_ffff_0000_0000 and limit EFI VA mapping space to 64G.
  */
 static u64 efi_va = EFI_VA_START;
-static struct mm_struct *efi_prev_mm;
+static temp_mm_state_t efi_temp_mm_state;
 
 /*
  * We need our own copy of the higher levels of the page tables
@@ -476,15 +476,12 @@ void __init efi_dump_pagetable(void)
  */
 static void efi_enter_mm(void)
 {
-	efi_prev_mm = current->active_mm;
-	current->active_mm = &efi_mm;
-	switch_mm(efi_prev_mm, &efi_mm, NULL);
+	efi_temp_mm_state = use_temporary_mm(&efi_mm);
 }
 
 static void efi_leave_mm(void)
 {
-	current->active_mm = efi_prev_mm;
-	switch_mm(&efi_mm, efi_prev_mm, NULL);
+	unuse_temporary_mm(efi_temp_mm_state);
 }
 
 void arch_efi_call_virt_setup(void)



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 7/7] x86/mm: Opt in to IRQs-off activate_mm()
  2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
                   ` (5 preceding siblings ...)
  2024-11-19 16:25 ` [PATCH 6/7] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery Peter Zijlstra
@ 2024-11-19 16:25 ` Peter Zijlstra
  6 siblings, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2024-11-19 16:25 UTC (permalink / raw)
  To: x86, "To:riel"; +Cc: linux-kernel, peterz, Andy Lutomirski

From: Andy Lutomirski <luto@kernel.org>

We gain nothing by having the core code enable IRQs right before calling
activate_mm() only for us to turn them right back off again in switch_mm().

This will save a few cycles, so execve() should be blazingly fast with this
patch applied!

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/69c7d711f240cfec23e6024e940d31af2990db36.1641659630.git.luto@kernel.org
---
 arch/x86/Kconfig                   |    1 +
 arch/x86/include/asm/mmu_context.h |    2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

Index: linux-2.6/arch/x86/Kconfig
===================================================================
--- linux-2.6.orig/arch/x86/Kconfig
+++ linux-2.6/arch/x86/Kconfig
@@ -133,6 +133,7 @@ config X86
 	select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP	if X86_64
 	select ARCH_WANTS_THP_SWAP		if X86_64
 	select ARCH_HAS_PARANOID_L1D_FLUSH
+	select ARCH_WANT_IRQS_OFF_ACTIVATE_MM
 	select BUILDTIME_TABLE_SORT
 	select CLKEVT_I8253
 	select CLOCKSOURCE_VALIDATE_LAST_CYCLE
Index: linux-2.6/arch/x86/include/asm/mmu_context.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/mmu_context.h
+++ linux-2.6/arch/x86/include/asm/mmu_context.h
@@ -175,7 +175,7 @@ extern void switch_mm_irqs_off(struct mm
 #define activate_mm(prev, next)			\
 do {						\
 	paravirt_enter_mmap(next);		\
-	switch_mm((prev), (next), NULL);	\
+	switch_mm_irqs_off((prev), (next), NULL);	\
 } while (0);
 
 #ifdef CONFIG_X86_32



^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/7] x86/mm: Add 'mm' argument to unuse_temporary_mm()
  2025-04-02  9:45 [PATCH 0/7 -v2] Factor out, clean up and use the use_/unuse_temporary_mm() APIs some more Ingo Molnar
@ 2025-04-02  9:45 ` Ingo Molnar
  0 siblings, 0 replies; 9+ messages in thread
From: Ingo Molnar @ 2025-04-02  9:45 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andy Lutomirski, Rik van Riel, H . Peter Anvin, Peter Zijlstra,
	Linus Torvalds, Andrew Morton, Ingo Molnar

From: Peter Zijlstra <peterz@infradead.org>

In commit 209954cbc7d0 ("x86/mm/tlb: Update mm_cpumask lazily")
unuse_temporary_mm() grew the assumption that it gets used on
poking_mm exclusively. While this is currently true, lets not hard
code this assumption.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20241119163035.322525475@infradead.org
---
 arch/x86/kernel/alternative.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 5b1a6252a4b9..cfffcb80f564 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -2161,14 +2161,14 @@ static inline struct mm_struct *use_temporary_mm(struct mm_struct *temp_mm)
 __ro_after_init struct mm_struct *text_poke_mm;
 __ro_after_init unsigned long text_poke_mm_addr;
 
-static inline void unuse_temporary_mm(struct mm_struct *prev_mm)
+static inline void unuse_temporary_mm(struct mm_struct *mm, struct mm_struct *prev_mm)
 {
 	lockdep_assert_irqs_disabled();
 
 	switch_mm_irqs_off(NULL, prev_mm, current);
 
 	/* Clear the cpumask, to indicate no TLB flushing is needed anywhere */
-	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(text_poke_mm));
+	cpumask_clear_cpu(raw_smp_processor_id(), mm_cpumask(mm));
 
 	/*
 	 * Restore the breakpoints if they were disabled before the temporary mm
@@ -2275,7 +2275,7 @@ static void *__text_poke(text_poke_f func, void *addr, const void *src, size_t l
 	 * instruction that already allows the core to see the updated version.
 	 * Xen-PV is assumed to serialize execution in a similar manner.
 	 */
-	unuse_temporary_mm(prev_mm);
+	unuse_temporary_mm(text_poke_mm, prev_mm);
 
 	/*
 	 * Flushing the TLB might involve IPIs, which would require enabled
-- 
2.45.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-02  9:46 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-19 16:25 [PATCH 0/7] x86/mm: Clean up and use temportary_mm more Peter Zijlstra
2024-11-19 16:25 ` [PATCH 1/7] x86/mm: Add mm argument to unuse_temporary_mm() Peter Zijlstra
2024-11-19 16:25 ` [PATCH 2/7] x86/events, x86/insn-eval: Remove incorrect active_mm references Peter Zijlstra
2024-11-19 16:25 ` [PATCH 3/7] x86/mm: Make use/unuse_temporary_mm() non-static Peter Zijlstra
2024-11-19 16:25 ` [PATCH 4/7] x86/mm: Remove mm argument from unuse_temporary_mm() again Peter Zijlstra
2024-11-19 16:25 ` [PATCH 5/7] x86/mm: Allow temporary mms when IRQs are on Peter Zijlstra
2024-11-19 16:25 ` [PATCH 6/7] x86/efi: Make efi_enter/leave_mm use the temporary_mm machinery Peter Zijlstra
2024-11-19 16:25 ` [PATCH 7/7] x86/mm: Opt in to IRQs-off activate_mm() Peter Zijlstra
  -- strict thread matches above, loose matches on Subject: below --
2025-04-02  9:45 [PATCH 0/7 -v2] Factor out, clean up and use the use_/unuse_temporary_mm() APIs some more Ingo Molnar
2025-04-02  9:45 ` [PATCH 1/7] x86/mm: Add 'mm' argument to unuse_temporary_mm() Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox