* [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs
@ 2023-06-20 14:46 Yair Podemsky
  2023-06-20 14:46 ` [PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Yair Podemsky @ 2023-06-20 14:46 UTC (permalink / raw)
  To: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, peterz, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
  Cc: ypodemsk
Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs
indiscriminately, this causes unnecessary work and delays notable in
real-time use-cases and isolated cpus.
By limiting the IPI to only be sent to cpus referencing the effected
mm.
a config to differentiate architectures that support mm_cpumask from
those that don't will allow safe usage of this feature.
changes from -v1:
- Previous version included a patch to only send the IPI to CPU's with
context_tracking in the kernel space, this was removed due to race 
condition concerns.
- for archs that do not maintain mm_cpumask the mask used should be
 cpu_online_mask (Peter Zijlstra).
 
 v1: https://lore.kernel.org/all/20230404134224.137038-1-ypodemsk@redhat.com/
Yair Podemsky (2):
  arch: Introduce ARCH_HAS_CPUMASK_BITS
  mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
 arch/Kconfig              |  8 ++++++++
 arch/arm/Kconfig          |  1 +
 arch/powerpc/Kconfig      |  1 +
 arch/s390/Kconfig         |  1 +
 arch/sparc/Kconfig        |  1 +
 arch/x86/Kconfig          |  1 +
 include/asm-generic/tlb.h |  4 ++--
 mm/khugepaged.c           |  4 ++--
 mm/mmu_gather.c           | 17 ++++++++++++-----
 9 files changed, 29 insertions(+), 9 deletions(-)
-- 
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* [PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS
  2023-06-20 14:46 [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
@ 2023-06-20 14:46 ` Yair Podemsky
  2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
  2023-06-21  7:43 ` [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Peter Zijlstra
  2 siblings, 0 replies; 16+ messages in thread
From: Yair Podemsky @ 2023-06-20 14:46 UTC (permalink / raw)
  To: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, peterz, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
  Cc: ypodemsk
Some architectures set and maintain the mm_cpumask bits when loading
or removing process from cpu.
This Kconfig will mark those to allow different behavior between
kernels that maintain the mm_cpumask and those that do not.
Signed-off-by: Yair Podemsky <ypodemsk@redhat.com>
---
 arch/Kconfig         | 8 ++++++++
 arch/arm/Kconfig     | 1 +
 arch/powerpc/Kconfig | 1 +
 arch/s390/Kconfig    | 1 +
 arch/sparc/Kconfig   | 1 +
 arch/x86/Kconfig     | 1 +
 6 files changed, 13 insertions(+)
diff --git a/arch/Kconfig b/arch/Kconfig
index 205fd23e0cad..953fbfa5a2ad 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -1466,6 +1466,14 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG
 	  address translations. Page table walkers that clear the accessed bit
 	  may use this capability to reduce their search space.
 
+config ARCH_HAS_CPUMASK_BITS
+	bool
+	help
+	  Architectures that select this option set bits on the mm_cpumask
+	  to mark which cpus loaded the mm, The mask can then be used to
+	  control mm specific actions such as tlb_flush.
+
+
 source "kernel/gcov/Kconfig"
 
 source "scripts/gcc-plugins/Kconfig"
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0fb4b218f665..cd20e96bc1dc 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -70,6 +70,7 @@ config ARM
 	select GENERIC_SMP_IDLE_THREAD
 	select HARDIRQS_SW_RESEND
 	select HAS_IOPORT
+	select ARCH_HAS_CPUMASK_BITS
 	select HAVE_ARCH_AUDITSYSCALL if AEABI && !OABI_COMPAT
 	select HAVE_ARCH_BITREVERSE if (CPU_32v7M || CPU_32v7) && !CPU_32v6
 	select HAVE_ARCH_JUMP_LABEL if !XIP_KERNEL && !CPU_ENDIAN_BE32 && MMU
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index bff5820b7cda..c9218722aa2f 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -156,6 +156,7 @@ config PPC
 	select ARCH_HAS_TICK_BROADCAST		if GENERIC_CLOCKEVENTS_BROADCAST
 	select ARCH_HAS_UACCESS_FLUSHCACHE
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_HAS_CPUMASK_BITS
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_KEEP_MEMBLOCK
 	select ARCH_MIGHT_HAVE_PC_PARPORT
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 6dab9c1be508..60bf29bc3f87 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -84,6 +84,7 @@ config S390
 	select ARCH_HAS_SYSCALL_WRAPPER
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
 	select ARCH_HAS_VDSO_DATA
+	select ARCH_HAS_CPUMASK_BITS
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select ARCH_INLINE_READ_LOCK
 	select ARCH_INLINE_READ_LOCK_BH
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 8535e19062f6..e8bf4d769306 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -99,6 +99,7 @@ config SPARC64
 	select ARCH_HAS_PTE_SPECIAL
 	select PCI_DOMAINS if PCI
 	select ARCH_HAS_GIGANTIC_PAGE
+	select ARCH_HAS_CPUMASK_BITS
 	select HAVE_SOFTIRQ_ON_OWN_STACK
 	select HAVE_SETUP_PER_CPU_AREA
 	select NEED_PER_CPU_EMBED_FIRST_CHUNK
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 53bab123a8ee..b351421695f3 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -185,6 +185,7 @@ config X86
 	select HAVE_ARCH_THREAD_STRUCT_WHITELIST
 	select HAVE_ARCH_STACKLEAK
 	select HAVE_ARCH_TRACEHOOK
+	select ARCH_HAS_CPUMASK_BITS
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE
 	select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
 	select HAVE_ARCH_USERFAULTFD_WP         if X86_64 && USERFAULTFD
-- 
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related	[flat|nested] 16+ messages in thread
* [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-20 14:46 [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
  2023-06-20 14:46 ` [PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
@ 2023-06-20 14:46 ` Yair Podemsky
  2023-06-21 17:42   ` Dave Hansen
                     ` (2 more replies)
  2023-06-21  7:43 ` [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Peter Zijlstra
  2 siblings, 3 replies; 16+ messages in thread
From: Yair Podemsky @ 2023-06-20 14:46 UTC (permalink / raw)
  To: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, peterz, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
  Cc: ypodemsk
Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs
indiscriminately, this causes unnecessary work and delays notable in
real-time use-cases and isolated cpus.
This patch will limit this IPI on systems with ARCH_HAS_CPUMASK_BITS,
Where the IPI will only be sent to cpus referencing the affected mm.
Signed-off-by: Yair Podemsky <ypodemsk@redhat.com>
Suggested-by: David Hildenbrand <david@redhat.com>
---
 include/asm-generic/tlb.h |  4 ++--
 mm/khugepaged.c           |  4 ++--
 mm/mmu_gather.c           | 17 ++++++++++++-----
 3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index b46617207c93..0b6ba17cc8d3 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -222,7 +222,7 @@ extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
 #define tlb_needs_table_invalidate() (true)
 #endif
 
-void tlb_remove_table_sync_one(void);
+void tlb_remove_table_sync_one(struct mm_struct *mm);
 
 #else
 
@@ -230,7 +230,7 @@ void tlb_remove_table_sync_one(void);
 #error tlb_needs_table_invalidate() requires MMU_GATHER_RCU_TABLE_FREE
 #endif
 
-static inline void tlb_remove_table_sync_one(void) { }
+static inline void tlb_remove_table_sync_one(struct mm_struct *mm) { }
 
 #endif /* CONFIG_MMU_GATHER_RCU_TABLE_FREE */
 
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 6b9d39d65b73..3e5cb079d268 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1166,7 +1166,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address,
 	_pmd = pmdp_collapse_flush(vma, address, pmd);
 	spin_unlock(pmd_ptl);
 	mmu_notifier_invalidate_range_end(&range);
-	tlb_remove_table_sync_one();
+	tlb_remove_table_sync_one(mm);
 
 	spin_lock(pte_ptl);
 	result =  __collapse_huge_page_isolate(vma, address, pte, cc,
@@ -1525,7 +1525,7 @@ static void collapse_and_free_pmd(struct mm_struct *mm, struct vm_area_struct *v
 				addr + HPAGE_PMD_SIZE);
 	mmu_notifier_invalidate_range_start(&range);
 	pmd = pmdp_collapse_flush(vma, addr, pmdp);
-	tlb_remove_table_sync_one();
+	tlb_remove_table_sync_one(mm);
 	mmu_notifier_invalidate_range_end(&range);
 	mm_dec_nr_ptes(mm);
 	page_table_check_pte_clear_range(mm, addr, pmd);
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index ea9683e12936..692d8175a88e 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -191,7 +191,13 @@ static void tlb_remove_table_smp_sync(void *arg)
 	/* Simply deliver the interrupt */
 }
 
-void tlb_remove_table_sync_one(void)
+#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
+#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
+#else
+#define REMOVE_TABLE_IPI_MASK cpu_online_mask
+#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */
+
+void tlb_remove_table_sync_one(struct mm_struct *mm)
 {
 	/*
 	 * This isn't an RCU grace period and hence the page-tables cannot be
@@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void)
 	 * It is however sufficient for software page-table walkers that rely on
 	 * IRQ disabling.
 	 */
-	smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
+	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
+			NULL, true);
 }
 
 static void tlb_remove_table_rcu(struct rcu_head *head)
@@ -237,9 +244,9 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb)
 	}
 }
 
-static void tlb_remove_table_one(void *table)
+static void tlb_remove_table_one(struct mm_struct *mm, void *table)
 {
-	tlb_remove_table_sync_one();
+	tlb_remove_table_sync_one(mm);
 	__tlb_remove_table(table);
 }
 
@@ -262,7 +269,7 @@ void tlb_remove_table(struct mmu_gather *tlb, void *table)
 		*batch = (struct mmu_table_batch *)__get_free_page(GFP_NOWAIT | __GFP_NOWARN);
 		if (*batch == NULL) {
 			tlb_table_invalidate(tlb);
-			tlb_remove_table_one(table);
+			tlb_remove_table_one(tlb->mm, table);
 			return;
 		}
 		(*batch)->nr = 0;
---
v2: replaced no REMOVE_TABLE_IPI_MASK REMOVE_TABLE_IPI_MASK to cpu_online_mask
-- 
2.39.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs
  2023-06-20 14:46 [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
  2023-06-20 14:46 ` [PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
  2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
@ 2023-06-21  7:43 ` Peter Zijlstra
  2023-06-22 12:47   ` Marcelo Tosatti
  2023-06-22 13:11   ` ypodemsk
  2 siblings, 2 replies; 16+ messages in thread
From: Peter Zijlstra @ 2023-06-21  7:43 UTC (permalink / raw)
  To: Yair Podemsky
  Cc: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
On Tue, Jun 20, 2023 at 05:46:16PM +0300, Yair Podemsky wrote:
> Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs
> indiscriminately, this causes unnecessary work and delays notable in
> real-time use-cases and isolated cpus.
> By limiting the IPI to only be sent to cpus referencing the effected
> mm.
> a config to differentiate architectures that support mm_cpumask from
> those that don't will allow safe usage of this feature.
> 
> changes from -v1:
> - Previous version included a patch to only send the IPI to CPU's with
> context_tracking in the kernel space, this was removed due to race 
> condition concerns.
> - for archs that do not maintain mm_cpumask the mask used should be
>  cpu_online_mask (Peter Zijlstra).
>  
Would it not be much better to fix the root cause? As per the last time,
there's patches that cure the thp abuse of this.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
@ 2023-06-21 17:42   ` Dave Hansen
  2023-06-22 13:14     ` ypodemsk
  2023-06-21 18:02   ` Nadav Amit
  2023-07-03 13:57   ` Peter Zijlstra
  2 siblings, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2023-06-21 17:42 UTC (permalink / raw)
  To: Yair Podemsky, mtosatti, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, peterz, ardb, samitolvanen, juerg.haefliger, arnd,
	rmk+kernel, geert+renesas, linus.walleij, akpm, sebastian.reichel,
	rppt, aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On 6/20/23 07:46, Yair Podemsky wrote:
> -void tlb_remove_table_sync_one(void)
> +#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
> +#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
> +#else
> +#define REMOVE_TABLE_IPI_MASK cpu_online_mask
> +#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */
> +
> +void tlb_remove_table_sync_one(struct mm_struct *mm)
>  {
>  	/*
>  	 * This isn't an RCU grace period and hence the page-tables cannot be
> @@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void)
>  	 * It is however sufficient for software page-table walkers that rely on
>  	 * IRQ disabling.
>  	 */
> -	smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
> +	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
> +			NULL, true);
>  }
That "REMOVE_TABLE_IPI_MASK" thing is pretty confusing.  It *looks* like
a constant.  It does *NOT* look at all like it consumes 'mm'.  Worst
case, just create a local variable:
	if (IS_ENABLED(CONFIG_ARCH_HAS_CPUMASK_BITS))
		ipi_mask = mm_cpumask(mm);
	else
		ipi_mask = cpu_online_mask;
	on_each_cpu_mask(ipi_mask, ...);
That's a billion times more clear and it'll compile down to the same thing.
I do think the CONFIG_ARCH_HAS_CPUMASK_BITS naming is also pretty
confusing, but I don't have any better suggestions.  Maybe something
with "MM_CPUMASK" in it?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
  2023-06-21 17:42   ` Dave Hansen
@ 2023-06-21 18:02   ` Nadav Amit
  2023-06-22 13:57     ` ypodemsk
  2023-06-23  3:38     ` Yang Shi
  2023-07-03 13:57   ` Peter Zijlstra
  2 siblings, 2 replies; 16+ messages in thread
From: Nadav Amit @ 2023-06-21 18:02 UTC (permalink / raw)
  To: Yair Podemsky
  Cc: mtosatti, ppandit, David Hildenbrand, Russell King (Oracle),
	Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Heiko Carstens, gor, agordeev, Christian Borntraeger, svens,
	David S. Miller, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Kees Cook, Paul E. McKenney,
	frederic, Will Deacon, Peter Zijlstra, ardb, samitolvanen,
	juerg.haefliger, Arnd Bergmann, rmk+kernel, geert+renesas,
	linus.walleij, Andrew Morton, sebastian.reichel, Mike Rapoport,
	aneesh.kumar, the arch/x86 maintainers, linux-arm-kernel,
	linuxppc-dev, linux-s390, sparclinux, linux-arch, linux-mm,
	linux-kernel
> 
> On Jun 20, 2023, at 7:46 AM, Yair Podemsky <ypodemsk@redhat.com> wrote:
> 
> @@ -1525,7 +1525,7 @@ static void collapse_and_free_pmd(struct mm_struct *mm, struct vm_area_struct *v
> 				addr + HPAGE_PMD_SIZE);
> 	mmu_notifier_invalidate_range_start(&range);
> 	pmd = pmdp_collapse_flush(vma, addr, pmdp);
> -	tlb_remove_table_sync_one();
> +	tlb_remove_table_sync_one(mm);
Can’t pmdp_collapse_flush() have one additional argument “freed_tables”
that it would propagate, for instance on x86 to flush_tlb_mm_range() ?
Then you would not need tlb_remove_table_sync_one() to issue an additional
IPI, no?
It just seems that you might still have 2 IPIs in many cases instead of
one, and unless I am missing something, I don’t see why.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs
  2023-06-21  7:43 ` [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Peter Zijlstra
@ 2023-06-22 12:47   ` Marcelo Tosatti
  2023-07-03 14:09     ` Peter Zijlstra
  2023-06-22 13:11   ` ypodemsk
  1 sibling, 1 reply; 16+ messages in thread
From: Marcelo Tosatti @ 2023-06-22 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Yair Podemsky, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, ardb, samitolvanen, juerg.haefliger, arnd, rmk+kernel,
	geert+renesas, linus.walleij, akpm, sebastian.reichel, rppt,
	aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev, linux-s390,
	sparclinux, linux-arch, linux-mm, linux-kernel
On Wed, Jun 21, 2023 at 09:43:37AM +0200, Peter Zijlstra wrote:
> On Tue, Jun 20, 2023 at 05:46:16PM +0300, Yair Podemsky wrote:
> > Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs
> > indiscriminately, this causes unnecessary work and delays notable in
> > real-time use-cases and isolated cpus.
> > By limiting the IPI to only be sent to cpus referencing the effected
> > mm.
> > a config to differentiate architectures that support mm_cpumask from
> > those that don't will allow safe usage of this feature.
> > 
> > changes from -v1:
> > - Previous version included a patch to only send the IPI to CPU's with
> > context_tracking in the kernel space, this was removed due to race 
> > condition concerns.
> > - for archs that do not maintain mm_cpumask the mask used should be
> >  cpu_online_mask (Peter Zijlstra).
> >  
> 
> Would it not be much better to fix the root cause? As per the last time,
> there's patches that cure the thp abuse of this.
The other case where the IPI can happen is:
CPU-0                                   CPU-1
tlb_remove_table
tlb_remove_table_sync_one
IPI
                                        local_irq_disable
                                        gup_fast
                                        local_irq_enable
So its not only the THP case.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs
  2023-06-21  7:43 ` [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Peter Zijlstra
  2023-06-22 12:47   ` Marcelo Tosatti
@ 2023-06-22 13:11   ` ypodemsk
  1 sibling, 0 replies; 16+ messages in thread
From: ypodemsk @ 2023-06-22 13:11 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
On Wed, 2023-06-21 at 09:43 +0200, Peter Zijlstra wrote:
> On Tue, Jun 20, 2023 at 05:46:16PM +0300, Yair Podemsky wrote:
> > Currently the tlb_remove_table_smp_sync IPI is sent to all CPUs
> > indiscriminately, this causes unnecessary work and delays notable
> > in
> > real-time use-cases and isolated cpus.
> > By limiting the IPI to only be sent to cpus referencing the
> > effected
> > mm.
> > a config to differentiate architectures that support mm_cpumask
> > from
> > those that don't will allow safe usage of this feature.
> > 
> > changes from -v1:
> > - Previous version included a patch to only send the IPI to CPU's
> > with
> > context_tracking in the kernel space, this was removed due to race 
> > condition concerns.
> > - for archs that do not maintain mm_cpumask the mask used should be
> >  cpu_online_mask (Peter Zijlstra).
> >  
> 
> Would it not be much better to fix the root cause? As per the last
> time,
> there's patches that cure the thp abuse of this.
> 
Hi Peter,
Thanks for your reply.
There are two code paths leading to this IPI, one is the thp,
But the other is the failure to allocate page in tlb_remove_table,
It is the the second path that we are most interested in as it was
found
to cause interference in a real time process for a client (That system
did
 not have thp).
So while curing thp abuses is a good thing, it will not unfortunately
solve
our root cause.
If you have any idea of how to remove the tlb_remove_table_sync_one()
usage
in the tlb_remove_table()->tlb_remove_table_one() call path -- the
usage 
that's relevant for us -- that would be great. As long as we can't
remove
that, I'm afraid all we can do is optimize for it to not broadcast an
IPI
to all CPUs in the system, as done in this patch.
Thanks,
Yair
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-21 17:42   ` Dave Hansen
@ 2023-06-22 13:14     ` ypodemsk
  2023-06-22 13:37       ` Dave Hansen
  0 siblings, 1 reply; 16+ messages in thread
From: ypodemsk @ 2023-06-22 13:14 UTC (permalink / raw)
  To: Dave Hansen, mtosatti, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, peterz, ardb, samitolvanen, juerg.haefliger, arnd,
	rmk+kernel, geert+renesas, linus.walleij, akpm, sebastian.reichel,
	rppt, aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On Wed, 2023-06-21 at 10:42 -0700, Dave Hansen wrote:
> On 6/20/23 07:46, Yair Podemsky wrote:
> > -void tlb_remove_table_sync_one(void)
> > +#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
> > +#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
> > +#else
> > +#define REMOVE_TABLE_IPI_MASK cpu_online_mask
> > +#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */
> > +
> > +void tlb_remove_table_sync_one(struct mm_struct *mm)
> >  {
> >  	/*
> >  	 * This isn't an RCU grace period and hence the page-tables
> > cannot be
> > @@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void)
> >  	 * It is however sufficient for software page-table walkers
> > that rely on
> >  	 * IRQ disabling.
> >  	 */
> > -	smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
> > +	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK,
> > tlb_remove_table_smp_sync,
> > +			NULL, true);
> >  }
> 
> That "REMOVE_TABLE_IPI_MASK" thing is pretty confusing.  It *looks*
> like
> a constant.  It does *NOT* look at all like it consumes 'mm'.  Worst
> case, just create a local variable:
> 
> 	if (IS_ENABLED(CONFIG_ARCH_HAS_CPUMASK_BITS))
> 		ipi_mask = mm_cpumask(mm);
> 	else
> 		ipi_mask = cpu_online_mask;
> 
> 	on_each_cpu_mask(ipi_mask, ...);
> 
> That's a billion times more clear and it'll compile down to the same
> thing.
> 
> I do think the CONFIG_ARCH_HAS_CPUMASK_BITS naming is also pretty
> confusing, but I don't have any better suggestions.  Maybe something
> with "MM_CPUMASK" in it?
> 
Hi Dave,
Thanks for your suggestions!
I will send a new version with the local variable as you suggested
soon.
As for the config name, what about CONFIG_ARCH_HAS_MM_CPUMASK?
Thanks,
Yair
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-22 13:14     ` ypodemsk
@ 2023-06-22 13:37       ` Dave Hansen
  2023-06-26 14:36         ` ypodemsk
  0 siblings, 1 reply; 16+ messages in thread
From: Dave Hansen @ 2023-06-22 13:37 UTC (permalink / raw)
  To: ypodemsk, mtosatti, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, peterz, ardb, samitolvanen, juerg.haefliger, arnd,
	rmk+kernel, geert+renesas, linus.walleij, akpm, sebastian.reichel,
	rppt, aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On 6/22/23 06:14, ypodemsk@redhat.com wrote:
> I will send a new version with the local variable as you suggested
> soon.
> As for the config name, what about CONFIG_ARCH_HAS_MM_CPUMASK?
The confusing part about that name is that mm_cpumask() and
mm->cpu_bitmap[] are defined unconditionally.  So, they're *around*
unconditionally but just aren't updated.
BTW, it would also be nice to have _some_ kind of data behind this patch.
Fewer IPIs are better I guess, but it would still be nice if you could say:
	Before this patch, /proc/interrupts showed 123 IPIs/hour for an
	isolated CPU.  After the approach here, it was 0.
... or something.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-21 18:02   ` Nadav Amit
@ 2023-06-22 13:57     ` ypodemsk
  2023-06-23  3:38     ` Yang Shi
  1 sibling, 0 replies; 16+ messages in thread
From: ypodemsk @ 2023-06-22 13:57 UTC (permalink / raw)
  To: Nadav Amit
  Cc: mtosatti, ppandit, David Hildenbrand, Russell King (Oracle),
	Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Heiko Carstens, gor, agordeev, Christian Borntraeger, svens,
	David S. Miller, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, H. Peter Anvin, Kees Cook, Paul E. McKenney,
	frederic, Will Deacon, Peter Zijlstra, ardb, samitolvanen,
	juerg.haefliger, Arnd Bergmann, rmk+kernel, geert+renesas,
	linus.walleij, Andrew Morton, sebastian.reichel, Mike Rapoport,
	aneesh.kumar, the arch/x86 maintainers, linux-arm-kernel,
	linuxppc-dev, linux-s390, sparclinux, linux-arch, linux-mm,
	linux-kernel
On Wed, 2023-06-21 at 11:02 -0700, Nadav Amit wrote:
> > On Jun 20, 2023, at 7:46 AM, Yair Podemsky <ypodemsk@redhat.com>
> > wrote:
> > 
> > @@ -1525,7 +1525,7 @@ static void collapse_and_free_pmd(struct
> > mm_struct *mm, struct vm_area_struct *v
> > 				addr + HPAGE_PMD_SIZE);
> > 	mmu_notifier_invalidate_range_start(&range);
> > 	pmd = pmdp_collapse_flush(vma, addr, pmdp);
> > -	tlb_remove_table_sync_one();
> > +	tlb_remove_table_sync_one(mm);
> 
> Can’t pmdp_collapse_flush() have one additional argument
> “freed_tables”
> that it would propagate, for instance on x86 to flush_tlb_mm_range()
> ?
> Then you would not need tlb_remove_table_sync_one() to issue an
> additional
> IPI, no?
> 
> It just seems that you might still have 2 IPIs in many cases instead
> of
> one, and unless I am missing something, I don’t see why.
> 
Hi Nadav,
Thanks for your comment.
I think you are right and in some configurations 2 IPIs might occur.
However I a am not really dealing with the thp code at the moment,
This patch is about the mmu_gatherer and mostly dealing with IPIs sent
via the other code path.
Thanks,
Yair
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-21 18:02   ` Nadav Amit
  2023-06-22 13:57     ` ypodemsk
@ 2023-06-23  3:38     ` Yang Shi
  1 sibling, 0 replies; 16+ messages in thread
From: Yang Shi @ 2023-06-23  3:38 UTC (permalink / raw)
  To: Nadav Amit, Jann Horn
  Cc: Yair Podemsky, mtosatti, ppandit, David Hildenbrand,
	Russell King (Oracle), Michael Ellerman, Nicholas Piggin,
	Christophe Leroy, Heiko Carstens, gor, agordeev,
	Christian Borntraeger, svens, David S. Miller, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, H. Peter Anvin,
	Kees Cook, Paul E. McKenney, frederic, Will Deacon,
	Peter Zijlstra, ardb, samitolvanen, juerg.haefliger,
	Arnd Bergmann, rmk+kernel, geert+renesas, linus.walleij,
	Andrew Morton, sebastian.reichel, Mike Rapoport, aneesh.kumar,
	the arch/x86 maintainers, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On Wed, Jun 21, 2023 at 11:02 AM Nadav Amit <nadav.amit@gmail.com> wrote:
>
> >
> > On Jun 20, 2023, at 7:46 AM, Yair Podemsky <ypodemsk@redhat.com> wrote:
> >
> > @@ -1525,7 +1525,7 @@ static void collapse_and_free_pmd(struct mm_struct *mm, struct vm_area_struct *v
> >                               addr + HPAGE_PMD_SIZE);
> >       mmu_notifier_invalidate_range_start(&range);
> >       pmd = pmdp_collapse_flush(vma, addr, pmdp);
> > -     tlb_remove_table_sync_one();
> > +     tlb_remove_table_sync_one(mm);
>
> Can’t pmdp_collapse_flush() have one additional argument “freed_tables”
> that it would propagate, for instance on x86 to flush_tlb_mm_range() ?
> Then you would not need tlb_remove_table_sync_one() to issue an additional
> IPI, no?
>
> It just seems that you might still have 2 IPIs in many cases instead of
> one, and unless I am missing something, I don’t see why.
The tlb_remove_table_sync_one() is used to serialize against fast GUP
for the architectures which don't broadcast TLB flush by IPI, for
example, arm64, etc. It may incur one extra IPI for x86 and some
others, but x86 virtualization needs this since the guest may not
flush TLB by sending IPI IIUC. So if the one extra IPI is really a
problem, we may be able to define an arch-specific function to deal
with it, for example, a pv ops off the top of my head. But I'm not a
virtualization expert, I'm not entirely sure whether it is the best
way or not.  But the complexity seems overkilling TBH since khugepaged
is usually not called that often.
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-22 13:37       ` Dave Hansen
@ 2023-06-26 14:36         ` ypodemsk
  2023-06-26 15:23           ` Dave Hansen
  0 siblings, 1 reply; 16+ messages in thread
From: ypodemsk @ 2023-06-26 14:36 UTC (permalink / raw)
  To: Dave Hansen, mtosatti, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, peterz, ardb, samitolvanen, juerg.haefliger, arnd,
	rmk+kernel, geert+renesas, linus.walleij, akpm, sebastian.reichel,
	rppt, aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On Thu, 2023-06-22 at 06:37 -0700, Dave Hansen wrote:
> On 6/22/23 06:14, ypodemsk@redhat.com wrote:
> > I will send a new version with the local variable as you suggested
> > soon.
> > As for the config name, what about CONFIG_ARCH_HAS_MM_CPUMASK?
> 
> The confusing part about that name is that mm_cpumask() and
> mm->cpu_bitmap[] are defined unconditionally.  So, they're *around*
> unconditionally but just aren't updated.
> 
I think your right about the config name,
How about the
CONFIG_ARCH_USE_MM_CPUMASK?
This has the right semantic as these archs use the cpumask field of the
mm struct.
> BTW, it would also be nice to have _some_ kind of data behind this
> patch.
> 
> Fewer IPIs are better I guess, but it would still be nice if you
> could say:
> 
> 	Before this patch, /proc/interrupts showed 123 IPIs/hour for an
> 	isolated CPU.  After the approach here, it was 0.
> 
> ... or something.
This is part of an ongoing effort to remove IPIs and this one was found
via code inspection.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-26 14:36         ` ypodemsk
@ 2023-06-26 15:23           ` Dave Hansen
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Hansen @ 2023-06-26 15:23 UTC (permalink / raw)
  To: ypodemsk, mtosatti, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, peterz, ardb, samitolvanen, juerg.haefliger, arnd,
	rmk+kernel, geert+renesas, linus.walleij, akpm, sebastian.reichel,
	rppt, aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev,
	linux-s390, sparclinux, linux-arch, linux-mm, linux-kernel
On 6/26/23 07:36, ypodemsk@redhat.com wrote:
> On Thu, 2023-06-22 at 06:37 -0700, Dave Hansen wrote:
>> On 6/22/23 06:14, ypodemsk@redhat.com wrote:
>>> I will send a new version with the local variable as you suggested
>>> soon.
>>> As for the config name, what about CONFIG_ARCH_HAS_MM_CPUMASK?
>>
>> The confusing part about that name is that mm_cpumask() and
>> mm->cpu_bitmap[] are defined unconditionally.  So, they're *around*
>> unconditionally but just aren't updated.
>>
> I think your right about the config name,
> How about the
> CONFIG_ARCH_USE_MM_CPUMASK?
> This has the right semantic as these archs use the cpumask field of the
> mm struct.
"USE" is still a command.  It should, at worst, be "USES".  But that's
still kinda generic.  How about:
	CONFIG_ARCH_UPDATES_MM_CPUMASK
?
>> BTW, it would also be nice to have _some_ kind of data behind this
>> patch.
>>
>> Fewer IPIs are better I guess, but it would still be nice if you
>> could say:
>>
>> 	Before this patch, /proc/interrupts showed 123 IPIs/hour for an
>> 	isolated CPU.  After the approach here, it was 0.
>>
>> ... or something.
> 
> This is part of an ongoing effort to remove IPIs and this one was found
> via code inspection.
OK, so it should be something more like:
	This was found via code inspection, but fixing it isn't very
	important so we didn't bother to test it any more than just
	making sure the thing still boots when it is applied.
Does that cover it?
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs
  2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
  2023-06-21 17:42   ` Dave Hansen
  2023-06-21 18:02   ` Nadav Amit
@ 2023-07-03 13:57   ` Peter Zijlstra
  2 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2023-07-03 13:57 UTC (permalink / raw)
  To: Yair Podemsky
  Cc: mtosatti, ppandit, david, linux, mpe, npiggin, christophe.leroy,
	hca, gor, agordeev, borntraeger, svens, davem, tglx, mingo, bp,
	dave.hansen, hpa, keescook, paulmck, frederic, will, ardb,
	samitolvanen, juerg.haefliger, arnd, rmk+kernel, geert+renesas,
	linus.walleij, akpm, sebastian.reichel, rppt, aneesh.kumar, x86,
	linux-arm-kernel, linuxppc-dev, linux-s390, sparclinux,
	linux-arch, linux-mm, linux-kernel
On Tue, Jun 20, 2023 at 05:46:18PM +0300, Yair Podemsky wrote:
> @@ -191,7 +191,13 @@ static void tlb_remove_table_smp_sync(void *arg)
>  	/* Simply deliver the interrupt */
>  }
>  
> -void tlb_remove_table_sync_one(void)
> +#ifdef CONFIG_ARCH_HAS_CPUMASK_BITS
> +#define REMOVE_TABLE_IPI_MASK mm_cpumask(mm)
> +#else
> +#define REMOVE_TABLE_IPI_MASK cpu_online_mask
> +#endif /* CONFIG_ARCH_HAS_CPUMASK_BITS */
> +
> +void tlb_remove_table_sync_one(struct mm_struct *mm)
>  {
>  	/*
>  	 * This isn't an RCU grace period and hence the page-tables cannot be
> @@ -200,7 +206,8 @@ void tlb_remove_table_sync_one(void)
>  	 * It is however sufficient for software page-table walkers that rely on
>  	 * IRQ disabling.
>  	 */
> -	smp_call_function(tlb_remove_table_smp_sync, NULL, 1);
> +	on_each_cpu_mask(REMOVE_TABLE_IPI_MASK, tlb_remove_table_smp_sync,
> +			NULL, true);
Aside from what Dave said about the REMOVE_TABLE_IPI_MASK thing, this
isn't right.
on_each_cpu_mask() includes the current cpu, while smp_call_function()
explicitly does not.
Yes, they all end up in smp_call_function_many_cond(), but the
on_each_cpu*() family will have SCF_RUN_LOCAL set, while the
smp_call_function*() family will not.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
* Re: [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs
  2023-06-22 12:47   ` Marcelo Tosatti
@ 2023-07-03 14:09     ` Peter Zijlstra
  0 siblings, 0 replies; 16+ messages in thread
From: Peter Zijlstra @ 2023-07-03 14:09 UTC (permalink / raw)
  To: Marcelo Tosatti
  Cc: Yair Podemsky, ppandit, david, linux, mpe, npiggin,
	christophe.leroy, hca, gor, agordeev, borntraeger, svens, davem,
	tglx, mingo, bp, dave.hansen, hpa, keescook, paulmck, frederic,
	will, ardb, samitolvanen, juerg.haefliger, arnd, rmk+kernel,
	geert+renesas, linus.walleij, akpm, sebastian.reichel, rppt,
	aneesh.kumar, x86, linux-arm-kernel, linuxppc-dev, linux-s390,
	sparclinux, linux-arch, linux-mm, linux-kernel
On Thu, Jun 22, 2023 at 09:47:22AM -0300, Marcelo Tosatti wrote:
> > there's patches that cure the thp abuse of this.
> 
> The other case where the IPI can happen is:
> 
> CPU-0                                   CPU-1
> 
> tlb_remove_table
> tlb_remove_table_sync_one
> IPI
>                                         local_irq_disable
>                                         gup_fast
>                                         local_irq_enable
> 
> 
> So its not only the THP case.
(your CPU-1 thing is wholly irrelevant)
Well, I know, but this case *should* be exceedingly rare. Last time
around I asked why you all were tripping this, you pointed at the THP
case.
The THP case should be fixed along the lines of Jann's original patches.
If you can trip this at any significant rate, then we should probably
look at a better allocation scheme. It means you're really low on
memory.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply	[flat|nested] 16+ messages in thread
end of thread, other threads:[~2023-07-03 14:10 UTC | newest]
Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-20 14:46 [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Yair Podemsky
2023-06-20 14:46 ` [PATCH v2 1/2] arch: Introduce ARCH_HAS_CPUMASK_BITS Yair Podemsky
2023-06-20 14:46 ` [PATCH v2 2/2] mm/mmu_gather: send tlb_remove_table_smp_sync IPI only to MM CPUs Yair Podemsky
2023-06-21 17:42   ` Dave Hansen
2023-06-22 13:14     ` ypodemsk
2023-06-22 13:37       ` Dave Hansen
2023-06-26 14:36         ` ypodemsk
2023-06-26 15:23           ` Dave Hansen
2023-06-21 18:02   ` Nadav Amit
2023-06-22 13:57     ` ypodemsk
2023-06-23  3:38     ` Yang Shi
2023-07-03 13:57   ` Peter Zijlstra
2023-06-21  7:43 ` [PATCH v2 0/2] send tlb_remove_table_smp_sync IPI only to necessary CPUs Peter Zijlstra
2023-06-22 12:47   ` Marcelo Tosatti
2023-07-03 14:09     ` Peter Zijlstra
2023-06-22 13:11   ` ypodemsk
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).