* [RFC / PoC v1 0/1] powerpc: Add support for batched unmap TLB flush
@ 2024-09-22 12:46 Ritesh Harjani (IBM)
2024-09-22 12:46 ` [RFC / PoC v1 1/1] " Ritesh Harjani (IBM)
0 siblings, 1 reply; 2+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-09-22 12:46 UTC (permalink / raw)
To: linuxppc-dev
Cc: Michael Ellerman, Madhavan Srinivasan, Nicholas Piggin,
Aneesh Kumar K . V, LKML, Christophe Leroy, Ritesh Harjani (IBM)
Hello All,
This is a quick PoC to add ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH support to
powerpc for book3s64. The ISA in 6.10 of "Translation Table Update
Synchronization Requirements" says that the architecture allows for optimizing
the translation cache invalidation by doing it in bulk later after the PTE
change has been done.
That means if we can add ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH support, it will be
possible to utilize optimizations in reclaim and migrate pages path which can
defer the tlb invalidations to be done in bulk after all the page unmap
operations has been completed.
This a quick PoC for the same. Note that this may not be a complete patch yet,
TLB on Power is already complex from the hardware side :) and then many
optimizations done in the software (e.g. exit_lazy_flush_tlb to avoid tlbies).
But since the current patch looked somewhat sane to me, I wanted to share to get
an early feedback from people who are well versed with this side of code.
Meanwhile I have many TODOs to look into which I am working in parallel for this
work. Later will also get some benchmarks w.r.t promotion / demotion.
I ran a micro-benchmark which was shared in other commits that adds this
support on other archs. I can see some good initial improvements.
without patch (perf report showing 7% in radix__flush_tlb_page_psize, even with
single thread)
==================
root# time ./a.out
real 0m23.538s
user 0m0.191s
sys 0m5.270s
# Overhead Command Shared Object Symbol
# ........ ....... .......................... .............................................
#
7.19% a.out [kernel.vmlinux] [k] radix__flush_tlb_page_psize
5.63% a.out [kernel.vmlinux] [k] _raw_spin_lock
3.21% a.out a.out [.] main
2.93% a.out [kernel.vmlinux] [k] page_counter_cancel
2.58% a.out [kernel.vmlinux] [k] page_counter_try_charge
2.56% a.out [kernel.vmlinux] [k] _raw_spin_lock_irq
2.30% a.out [kernel.vmlinux] [k] try_to_unmap_one
with patch
============
root# time ./a.out
real 0m8.593s
user 0m0.064s
sys 0m1.610s
# Overhead Command Shared Object Symbol
# ........ ....... .......................... .............................................
#
5.10% a.out [kernel.vmlinux] [k] _raw_spin_lock
3.55% a.out [kernel.vmlinux] [k] __mod_memcg_lruvec_state
3.13% a.out a.out [.] main
3.00% a.out [kernel.vmlinux] [k] page_counter_try_charge
2.62% a.out [kernel.vmlinux] [k] _raw_spin_lock_irq
2.58% a.out [kernel.vmlinux] [k] page_counter_cancel
2.22% a.out [kernel.vmlinux] [k] try_to_unmap_one
<micro-benchmark>
====================
#define PAGESIZE 65536
#define SIZE (1 * 1024 * 1024 * 10)
int main()
{
volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1, 0);
memset(p, 0x88, SIZE);
for (int k = 0; k < 10000; k++) {
/* swap in */
for (int i = 0; i < SIZE; i += PAGESIZE) {
(void)p[i];
}
/* swap out */
madvise(p, SIZE, MADV_PAGEOUT);
}
}
Ritesh Harjani (IBM) (1):
powerpc: Add support for batched unmap TLB flush
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/book3s/64/tlbflush.h | 5 +++
arch/powerpc/include/asm/tlbbatch.h | 14 ++++++++
arch/powerpc/mm/book3s64/radix_tlb.c | 32 +++++++++++++++++++
4 files changed, 52 insertions(+)
create mode 100644 arch/powerpc/include/asm/tlbbatch.h
--
2.46.0
^ permalink raw reply [flat|nested] 2+ messages in thread* [RFC / PoC v1 1/1] powerpc: Add support for batched unmap TLB flush
2024-09-22 12:46 [RFC / PoC v1 0/1] powerpc: Add support for batched unmap TLB flush Ritesh Harjani (IBM)
@ 2024-09-22 12:46 ` Ritesh Harjani (IBM)
0 siblings, 0 replies; 2+ messages in thread
From: Ritesh Harjani (IBM) @ 2024-09-22 12:46 UTC (permalink / raw)
To: linuxppc-dev
Cc: Michael Ellerman, Madhavan Srinivasan, Nicholas Piggin,
Aneesh Kumar K . V, LKML, Christophe Leroy, Ritesh Harjani (IBM)
=== NOT FOR MERGE YET ===
This adds the support for ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH.
More details are added to the cover letter.
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/book3s/64/tlbflush.h | 5 +++
arch/powerpc/include/asm/tlbbatch.h | 14 ++++++++
arch/powerpc/mm/book3s64/radix_tlb.c | 32 +++++++++++++++++++
4 files changed, 52 insertions(+)
create mode 100644 arch/powerpc/include/asm/tlbbatch.h
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 0b8b2e3a6381..c3a23c1894dd 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -171,6 +171,7 @@ config PPC
select ARCH_USE_CMPXCHG_LOCKREF if PPC64
select ARCH_USE_MEMTEST
select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS
+ select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH if PPC64 && PPC_BOOK3S_64
select ARCH_WANT_DEFAULT_BPF_JIT
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select ARCH_WANT_IPC_PARSE_VERSION
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index fd642b729775..f872537715e7 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -222,4 +222,9 @@ static inline bool cputlb_use_tlbie(void)
return tlbie_enabled;
}
+bool arch_tlbbatch_should_defer(struct mm_struct *mm);
+void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
+ struct mm_struct *mm, unsigned long uaddr);
+void arch_flush_tlb_batched_pending(struct mm_struct *mm);
+void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch);
#endif /* _ASM_POWERPC_BOOK3S_64_TLBFLUSH_H */
diff --git a/arch/powerpc/include/asm/tlbbatch.h b/arch/powerpc/include/asm/tlbbatch.h
new file mode 100644
index 000000000000..fa738462a242
--- /dev/null
+++ b/arch/powerpc/include/asm/tlbbatch.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2024 IBM Corporation.
+ */
+#ifndef _ASM_POWERPC_TLBBATCH_H
+#define _ASM_POWERPC_TLBBATCH_H
+
+#include <linux/cpumask.h>
+
+struct arch_tlbflush_unmap_batch {
+ struct cpumask cpumask;
+};
+
+#endif /* _ASM_POWERPC_TLBBATCH_H */
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c b/arch/powerpc/mm/book3s64/radix_tlb.c
index 9e1f6558d026..2b1b2f7429fc 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -11,6 +11,7 @@
#include <linux/mmu_context.h>
#include <linux/sched/mm.h>
#include <linux/debugfs.h>
+#include <linux/smp.h>
#include <asm/ppc-opcode.h>
#include <asm/tlb.h>
@@ -1585,3 +1586,34 @@ static int __init create_tlb_single_page_flush_ceiling(void)
}
late_initcall(create_tlb_single_page_flush_ceiling);
+#ifdef CONFIG_ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
+bool arch_tlbbatch_should_defer(struct mm_struct *mm)
+{
+ if (!radix_enabled())
+ return false;
+ return true;
+}
+
+void arch_tlbbatch_add_pending(struct arch_tlbflush_unmap_batch *batch,
+ struct mm_struct *mm,
+ unsigned long uaddr)
+{
+ cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm));
+}
+
+void arch_flush_tlb_batched_pending(struct mm_struct *mm)
+{
+ flush_tlb_mm(mm);
+}
+
+static inline void tlbiel_flush_all_lpid(void *arg)
+{
+ tlbiel_all_lpid(true);
+}
+
+void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
+{
+ on_each_cpu_mask(&batch->cpumask, tlbiel_flush_all_lpid, NULL, 1);
+ cpumask_clear(&batch->cpumask);
+}
+#endif
--
2.46.0
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2024-09-22 12:46 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-22 12:46 [RFC / PoC v1 0/1] powerpc: Add support for batched unmap TLB flush Ritesh Harjani (IBM)
2024-09-22 12:46 ` [RFC / PoC v1 1/1] " Ritesh Harjani (IBM)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).