* [PATCH] arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE
@ 2018-11-26 17:01 Will Deacon
2018-11-27 18:57 ` Catalin Marinas
0 siblings, 1 reply; 2+ messages in thread
From: Will Deacon @ 2018-11-26 17:01 UTC (permalink / raw)
To: linux-arm-kernel
In order to reduce the possibility of soft lock-ups, we bound the
maximum number of TLBI operations performed by a single call to
flush_tlb_range() to an arbitrary constant of 1024.
Whilst this does the job of avoiding lock-ups, we can actually be a bit
smarter by defining this as PTRS_PER_PTE. Due to the structure of our
page tables, using PTRS_PER_PTE means that an outer loop calling
flush_tlb_range() for entire table entries will end up performing just a
single TLBI operation for each entry. As an example, mremap()ing a 1GB
range mapped using 4k pages now requires only 512 TLBI operations when
moving the page tables as opposed to 262144 operations (512*512) when
using the current threshold of 1024.
Cc: Joel Fernandes <joel@joelfernandes.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
arch/arm64/include/asm/tlbflush.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index c3c0387aee18..460fdd69ad5b 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -179,7 +179,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
* This is meant to avoid soft lock-ups on large TLB flushing ranges and not
* necessarily a performance improvement.
*/
-#define MAX_TLBI_OPS 1024UL
+#define MAX_TLBI_OPS PTRS_PER_PTE
static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end,
@@ -188,7 +188,7 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
unsigned long asid = ASID(vma->vm_mm);
unsigned long addr;
- if ((end - start) > (MAX_TLBI_OPS * stride)) {
+ if ((end - start) >= (MAX_TLBI_OPS * stride)) {
flush_tlb_mm(vma->vm_mm);
return;
}
--
2.1.4
^ permalink raw reply related [flat|nested] 2+ messages in thread
* [PATCH] arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE
2018-11-26 17:01 [PATCH] arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE Will Deacon
@ 2018-11-27 18:57 ` Catalin Marinas
0 siblings, 0 replies; 2+ messages in thread
From: Catalin Marinas @ 2018-11-27 18:57 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Nov 26, 2018 at 05:01:07PM +0000, Will Deacon wrote:
> In order to reduce the possibility of soft lock-ups, we bound the
> maximum number of TLBI operations performed by a single call to
> flush_tlb_range() to an arbitrary constant of 1024.
>
> Whilst this does the job of avoiding lock-ups, we can actually be a bit
> smarter by defining this as PTRS_PER_PTE. Due to the structure of our
> page tables, using PTRS_PER_PTE means that an outer loop calling
> flush_tlb_range() for entire table entries will end up performing just a
> single TLBI operation for each entry. As an example, mremap()ing a 1GB
> range mapped using 4k pages now requires only 512 TLBI operations when
> moving the page tables as opposed to 262144 operations (512*512) when
> using the current threshold of 1024.
To be more precise, we'd have 512 TLBI ASIDE1IS vs 262144 TLBI VAE1IS
(or VALE1IS). But since it only affects the given ASID, I don't think it
matters.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2018-11-27 18:57 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-26 17:01 [PATCH] arm64: tlbi: Set MAX_TLBI_OPS to PTRS_PER_PTE Will Deacon
2018-11-27 18:57 ` Catalin Marinas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).