linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] arm64: TLB flush issues
@ 2014-05-02 15:20 Mark Salter
  2014-05-02 15:20 ` [PATCH 1/2] arm64: fix unnecessary tlb flushes Mark Salter
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Mark Salter @ 2014-05-02 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

As explained in more detail in the second patch, I have observed a soft
lockup under some loads. These lockups were in flush_tlb_kernel_range()
which was looping through a very large address range. While looking into
this, I also noticed the flush routines in tlb.S were not properly
handling pages larger than 4k. This is corrected in the first patch.
The second patch limits the loop size for the flush_tlb_[kernel_]range
functions. It uses an arbitrary constant to limit the loop, but it
would be better if it were based on actual tlb size or some other
heuristic.

Mark Salter (2):
  arm64: fix unnecessary tlb flushes
  arm64: fix soft lockup due to large tlb flush range

 arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++++++++++---
 arch/arm64/mm/tlb.S               |  4 ++--
 2 files changed, 27 insertions(+), 5 deletions(-)

-- 
1.9.0

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/2] arm64: fix unnecessary tlb flushes
  2014-05-02 15:20 [PATCH 0/2] arm64: TLB flush issues Mark Salter
@ 2014-05-02 15:20 ` Mark Salter
  2014-05-02 15:20 ` [PATCH 2/2] arm64: fix soft lockup due to large tlb flush range Mark Salter
  2014-05-02 15:30 ` [PATCH 0/2] arm64: TLB flush issues Steve Capper
  2 siblings, 0 replies; 5+ messages in thread
From: Mark Salter @ 2014-05-02 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

The __cpu_flush_user_tlb_range() and __cpu_flush_user_tlb_range()
functions loop through an address range by page to flush tlb entries.
However, these functions assume a 4K page size. If the kernel is
configured for 64k page sizes, these functions would execute the
tlbi instruction 16 times per page rather than once. This patch
uses the PAGE_SHIFT definition to ensure one tlb flush for any
given page in the range.

Signed-off-by: Mark Salter <msalter@redhat.com>
---
 arch/arm64/mm/tlb.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/tlb.S b/arch/arm64/mm/tlb.S
index 19da91e..b818073 100644
--- a/arch/arm64/mm/tlb.S
+++ b/arch/arm64/mm/tlb.S
@@ -42,7 +42,7 @@ ENTRY(__cpu_flush_user_tlb_range)
 	bfi	x0, x3, #48, #16		// start VA and ASID
 	bfi	x1, x3, #48, #16		// end VA and ASID
 1:	tlbi	vae1is, x0			// TLB invalidate by address and ASID
-	add	x0, x0, #1
+	add	x0, x0, #(1 << (PAGE_SHIFT - 12))
 	cmp	x0, x1
 	b.lo	1b
 	dsb	sy
@@ -62,7 +62,7 @@ ENTRY(__cpu_flush_kern_tlb_range)
 	lsr	x0, x0, #12			// align address
 	lsr	x1, x1, #12
 1:	tlbi	vaae1is, x0			// TLB invalidate by address
-	add	x0, x0, #1
+	add	x0, x0, #(1 << (PAGE_SHIFT - 12))
 	cmp	x0, x1
 	b.lo	1b
 	dsb	sy
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/2] arm64: fix soft lockup due to large tlb flush range
  2014-05-02 15:20 [PATCH 0/2] arm64: TLB flush issues Mark Salter
  2014-05-02 15:20 ` [PATCH 1/2] arm64: fix unnecessary tlb flushes Mark Salter
@ 2014-05-02 15:20 ` Mark Salter
  2014-05-02 15:30 ` [PATCH 0/2] arm64: TLB flush issues Steve Capper
  2 siblings, 0 replies; 5+ messages in thread
From: Mark Salter @ 2014-05-02 15:20 UTC (permalink / raw)
  To: linux-arm-kernel

Under certain loads, this soft lockup has been observed:

   BUG: soft lockup - CPU#2 stuck for 22s! [ip6tables:1016]
   Modules linked in: ip6t_rpfilter ip6t_REJECT cfg80211 rfkill xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw vfat fat efivarfs xfs libcrc32c

   CPU: 2 PID: 1016 Comm: ip6tables Not tainted 3.13.0-0.rc7.30.sa2.aarch64 #1
   task: fffffe03e81d1400 ti: fffffe03f01f8000 task.ti: fffffe03f01f8000
   PC is at __cpu_flush_kern_tlb_range+0xc/0x40
   LR is at __purge_vmap_area_lazy+0x28c/0x3ac
   pc : [<fffffe000009c5cc>] lr : [<fffffe0000182710>] pstate: 80000145
   sp : fffffe03f01fbb70
   x29: fffffe03f01fbb70 x28: fffffe03f01f8000
   x27: fffffe0000b19000 x26: 00000000000000d0
   x25: 000000000000001c x24: fffffe03f01fbc50
   x23: fffffe03f01fbc58 x22: fffffe03f01fbc10
   x21: fffffe0000b2a3f8 x20: 0000000000000802
   x19: fffffe0000b2a3c8 x18: 000003fffdf52710
   x17: 000003ff9d8bb910 x16: fffffe000050fbfc
   x15: 0000000000005735 x14: 000003ff9d7e1a5c
   x13: 0000000000000000 x12: 000003ff9d7e1a5c
   x11: 0000000000000007 x10: fffffe0000c09af0
   x9 : fffffe0000ad1000 x8 : 000000000000005c
   x7 : fffffe03e8624000 x6 : 0000000000000000
   x5 : 0000000000000000 x4 : 0000000000000000
   x3 : fffffe0000c09cc8 x2 : 0000000000000000
   x1 : 000fffffdfffca80 x0 : 000fffffcd742150

The __cpu_flush_kern_tlb_range() function looks like:

  ENTRY(__cpu_flush_kern_tlb_range)
	dsb	sy
	lsr	x0, x0, #12
	lsr	x1, x1, #12
  1:	tlbi	vaae1is, x0
	add	x0, x0, #1
	cmp	x0, x1
	b.lo	1b
	dsb	sy
	isb
	ret
  ENDPROC(__cpu_flush_kern_tlb_range)

The above soft lockup shows the PC at tlbi insn with:

  x0 = 0x000fffffcd742150
  x1 = 0x000fffffdfffca80

So __cpu_flush_kern_tlb_range has 0x128ba930 tlbi flushes left
after it has already been looping for 23 seconds!.

Looking up one frame at __purge_vmap_area_lazy(), there is:

	...
	list_for_each_entry_rcu(va, &vmap_area_list, list) {
		if (va->flags & VM_LAZY_FREE) {
			if (va->va_start < *start)
				*start = va->va_start;
			if (va->va_end > *end)
				*end = va->va_end;
			nr += (va->va_end - va->va_start) >> PAGE_SHIFT;
			list_add_tail(&va->purge_list, &valist);
			va->flags |= VM_LAZY_FREEING;
			va->flags &= ~VM_LAZY_FREE;
		}
	}
	...
	if (nr || force_flush)
		flush_tlb_kernel_range(*start, *end);

So if two areas are being freed, the range passed to
flush_tlb_kernel_range() may be as large as the vmalloc
space. For arm64, this is ~240GB for 4k pagesize and ~2TB
for 64kpage size.

This patch works around this problem by adding a loop limit.
If the range is larger than the limit, use flush_tlb_all()
rather than flushing based on individual pages. The limit
chosen is arbitrary and would be better if based on the
actual size of the tlb. I looked through the ARM ARM but
didn't see any easy way to get the actual tlb size, so for
now the arbitrary limit is better than the soft lockup.

Signed-off-by: Mark Salter <msalter@redhat.com>
---
 arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index 8b48203..34ea52a 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -99,10 +99,32 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
 }
 
 /*
- * Convert calls to our calling convention.
+ * The flush by range functions may take a very large range.
+ * If we need to invalidate a large range, it may be better
+ * to invalidate all tlb entries at once rather than looping
+ * through and invalidating individual entries.
+ *
+ * Here, we just use a fixed (arbitrary) number. It would be
+ * better if this was based on the actual size of the tlb...
  */
-#define flush_tlb_range(vma,start,end)	__cpu_flush_user_tlb_range(start,end,vma)
-#define flush_tlb_kernel_range(s,e)	__cpu_flush_kern_tlb_range(s,e)
+#define MAX_TLB_LOOP 128
+
+static inline void flush_tlb_range(struct vm_area_struct *vma,
+				   unsigned long start, unsigned long end)
+{
+	if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP)
+		__cpu_flush_user_tlb_range(start, end, vma);
+	else
+		flush_tlb_mm(vma->vm_mm);
+}
+
+static inline void flush_tlb_kernel_range(unsigned long start, unsigned long end)
+{
+	if (((end - start) >> PAGE_SHIFT) < MAX_TLB_LOOP)
+		__cpu_flush_kern_tlb_range(start, end);
+	else
+		flush_tlb_all();
+}
 
 /*
  * On AArch64, the cache coherency is handled via the set_pte_at() function.
-- 
1.9.0

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 0/2] arm64: TLB flush issues
  2014-05-02 15:20 [PATCH 0/2] arm64: TLB flush issues Mark Salter
  2014-05-02 15:20 ` [PATCH 1/2] arm64: fix unnecessary tlb flushes Mark Salter
  2014-05-02 15:20 ` [PATCH 2/2] arm64: fix soft lockup due to large tlb flush range Mark Salter
@ 2014-05-02 15:30 ` Steve Capper
  2014-05-02 15:50   ` Mark Salter
  2 siblings, 1 reply; 5+ messages in thread
From: Steve Capper @ 2014-05-02 15:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, May 02, 2014 at 11:20:33AM -0400, Mark Salter wrote:
> As explained in more detail in the second patch, I have observed a soft
> lockup under some loads. These lockups were in flush_tlb_kernel_range()
> which was looping through a very large address range. While looking into
> this, I also noticed the flush routines in tlb.S were not properly
> handling pages larger than 4k. This is corrected in the first patch.
> The second patch limits the loop size for the flush_tlb_[kernel_]range
> functions. It uses an arbitrary constant to limit the loop, but it
> would be better if it were based on actual tlb size or some other
> heuristic.
> 
> Mark Salter (2):
>   arm64: fix unnecessary tlb flushes
>   arm64: fix soft lockup due to large tlb flush range
> 
>  arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++++++++++---
>  arch/arm64/mm/tlb.S               |  4 ++--
>  2 files changed, 27 insertions(+), 5 deletions(-)
> 

Hi Mark,
As a heads-up, I posted this earlier today:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/252837.html

Cheers,
-- 
Steve

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 0/2] arm64: TLB flush issues
  2014-05-02 15:30 ` [PATCH 0/2] arm64: TLB flush issues Steve Capper
@ 2014-05-02 15:50   ` Mark Salter
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Salter @ 2014-05-02 15:50 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2014-05-02 at 16:30 +0100, Steve Capper wrote:
> On Fri, May 02, 2014 at 11:20:33AM -0400, Mark Salter wrote:
> > As explained in more detail in the second patch, I have observed a soft
> > lockup under some loads. These lockups were in flush_tlb_kernel_range()
> > which was looping through a very large address range. While looking into
> > this, I also noticed the flush routines in tlb.S were not properly
> > handling pages larger than 4k. This is corrected in the first patch.
> > The second patch limits the loop size for the flush_tlb_[kernel_]range
> > functions. It uses an arbitrary constant to limit the loop, but it
> > would be better if it were based on actual tlb size or some other
> > heuristic.
> > 
> > Mark Salter (2):
> >   arm64: fix unnecessary tlb flushes
> >   arm64: fix soft lockup due to large tlb flush range
> > 
> >  arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++++++++++---
> >  arch/arm64/mm/tlb.S               |  4 ++--
> >  2 files changed, 27 insertions(+), 5 deletions(-)
> > 
> 
> Hi Mark,
> As a heads-up, I posted this earlier today:
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-May/252837.html
> 
> Cheers,

Thanks. That takes care of the pagesize handling (my patch 1/2).

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-05-02 15:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-02 15:20 [PATCH 0/2] arm64: TLB flush issues Mark Salter
2014-05-02 15:20 ` [PATCH 1/2] arm64: fix unnecessary tlb flushes Mark Salter
2014-05-02 15:20 ` [PATCH 2/2] arm64: fix soft lockup due to large tlb flush range Mark Salter
2014-05-02 15:30 ` [PATCH 0/2] arm64: TLB flush issues Steve Capper
2014-05-02 15:50   ` Mark Salter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).