All of lore.kernel.org
 help / color / mirror / Atom feed
From: gregory.clement@free-electrons.com (Gregory CLEMENT)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/4] ARM: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead
Date: Wed, 15 May 2013 15:18:53 +0200	[thread overview]
Message-ID: <51938B3D.2070508@free-electrons.com> (raw)
In-Reply-To: <1364235581-17900-4-git-send-email-will.deacon@arm.com>

Hi Will,

On 03/25/2013 07:19 PM, Will Deacon wrote:
> Many ARMv7 cores have hardware page table walkers that can read the L1
> cache. This is discoverable from the ID_MMFR3 register, although this
> can be expensive to access from the low-level set_pte functions and is a
> pain to cache, particularly with multi-cluster systems.
> 
> A useful observation is that the multi-processing extensions for ARMv7
> require coherent table walks, meaning that we can make use of ALT_SMP
> patching in proc-v7-* to patch away the cache flush safely for these
> cores.

I encountered a regression with 3.10-rc1 on the Armada 370 based boards.
With the 3.10-rc1 they hang during auto testy of the xor engine which are
mainly DMA transfers. If I revert this patch, it no more hang. I found this
by using bisect, it was not obvious at all for me that this patch may have
cause this regression.
The issue appear in SMP and in UP. However I think that  the PJ4B-v7 used in
 the Armada 370 are not MP capable.

I made some investigation. And in UP if I remove the line:
	ALT_UP(W(nop))

at the begining of the cpu_v7_dcache_clean_are macro located in
arch/arm/mm/proc-v7.S

Then the kernel boot again. It is not surprising because in this case
we found the same generated code that before this patch was applied.

now I don't really understand why a W(nop) will cause this issue.

In SMP mode even with this line removed the kernel hang, but in this case
I am not sure of what happen exactly and how the .alt.smp.init section is
used.

I don't know if it is relevant but I tested with these 2 version of gcc:
gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5)
and
gcc version 4.7.3 (Ubuntu/Linaro 4.7.3-1ubuntu1)

I hope you will find some explanation and solution to this bug, because currently
the only solution I have is to revert this patch.

Thanks,
Gregory
> 
> Reported-by: Albin Tonnerre <Albin.Tonnerre@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  arch/arm/include/asm/tlbflush.h | 2 +-
>  arch/arm/mm/proc-v6.S           | 2 --
>  arch/arm/mm/proc-v7-2level.S    | 3 ++-
>  arch/arm/mm/proc-v7-3level.S    | 3 ++-
>  arch/arm/mm/proc-v7.S           | 4 ++--
>  5 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm/include/asm/tlbflush.h b/arch/arm/include/asm/tlbflush.h
> index c7cdb59..42d155e 100644
> --- a/arch/arm/include/asm/tlbflush.h
> +++ b/arch/arm/include/asm/tlbflush.h
> @@ -169,7 +169,7 @@
>  # define v6wbi_always_flags	(-1UL)
>  #endif
>  
> -#define v7wbi_tlb_flags_smp	(TLB_WB | TLB_DCLEAN | TLB_BARRIER | \
> +#define v7wbi_tlb_flags_smp	(TLB_WB | TLB_BARRIER | \
>  				 TLB_V6_U_FULL | TLB_V6_U_PAGE | \
>  				 TLB_V6_U_ASID | TLB_V6_BP | \
>  				 TLB_V7_UIS_FULL | TLB_V7_UIS_PAGE | \
> diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
> index bcaaa8d..a286d47 100644
> --- a/arch/arm/mm/proc-v6.S
> +++ b/arch/arm/mm/proc-v6.S
> @@ -80,12 +80,10 @@ ENTRY(cpu_v6_do_idle)
>  	mov	pc, lr
>  
>  ENTRY(cpu_v6_dcache_clean_area)
> -#ifndef TLB_CAN_READ_FROM_L1_CACHE
>  1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
>  	add	r0, r0, #D_CACHE_LINE_SIZE
>  	subs	r1, r1, #D_CACHE_LINE_SIZE
>  	bhi	1b
> -#endif
>  	mov	pc, lr
>  
>  /*
> diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
> index 78f520b..9704097 100644
> --- a/arch/arm/mm/proc-v7-2level.S
> +++ b/arch/arm/mm/proc-v7-2level.S
> @@ -110,7 +110,8 @@ ENTRY(cpu_v7_set_pte_ext)
>   ARM(	str	r3, [r0, #2048]! )
>   THUMB(	add	r0, r0, #2048 )
>   THUMB(	str	r3, [r0] )
> -	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +	ALT_SMP(mov	pc,lr)
> +	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
>  #endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_set_pte_ext)
> diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
> index 6ffd78c..363027e 100644
> --- a/arch/arm/mm/proc-v7-3level.S
> +++ b/arch/arm/mm/proc-v7-3level.S
> @@ -73,7 +73,8 @@ ENTRY(cpu_v7_set_pte_ext)
>  	tst	r3, #1 << (55 - 32)		@ L_PTE_DIRTY
>  	orreq	r2, #L_PTE_RDONLY
>  1:	strd	r2, r3, [r0]
> -	mcr	p15, 0, r0, c7, c10, 1		@ flush_pte
> +	ALT_SMP(mov	pc, lr)
> +	ALT_UP (mcr	p15, 0, r0, c7, c10, 1)		@ flush_pte
>  #endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_set_pte_ext)
> diff --git a/arch/arm/mm/proc-v7.S b/arch/arm/mm/proc-v7.S
> index 3a3c015..37716b0 100644
> --- a/arch/arm/mm/proc-v7.S
> +++ b/arch/arm/mm/proc-v7.S
> @@ -75,14 +75,14 @@ ENTRY(cpu_v7_do_idle)
>  ENDPROC(cpu_v7_do_idle)
>  
>  ENTRY(cpu_v7_dcache_clean_area)
> -#ifndef TLB_CAN_READ_FROM_L1_CACHE
> +	ALT_SMP(mov	pc, lr)			@ MP extensions imply L1 PTW
> +	ALT_UP(W(nop))
>  	dcache_line_size r2, r3
>  1:	mcr	p15, 0, r0, c7, c10, 1		@ clean D entry
>  	add	r0, r0, r2
>  	subs	r1, r1, r2
>  	bhi	1b
>  	dsb
> -#endif
>  	mov	pc, lr
>  ENDPROC(cpu_v7_dcache_clean_area)
>  
> 


-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

  parent reply	other threads:[~2013-05-15 13:18 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-25 18:19 [PATCH 0/4] TLB and mm-related optimisations Will Deacon
2013-03-25 18:19 ` [PATCH 1/4] ARM: tlb: don't perform inner-shareable invalidation for local TLB ops Will Deacon
2013-03-27 10:34   ` Catalin Marinas
2013-03-27 12:07     ` Will Deacon
2013-03-27 12:30       ` Catalin Marinas
2013-03-27 12:56         ` Will Deacon
2013-03-27 13:40           ` Catalin Marinas
2013-03-27 13:54             ` Will Deacon
2013-03-25 18:19 ` [PATCH 2/4] ARM: tlb: don't perform inner-shareable invalidation for local BP ops Will Deacon
2013-03-27 10:36   ` Catalin Marinas
2013-03-25 18:19 ` [PATCH 3/4] ARM: mm: kill unused TLB_CAN_READ_FROM_L1_CACHE and use ALT_SMP instead Will Deacon
2013-03-27 10:53   ` Catalin Marinas
2013-03-27 12:20     ` Will Deacon
2013-05-15 13:18   ` Gregory CLEMENT [this message]
2013-05-15 13:41     ` Will Deacon
2013-05-15 13:54       ` Gregory CLEMENT
2013-05-15 14:06         ` Will Deacon
2013-05-15 14:46           ` Gregory CLEMENT
2013-05-15 15:04             ` Will Deacon
2013-05-15 15:36               ` Gregory CLEMENT
2013-05-15 15:41                 ` Will Deacon
2013-05-15 16:29                   ` Gregory CLEMENT
2013-05-15 16:48                     ` Will Deacon
2013-05-15 17:16                       ` Russell King - ARM Linux
2013-03-25 18:19 ` [PATCH 4/4] ARM: atomics: don't use exclusives for atomic64 read/set with LPAE Will Deacon
2013-03-27 10:57   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51938B3D.2070508@free-electrons.com \
    --to=gregory.clement@free-electrons.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.