linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] ARM: iwmmxt/pj4 fixes for v7/Thumb-2
@ 2011-09-08 16:04 Dave Martin
  2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

The iwmmxt and pj4 support code contains some code which isn't
compatible with Thumb-2 and looks possibly inappropriate for v7.

There is also an issue with Makefile rules causing iwmmxt.S to be
built for the wrong architecture.

The Makefile and .S changes for iwmmxt may get folded into a
single patch; they don't make much sense independently any more.

v2 changes:

*   Makefile logic changed, so that -march=armv7-a+iwmmxt is
    demanded for Thumb-2 kernels, and a warning is printed if the
    assembler doesn't support this when buiding a v7 ARM kernel.

Dave Martin (3):
  ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2
  ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2

 arch/arm/kernel/Makefile  |   12 ++++++++++++
 arch/arm/kernel/iwmmxt.S  |    9 +++++++++
 arch/arm/kernel/pj4-cp0.c |    4 ++++
 3 files changed, 25 insertions(+), 0 deletions(-)

--
1.7.4.1

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2
  2011-09-08 16:04 [PATCH v2 0/3] ARM: iwmmxt/pj4 fixes for v7/Thumb-2 Dave Martin
@ 2011-09-08 16:04 ` Dave Martin
  2011-09-08 16:45   ` Arnd Bergmann
  2011-09-08 17:09   ` Nicolas Pitre
  2011-09-08 16:04 ` [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2 Dave Martin
  2011-09-08 16:04 ` [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 " Dave Martin
  2 siblings, 2 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

Because gcc/gas have no sane way to turn on individual CPU
extensions from the command-line, iwmmxt.S was previously built
with -mcpu=iwmmxt.  Unfortunately, this also downgrades the CPU to
v5, with the result that this file fails to build for a Thumb-2
kernel.

New versions of the tools support -march=<base arch>+iwmmxt, and it
seems reasonable to require up-to-date tools when building in
Thumb-2.  So, this patch uses -march=armv7-a+iwmmxt for
CONFIG_THUMB2_KERNEL=y.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
 arch/arm/kernel/Makefile |   12 ++++++++++++
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index f7887dc..e03691e 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -66,6 +66,18 @@ obj-$(CONFIG_IWMMXT)		+= iwmmxt.o
 obj-$(CONFIG_CPU_HAS_PMU)	+= pmu.o
 obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
 AFLAGS_iwmmxt.o			:= -Wa,-mcpu=iwmmxt
+ifdef CONFIG_CPU_V7
+AFLAGS_iwmmxt.o			:= $(call cc-option,-Wa$(comma)-march=armv7-a+iwmmxt,-DIWMMXT_LEGACY_ASSEMBLER)
+ifeq ($(AFLAGS_iwmmxt.o),-DIWMMXT_LEGACY_ASSEMBLER)
+ifdef CONFIG_THUMB2_KERNEL
+$(error Newer binutils is needed to build iwmmxt support for Thumb-2 kernels.)
+else
+$(warning Warning: Building legacy pre-v7 iwmmxt code for a v7 target.\
+  This is inadvisable.  Please upgrade to newer binutils.)
+AFLAGS_iwmmxt.o			+= -Wa,-mcpu=iwmmxt
+endif
+endif
+endif # CONFIG_CPU_V7
 
 ifneq ($(CONFIG_ARCH_EBSA110),y)
   obj-y		+= io.o
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 16:04 [PATCH v2 0/3] ARM: iwmmxt/pj4 fixes for v7/Thumb-2 Dave Martin
  2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
@ 2011-09-08 16:04 ` Dave Martin
  2011-09-08 16:45   ` Arnd Bergmann
  2011-09-08 16:04 ` [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 " Dave Martin
  2 siblings, 1 reply; 30+ messages in thread
From: Dave Martin @ 2011-09-08 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

The iwmmxt code contains some code to implement a pseudo-ISB, but
this is not buildable for Thumb-2.

This patch replaces the pseudo-ISB with a real one for Thumb-2
kernels.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
 arch/arm/kernel/iwmmxt.S |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index a087838..7e049b0 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -319,8 +319,17 @@ ENTRY(iwmmxt_task_switch)
 	PJ4(eor r1, r1, #0xf)
 	PJ4(mcr	p15, 0, r1, c1, c0, 2)
 
+/*
+ * This should be ported to XSC()/PJ4() when everyone has new enough binutils
+ * to support the -march=...+iwmmxt command-line option syntax.
+ */
+#if __LINUX_ARM_ARCH__ < 7 || defined(IWMMXT_LEGACY_ASSEMBLER)
 	mrc	p15, 0, r1, c2, c0, 0
 	sub	pc, lr, r1, lsr #32		@ cpwait and return
+#else
+	isb					@ ISB needed instead on ARMv7
+	mov	pc, lr
+#endif
 
 /*
  * Remove Concan ownership of given task
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 16:04 [PATCH v2 0/3] ARM: iwmmxt/pj4 fixes for v7/Thumb-2 Dave Martin
  2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
  2011-09-08 16:04 ` [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2 Dave Martin
@ 2011-09-08 16:04 ` Dave Martin
  2011-09-08 16:44   ` Arnd Bergmann
  2011-09-08 17:06   ` Nicolas Pitre
  2 siblings, 2 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 16:04 UTC (permalink / raw)
  To: linux-arm-kernel

The iwmmxt code contains some code to implement a pseudo-ISB, but
this is not buildable for Thumb-2.

This patch replaces the pseudo-ISB with a real one on v7 and above.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
 arch/arm/kernel/pj4-cp0.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/arch/arm/kernel/pj4-cp0.c b/arch/arm/kernel/pj4-cp0.c
index a4b1b07..5117d9d 100644
--- a/arch/arm/kernel/pj4-cp0.c
+++ b/arch/arm/kernel/pj4-cp0.c
@@ -66,9 +66,13 @@ static void __init pj4_cp_access_write(u32 value)
 
 	__asm__ __volatile__ (
 		"mcr	p15, 0, %1, c1, c0, 2\n\t"
+#ifdef __LINUX_ARM_ARCH__ >= 7
+		"isb\n\t"
+#else
 		"mrc	p15, 0, %0, c1, c0, 2\n\t"
 		"mov	%0, %0\n\t"
 		"sub	pc, pc, #4\n\t"
+#endif
 		: "=r" (temp) : "r" (value));
 }
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 16:04 ` [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 " Dave Martin
@ 2011-09-08 16:44   ` Arnd Bergmann
  2011-09-08 17:06     ` Eric Miao
  2011-09-08 17:23     ` Dave Martin
  2011-09-08 17:06   ` Nicolas Pitre
  1 sibling, 2 replies; 30+ messages in thread
From: Arnd Bergmann @ 2011-09-08 16:44 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 08 September 2011, Dave Martin wrote:
>         __asm__ __volatile__ (
>                 "mcr    p15, 0, %1, c1, c0, 2\n\t"
> +#ifdef __LINUX_ARM_ARCH__ >= 7
> +               "isb\n\t"
> +#else
>                 "mrc    p15, 0, %0, c1, c0, 2\n\t"
>                 "mov    %0, %0\n\t"
>                 "sub    pc, pc, #4\n\t"
> +#endif
>                 : "=r" (temp) : "r" (value));
>  }

I thought we had concluded that we don't actually need the #else
path because there is no pre-v7 support for pj4 any more.

	Arnd

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2
  2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
@ 2011-09-08 16:45   ` Arnd Bergmann
  2011-09-08 17:14     ` Dave Martin
  2011-09-08 17:09   ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Arnd Bergmann @ 2011-09-08 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 08 September 2011, Dave Martin wrote:
> Because gcc/gas have no sane way to turn on individual CPU
> extensions from the command-line, iwmmxt.S was previously built
> with -mcpu=iwmmxt.  Unfortunately, this also downgrades the CPU to
> v5, with the result that this file fails to build for a Thumb-2
> kernel.
> 
> New versions of the tools support -march=<base arch>+iwmmxt, and it
> seems reasonable to require up-to-date tools when building in
> Thumb-2.  So, this patch uses -march=armv7-a+iwmmxt for
> CONFIG_THUMB2_KERNEL=y.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>

Ok. More complex than I would have liked, but this definitely does
the right thing in every case, as far as I can tell.

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 16:04 ` [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2 Dave Martin
@ 2011-09-08 16:45   ` Arnd Bergmann
  2011-09-08 17:03     ` Eric Miao
  0 siblings, 1 reply; 30+ messages in thread
From: Arnd Bergmann @ 2011-09-08 16:45 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday 08 September 2011, Dave Martin wrote:
> The iwmmxt code contains some code to implement a pseudo-ISB, but
> this is not buildable for Thumb-2.
> 
> This patch replaces the pseudo-ISB with a real one for Thumb-2
> kernels.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> ---
>  arch/arm/kernel/iwmmxt.S |    9 +++++++++
>  1 files changed, 9 insertions(+), 0 deletions(-)

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 16:45   ` Arnd Bergmann
@ 2011-09-08 17:03     ` Eric Miao
  2011-09-08 17:13       ` Jon Medhurst (Tixy)
  2011-09-08 17:20       ` Dave Martin
  0 siblings, 2 replies; 30+ messages in thread
From: Eric Miao @ 2011-09-08 17:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 08 September 2011, Dave Martin wrote:
>> The iwmmxt code contains some code to implement a pseudo-ISB, but
>> this is not buildable for Thumb-2.
>>
>> This patch replaces the pseudo-ISB with a real one for Thumb-2
>> kernels.
>>
>> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>> ---
>> ?arch/arm/kernel/iwmmxt.S | ? ?9 +++++++++
>> ?1 files changed, 9 insertions(+), 0 deletions(-)
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
>

Maybe it'll be much simpler to have something like below:

diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index a087838..5998f7d 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
        PJ4(eor r1, r1, #0xf)
        PJ4(mcr p15, 0, r1, c1, c0, 2)

-       mrc     p15, 0, r1, c2, c0, 0
-       sub     pc, lr, r1, lsr #32             @ cpwait and return
+       XSC(mrc p15, 0, r1, c2, c0, 0)
+       PJ4(isb)
+       mov     pc, lr                          @ cpwait and return

 /*
  * Remove Concan ownership of given task

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 16:04 ` [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 " Dave Martin
  2011-09-08 16:44   ` Arnd Bergmann
@ 2011-09-08 17:06   ` Nicolas Pitre
  2011-09-08 17:06     ` Eric Miao
  1 sibling, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-08 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 8 Sep 2011, Dave Martin wrote:

> The iwmmxt code contains some code to implement a pseudo-ISB, but
> this is not buildable for Thumb-2.
> 
> This patch replaces the pseudo-ISB with a real one on v7 and above.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>

In arch/arm/mm/Kconfig:

config CPU_PJ4
        bool
        select CPU_V7
        select ARM_THUMBEE

So this code may only be ARMv7 aware now.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 16:44   ` Arnd Bergmann
@ 2011-09-08 17:06     ` Eric Miao
  2011-09-08 17:23     ` Dave Martin
  1 sibling, 0 replies; 30+ messages in thread
From: Eric Miao @ 2011-09-08 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 8, 2011 at 9:44 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 08 September 2011, Dave Martin wrote:
>> ? ? ? ? __asm__ __volatile__ (
>> ? ? ? ? ? ? ? ? "mcr ? ?p15, 0, %1, c1, c0, 2\n\t"
>> +#ifdef __LINUX_ARM_ARCH__ >= 7
>> + ? ? ? ? ? ? ? "isb\n\t"
>> +#else
>> ? ? ? ? ? ? ? ? "mrc ? ?p15, 0, %0, c1, c0, 2\n\t"
>> ? ? ? ? ? ? ? ? "mov ? ?%0, %0\n\t"
>> ? ? ? ? ? ? ? ? "sub ? ?pc, pc, #4\n\t"
>> +#endif
>> ? ? ? ? ? ? ? ? : "=r" (temp) : "r" (value));
>> ?}
>
> I thought we had concluded that we don't actually need the #else
> path because there is no pre-v7 support for pj4 any more.

Or if isb() works fine, will the below change be cleaner:

diff --git a/arch/arm/kernel/pj4-cp0.c b/arch/arm/kernel/pj4-cp0.c
index a4b1b07..dc4272d 100644
--- a/arch/arm/kernel/pj4-cp0.c
+++ b/arch/arm/kernel/pj4-cp0.c
@@ -62,14 +62,10 @@ static u32 __init pj4_cp_access_read(void)

 static void __init pj4_cp_access_write(u32 value)
 {
-       u32 temp;
-
        __asm__ __volatile__ (
-               "mcr    p15, 0, %1, c1, c0, 2\n\t"
-               "mrc    p15, 0, %0, c1, c0, 2\n\t"
-               "mov    %0, %0\n\t"
-               "sub    pc, pc, #4\n\t"
-               : "=r" (temp) : "r" (value));
+               "mcr    p15, 0, %0, c1, c0, 2\n\t"
+               : : "r" (value));
+       isb();
 }

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 17:06   ` Nicolas Pitre
@ 2011-09-08 17:06     ` Eric Miao
  0 siblings, 0 replies; 30+ messages in thread
From: Eric Miao @ 2011-09-08 17:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 8, 2011 at 10:06 AM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Thu, 8 Sep 2011, Dave Martin wrote:
>
>> The iwmmxt code contains some code to implement a pseudo-ISB, but
>> this is not buildable for Thumb-2.
>>
>> This patch replaces the pseudo-ISB with a real one on v7 and above.
>>
>> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>
> In arch/arm/mm/Kconfig:
>
> config CPU_PJ4
> ? ? ? ?bool
> ? ? ? ?select CPU_V7
> ? ? ? ?select ARM_THUMBEE
>
> So this code may only be ARMv7 aware now.
>

True.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2
  2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
  2011-09-08 16:45   ` Arnd Bergmann
@ 2011-09-08 17:09   ` Nicolas Pitre
  1 sibling, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-08 17:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 8 Sep 2011, Dave Martin wrote:

> Because gcc/gas have no sane way to turn on individual CPU
> extensions from the command-line, iwmmxt.S was previously built
> with -mcpu=iwmmxt.  Unfortunately, this also downgrades the CPU to
> v5, with the result that this file fails to build for a Thumb-2
> kernel.
> 
> New versions of the tools support -march=<base arch>+iwmmxt, and it
> seems reasonable to require up-to-date tools when building in
> Thumb-2.  So, this patch uses -march=armv7-a+iwmmxt for
> CONFIG_THUMB2_KERNEL=y.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>

I like this one better.

Acked-by: Nicolas Pitre <nicolas.pitre@linaro.org>

As you said, this should be merged with #2/3.



> ---
>  arch/arm/kernel/Makefile |   12 ++++++++++++
>  1 files changed, 12 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
> index f7887dc..e03691e 100644
> --- a/arch/arm/kernel/Makefile
> +++ b/arch/arm/kernel/Makefile
> @@ -66,6 +66,18 @@ obj-$(CONFIG_IWMMXT)		+= iwmmxt.o
>  obj-$(CONFIG_CPU_HAS_PMU)	+= pmu.o
>  obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
>  AFLAGS_iwmmxt.o			:= -Wa,-mcpu=iwmmxt
> +ifdef CONFIG_CPU_V7
> +AFLAGS_iwmmxt.o			:= $(call cc-option,-Wa$(comma)-march=armv7-a+iwmmxt,-DIWMMXT_LEGACY_ASSEMBLER)
> +ifeq ($(AFLAGS_iwmmxt.o),-DIWMMXT_LEGACY_ASSEMBLER)
> +ifdef CONFIG_THUMB2_KERNEL
> +$(error Newer binutils is needed to build iwmmxt support for Thumb-2 kernels.)
> +else
> +$(warning Warning: Building legacy pre-v7 iwmmxt code for a v7 target.\
> +  This is inadvisable.  Please upgrade to newer binutils.)
> +AFLAGS_iwmmxt.o			+= -Wa,-mcpu=iwmmxt
> +endif
> +endif
> +endif # CONFIG_CPU_V7
>  
>  ifneq ($(CONFIG_ARCH_EBSA110),y)
>    obj-y		+= io.o
> -- 
> 1.7.4.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:03     ` Eric Miao
@ 2011-09-08 17:13       ` Jon Medhurst (Tixy)
  2011-09-08 17:15         ` Eric Miao
  2011-09-08 17:33         ` Nicolas Pitre
  2011-09-08 17:20       ` Dave Martin
  1 sibling, 2 replies; 30+ messages in thread
From: Jon Medhurst (Tixy) @ 2011-09-08 17:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2011-09-08 at 10:03 -0700, Eric Miao wrote:
> On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thursday 08 September 2011, Dave Martin wrote:
> >> The iwmmxt code contains some code to implement a pseudo-ISB, but
> >> this is not buildable for Thumb-2.
> >>
> >> This patch replaces the pseudo-ISB with a real one for Thumb-2
> >> kernels.
> >>
> >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> >> ---
> >>  arch/arm/kernel/iwmmxt.S |    9 +++++++++
> >>  1 files changed, 9 insertions(+), 0 deletions(-)
> >
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> >
> 
> Maybe it'll be much simpler to have something like below:
> 
> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> index a087838..5998f7d 100644
> --- a/arch/arm/kernel/iwmmxt.S
> +++ b/arch/arm/kernel/iwmmxt.S
> @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
>         PJ4(eor r1, r1, #0xf)
>         PJ4(mcr p15, 0, r1, c1, c0, 2)
> 
> -       mrc     p15, 0, r1, c2, c0, 0
> -       sub     pc, lr, r1, lsr #32             @ cpwait and return
> +       XSC(mrc p15, 0, r1, c2, c0, 0)
> +       PJ4(isb)
> +       mov     pc, lr                          @ cpwait and return

Don't we still need "sub pc, lr, r1, lsr #32" in the XSC case?
I had assumed this was to introduce a data dependency. I.e. PC depends
on value of R1 which is loaded by the MRC, so it can't complete until
that has.

-- 
Tixy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2
  2011-09-08 16:45   ` Arnd Bergmann
@ 2011-09-08 17:14     ` Dave Martin
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 17:14 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2011 at 06:45:32PM +0200, Arnd Bergmann wrote:
> On Thursday 08 September 2011, Dave Martin wrote:
> > Because gcc/gas have no sane way to turn on individual CPU
> > extensions from the command-line, iwmmxt.S was previously built
> > with -mcpu=iwmmxt.  Unfortunately, this also downgrades the CPU to
> > v5, with the result that this file fails to build for a Thumb-2
> > kernel.
> > 
> > New versions of the tools support -march=<base arch>+iwmmxt, and it
> > seems reasonable to require up-to-date tools when building in
> > Thumb-2.  So, this patch uses -march=armv7-a+iwmmxt for
> > CONFIG_THUMB2_KERNEL=y.
> > 
> > Signed-off-by: Dave Martin <dave.martin@linaro.org>
> 
> Ok. More complex than I would have liked, but this definitely does
> the right thing in every case, as far as I can tell.

Actually, there is still one problem, which is that the errors/warnings
will be triggered even if iwmmxt is not being built.

This is straightforward to work around, and I have a local fix for it
which I'll incorporate in v3.

> Acked-by: Arnd Bergmann <arnd@arndb.de>

Thanks

I certainly agree with the comment about complexity -- if anyone
has any better ideas, I'm still interested...

We should eventually be able to get rid of this, but not for years
(until currentl binutils versions are effecitvely obsolete).

Cheers
---Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:13       ` Jon Medhurst (Tixy)
@ 2011-09-08 17:15         ` Eric Miao
  2011-09-08 17:33         ` Nicolas Pitre
  1 sibling, 0 replies; 30+ messages in thread
From: Eric Miao @ 2011-09-08 17:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 8, 2011 at 10:13 AM, Jon Medhurst (Tixy)
<jon.medhurst@linaro.org> wrote:
> On Thu, 2011-09-08 at 10:03 -0700, Eric Miao wrote:
>> On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Thursday 08 September 2011, Dave Martin wrote:
>> >> The iwmmxt code contains some code to implement a pseudo-ISB, but
>> >> this is not buildable for Thumb-2.
>> >>
>> >> This patch replaces the pseudo-ISB with a real one for Thumb-2
>> >> kernels.
>> >>
>> >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>> >> ---
>> >> ?arch/arm/kernel/iwmmxt.S | ? ?9 +++++++++
>> >> ?1 files changed, 9 insertions(+), 0 deletions(-)
>> >
>> > Acked-by: Arnd Bergmann <arnd@arndb.de>
>> >
>>
>> Maybe it'll be much simpler to have something like below:
>>
>> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
>> index a087838..5998f7d 100644
>> --- a/arch/arm/kernel/iwmmxt.S
>> +++ b/arch/arm/kernel/iwmmxt.S
>> @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
>> ? ? ? ? PJ4(eor r1, r1, #0xf)
>> ? ? ? ? PJ4(mcr p15, 0, r1, c1, c0, 2)
>>
>> - ? ? ? mrc ? ? p15, 0, r1, c2, c0, 0
>> - ? ? ? sub ? ? pc, lr, r1, lsr #32 ? ? ? ? ? ? @ cpwait and return
>> + ? ? ? XSC(mrc p15, 0, r1, c2, c0, 0)
>> + ? ? ? PJ4(isb)
>> + ? ? ? mov ? ? pc, lr ? ? ? ? ? ? ? ? ? ? ? ? ?@ cpwait and return
>
> Don't we still need "sub pc, lr, r1, lsr #32" in the XSC case?
> I had assumed this was to introduce a data dependency. I.e. PC depends
> on value of R1 which is loaded by the MRC, so it can't complete until
> that has.

Indeed, that's actually a good point.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:03     ` Eric Miao
  2011-09-08 17:13       ` Jon Medhurst (Tixy)
@ 2011-09-08 17:20       ` Dave Martin
  2011-09-08 17:32         ` Nicolas Pitre
  2011-09-08 18:49         ` Eric Miao
  1 sibling, 2 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2011 at 10:03:52AM -0700, Eric Miao wrote:
> On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thursday 08 September 2011, Dave Martin wrote:
> >> The iwmmxt code contains some code to implement a pseudo-ISB, but
> >> this is not buildable for Thumb-2.
> >>
> >> This patch replaces the pseudo-ISB with a real one for Thumb-2
> >> kernels.
> >>
> >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> >> ---
> >> ?arch/arm/kernel/iwmmxt.S | ? ?9 +++++++++
> >> ?1 files changed, 9 insertions(+), 0 deletions(-)
> >
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> >
> 
> Maybe it'll be much simpler to have something like below:
> 
> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> index a087838..5998f7d 100644
> --- a/arch/arm/kernel/iwmmxt.S
> +++ b/arch/arm/kernel/iwmmxt.S
> @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
>         PJ4(eor r1, r1, #0xf)
>         PJ4(mcr p15, 0, r1, c1, c0, 2)
> 
> -       mrc     p15, 0, r1, c2, c0, 0
> -       sub     pc, lr, r1, lsr #32             @ cpwait and return
> +       XSC(mrc p15, 0, r1, c2, c0, 0)
> +       PJ4(isb)
> +       mov     pc, lr                          @ cpwait and return

This won't allow the building of this code for a v7 ARM kernel with
current tools.

I think this is likely to be too much dirsuption: requiring really
new tools for Thumb-2 is unavoidable, but we should allow people
using existing tools to build this to carry on with the current
situation for the time being.  This is because of the fact that
not all current tools can support "isb" and the iwmmxt mnemonics
in the same file.

One thing we could do is to code the ISB using .long, since we
only need this for Thumb kernels.

This would allow us to get rid of the Makefile warning, since the
generated code would be the preferred v7 code in that case.

Any comments on this?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 support code to v7/Thumb-2
  2011-09-08 16:44   ` Arnd Bergmann
  2011-09-08 17:06     ` Eric Miao
@ 2011-09-08 17:23     ` Dave Martin
  1 sibling, 0 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-08 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 08, 2011 at 06:44:21PM +0200, Arnd Bergmann wrote:
> On Thursday 08 September 2011, Dave Martin wrote:
> >         __asm__ __volatile__ (
> >                 "mcr    p15, 0, %1, c1, c0, 2\n\t"
> > +#ifdef __LINUX_ARM_ARCH__ >= 7
> > +               "isb\n\t"
> > +#else
> >                 "mrc    p15, 0, %0, c1, c0, 2\n\t"
> >                 "mov    %0, %0\n\t"
> >                 "sub    pc, pc, #4\n\t"
> > +#endif
> >                 : "=r" (temp) : "r" (value));
> >  }
> 
> I thought we had concluded that we don't actually need the #else
> path because there is no pre-v7 support for pj4 any more.
> 

Ah, OK -- I somewhat missed the point, the was focusing on the other
patches (I notice I didn't fix #ifdef to #if either...)

Anyway, this patch can be drastically simplified -- i.e.,
the #if conditional should just go away, along with the
xscale-style code (which really doesn't make sense for a v7-only
file).

So we'd end up with something like:

        __asm__ __volatile__ (
                "mcr    p15, 0, %0, c1, c0, 2\n\t"
                "isb"
                : "r" (value));

Better?

Cheers
---Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:20       ` Dave Martin
@ 2011-09-08 17:32         ` Nicolas Pitre
  2011-09-08 18:49         ` Eric Miao
  1 sibling, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-08 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 8 Sep 2011, Dave Martin wrote:

> On Thu, Sep 08, 2011 at 10:03:52AM -0700, Eric Miao wrote:
> > On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Thursday 08 September 2011, Dave Martin wrote:
> > >> The iwmmxt code contains some code to implement a pseudo-ISB, but
> > >> this is not buildable for Thumb-2.
> > >>
> > >> This patch replaces the pseudo-ISB with a real one for Thumb-2
> > >> kernels.
> > >>
> > >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> > >> ---
> > >> ?arch/arm/kernel/iwmmxt.S | ? ?9 +++++++++
> > >> ?1 files changed, 9 insertions(+), 0 deletions(-)
> > >
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > >
> > 
> > Maybe it'll be much simpler to have something like below:
> > 
> > diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> > index a087838..5998f7d 100644
> > --- a/arch/arm/kernel/iwmmxt.S
> > +++ b/arch/arm/kernel/iwmmxt.S
> > @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
> >         PJ4(eor r1, r1, #0xf)
> >         PJ4(mcr p15, 0, r1, c1, c0, 2)
> > 
> > -       mrc     p15, 0, r1, c2, c0, 0
> > -       sub     pc, lr, r1, lsr #32             @ cpwait and return
> > +       XSC(mrc p15, 0, r1, c2, c0, 0)
> > +       PJ4(isb)
> > +       mov     pc, lr                          @ cpwait and return
> 
> This won't allow the building of this code for a v7 ARM kernel with
> current tools.
> 
> I think this is likely to be too much dirsuption: requiring really
> new tools for Thumb-2 is unavoidable, but we should allow people
> using existing tools to build this to carry on with the current
> situation for the time being.  This is because of the fact that
> not all current tools can support "isb" and the iwmmxt mnemonics
> in the same file.
> 
> One thing we could do is to code the ISB using .long, since we
> only need this for Thumb kernels.

You would have an ARM and a Thumb variant then, plus the legacy variant, 
right?

> This would allow us to get rid of the Makefile warning, since the
> generated code would be the preferred v7 code in that case.

Makes sense.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:13       ` Jon Medhurst (Tixy)
  2011-09-08 17:15         ` Eric Miao
@ 2011-09-08 17:33         ` Nicolas Pitre
  1 sibling, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-08 17:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 8 Sep 2011, Jon Medhurst (Tixy) wrote:

> On Thu, 2011-09-08 at 10:03 -0700, Eric Miao wrote:
> > On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > > On Thursday 08 September 2011, Dave Martin wrote:
> > >> The iwmmxt code contains some code to implement a pseudo-ISB, but
> > >> this is not buildable for Thumb-2.
> > >>
> > >> This patch replaces the pseudo-ISB with a real one for Thumb-2
> > >> kernels.
> > >>
> > >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> > >> ---
> > >>  arch/arm/kernel/iwmmxt.S |    9 +++++++++
> > >>  1 files changed, 9 insertions(+), 0 deletions(-)
> > >
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > >
> > 
> > Maybe it'll be much simpler to have something like below:
> > 
> > diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> > index a087838..5998f7d 100644
> > --- a/arch/arm/kernel/iwmmxt.S
> > +++ b/arch/arm/kernel/iwmmxt.S
> > @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
> >         PJ4(eor r1, r1, #0xf)
> >         PJ4(mcr p15, 0, r1, c1, c0, 2)
> > 
> > -       mrc     p15, 0, r1, c2, c0, 0
> > -       sub     pc, lr, r1, lsr #32             @ cpwait and return
> > +       XSC(mrc p15, 0, r1, c2, c0, 0)
> > +       PJ4(isb)
> > +       mov     pc, lr                          @ cpwait and return
> 
> Don't we still need "sub pc, lr, r1, lsr #32" in the XSC case?
> I had assumed this was to introduce a data dependency. I.e. PC depends
> on value of R1 which is loaded by the MRC, so it can't complete until
> that has.

Yes, that's exactly the reason why this is coded that way.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 17:20       ` Dave Martin
  2011-09-08 17:32         ` Nicolas Pitre
@ 2011-09-08 18:49         ` Eric Miao
  2011-09-08 19:30           ` Nicolas Pitre
  2011-09-09  9:55           ` Jon Medhurst (Tixy)
  1 sibling, 2 replies; 30+ messages in thread
From: Eric Miao @ 2011-09-08 18:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Sep 8, 2011 at 10:20 AM, Dave Martin <dave.martin@linaro.org> wrote:
> On Thu, Sep 08, 2011 at 10:03:52AM -0700, Eric Miao wrote:
>> On Thu, Sep 8, 2011 at 9:45 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Thursday 08 September 2011, Dave Martin wrote:
>> >> The iwmmxt code contains some code to implement a pseudo-ISB, but
>> >> this is not buildable for Thumb-2.
>> >>
>> >> This patch replaces the pseudo-ISB with a real one for Thumb-2
>> >> kernels.
>> >>
>> >> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>> >> ---
>> >> ?arch/arm/kernel/iwmmxt.S | ? ?9 +++++++++
>> >> ?1 files changed, 9 insertions(+), 0 deletions(-)
>> >
>> > Acked-by: Arnd Bergmann <arnd@arndb.de>
>> >
>>
>> Maybe it'll be much simpler to have something like below:
>>
>> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
>> index a087838..5998f7d 100644
>> --- a/arch/arm/kernel/iwmmxt.S
>> +++ b/arch/arm/kernel/iwmmxt.S
>> @@ -319,8 +319,9 @@ ENTRY(iwmmxt_task_switch)
>> ? ? ? ? PJ4(eor r1, r1, #0xf)
>> ? ? ? ? PJ4(mcr p15, 0, r1, c1, c0, 2)
>>
>> - ? ? ? mrc ? ? p15, 0, r1, c2, c0, 0
>> - ? ? ? sub ? ? pc, lr, r1, lsr #32 ? ? ? ? ? ? @ cpwait and return
>> + ? ? ? XSC(mrc p15, 0, r1, c2, c0, 0)
>> + ? ? ? PJ4(isb)
>> + ? ? ? mov ? ? pc, lr ? ? ? ? ? ? ? ? ? ? ? ? ?@ cpwait and return
>
> This won't allow the building of this code for a v7 ARM kernel with
> current tools.

Ah I missed that point.

So the problem is really when compiling this file with existing toolchain,
it's downgrading to v5 compatible mode, and the instruction below

sub pc, lr, r1, lsr #32

wouldn't be encoded when building a THUMB2 kernel. Considering the
r1, lsr #32 is actually to create an explicit data dependency of the previous
co-processor instruction, would it be one option to rewrite this as something
like:

mov r1, r1
mov pc, lr

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 18:49         ` Eric Miao
@ 2011-09-08 19:30           ` Nicolas Pitre
  2011-09-09  9:55           ` Jon Medhurst (Tixy)
  1 sibling, 0 replies; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-08 19:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 8 Sep 2011, Eric Miao wrote:

> So the problem is really when compiling this file with existing toolchain,
> it's downgrading to v5 compatible mode, and the instruction below
> 
> sub pc, lr, r1, lsr #32
> 
> wouldn't be encoded when building a THUMB2 kernel. Considering the
> r1, lsr #32 is actually to create an explicit data dependency of the previous
> co-processor instruction, would it be one option to rewrite this as something
> like:
> 
> mov r1, r1
> mov pc, lr

That is certainly a pretty simple, obvious and straight forward way to 
deal with this issue.  I think I like this even more.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-08 18:49         ` Eric Miao
  2011-09-08 19:30           ` Nicolas Pitre
@ 2011-09-09  9:55           ` Jon Medhurst (Tixy)
  2011-09-09 13:21             ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Jon Medhurst (Tixy) @ 2011-09-09  9:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> So the problem is really when compiling this file with existing toolchain,
> it's downgrading to v5 compatible mode, and the instruction below
> 
> sub pc, lr, r1, lsr #32
> 
> wouldn't be encoded when building a THUMB2 kernel. Considering the
> r1, lsr #32 is actually to create an explicit data dependency of the previous
> co-processor instruction, would it be one option to rewrite this as something
> like:
> 
> mov r1, r1
> mov pc, lr

That doesn't include a data dependency of PC on R1, so it's possible for
MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
has completed. We would want...

add lr, lr, r1, lsr #32
mov pc, lr

-- 
Tixy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-09  9:55           ` Jon Medhurst (Tixy)
@ 2011-09-09 13:21             ` Nicolas Pitre
  2011-09-09 14:05               ` Jon Medhurst (Tixy)
  2011-09-09 16:41               ` Dave Martin
  0 siblings, 2 replies; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-09 13:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:

> On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > So the problem is really when compiling this file with existing toolchain,
> > it's downgrading to v5 compatible mode, and the instruction below
> > 
> > sub pc, lr, r1, lsr #32
> > 
> > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > co-processor instruction, would it be one option to rewrite this as something
> > like:
> > 
> > mov r1, r1
> > mov pc, lr
> 
> That doesn't include a data dependency of PC on R1, so it's possible for
> MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> has completed. We would want...
> 
> add lr, lr, r1, lsr #32
> mov pc, lr

But isn't the first insn unavailable with Thumb2?

Maybe something like:

	sub	r1, r1, r1
	add	pc, lr, r1


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-09 13:21             ` Nicolas Pitre
@ 2011-09-09 14:05               ` Jon Medhurst (Tixy)
  2011-09-09 16:41               ` Dave Martin
  1 sibling, 0 replies; 30+ messages in thread
From: Jon Medhurst (Tixy) @ 2011-09-09 14:05 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 2011-09-09 at 09:21 -0400, Nicolas Pitre wrote:
> On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> 
> > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > So the problem is really when compiling this file with existing toolchain,
> > > it's downgrading to v5 compatible mode, and the instruction below
> > > 
> > > sub pc, lr, r1, lsr #32
> > > 
> > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > co-processor instruction, would it be one option to rewrite this as something
> > > like:
> > > 
> > > mov r1, r1
> > > mov pc, lr
> > 
> > That doesn't include a data dependency of PC on R1, so it's possible for
> > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > has completed. We would want...
> > 
> > add lr, lr, r1, lsr #32
> > mov pc, lr
> 
> But isn't the first insn unavailable with Thumb2?

It is available in Thumb2.

> Maybe something like:
> 
> 	sub	r1, r1, r1
> 	add	pc, lr, r1

That's not allowed in Thumb2. ;-)
There aren't many instruction forms which are allowed to set PC.

-- 
Tixy

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-09 13:21             ` Nicolas Pitre
  2011-09-09 14:05               ` Jon Medhurst (Tixy)
@ 2011-09-09 16:41               ` Dave Martin
  2011-09-09 17:51                 ` Nicolas Pitre
  1 sibling, 1 reply; 30+ messages in thread
From: Dave Martin @ 2011-09-09 16:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2011 at 09:21:28AM -0400, Nicolas Pitre wrote:
> On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> 
> > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > So the problem is really when compiling this file with existing toolchain,
> > > it's downgrading to v5 compatible mode, and the instruction below
> > > 
> > > sub pc, lr, r1, lsr #32
> > > 
> > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > co-processor instruction, would it be one option to rewrite this as something
> > > like:
> > > 
> > > mov r1, r1
> > > mov pc, lr
> > 
> > That doesn't include a data dependency of PC on R1, so it's possible for
> > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > has completed. We would want...
> > 
> > add lr, lr, r1, lsr #32
> > mov pc, lr
> 
> But isn't the first insn unavailable with Thumb2?
> 
> Maybe something like:
> 
> 	sub	r1, r1, r1
> 	add	pc, lr, r1

Thumb-2 implies v7, the v7 implies that we have (and need) a real ISB.  So for the
Thumb-2 case, this should always become

	isb

Ideally, we should do this even for ARM v7 kernels, but in order not to break older
tools which don't understand -march=armv7-a+iwmmxt we may need to use a .long to
encode the ISB manually.  I'll give that a try...

Cheers
---Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-09 16:41               ` Dave Martin
@ 2011-09-09 17:51                 ` Nicolas Pitre
  2011-09-12 14:37                   ` Dave Martin
  0 siblings, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-09 17:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, 9 Sep 2011, Dave Martin wrote:

> On Fri, Sep 09, 2011 at 09:21:28AM -0400, Nicolas Pitre wrote:
> > On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> > 
> > > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > > So the problem is really when compiling this file with existing toolchain,
> > > > it's downgrading to v5 compatible mode, and the instruction below
> > > > 
> > > > sub pc, lr, r1, lsr #32
> > > > 
> > > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > > co-processor instruction, would it be one option to rewrite this as something
> > > > like:
> > > > 
> > > > mov r1, r1
> > > > mov pc, lr
> > > 
> > > That doesn't include a data dependency of PC on R1, so it's possible for
> > > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > > has completed. We would want...
> > > 
> > > add lr, lr, r1, lsr #32
> > > mov pc, lr
> > 
> > But isn't the first insn unavailable with Thumb2?
> > 
> > Maybe something like:
> > 
> > 	sub	r1, r1, r1
> > 	add	pc, lr, r1
> 
> Thumb-2 implies v7, the v7 implies that we have (and need) a real ISB.  So for the
> Thumb-2 case, this should always become
> 
> 	isb

Don't forget that here we are talking about the Marvell implementation.  
Since I've no doubt that the real isb certainly works, so far the 
advertized isb has always been implemented with a CP readback with a 
data dependency.  Don't forget that this CPU core can also pretend to be 
an ARMv6 and they probably didn't duplicate the whole instruction 
decoder for each modes.

So I really doubt the Marvell implementation would behave any 
differently if this special isb is expressed in terms of Thumb2 rather 
than ARM instructions as the backend is certainly common to all the 
modes. If we can use the same implementation for all modes then we'll 
just simplify maintenance of this code.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-09 17:51                 ` Nicolas Pitre
@ 2011-09-12 14:37                   ` Dave Martin
  2011-09-12 14:43                     ` Russell King - ARM Linux
  2011-09-12 14:55                     ` Nicolas Pitre
  0 siblings, 2 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-12 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 09, 2011 at 01:51:30PM -0400, Nicolas Pitre wrote:
> On Fri, 9 Sep 2011, Dave Martin wrote:
> 
> > On Fri, Sep 09, 2011 at 09:21:28AM -0400, Nicolas Pitre wrote:
> > > On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> > > 
> > > > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > > > So the problem is really when compiling this file with existing toolchain,
> > > > > it's downgrading to v5 compatible mode, and the instruction below
> > > > > 
> > > > > sub pc, lr, r1, lsr #32
> > > > > 
> > > > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > > > co-processor instruction, would it be one option to rewrite this as something
> > > > > like:
> > > > > 
> > > > > mov r1, r1
> > > > > mov pc, lr
> > > > 
> > > > That doesn't include a data dependency of PC on R1, so it's possible for
> > > > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > > > has completed. We would want...
> > > > 
> > > > add lr, lr, r1, lsr #32
> > > > mov pc, lr
> > > 
> > > But isn't the first insn unavailable with Thumb2?
> > > 
> > > Maybe something like:
> > > 
> > > 	sub	r1, r1, r1
> > > 	add	pc, lr, r1
> > 
> > Thumb-2 implies v7, the v7 implies that we have (and need) a real ISB.  So for the
> > Thumb-2 case, this should always become
> > 
> > 	isb
> 
> Don't forget that here we are talking about the Marvell implementation.  
> Since I've no doubt that the real isb certainly works, so far the 
> advertized isb has always been implemented with a CP readback with a 
> data dependency.  Don't forget that this CPU core can also pretend to be 
> an ARMv6 and they probably didn't duplicate the whole instruction 
> decoder for each modes.

If the isb instruction doesn't work for this in v7 mode, I think that
would be an architecture violation, but you're right that this may
not be available, or may not behave identically, when the CPU is
behaving like a v6 part.

> So I really doubt the Marvell implementation would behave any 
> differently if this special isb is expressed in terms of Thumb2 rather 
> than ARM instructions as the backend is certainly common to all the 
> modes. If we can use the same implementation for all modes then we'll 
> just simplify maintenance of this code.

Agreed, but the trouble is that what constitutes a suitable data
dependency is completely microarchitecture dependent -- we're relying
on side-effects outside the architecture here.

We can't code _exactly_ the same thing in Thumb-2 because of restrictions
in the instruction set, so we have to settle for something that "looks
similar" -- but there's no guarantee it will work.

Do you feel your judgement on this is authoritative?  If so, then
we can go ahead with your suggestion; otherwise we might need input
from someone else.

Cheers
---Dave

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-12 14:37                   ` Dave Martin
@ 2011-09-12 14:43                     ` Russell King - ARM Linux
  2011-09-12 14:55                     ` Nicolas Pitre
  1 sibling, 0 replies; 30+ messages in thread
From: Russell King - ARM Linux @ 2011-09-12 14:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 12, 2011 at 03:37:08PM +0100, Dave Martin wrote:
> Agreed, but the trouble is that what constitutes a suitable data
> dependency is completely microarchitecture dependent -- we're relying
> on side-effects outside the architecture here.
> 
> We can't code _exactly_ the same thing in Thumb-2 because of restrictions
> in the instruction set, so we have to settle for something that "looks
> similar" -- but there's no guarantee it will work.
> 
> Do you feel your judgement on this is authoritative?  If so, then
> we can go ahead with your suggestion; otherwise we might need input
> from someone else.

How about we ask Haojian Zhuang what the recommended sequence is for
ensuring that writes hit CP15 when executing in Thumb-2 mode on PJ4,
instead of blindly beating around the issue without really knowing
what we're doing?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-12 14:37                   ` Dave Martin
  2011-09-12 14:43                     ` Russell King - ARM Linux
@ 2011-09-12 14:55                     ` Nicolas Pitre
  2011-09-12 16:22                       ` Dave Martin
  1 sibling, 1 reply; 30+ messages in thread
From: Nicolas Pitre @ 2011-09-12 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, 12 Sep 2011, Dave Martin wrote:

> On Fri, Sep 09, 2011 at 01:51:30PM -0400, Nicolas Pitre wrote:
> > On Fri, 9 Sep 2011, Dave Martin wrote:
> > 
> > > On Fri, Sep 09, 2011 at 09:21:28AM -0400, Nicolas Pitre wrote:
> > > > On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> > > > 
> > > > > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > > > > So the problem is really when compiling this file with existing toolchain,
> > > > > > it's downgrading to v5 compatible mode, and the instruction below
> > > > > > 
> > > > > > sub pc, lr, r1, lsr #32
> > > > > > 
> > > > > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > > > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > > > > co-processor instruction, would it be one option to rewrite this as something
> > > > > > like:
> > > > > > 
> > > > > > mov r1, r1
> > > > > > mov pc, lr
> > > > > 
> > > > > That doesn't include a data dependency of PC on R1, so it's possible for
> > > > > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > > > > has completed. We would want...
> > > > > 
> > > > > add lr, lr, r1, lsr #32
> > > > > mov pc, lr
> > > > 
> > > > But isn't the first insn unavailable with Thumb2?
> > > > 
> > > > Maybe something like:
> > > > 
> > > > 	sub	r1, r1, r1
> > > > 	add	pc, lr, r1
> > > 
> > > Thumb-2 implies v7, the v7 implies that we have (and need) a real ISB.  So for the
> > > Thumb-2 case, this should always become
> > > 
> > > 	isb
> > 
> > Don't forget that here we are talking about the Marvell implementation.  
> > Since I've no doubt that the real isb certainly works, so far the 
> > advertized isb has always been implemented with a CP readback with a 
> > data dependency.  Don't forget that this CPU core can also pretend to be 
> > an ARMv6 and they probably didn't duplicate the whole instruction 
> > decoder for each modes.
> 
> If the isb instruction doesn't work for this in v7 mode, I think that
> would be an architecture violation, but you're right that this may
> not be available, or may not behave identically, when the CPU is
> behaving like a v6 part.

I'm sure isb is available in ARMv7 mode.

What I'm saying is that the advertised trick for ARMv6 mode certainly 
must work just as well here in ARMv7 mode.

> > So I really doubt the Marvell implementation would behave any 
> > differently if this special isb is expressed in terms of Thumb2 rather 
> > than ARM instructions as the backend is certainly common to all the 
> > modes. If we can use the same implementation for all modes then we'll 
> > just simplify maintenance of this code.
> 
> Agreed, but the trouble is that what constitutes a suitable data
> dependency is completely microarchitecture dependent -- we're relying
> on side-effects outside the architecture here.

Sure.  But what is there now is what is documented by Marvell docs i.e. 
create a data dependency on the register used to read back the CP value.  
I've seen a few implementation variations on that theme too.

> We can't code _exactly_ the same thing in Thumb-2 because of restrictions
> in the instruction set, so we have to settle for something that "looks
> similar" -- but there's no guarantee it will work.
> 
> Do you feel your judgement on this is authoritative?  If so, then
> we can go ahead with your suggestion; otherwise we might need input
> from someone else.

My suggestion would be what Tixy also proposed:

	mrc	p15, 0, r1, c2, c0, 0
	add	lr, lr, r1, lsr #32
	mov	pc, lr

Running this past people still at Marvell (I'm not anymore) wouldn't 
hurt of course.


Nicolas

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2
  2011-09-12 14:55                     ` Nicolas Pitre
@ 2011-09-12 16:22                       ` Dave Martin
  0 siblings, 0 replies; 30+ messages in thread
From: Dave Martin @ 2011-09-12 16:22 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Sep 12, 2011 at 10:55:30AM -0400, Nicolas Pitre wrote:
> On Mon, 12 Sep 2011, Dave Martin wrote:
> 
> > On Fri, Sep 09, 2011 at 01:51:30PM -0400, Nicolas Pitre wrote:
> > > On Fri, 9 Sep 2011, Dave Martin wrote:
> > > 
> > > > On Fri, Sep 09, 2011 at 09:21:28AM -0400, Nicolas Pitre wrote:
> > > > > On Fri, 9 Sep 2011, Jon Medhurst (Tixy) wrote:
> > > > > 
> > > > > > On Thu, 2011-09-08 at 11:49 -0700, Eric Miao wrote:
> > > > > > > So the problem is really when compiling this file with existing toolchain,
> > > > > > > it's downgrading to v5 compatible mode, and the instruction below
> > > > > > > 
> > > > > > > sub pc, lr, r1, lsr #32
> > > > > > > 
> > > > > > > wouldn't be encoded when building a THUMB2 kernel. Considering the
> > > > > > > r1, lsr #32 is actually to create an explicit data dependency of the previous
> > > > > > > co-processor instruction, would it be one option to rewrite this as something
> > > > > > > like:
> > > > > > > 
> > > > > > > mov r1, r1
> > > > > > > mov pc, lr
> > > > > > 
> > > > > > That doesn't include a data dependency of PC on R1, so it's possible for
> > > > > > MOV PC, LR and subsequent instructions to be executed before MOV R1, R1
> > > > > > has completed. We would want...
> > > > > > 
> > > > > > add lr, lr, r1, lsr #32
> > > > > > mov pc, lr
> > > > > 
> > > > > But isn't the first insn unavailable with Thumb2?
> > > > > 
> > > > > Maybe something like:
> > > > > 
> > > > > 	sub	r1, r1, r1
> > > > > 	add	pc, lr, r1
> > > > 
> > > > Thumb-2 implies v7, the v7 implies that we have (and need) a real ISB.  So for the
> > > > Thumb-2 case, this should always become
> > > > 
> > > > 	isb
> > > 
> > > Don't forget that here we are talking about the Marvell implementation.  
> > > Since I've no doubt that the real isb certainly works, so far the 
> > > advertized isb has always been implemented with a CP readback with a 
> > > data dependency.  Don't forget that this CPU core can also pretend to be 
> > > an ARMv6 and they probably didn't duplicate the whole instruction 
> > > decoder for each modes.
> > 
> > If the isb instruction doesn't work for this in v7 mode, I think that
> > would be an architecture violation, but you're right that this may
> > not be available, or may not behave identically, when the CPU is
> > behaving like a v6 part.
> 
> I'm sure isb is available in ARMv7 mode.
> 
> What I'm saying is that the advertised trick for ARMv6 mode certainly 
> must work just as well here in ARMv7 mode.
> 
> > > So I really doubt the Marvell implementation would behave any 
> > > differently if this special isb is expressed in terms of Thumb2 rather 
> > > than ARM instructions as the backend is certainly common to all the 
> > > modes. If we can use the same implementation for all modes then we'll 
> > > just simplify maintenance of this code.
> > 
> > Agreed, but the trouble is that what constitutes a suitable data
> > dependency is completely microarchitecture dependent -- we're relying
> > on side-effects outside the architecture here.
> 
> Sure.  But what is there now is what is documented by Marvell docs i.e. 
> create a data dependency on the register used to read back the CP value.  
> I've seen a few implementation variations on that theme too.
> 
> > We can't code _exactly_ the same thing in Thumb-2 because of restrictions
> > in the instruction set, so we have to settle for something that "looks
> > similar" -- but there's no guarantee it will work.
> > 
> > Do you feel your judgement on this is authoritative?  If so, then
> > we can go ahead with your suggestion; otherwise we might need input
> > from someone else.
> 
> My suggestion would be what Tixy also proposed:
> 
> 	mrc	p15, 0, r1, c2, c0, 0
> 	add	lr, lr, r1, lsr #32
> 	mov	pc, lr
>
> Running this past people still at Marvell (I'm not anymore) wouldn't 
> hurt of course.

Sure, that might work, and if we have a sequence which works for all
build configurations, that's great.  But it needs to work for the right
reasons, otherwise we may be storing up nasty surprises for later.

As you/Russell suggest, I think Haojian or someone should comment.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2011-09-12 16:22 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-08 16:04 [PATCH v2 0/3] ARM: iwmmxt/pj4 fixes for v7/Thumb-2 Dave Martin
2011-09-08 16:04 ` [PATCH v2 1/3] ARM: iwmmxt: Fix Makefile rules for building iwmmxt for Thumb-2 Dave Martin
2011-09-08 16:45   ` Arnd Bergmann
2011-09-08 17:14     ` Dave Martin
2011-09-08 17:09   ` Nicolas Pitre
2011-09-08 16:04 ` [PATCH v2 2/3] ARM: iwmmxt: Port problematic iwmmxt support code to v7/Thumb-2 Dave Martin
2011-09-08 16:45   ` Arnd Bergmann
2011-09-08 17:03     ` Eric Miao
2011-09-08 17:13       ` Jon Medhurst (Tixy)
2011-09-08 17:15         ` Eric Miao
2011-09-08 17:33         ` Nicolas Pitre
2011-09-08 17:20       ` Dave Martin
2011-09-08 17:32         ` Nicolas Pitre
2011-09-08 18:49         ` Eric Miao
2011-09-08 19:30           ` Nicolas Pitre
2011-09-09  9:55           ` Jon Medhurst (Tixy)
2011-09-09 13:21             ` Nicolas Pitre
2011-09-09 14:05               ` Jon Medhurst (Tixy)
2011-09-09 16:41               ` Dave Martin
2011-09-09 17:51                 ` Nicolas Pitre
2011-09-12 14:37                   ` Dave Martin
2011-09-12 14:43                     ` Russell King - ARM Linux
2011-09-12 14:55                     ` Nicolas Pitre
2011-09-12 16:22                       ` Dave Martin
2011-09-08 16:04 ` [PATCH v2 3/3] ARM: pxa/pj4: Port problematic pj4 " Dave Martin
2011-09-08 16:44   ` Arnd Bergmann
2011-09-08 17:06     ` Eric Miao
2011-09-08 17:23     ` Dave Martin
2011-09-08 17:06   ` Nicolas Pitre
2011-09-08 17:06     ` Eric Miao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).