linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
@ 2010-03-15 22:25 Tony Lindgren
  0 siblings, 0 replies; 15+ messages in thread
From: Tony Lindgren @ 2010-03-15 22:25 UTC (permalink / raw)
  To: linux-arm-kernel

To mount root on omap2420, we need to disable VFPv3 and
HAS_TLS_REG.

VFPv3 is only available on CPU_V7. TLS_REG is only available
on ARM11 starting with r1p0 and later. As omap2420 is r0p2,
it does not have TLS_REG.

Otherwise we'll get something like this for CPUv3:

Freeing init memory: 184K
Internal error: Oops - undefined instruction: 0 [#1]
last sysfs file:
Modules linked in:
CPU: 0    Not tainted  (2.6.33-rc8-07824-gf2e1d91-dirty #36)
PC is at no_old_VFP_process+0x8/0x3c
LR is at __und_usr_unknown+0x0/0x14
...

Or the system just hangs if HAS_TLS_REG is set.

Signed-off-by: Tony Lindgren <tony@atomide.com>

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index d97d893..409ae23 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1549,7 +1549,7 @@ config VFP
 config VFPv3
 	bool
 	depends on VFP
-	default y if CPU_V7
+	default y if CPU_V7 && !CPU_V6
 
 config NEON
 	bool "Advanced SIMD (NEON) Extension support"
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index c4ed9f9..ff0c829 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -718,11 +718,11 @@ config TLS_REG_EMUL
 config HAS_TLS_REG
 	bool
 	depends on !TLS_REG_EMUL
-	default y if SMP || CPU_32v7
+	default y if (SMP || CPU_32v7) && !ARCH_OMAP2
 	help
 	  This selects support for the CP15 thread register.
-	  It is defined to be available on some ARMv6 processors (including
-	  all SMP capable ARMv6's) or later processors.  User space may
+	  It is defined to be available on some ARMv6 processors (r1p0 and
+	  later, including all SMP capable ARMv6's).  User space may
 	  assume directly accessing that register and always obtain the
 	  expected value only on ARMv7 and above.
 

--HTLCc13+3hfAZ6SL--

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
@ 2010-03-17 17:57 Tony Lindgren
  2010-03-17 18:07 ` Catalin Marinas
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Lindgren @ 2010-03-17 17:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

Here's an updated version of this patch with more details.

Looks like VFPv3 is only available on V7:

http://www.arm.com/products/processors/technologies/vector-floating-point.php

HAS_TLS reg is only on ARM11 starting with r1p0:

http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html

So that explains why it won't work on omap2420 as it's r0p2.

Regards,

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
  2010-03-17 17:57 [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7 Tony Lindgren
@ 2010-03-17 18:07 ` Catalin Marinas
  2010-03-17 19:11   ` Tony Lindgren
  0 siblings, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2010-03-17 18:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2010-03-17 at 17:57 +0000, Tony Lindgren wrote:
> Here's an updated version of this patch with more details.
> 
> Looks like VFPv3 is only available on V7:
> 
> http://www.arm.com/products/processors/technologies/vector-floating-point.php

But does it cause any problem if the feature is enabled in the kernel?
The vfp_init() code should check for its presence and set the hwcap
accordingly.

Ideally, we should fix the VFP handling code to cope with dynamic
detection.

> HAS_TLS reg is only on ARM11 starting with r1p0:
> 
> http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html
> 
> So that explains why it won't work on omap2420 as it's r0p2.

Same here, would it work with dynamic detection?

I would like to get v6+v7 support working fine together on RealView
boards as well (though not much spare time) but without disabling the
features that are present on v7 if they can be detected at run-time.

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
  2010-03-17 18:07 ` Catalin Marinas
@ 2010-03-17 19:11   ` Tony Lindgren
  2010-03-18 11:13     ` Catalin Marinas
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Lindgren @ 2010-03-17 19:11 UTC (permalink / raw)
  To: linux-arm-kernel

* Catalin Marinas <catalin.marinas@arm.com> [100317 11:04]:
> On Wed, 2010-03-17 at 17:57 +0000, Tony Lindgren wrote:
> > Here's an updated version of this patch with more details.
> > 
> > Looks like VFPv3 is only available on V7:
> > 
> > http://www.arm.com/products/processors/technologies/vector-floating-point.php
> 
> But does it cause any problem if the feature is enabled in the kernel?
> The vfp_init() code should check for its presence and set the hwcap
> accordingly.

Yeah, it causes the problem posted in the patch description. I took a
quick look at it and at least the VFPFMRX in vfpmacros.h for CONFIG_VFPv3
is a problem.

Also I think we would need to have separate vfp_get_double functions
in vfphw.S for VFPv2 and 3 that get used based on the features.
 
> Ideally, we should fix the VFP handling code to cope with dynamic
> detection.

I agree, being able to boot the same kernel and avoiding tens of
recompiles to test something is a major time saver :)
 
> > HAS_TLS reg is only on ARM11 starting with r1p0:
> > 
> > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html
> > 
> > So that explains why it won't work on omap2420 as it's r0p2.
> 
> Same here, would it work with dynamic detection?

Hmm I believe here the problem is __switch_to in entry-armv.S.
I don't think we want to dynamically test it every time.. Or
at least it would have to be optimized out in most cases.

> I would like to get v6+v7 support working fine together on RealView
> boards as well (though not much spare time) but without disabling the
> features that are present on v7 if they can be detected at run-time.

I totally agree with you there.

Regards,

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
  2010-03-17 19:11   ` Tony Lindgren
@ 2010-03-18 11:13     ` Catalin Marinas
  2010-03-18 17:00       ` Tony Lindgren
  0 siblings, 1 reply; 15+ messages in thread
From: Catalin Marinas @ 2010-03-18 11:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, 2010-03-17 at 19:11 +0000, Tony Lindgren wrote:
> * Catalin Marinas <catalin.marinas@arm.com> [100317 11:04]:
> > On Wed, 2010-03-17 at 17:57 +0000, Tony Lindgren wrote:
> > > Here's an updated version of this patch with more details.
> > >
> > > Looks like VFPv3 is only available on V7:
> > >
> > > http://www.arm.com/products/processors/technologies/vector-floating-point.php
> >
> > But does it cause any problem if the feature is enabled in the kernel?
> > The vfp_init() code should check for its presence and set the hwcap
> > accordingly.
> 
> Yeah, it causes the problem posted in the patch description. I took a
> quick look at it and at least the VFPFMRX in vfpmacros.h for CONFIG_VFPv3
> is a problem.

This would indeed need more checking to avoid reading some registers
which aren't present on ARMv6.

I think the main problem with just falling back to VFPv2 is the lack of
NEON support even if the CPU supports it.

> Also I think we would need to have separate vfp_get_double functions
> in vfphw.S for VFPv2 and 3 that get used based on the features.

I don't think that's causing problems (or at least we can identify where
the function gets called for higher VFP registers). Even with VFPv3, you
may not have the D16-D31 registers.

> > > HAS_TLS reg is only on ARM11 starting with r1p0:
> > >
> > > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html
> > >
> > > So that explains why it won't work on omap2420 as it's r0p2.
> >
> > Same here, would it work with dynamic detection?
> 
> Hmm I believe here the problem is __switch_to in entry-armv.S.
> I don't think we want to dynamically test it every time.. Or
> at least it would have to be optimized out in most cases.

But if you disable this, you won't be able to use an SMP build on both
v6 and v7. Anyway, I don't think that dynamically checking this would
introduce performance penalties, the __switch_to code is pretty complex
already with all the notifier calls.

We may also have optimised user space that reads the TLS register
directly rather than going through the kuser helper, in which case we
would need a kernel built only for ARMv7 (maybe that's acceptable in
this situation).

-- 
Catalin

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7
  2010-03-18 11:13     ` Catalin Marinas
@ 2010-03-18 17:00       ` Tony Lindgren
  2010-03-19  1:35         ` [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6 Tony Lindgren
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Lindgren @ 2010-03-18 17:00 UTC (permalink / raw)
  To: linux-arm-kernel

* Catalin Marinas <catalin.marinas@arm.com> [100318 04:10]:
> On Wed, 2010-03-17 at 19:11 +0000, Tony Lindgren wrote:
> > * Catalin Marinas <catalin.marinas@arm.com> [100317 11:04]:
> > > On Wed, 2010-03-17 at 17:57 +0000, Tony Lindgren wrote:
> > > > Here's an updated version of this patch with more details.
> > > >
> > > > Looks like VFPv3 is only available on V7:
> > > >
> > > > http://www.arm.com/products/processors/technologies/vector-floating-point.php
> > >
> > > But does it cause any problem if the feature is enabled in the kernel?
> > > The vfp_init() code should check for its presence and set the hwcap
> > > accordingly.
> > 
> > Yeah, it causes the problem posted in the patch description. I took a
> > quick look at it and at least the VFPFMRX in vfpmacros.h for CONFIG_VFPv3
> > is a problem.
> 
> This would indeed need more checking to avoid reading some registers
> which aren't present on ARMv6.
> 
> I think the main problem with just falling back to VFPv2 is the lack of
> NEON support even if the CPU supports it.

Yeah it would be nice to have things also working in a reasonably fast
and usable way for distros etc.
 
> > Also I think we would need to have separate vfp_get_double functions
> > in vfphw.S for VFPv2 and 3 that get used based on the features.
> 
> I don't think that's causing problems (or at least we can identify where
> the function gets called for higher VFP registers). Even with VFPv3, you
> may not have the D16-D31 registers.

OK. There's also an ifdef else define for VFP_REG_ZERO in vfp.h. Sounds
like that test would also need to be done dynamically.
 
> > > > HAS_TLS reg is only on ARM11 starting with r1p0:
> > > >
> > > > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html
> > > >
> > > > So that explains why it won't work on omap2420 as it's r0p2.
> > >
> > > Same here, would it work with dynamic detection?
> > 
> > Hmm I believe here the problem is __switch_to in entry-armv.S.
> > I don't think we want to dynamically test it every time.. Or
> > at least it would have to be optimized out in most cases.
> 
> But if you disable this, you won't be able to use an SMP build on both
> v6 and v7. Anyway, I don't think that dynamically checking this would
> introduce performance penalties, the __switch_to code is pretty complex
> already with all the notifier calls.

OK. I'll take a look at setting the TLS a HWCAP flag.
 
> We may also have optimised user space that reads the TLS register
> directly rather than going through the kuser helper, in which case we
> would need a kernel built only for ARMv7 (maybe that's acceptable in
> this situation).

Sounds like more of a hassle to me :)

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-18 17:00       ` Tony Lindgren
@ 2010-03-19  1:35         ` Tony Lindgren
  2010-03-19  3:24           ` Tony Lindgren
                             ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Tony Lindgren @ 2010-03-19  1:35 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [100318 09:55]:
> * Catalin Marinas <catalin.marinas@arm.com> [100318 04:10]:
> > On Wed, 2010-03-17 at 19:11 +0000, Tony Lindgren wrote:
> > > * Catalin Marinas <catalin.marinas@arm.com> [100317 11:04]:
> > > > On Wed, 2010-03-17 at 17:57 +0000, Tony Lindgren wrote:
> > > > > HAS_TLS reg is only on ARM11 starting with r1p0:
> > > > >
> > > > > http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/Babeihid.html
> > > > >
> > > > > So that explains why it won't work on omap2420 as it's r0p2.
> > > >
> > > > Same here, would it work with dynamic detection?
> > > 
> > > Hmm I believe here the problem is __switch_to in entry-armv.S.
> > > I don't think we want to dynamically test it every time.. Or
> > > at least it would have to be optimized out in most cases.
> > 
> > But if you disable this, you won't be able to use an SMP build on both
> > v6 and v7. Anyway, I don't think that dynamically checking this would
> > introduce performance penalties, the __switch_to code is pretty complex
> > already with all the notifier calls.
> 
> OK. I'll take a look at setting the TLS a HWCAP flag.

Below is a patch for convert CONFIG_HAS_TLS_REG into HWCAP_TLS.

I've tested it with V6 r0p2 with no HWCAP_TLS, and V7 that has HWCAP_TLS.
I also forced CONFIG_TLS_REG_EMUL and booted on V6 r0p2, and it booted OK.

Could somebody please test this patch on a real CONFIG_TLS_REG_EMUL
system?

Also, I wonder if the change __kuser_get_tls is safe?

I changed it to assume that if 0xffff0ff0 == 0, then we have HWCAP_TLS.

Regards,

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  1:35         ` [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6 Tony Lindgren
@ 2010-03-19  3:24           ` Tony Lindgren
  2010-03-19  3:46           ` Jamie Lokier
  2010-03-19  8:53           ` Russell King - ARM Linux
  2 siblings, 0 replies; 15+ messages in thread
From: Tony Lindgren @ 2010-03-19  3:24 UTC (permalink / raw)
  To: linux-arm-kernel

* Tony Lindgren <tony@atomide.com> [100318 18:31]:
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -269,6 +269,24 @@ static void __init cacheid_init(void)
>  extern struct proc_info_list *lookup_processor_type(unsigned int);
>  extern struct machine_desc *lookup_machine_type(unsigned int);
>  
> +#ifdef CONFIG_CPU_V6
> +static void __init feat_v6_fixup(void)
> +{
> +	int id = read_cpuid_id();
> +
> +	if (id & 0x000f0000 != 0x00070000)
> +		return;
> +
> +	/* HWCAP_TLS is available only on V6 r1p0 and later */
> +	if (((id >> 20) & 3) == 0)
> +		elf_hwcap &= ~HWCAP_TLS;
> +}

This test probably needs to only look at ARM1136, and ignore others
such as ARM1176. Will take a look tomorrow.

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  1:35         ` [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6 Tony Lindgren
  2010-03-19  3:24           ` Tony Lindgren
@ 2010-03-19  3:46           ` Jamie Lokier
  2010-03-19  8:54             ` Russell King - ARM Linux
  2010-03-19  8:53           ` Russell King - ARM Linux
  2 siblings, 1 reply; 15+ messages in thread
From: Jamie Lokier @ 2010-03-19  3:46 UTC (permalink / raw)
  To: linux-arm-kernel

Tony Lindgren wrote:
> Also, I wonder if the change __kuser_get_tls is safe?
> 
> +	ldr     r0, [pc, #(16 - 8)]		@ TLS set at 0xffff0ff0?
> +	cmp	r0, #0				@ assume hw TLS if not set
> +	mrceq	p15, 0, r0, c13, c0, 3		@ read TLS register

You cannot assume the TLS value is non-zero, because it's provided by
userspace to use however it wants.  It doesn't even have to be an address.

I'm thinking, why not an alternative() macro like on x86, which is a
very nice way to describe run-time patches of one or a few instructions
which depend on arch feature bits.

Then all that switch_to() logic could be made the size it was before.

An alternative() macro could make a lot of other chip-dependent calls
smaller too, i.e. all those which dispatch through function pointers
at present for cache flushing etc - they could become direct calls, or
an inline instruction or two when possible.

-- Jamie

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  1:35         ` [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6 Tony Lindgren
  2010-03-19  3:24           ` Tony Lindgren
  2010-03-19  3:46           ` Jamie Lokier
@ 2010-03-19  8:53           ` Russell King - ARM Linux
  2010-03-19 15:58             ` Tony Lindgren
  2 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2010-03-19  8:53 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Mar 18, 2010 at 06:35:21PM -0700, Tony Lindgren wrote:
> -#if defined(CONFIG_HAS_TLS_REG)
> -	mcr	p15, 0, r3, c13, c0, 3		@ set TLS register
> -#elif !defined(CONFIG_TLS_REG_EMUL)
> -	mov	r4, #0xffff0fff
> -	str	r3, [r4, #-15]			@ TLS val at 0xffff0ff0
> +#if !defined(CONFIG_TLS_REG_EMUL)
> +	ldr	r4, =elf_hwcap
> +	ldr	r4, [r4, #0]
> +	tst	r4, #HWCAP_TLS			@ hardware with TLS?

This is really really inefficient.  Both the second ldr and tst will stall
the pipeline because they need to wait for the result of the precending
ldr.  Can we do better by re-ordering some instructions?

Also, the ifndef seems incorrect - if we have TLS_REG_EMUL we seem to omit
all this code.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  3:46           ` Jamie Lokier
@ 2010-03-19  8:54             ` Russell King - ARM Linux
  2010-03-19 15:32               ` Tony Lindgren
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2010-03-19  8:54 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 19, 2010 at 03:46:45AM +0000, Jamie Lokier wrote:
> I'm thinking, why not an alternative() macro like on x86, which is a
> very nice way to describe run-time patches of one or a few instructions
> which depend on arch feature bits.

Having XIP support prevents that kind of thing.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  8:54             ` Russell King - ARM Linux
@ 2010-03-19 15:32               ` Tony Lindgren
  0 siblings, 0 replies; 15+ messages in thread
From: Tony Lindgren @ 2010-03-19 15:32 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [100319 01:50]:
> On Fri, Mar 19, 2010 at 03:46:45AM +0000, Jamie Lokier wrote:
> > I'm thinking, why not an alternative() macro like on x86, which is a
> > very nice way to describe run-time patches of one or a few instructions
> > which depend on arch feature bits.
> 
> Having XIP support prevents that kind of thing.

How about we store the HWCAP_TLS flag into 0xffff0ff4 for
__kuser_get_tls? That way the userspace won't be able to set
it.

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19  8:53           ` Russell King - ARM Linux
@ 2010-03-19 15:58             ` Tony Lindgren
  2010-03-23  0:16               ` Russell King - ARM Linux
  0 siblings, 1 reply; 15+ messages in thread
From: Tony Lindgren @ 2010-03-19 15:58 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [100319 01:49]:
> On Thu, Mar 18, 2010 at 06:35:21PM -0700, Tony Lindgren wrote:
> > -#if defined(CONFIG_HAS_TLS_REG)
> > -	mcr	p15, 0, r3, c13, c0, 3		@ set TLS register
> > -#elif !defined(CONFIG_TLS_REG_EMUL)
> > -	mov	r4, #0xffff0fff
> > -	str	r3, [r4, #-15]			@ TLS val at 0xffff0ff0
> > +#if !defined(CONFIG_TLS_REG_EMUL)
> > +	ldr	r4, =elf_hwcap
> > +	ldr	r4, [r4, #0]
> > +	tst	r4, #HWCAP_TLS			@ hardware with TLS?
> 
> This is really really inefficient.  Both the second ldr and tst will stall
> the pipeline because they need to wait for the result of the precending
> ldr.  Can we do better by re-ordering some instructions?

Or set ifdef CONFIG_CPU_V6 and test for the cp15 id register every time..
 
> Also, the ifndef seems incorrect - if we have TLS_REG_EMUL we seem to omit
> all this code.

Is the current ifdef elif wrong? The current code does not seem to
do anything if TLS_REG_EMUL is set and HAS_TLS_REG is not set.
HAS_TLS_REG depends !TLS_REG_EMUL.

Regards,

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-19 15:58             ` Tony Lindgren
@ 2010-03-23  0:16               ` Russell King - ARM Linux
  2010-03-23  0:54                 ` Tony Lindgren
  0 siblings, 1 reply; 15+ messages in thread
From: Russell King - ARM Linux @ 2010-03-23  0:16 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Mar 19, 2010 at 08:58:05AM -0700, Tony Lindgren wrote:
> * Russell King - ARM Linux <linux@arm.linux.org.uk> [100319 01:49]:
> > On Thu, Mar 18, 2010 at 06:35:21PM -0700, Tony Lindgren wrote:
> > > -#if defined(CONFIG_HAS_TLS_REG)
> > > -	mcr	p15, 0, r3, c13, c0, 3		@ set TLS register
> > > -#elif !defined(CONFIG_TLS_REG_EMUL)
> > > -	mov	r4, #0xffff0fff
> > > -	str	r3, [r4, #-15]			@ TLS val at 0xffff0ff0
> > > +#if !defined(CONFIG_TLS_REG_EMUL)
> > > +	ldr	r4, =elf_hwcap
> > > +	ldr	r4, [r4, #0]
> > > +	tst	r4, #HWCAP_TLS			@ hardware with TLS?
> > 
> > This is really really inefficient.  Both the second ldr and tst will stall
> > the pipeline because they need to wait for the result of the precending
> > ldr.  Can we do better by re-ordering some instructions?
> 
> Or set ifdef CONFIG_CPU_V6 and test for the cp15 id register every time..

I was suggesting that it might be worth trying to reorder the instructions
here so that we're not immediately using the result of the ldr in the
next instruction.  We have plenty of registers available here (everything
except r0-r2, r6, fp.)

> > Also, the ifndef seems incorrect - if we have TLS_REG_EMUL we seem to omit
> > all this code.
> 
> Is the current ifdef elif wrong? The current code does not seem to
> do anything if TLS_REG_EMUL is set and HAS_TLS_REG is not set.
> HAS_TLS_REG depends !TLS_REG_EMUL.

Now I look back, I don't think so.  Ignore that comment.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6
  2010-03-23  0:16               ` Russell King - ARM Linux
@ 2010-03-23  0:54                 ` Tony Lindgren
  0 siblings, 0 replies; 15+ messages in thread
From: Tony Lindgren @ 2010-03-23  0:54 UTC (permalink / raw)
  To: linux-arm-kernel

* Russell King - ARM Linux <linux@arm.linux.org.uk> [100322 17:12]:
> On Fri, Mar 19, 2010 at 08:58:05AM -0700, Tony Lindgren wrote:
> > * Russell King - ARM Linux <linux@arm.linux.org.uk> [100319 01:49]:
> > > On Thu, Mar 18, 2010 at 06:35:21PM -0700, Tony Lindgren wrote:
> > > > -#if defined(CONFIG_HAS_TLS_REG)
> > > > -	mcr	p15, 0, r3, c13, c0, 3		@ set TLS register
> > > > -#elif !defined(CONFIG_TLS_REG_EMUL)
> > > > -	mov	r4, #0xffff0fff
> > > > -	str	r3, [r4, #-15]			@ TLS val at 0xffff0ff0
> > > > +#if !defined(CONFIG_TLS_REG_EMUL)
> > > > +	ldr	r4, =elf_hwcap
> > > > +	ldr	r4, [r4, #0]
> > > > +	tst	r4, #HWCAP_TLS			@ hardware with TLS?
> > > 
> > > This is really really inefficient.  Both the second ldr and tst will stall
> > > the pipeline because they need to wait for the result of the precending
> > > ldr.  Can we do better by re-ordering some instructions?
> > 
> > Or set ifdef CONFIG_CPU_V6 and test for the cp15 id register every time..
> 
> I was suggesting that it might be worth trying to reorder the instructions
> here so that we're not immediately using the result of the ldr in the
> next instruction.  We have plenty of registers available here (everything
> except r0-r2, r6, fp.)

Yeah sure, I'll take a look. I'll repost an updated version after I get a
chance to play with this again. Might be a little while before I get back to
this, but this would be for the next merge window anyways.
 
> > > Also, the ifndef seems incorrect - if we have TLS_REG_EMUL we seem to omit
> > > all this code.
> > 
> > Is the current ifdef elif wrong? The current code does not seem to
> > do anything if TLS_REG_EMUL is set and HAS_TLS_REG is not set.
> > HAS_TLS_REG depends !TLS_REG_EMUL.
> 
> Now I look back, I don't think so.  Ignore that comment.

OK

Regards,

Tony

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2010-03-23  0:54 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-17 17:57 [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7 Tony Lindgren
2010-03-17 18:07 ` Catalin Marinas
2010-03-17 19:11   ` Tony Lindgren
2010-03-18 11:13     ` Catalin Marinas
2010-03-18 17:00       ` Tony Lindgren
2010-03-19  1:35         ` [PATCH] arm: Replace CONFIG_HAS_TLS_REG with HWCAP_TLS and check for it on V6 Tony Lindgren
2010-03-19  3:24           ` Tony Lindgren
2010-03-19  3:46           ` Jamie Lokier
2010-03-19  8:54             ` Russell King - ARM Linux
2010-03-19 15:32               ` Tony Lindgren
2010-03-19  8:53           ` Russell King - ARM Linux
2010-03-19 15:58             ` Tony Lindgren
2010-03-23  0:16               ` Russell King - ARM Linux
2010-03-23  0:54                 ` Tony Lindgren
  -- strict thread matches above, loose matches on Subject: below --
2010-03-15 22:25 [PATCH] arm: Fix mounting root on omaps with CPU_V6 and CPU_V7 Tony Lindgren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).