* Problem with 64-bit registers on i.MX53
2012-10-08 17:10 ` Russell King - ARM Linux
@ 2012-10-08 17:50 ` Dave Martin
2012-10-08 18:03 ` Russell King - ARM Linux
2012-10-09 9:02 ` Michael Olbrich
2012-10-09 8:52 ` Michael Olbrich
2012-10-09 9:02 ` Uwe Kleine-König
2 siblings, 2 replies; 10+ messages in thread
From: Dave Martin @ 2012-10-08 17:50 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 06:10:11PM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 08, 2012 at 06:01:24PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 06:08:41PM +0200, Michael Olbrich wrote:
> > >
> > > Hi,
> > >
> > > I have a problem that looks like that 64-bit registers (I think) are not
> > > saved/restored correctly on a context switch. I've reduced it to the
> > > following test case:
> > >
> > > - Latest Linux mainline kernel (v3.6-8559-ge9eca4d)
> > > v3.5 is also affected
> > > - imx_v6_v7_defconfig
> > > - arch/arm/boot/dts/imx53-evk.dts
> > >
> > > The following test program is compiled with "-mcpu=cortex-a8 -mfpu=neon
> > > -O2".
> > > ------------------------>8--------------------------------
> > > #include <inttypes.h>
> > > #include <assert.h>
> > >
> > > volatile int x = 2;
> > > volatile int64_t y = 2;
> > >
> > > int main() {
> > > volatile int a = 0;
> > > volatile int64_t b = 0;
> > > while (1) {
> > > a = (a + x) % (1 << 30);
> > > b = (b + y) % (1 << 30);
> > > assert(a == b);
> > > }
> > > }
> > > ------------------------>8--------------------------------
> > > The ".. (b + y) .." should result in "vadd.i64 d19, d18, d16" or
> > > something like that.
Just for my curiosity, can you let me know what compiler version you're
using and the disassembly? I'm actually a little surprised to see
NEON code being generated here, though the patch below fixes what
definitely looks like a context switch bug for combined v6+v7 kernels...
Cheers
---Dave
> >
> > Hmm.
> >
> > Can you send me the output of 'grep ^Features /proc/cpuinfo' please?
>
> You may also like to try the patch below... it will probably fix your
> problem.
>
> diff --git a/arch/arm/include/asm/vfpmacros.h b/arch/arm/include/asm/vfpmacros.h
> index a7aadbd..6a6f1e4 100644
> --- a/arch/arm/include/asm/vfpmacros.h
> +++ b/arch/arm/include/asm/vfpmacros.h
> @@ -28,7 +28,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - ldceq p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> + ldceql p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> addne \base, \base, #32*4 @ step over unused register space
> #else
> VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
> @@ -52,7 +52,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - stceq p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> + stceql p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> addne \base, \base, #32*4 @ step over unused register space
> #else
> VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread* Problem with 64-bit registers on i.MX53
2012-10-08 17:50 ` Dave Martin
@ 2012-10-08 18:03 ` Russell King - ARM Linux
2012-10-08 18:04 ` Dave Martin
2012-10-09 9:02 ` Michael Olbrich
1 sibling, 1 reply; 10+ messages in thread
From: Russell King - ARM Linux @ 2012-10-08 18:03 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 06:50:52PM +0100, Dave Martin wrote:
> Just for my curiosity, can you let me know what compiler version you're
> using and the disassembly? I'm actually a little surprised to see
> NEON code being generated here, though the patch below fixes what
> definitely looks like a context switch bug for combined v6+v7 kernels...
Well, one such compiler is gcc 4.6.3 in Ubuntu Precise LTS.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Problem with 64-bit registers on i.MX53
2012-10-08 18:03 ` Russell King - ARM Linux
@ 2012-10-08 18:04 ` Dave Martin
0 siblings, 0 replies; 10+ messages in thread
From: Dave Martin @ 2012-10-08 18:04 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 07:03:37PM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 08, 2012 at 06:50:52PM +0100, Dave Martin wrote:
> > Just for my curiosity, can you let me know what compiler version you're
> > using and the disassembly? I'm actually a little surprised to see
> > NEON code being generated here, though the patch below fixes what
> > definitely looks like a context switch bug for combined v6+v7 kernels...
>
> Well, one such compiler is gcc 4.6.3 in Ubuntu Precise LTS.
Hmmm, I really need to upgrade...
Cheers
---Dave
^ permalink raw reply [flat|nested] 10+ messages in thread
* Problem with 64-bit registers on i.MX53
2012-10-08 17:50 ` Dave Martin
2012-10-08 18:03 ` Russell King - ARM Linux
@ 2012-10-09 9:02 ` Michael Olbrich
1 sibling, 0 replies; 10+ messages in thread
From: Michael Olbrich @ 2012-10-09 9:02 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 06:50:52PM +0100, Dave Martin wrote:
> On Mon, Oct 08, 2012 at 06:10:11PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 06:01:24PM +0100, Russell King - ARM Linux wrote:
> > > On Mon, Oct 08, 2012 at 06:08:41PM +0200, Michael Olbrich wrote:
> > > > I have a problem that looks like that 64-bit registers (I think) are not
> > > > saved/restored correctly on a context switch. I've reduced it to the
> > > > following test case:
> > > >
> > > > - Latest Linux mainline kernel (v3.6-8559-ge9eca4d)
> > > > v3.5 is also affected
> > > > - imx_v6_v7_defconfig
> > > > - arch/arm/boot/dts/imx53-evk.dts
> > > >
> > > > The following test program is compiled with "-mcpu=cortex-a8 -mfpu=neon
> > > > -O2".
> > > > ------------------------>8--------------------------------
> > > > #include <inttypes.h>
> > > > #include <assert.h>
> > > >
> > > > volatile int x = 2;
> > > > volatile int64_t y = 2;
> > > >
> > > > int main() {
> > > > volatile int a = 0;
> > > > volatile int64_t b = 0;
> > > > while (1) {
> > > > a = (a + x) % (1 << 30);
> > > > b = (b + y) % (1 << 30);
> > > > assert(a == b);
> > > > }
> > > > }
> > > > ------------------------>8--------------------------------
> > > > The ".. (b + y) .." should result in "vadd.i64 d19, d18, d16" or
> > > > something like that.
>
> Just for my curiosity, can you let me know what compiler version you're
> using and the disassembly? I'm actually a little surprised to see
> NEON code being generated here,
Im using oselas.toolchain which includes a gcc-linaro-4.6-2011.11. It
generates quite a bit of NEON code actually. I originally tracked down the
issue to a commit in libxcb: "xcb_in: Use 64-bit sequence numbers
internally everywhere.". The compiler generated NEON code to calculate
sequence numbers...
Regards,
Michael
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 10+ messages in thread
* Problem with 64-bit registers on i.MX53
2012-10-08 17:10 ` Russell King - ARM Linux
2012-10-08 17:50 ` Dave Martin
@ 2012-10-09 8:52 ` Michael Olbrich
2012-10-09 9:02 ` Uwe Kleine-König
2 siblings, 0 replies; 10+ messages in thread
From: Michael Olbrich @ 2012-10-09 8:52 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Oct 08, 2012 at 06:10:11PM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 08, 2012 at 06:01:24PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 06:08:41PM +0200, Michael Olbrich wrote:
> > > I have a problem that looks like that 64-bit registers (I think) are not
> > > saved/restored correctly on a context switch. I've reduced it to the
> > > following test case:
> > >
> > > - Latest Linux mainline kernel (v3.6-8559-ge9eca4d)
> > > v3.5 is also affected
> > > - imx_v6_v7_defconfig
> > > - arch/arm/boot/dts/imx53-evk.dts
> > >
> > > The following test program is compiled with "-mcpu=cortex-a8 -mfpu=neon
> > > -O2".
> > > ------------------------>8--------------------------------
> > > #include <inttypes.h>
> > > #include <assert.h>
> > >
> > > volatile int x = 2;
> > > volatile int64_t y = 2;
> > >
> > > int main() {
> > > volatile int a = 0;
> > > volatile int64_t b = 0;
> > > while (1) {
> > > a = (a + x) % (1 << 30);
> > > b = (b + y) % (1 << 30);
> > > assert(a == b);
> > > }
> > > }
> > > ------------------------>8--------------------------------
> > > The ".. (b + y) .." should result in "vadd.i64 d19, d18, d16" or
> > > something like that.
> >
> > Hmm.
> >
> > Can you send me the output of 'grep ^Features /proc/cpuinfo' please?
Features : swp half thumb fastmult vfp edsp neon vfpv3 tls
> You may also like to try the patch below... it will probably fix your
> problem.
This does indeed fix my problem. Is this a real fix or just a test to
narrow down the issue? I don't really understand what it does.
If it is a real fix,
Tested-By: Michael Olbrich <m.olbrich@pengutronix.de>
Regards,
Michael
> diff --git a/arch/arm/include/asm/vfpmacros.h b/arch/arm/include/asm/vfpmacros.h
> index a7aadbd..6a6f1e4 100644
> --- a/arch/arm/include/asm/vfpmacros.h
> +++ b/arch/arm/include/asm/vfpmacros.h
> @@ -28,7 +28,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - ldceq p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> + ldceql p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> addne \base, \base, #32*4 @ step over unused register space
> #else
> VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
> @@ -52,7 +52,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - stceq p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> + stceql p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> addne \base, \base, #32*4 @ step over unused register space
> #else
> VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
^ permalink raw reply [flat|nested] 10+ messages in thread* Problem with 64-bit registers on i.MX53
2012-10-08 17:10 ` Russell King - ARM Linux
2012-10-08 17:50 ` Dave Martin
2012-10-09 8:52 ` Michael Olbrich
@ 2012-10-09 9:02 ` Uwe Kleine-König
2012-10-09 14:05 ` Dave Martin
2 siblings, 1 reply; 10+ messages in thread
From: Uwe Kleine-König @ 2012-10-09 9:02 UTC (permalink / raw)
To: linux-arm-kernel
Hello,
On Mon, Oct 08, 2012 at 06:10:11PM +0100, Russell King - ARM Linux wrote:
> On Mon, Oct 08, 2012 at 06:01:24PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 06:08:41PM +0200, Michael Olbrich wrote:
> You may also like to try the patch below... it will probably fix your
> problem.
>
> diff --git a/arch/arm/include/asm/vfpmacros.h b/arch/arm/include/asm/vfpmacros.h
> index a7aadbd..6a6f1e4 100644
> --- a/arch/arm/include/asm/vfpmacros.h
> +++ b/arch/arm/include/asm/vfpmacros.h
> @@ -28,7 +28,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - ldceq p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> + ldceql p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> addne \base, \base, #32*4 @ step over unused register space
> #else
> VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
> @@ -52,7 +52,7 @@
> ldr \tmp, =elf_hwcap @ may not have MVFR regs
> ldr \tmp, [\tmp, #0]
> tst \tmp, #HWCAP_VFPv3D16
> - stceq p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> + stceql p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
According to the ARMARM for v7-A and v7-R (ARM DDI 0406B errata 2010 Q2)
the syntax is "STC{L}<c> ...", with a note "The pre-UAL syntax STC<c>L
is equivalent to STCL<c>.". Maybe the UAL-syntax should better be used?
Best regards
Uwe
--
Pengutronix e.K. | Uwe Kleine-K?nig |
Industrial Linux Solutions | http://www.pengutronix.de/ |
^ permalink raw reply [flat|nested] 10+ messages in thread* Problem with 64-bit registers on i.MX53
2012-10-09 9:02 ` Uwe Kleine-König
@ 2012-10-09 14:05 ` Dave Martin
0 siblings, 0 replies; 10+ messages in thread
From: Dave Martin @ 2012-10-09 14:05 UTC (permalink / raw)
To: linux-arm-kernel
On Tue, Oct 09, 2012 at 11:02:37AM +0200, Uwe Kleine-K?nig wrote:
> Hello,
>
> On Mon, Oct 08, 2012 at 06:10:11PM +0100, Russell King - ARM Linux wrote:
> > On Mon, Oct 08, 2012 at 06:01:24PM +0100, Russell King - ARM Linux wrote:
> > > On Mon, Oct 08, 2012 at 06:08:41PM +0200, Michael Olbrich wrote:
> > You may also like to try the patch below... it will probably fix your
> > problem.
> >
> > diff --git a/arch/arm/include/asm/vfpmacros.h b/arch/arm/include/asm/vfpmacros.h
> > index a7aadbd..6a6f1e4 100644
> > --- a/arch/arm/include/asm/vfpmacros.h
> > +++ b/arch/arm/include/asm/vfpmacros.h
> > @@ -28,7 +28,7 @@
> > ldr \tmp, =elf_hwcap @ may not have MVFR regs
> > ldr \tmp, [\tmp, #0]
> > tst \tmp, #HWCAP_VFPv3D16
> > - ldceq p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> > + ldceql p11, cr0, [\base],#32*4 @ FLDMIAD \base!, {d16-d31}
> > addne \base, \base, #32*4 @ step over unused register space
> > #else
> > VFPFMRX \tmp, MVFR0 @ Media and VFP Feature Register 0
> > @@ -52,7 +52,7 @@
> > ldr \tmp, =elf_hwcap @ may not have MVFR regs
> > ldr \tmp, [\tmp, #0]
> > tst \tmp, #HWCAP_VFPv3D16
> > - stceq p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> > + stceql p11, cr0, [\base],#32*4 @ FSTMIAD \base!, {d16-d31}
> According to the ARMARM for v7-A and v7-R (ARM DDI 0406B errata 2010 Q2)
> the syntax is "STC{L}<c> ...", with a note "The pre-UAL syntax STC<c>L
> is equivalent to STCL<c>.". Maybe the UAL-syntax should better be used?
The older stc<c>l type of syntax is used all over the place. Code which
might need to be built by tools which pre-date unified syntax needs to
use the old syntax, so it is in common usage in the kernel in general.
This code presumably only gets built by new-enough tools for the unified
syntax to be usable, but support for the old syntax isn't going to
disappear from the tools any time soon, AFAIK.
The bug here was that the presence or absence of the "L" suffix is used
to encode bit 4 of the starting d-register number for these instructions.
The comment says d16-d31, but the instructions as written are actually
saving and restoring d0-d15...which is not so helpful since we already
handled those registers in the neighbouring code.
We could avoid this kind of bug by writing those VFP instructions
using the unified syntax native mnemonics (vstmia, vldmia -- since d16-
d31 never existed while the old fldmiad/fstmiad mnemonics were in use,
and the assembler doesn't accept them), but it is tricky to change the
assembler's notion of target CPU and FPU on-the-fly inside a header or
macro without messing things up.
Cheers
---Dave
>
> Best regards
> Uwe
>
> --
> Pengutronix e.K. | Uwe Kleine-K?nig |
> Industrial Linux Solutions | http://www.pengutronix.de/ |
>
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 10+ messages in thread