* [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed
@ 2014-06-27 15:53 Will Deacon
2014-06-27 17:33 ` Ard Biesheuvel
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2014-06-27 15:53 UTC (permalink / raw)
To: linux-arm-kernel
Writing to the FPCR is commonly implemented as a self-synchronising
operation in the CPU, so avoid writing to the register when the saved
value matches that in the hardware already.
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
v1 -> v2 : Move FPCR restoration out into a macro and update partial
restore code too.
arch/arm64/include/asm/fpsimdmacros.h | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
index 768414d55e64..f6b3eb5f4517 100644
--- a/arch/arm64/include/asm/fpsimdmacros.h
+++ b/arch/arm64/include/asm/fpsimdmacros.h
@@ -40,6 +40,19 @@
str w\tmpnr, [\state, #16 * 2 + 4]
.endm
+.macro fpsimd_restore_fpcr state, tmp
+ /*
+ * Writes to fpcr may be self-synchronising, so avoid restoring
+ * the register if it hasn't changed.
+ */
+ mrs \tmp, fpcr
+ cmp \tmp, \state
+ b.eq 9999f
+ msr fpcr, \state
+9999:
+.endm
+
+/* Clobbers \state */
.macro fpsimd_restore state, tmpnr
ldp q0, q1, [\state, #16 * 0]
ldp q2, q3, [\state, #16 * 2]
@@ -60,7 +73,7 @@
ldr w\tmpnr, [\state, #16 * 2]
msr fpsr, x\tmpnr
ldr w\tmpnr, [\state, #16 * 2 + 4]
- msr fpcr, x\tmpnr
+ fpsimd_restore_fpcr x\tmpnr, \state
.endm
.altmacro
@@ -84,7 +97,7 @@
.macro fpsimd_restore_partial state, tmpnr1, tmpnr2
ldp w\tmpnr1, w\tmpnr2, [\state]
msr fpsr, x\tmpnr1
- msr fpcr, x\tmpnr2
+ fpsimd_restore_fpcr x\tmpnr1, x\tmpnr2
adr x\tmpnr1, 0f
ldr w\tmpnr2, [\state, #8]
add \state, \state, x\tmpnr2, lsl #4
--
2.0.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed
2014-06-27 15:53 [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed Will Deacon
@ 2014-06-27 17:33 ` Ard Biesheuvel
2014-06-30 9:03 ` Will Deacon
0 siblings, 1 reply; 4+ messages in thread
From: Ard Biesheuvel @ 2014-06-27 17:33 UTC (permalink / raw)
To: linux-arm-kernel
On 27 June 2014 17:53, Will Deacon <will.deacon@arm.com> wrote:
> Writing to the FPCR is commonly implemented as a self-synchronising
> operation in the CPU, so avoid writing to the register when the saved
> value matches that in the hardware already.
>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>
> v1 -> v2 : Move FPCR restoration out into a macro and update partial
> restore code too.
>
> arch/arm64/include/asm/fpsimdmacros.h | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/fpsimdmacros.h b/arch/arm64/include/asm/fpsimdmacros.h
> index 768414d55e64..f6b3eb5f4517 100644
> --- a/arch/arm64/include/asm/fpsimdmacros.h
> +++ b/arch/arm64/include/asm/fpsimdmacros.h
> @@ -40,6 +40,19 @@
> str w\tmpnr, [\state, #16 * 2 + 4]
> .endm
>
> +.macro fpsimd_restore_fpcr state, tmp
> + /*
> + * Writes to fpcr may be self-synchronising, so avoid restoring
> + * the register if it hasn't changed.
> + */
> + mrs \tmp, fpcr
> + cmp \tmp, \state
> + b.eq 9999f
> + msr fpcr, \state
> +9999:
> +.endm
> +
> +/* Clobbers \state */
> .macro fpsimd_restore state, tmpnr
> ldp q0, q1, [\state, #16 * 0]
> ldp q2, q3, [\state, #16 * 2]
> @@ -60,7 +73,7 @@
> ldr w\tmpnr, [\state, #16 * 2]
> msr fpsr, x\tmpnr
> ldr w\tmpnr, [\state, #16 * 2 + 4]
> - msr fpcr, x\tmpnr
> + fpsimd_restore_fpcr x\tmpnr, \state
> .endm
>
> .altmacro
> @@ -84,7 +97,7 @@
> .macro fpsimd_restore_partial state, tmpnr1, tmpnr2
> ldp w\tmpnr1, w\tmpnr2, [\state]
> msr fpsr, x\tmpnr1
> - msr fpcr, x\tmpnr2
> + fpsimd_restore_fpcr x\tmpnr1, x\tmpnr2
Ehm, isn't this the wrong way around?
> adr x\tmpnr1, 0f
> ldr w\tmpnr2, [\state, #8]
> add \state, \state, x\tmpnr2, lsl #4
> --
> 2.0.0
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed
2014-06-27 17:33 ` Ard Biesheuvel
@ 2014-06-30 9:03 ` Will Deacon
2014-06-30 10:26 ` Ard Biesheuvel
0 siblings, 1 reply; 4+ messages in thread
From: Will Deacon @ 2014-06-30 9:03 UTC (permalink / raw)
To: linux-arm-kernel
On Fri, Jun 27, 2014 at 06:33:58PM +0100, Ard Biesheuvel wrote:
> On 27 June 2014 17:53, Will Deacon <will.deacon@arm.com> wrote:
> > Writing to the FPCR is commonly implemented as a self-synchronising
> > operation in the CPU, so avoid writing to the register when the saved
> > value matches that in the hardware already.
[...]
> > .macro fpsimd_restore_partial state, tmpnr1, tmpnr2
> > ldp w\tmpnr1, w\tmpnr2, [\state]
> > msr fpsr, x\tmpnr1
> > - msr fpcr, x\tmpnr2
> > + fpsimd_restore_fpcr x\tmpnr1, x\tmpnr2
>
> Ehm, isn't this the wrong way around?
Yup, well spotted. I'll give the crypto selftests a run to test this path
with the fix (I was just running paranoia in userspace before).
Will
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed
2014-06-30 9:03 ` Will Deacon
@ 2014-06-30 10:26 ` Ard Biesheuvel
0 siblings, 0 replies; 4+ messages in thread
From: Ard Biesheuvel @ 2014-06-30 10:26 UTC (permalink / raw)
To: linux-arm-kernel
On 30 June 2014 11:03, Will Deacon <will.deacon@arm.com> wrote:
> On Fri, Jun 27, 2014 at 06:33:58PM +0100, Ard Biesheuvel wrote:
>> On 27 June 2014 17:53, Will Deacon <will.deacon@arm.com> wrote:
>> > Writing to the FPCR is commonly implemented as a self-synchronising
>> > operation in the CPU, so avoid writing to the register when the saved
>> > value matches that in the hardware already.
>
> [...]
>
>> > .macro fpsimd_restore_partial state, tmpnr1, tmpnr2
>> > ldp w\tmpnr1, w\tmpnr2, [\state]
>> > msr fpsr, x\tmpnr1
>> > - msr fpcr, x\tmpnr2
>> > + fpsimd_restore_fpcr x\tmpnr1, x\tmpnr2
>>
>> Ehm, isn't this the wrong way around?
>
> Yup, well spotted. I'll give the crypto selftests a run to test this path
> with the fix (I was just running paranoia in userspace before).
>
That won't cut it, I'm afraid. What I used is
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -221,6 +221,8 @@ void fpsimd_flush_task_state(struct task_struct *t)
static DEFINE_PER_CPU(struct fpsimd_partial_state, hardirq_fpsimdstate);
static DEFINE_PER_CPU(struct fpsimd_partial_state, softirq_fpsimdstate);
+#define in_interrupt() (1)
+
/*
* Kernel-side NEON support functions
*/
and run my [lengthy] userland benchmark while doing 'modprobe tcrypt
mode=x' a number of times in another terminal (with only 1 CPU up).
[with x=2 for SHA1 and x=6 for SHA256, for instance]
The first iteration of the interrupt context NEON series had an ARM
counterpart which I was able to test using iperf over a WPA2/CCMP link
(with the bit-sliced AES in CTR mode). No such luck [yet] on arm64,
unfortunately, so I had to improvise a bit.
--
Ard.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2014-06-30 10:26 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-06-27 15:53 [PATCH v2] arm64: fpsimd: avoid restoring fpcr if the contents haven't changed Will Deacon
2014-06-27 17:33 ` Ard Biesheuvel
2014-06-30 9:03 ` Will Deacon
2014-06-30 10:26 ` Ard Biesheuvel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).