* [PATCH] ARM: vfp: use asm volatile for FP control register accesses
@ 2024-03-18 9:30 Ard Biesheuvel
2024-03-26 23:55 ` Nathan Chancellor
0 siblings, 1 reply; 5+ messages in thread
From: Ard Biesheuvel @ 2024-03-18 9:30 UTC (permalink / raw)
To: linux-arm-kernel
Cc: linux, arnd, nathan, linus.walleij, Ard Biesheuvel, stable
From: Ard Biesheuvel <ardb@kernel.org>
Clang may reorder FP control register reads and writes, due to the fact
that the inline asm() blocks in the read/write wrappers are not volatile
qualified, and the compiler has no idea that these reads and writes may
have side effects.
In particular, reads of FPSCR may generate an UNDEF exception if a
floating point exception is pending, and the FP emulation code in
VFP_bounce() explicitly clears FP exceptions temporarily in order to be
able to perform the emulation on behalf of user space. This requires
that the writes to FPEXC are never reordered with respect to accesses to
other FP control registers, such as FPSCR.
So use asm volatile for both the read and the write helpers.
Cc: <stable@kernel.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/vfp/vfpinstr.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
index 3c7938fd40aa..c4ac778e6fc9 100644
--- a/arch/arm/vfp/vfpinstr.h
+++ b/arch/arm/vfp/vfpinstr.h
@@ -66,14 +66,14 @@
#define fmrx(_vfp_) ({ \
u32 __v; \
- asm(".fpu vfpv2\n" \
+ asm volatile(".fpu vfpv2\n" \
"vmrs %0, " #_vfp_ \
: "=r" (__v) : : "cc"); \
__v; \
})
#define fmxr(_vfp_,_var_) \
- asm(".fpu vfpv2\n" \
+ asm volatile(".fpu vfpv2\n" \
"vmsr " #_vfp_ ", %0" \
: : "r" (_var_) : "cc")
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses
2024-03-18 9:30 [PATCH] ARM: vfp: use asm volatile for FP control register accesses Ard Biesheuvel
@ 2024-03-26 23:55 ` Nathan Chancellor
2024-03-27 7:05 ` Ard Biesheuvel
0 siblings, 1 reply; 5+ messages in thread
From: Nathan Chancellor @ 2024-03-26 23:55 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, linux, arnd, linus.walleij, Ard Biesheuvel,
stable
On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote:
> From: Ard Biesheuvel <ardb@kernel.org>
>
> Clang may reorder FP control register reads and writes, due to the fact
> that the inline asm() blocks in the read/write wrappers are not volatile
> qualified, and the compiler has no idea that these reads and writes may
> have side effects.
>
> In particular, reads of FPSCR may generate an UNDEF exception if a
> floating point exception is pending, and the FP emulation code in
> VFP_bounce() explicitly clears FP exceptions temporarily in order to be
> able to perform the emulation on behalf of user space. This requires
> that the writes to FPEXC are never reordered with respect to accesses to
> other FP control registers, such as FPSCR.
>
> So use asm volatile for both the read and the write helpers.
>
> Cc: <stable@kernel.org>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
This seems reasonable to me based on my understanding of GCC's
documentation. However, their documentation states "the compiler can
move even volatile asm instructions relative to other code, including
across jump instructions" and I feel like there was some discussion
around this sentence in the past but I can't remember what the
conclusion was, although I want to say Clang did not have the same
behavior. Regardless:
Acked-by: Nathan Chancellor <nathan@kernel.org>
I am just curious, how was this discovered or noticed? Was there a
report I missed?
> ---
> arch/arm/vfp/vfpinstr.h | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h
> index 3c7938fd40aa..c4ac778e6fc9 100644
> --- a/arch/arm/vfp/vfpinstr.h
> +++ b/arch/arm/vfp/vfpinstr.h
> @@ -66,14 +66,14 @@
>
> #define fmrx(_vfp_) ({ \
> u32 __v; \
> - asm(".fpu vfpv2\n" \
> + asm volatile(".fpu vfpv2\n" \
> "vmrs %0, " #_vfp_ \
> : "=r" (__v) : : "cc"); \
> __v; \
> })
>
> #define fmxr(_vfp_,_var_) \
> - asm(".fpu vfpv2\n" \
> + asm volatile(".fpu vfpv2\n" \
> "vmsr " #_vfp_ ", %0" \
> : : "r" (_var_) : "cc")
>
> --
> 2.39.2
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses
2024-03-26 23:55 ` Nathan Chancellor
@ 2024-03-27 7:05 ` Ard Biesheuvel
2024-03-27 7:31 ` Ard Biesheuvel
2024-03-27 14:41 ` Nathan Chancellor
0 siblings, 2 replies; 5+ messages in thread
From: Ard Biesheuvel @ 2024-03-27 7:05 UTC (permalink / raw)
To: Nathan Chancellor; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable
On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote:
>
> On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@kernel.org>
> >
> > Clang may reorder FP control register reads and writes, due to the fact
> > that the inline asm() blocks in the read/write wrappers are not volatile
> > qualified, and the compiler has no idea that these reads and writes may
> > have side effects.
> >
> > In particular, reads of FPSCR may generate an UNDEF exception if a
> > floating point exception is pending, and the FP emulation code in
> > VFP_bounce() explicitly clears FP exceptions temporarily in order to be
> > able to perform the emulation on behalf of user space. This requires
> > that the writes to FPEXC are never reordered with respect to accesses to
> > other FP control registers, such as FPSCR.
> >
> > So use asm volatile for both the read and the write helpers.
> >
> > Cc: <stable@kernel.org>
> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
>
> This seems reasonable to me based on my understanding of GCC's
> documentation. However, their documentation states "the compiler can
> move even volatile asm instructions relative to other code, including
> across jump instructions" and I feel like there was some discussion
> around this sentence in the past but I can't remember what the
> conclusion was, although I want to say Clang did not have the same
> behavior.
The only thing that matters here is whether two asm blocks are emitted
in a different order than they appear in the program, and I would be
very surprised if volatile permits that. Otherwise, we might introduce
a fake input dependency or a memory clobber instead.
> Regardless:
>
> Acked-by: Nathan Chancellor <nathan@kernel.org>
>
Thanks.
> I am just curious, how was this discovered or noticed? Was there a
> report I missed?
>
I noticed this when building a recent kernel for the original
Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially
relies on emulation. On more recent cores, we never hit the issue
because emulation is never needed. On older cores, there is no VFP so
we never reach this code path either.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses
2024-03-27 7:05 ` Ard Biesheuvel
@ 2024-03-27 7:31 ` Ard Biesheuvel
2024-03-27 14:41 ` Nathan Chancellor
1 sibling, 0 replies; 5+ messages in thread
From: Ard Biesheuvel @ 2024-03-27 7:31 UTC (permalink / raw)
To: Nathan Chancellor; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable
On Wed, 27 Mar 2024 at 09:05, Ard Biesheuvel <ardb@kernel.org> wrote:
>
> On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Clang may reorder FP control register reads and writes, due to the fact
> > > that the inline asm() blocks in the read/write wrappers are not volatile
> > > qualified, and the compiler has no idea that these reads and writes may
> > > have side effects.
> > >
> > > In particular, reads of FPSCR may generate an UNDEF exception if a
> > > floating point exception is pending, and the FP emulation code in
> > > VFP_bounce() explicitly clears FP exceptions temporarily in order to be
> > > able to perform the emulation on behalf of user space. This requires
> > > that the writes to FPEXC are never reordered with respect to accesses to
> > > other FP control registers, such as FPSCR.
> > >
> > > So use asm volatile for both the read and the write helpers.
> > >
> > > Cc: <stable@kernel.org>
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> >
> > This seems reasonable to me based on my understanding of GCC's
> > documentation. However, their documentation states "the compiler can
> > move even volatile asm instructions relative to other code, including
> > across jump instructions" and I feel like there was some discussion
> > around this sentence in the past but I can't remember what the
> > conclusion was, although I want to say Clang did not have the same
> > behavior.
>
> The only thing that matters here is whether two asm blocks are emitted
> in a different order than they appear in the program, and I would be
> very surprised if volatile permits that. Otherwise, we might introduce
> a fake input dependency or a memory clobber instead.
>
> > Regardless:
> >
> > Acked-by: Nathan Chancellor <nathan@kernel.org>
> >
>
> Thanks.
>
> > I am just curious, how was this discovered or noticed? Was there a
> > report I missed?
> >
>
> I noticed this when building a recent kernel for the original
> Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially
> relies on emulation. On more recent cores, we never hit the issue
> because emulation is never needed. On older cores, there is no VFP so
> we never reach this code path either.
An alternative approach might be to do the following, which tricks the
compiler into thinking fmxr might update *current, and fmrx might
access it. Given that current == current_thread_info(), which is used
all the time in the VFP code, it will already be available in a
register, and so it doesn't require the compiler to generate any
additional code.
--- a/arch/arm/vfp/vfpinstr.h
+++ b/arch/arm/vfp/vfpinstr.h
@@ -68,14 +68,14 @@
u32 __v; \
asm(".fpu vfpv2\n" \
"vmrs %0, " #_vfp_ \
- : "=r" (__v) : : "cc"); \
+ : "=r" (__v) : "Q" (*current)); \
__v; \
})
#define fmxr(_vfp_,_var_) \
asm(".fpu vfpv2\n" \
- "vmsr " #_vfp_ ", %0" \
- : : "r" (_var_) : "cc")
+ "vmsr " #_vfp_ ", %1" \
+ : "=Q" (*current) : "r" (_var_))
#else
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses
2024-03-27 7:05 ` Ard Biesheuvel
2024-03-27 7:31 ` Ard Biesheuvel
@ 2024-03-27 14:41 ` Nathan Chancellor
1 sibling, 0 replies; 5+ messages in thread
From: Nathan Chancellor @ 2024-03-27 14:41 UTC (permalink / raw)
To: Ard Biesheuvel; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable
On Wed, Mar 27, 2024 at 09:05:17AM +0200, Ard Biesheuvel wrote:
> On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote:
> >
> > On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@kernel.org>
> > >
> > > Clang may reorder FP control register reads and writes, due to the fact
> > > that the inline asm() blocks in the read/write wrappers are not volatile
> > > qualified, and the compiler has no idea that these reads and writes may
> > > have side effects.
> > >
> > > In particular, reads of FPSCR may generate an UNDEF exception if a
> > > floating point exception is pending, and the FP emulation code in
> > > VFP_bounce() explicitly clears FP exceptions temporarily in order to be
> > > able to perform the emulation on behalf of user space. This requires
> > > that the writes to FPEXC are never reordered with respect to accesses to
> > > other FP control registers, such as FPSCR.
> > >
> > > So use asm volatile for both the read and the write helpers.
> > >
> > > Cc: <stable@kernel.org>
> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> >
> > This seems reasonable to me based on my understanding of GCC's
> > documentation. However, their documentation states "the compiler can
> > move even volatile asm instructions relative to other code, including
> > across jump instructions" and I feel like there was some discussion
> > around this sentence in the past but I can't remember what the
> > conclusion was, although I want to say Clang did not have the same
> > behavior.
>
> The only thing that matters here is whether two asm blocks are emitted
> in a different order than they appear in the program, and I would be
> very surprised if volatile permits that. Otherwise, we might introduce
> a fake input dependency or a memory clobber instead.
Yeah, I think it is reasonable to go with this approach and fall back to
your follow up suggestion if this proves not to be robust enough.
> > Regardless:
> >
> > Acked-by: Nathan Chancellor <nathan@kernel.org>
> >
>
> Thanks.
>
> > I am just curious, how was this discovered or noticed? Was there a
> > report I missed?
> >
>
> I noticed this when building a recent kernel for the original
> Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially
> relies on emulation. On more recent cores, we never hit the issue
> because emulation is never needed. On older cores, there is no VFP so
> we never reach this code path either.
Ah, good to know. I boot test -next on a Pi 3 with a 32-bit kernel, so
that explains why I have not seen any issues. Just wanted to make sure I
did not miss something :)
Cheers,
Nathan
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-27 14:41 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-18 9:30 [PATCH] ARM: vfp: use asm volatile for FP control register accesses Ard Biesheuvel
2024-03-26 23:55 ` Nathan Chancellor
2024-03-27 7:05 ` Ard Biesheuvel
2024-03-27 7:31 ` Ard Biesheuvel
2024-03-27 14:41 ` Nathan Chancellor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).