* [PATCH] ARM: vfp: use asm volatile for FP control register accesses @ 2024-03-18 9:30 Ard Biesheuvel 2024-03-26 23:55 ` Nathan Chancellor 0 siblings, 1 reply; 5+ messages in thread From: Ard Biesheuvel @ 2024-03-18 9:30 UTC (permalink / raw) To: linux-arm-kernel Cc: linux, arnd, nathan, linus.walleij, Ard Biesheuvel, stable From: Ard Biesheuvel <ardb@kernel.org> Clang may reorder FP control register reads and writes, due to the fact that the inline asm() blocks in the read/write wrappers are not volatile qualified, and the compiler has no idea that these reads and writes may have side effects. In particular, reads of FPSCR may generate an UNDEF exception if a floating point exception is pending, and the FP emulation code in VFP_bounce() explicitly clears FP exceptions temporarily in order to be able to perform the emulation on behalf of user space. This requires that the writes to FPEXC are never reordered with respect to accesses to other FP control registers, such as FPSCR. So use asm volatile for both the read and the write helpers. Cc: <stable@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> --- arch/arm/vfp/vfpinstr.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h index 3c7938fd40aa..c4ac778e6fc9 100644 --- a/arch/arm/vfp/vfpinstr.h +++ b/arch/arm/vfp/vfpinstr.h @@ -66,14 +66,14 @@ #define fmrx(_vfp_) ({ \ u32 __v; \ - asm(".fpu vfpv2\n" \ + asm volatile(".fpu vfpv2\n" \ "vmrs %0, " #_vfp_ \ : "=r" (__v) : : "cc"); \ __v; \ }) #define fmxr(_vfp_,_var_) \ - asm(".fpu vfpv2\n" \ + asm volatile(".fpu vfpv2\n" \ "vmsr " #_vfp_ ", %0" \ : : "r" (_var_) : "cc") -- 2.39.2 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses 2024-03-18 9:30 [PATCH] ARM: vfp: use asm volatile for FP control register accesses Ard Biesheuvel @ 2024-03-26 23:55 ` Nathan Chancellor 2024-03-27 7:05 ` Ard Biesheuvel 0 siblings, 1 reply; 5+ messages in thread From: Nathan Chancellor @ 2024-03-26 23:55 UTC (permalink / raw) To: Ard Biesheuvel Cc: linux-arm-kernel, linux, arnd, linus.walleij, Ard Biesheuvel, stable On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote: > From: Ard Biesheuvel <ardb@kernel.org> > > Clang may reorder FP control register reads and writes, due to the fact > that the inline asm() blocks in the read/write wrappers are not volatile > qualified, and the compiler has no idea that these reads and writes may > have side effects. > > In particular, reads of FPSCR may generate an UNDEF exception if a > floating point exception is pending, and the FP emulation code in > VFP_bounce() explicitly clears FP exceptions temporarily in order to be > able to perform the emulation on behalf of user space. This requires > that the writes to FPEXC are never reordered with respect to accesses to > other FP control registers, such as FPSCR. > > So use asm volatile for both the read and the write helpers. > > Cc: <stable@kernel.org> > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> This seems reasonable to me based on my understanding of GCC's documentation. However, their documentation states "the compiler can move even volatile asm instructions relative to other code, including across jump instructions" and I feel like there was some discussion around this sentence in the past but I can't remember what the conclusion was, although I want to say Clang did not have the same behavior. Regardless: Acked-by: Nathan Chancellor <nathan@kernel.org> I am just curious, how was this discovered or noticed? Was there a report I missed? > --- > arch/arm/vfp/vfpinstr.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/arm/vfp/vfpinstr.h b/arch/arm/vfp/vfpinstr.h > index 3c7938fd40aa..c4ac778e6fc9 100644 > --- a/arch/arm/vfp/vfpinstr.h > +++ b/arch/arm/vfp/vfpinstr.h > @@ -66,14 +66,14 @@ > > #define fmrx(_vfp_) ({ \ > u32 __v; \ > - asm(".fpu vfpv2\n" \ > + asm volatile(".fpu vfpv2\n" \ > "vmrs %0, " #_vfp_ \ > : "=r" (__v) : : "cc"); \ > __v; \ > }) > > #define fmxr(_vfp_,_var_) \ > - asm(".fpu vfpv2\n" \ > + asm volatile(".fpu vfpv2\n" \ > "vmsr " #_vfp_ ", %0" \ > : : "r" (_var_) : "cc") > > -- > 2.39.2 > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses 2024-03-26 23:55 ` Nathan Chancellor @ 2024-03-27 7:05 ` Ard Biesheuvel 2024-03-27 7:31 ` Ard Biesheuvel 2024-03-27 14:41 ` Nathan Chancellor 0 siblings, 2 replies; 5+ messages in thread From: Ard Biesheuvel @ 2024-03-27 7:05 UTC (permalink / raw) To: Nathan Chancellor; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote: > > On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote: > > From: Ard Biesheuvel <ardb@kernel.org> > > > > Clang may reorder FP control register reads and writes, due to the fact > > that the inline asm() blocks in the read/write wrappers are not volatile > > qualified, and the compiler has no idea that these reads and writes may > > have side effects. > > > > In particular, reads of FPSCR may generate an UNDEF exception if a > > floating point exception is pending, and the FP emulation code in > > VFP_bounce() explicitly clears FP exceptions temporarily in order to be > > able to perform the emulation on behalf of user space. This requires > > that the writes to FPEXC are never reordered with respect to accesses to > > other FP control registers, such as FPSCR. > > > > So use asm volatile for both the read and the write helpers. > > > > Cc: <stable@kernel.org> > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > This seems reasonable to me based on my understanding of GCC's > documentation. However, their documentation states "the compiler can > move even volatile asm instructions relative to other code, including > across jump instructions" and I feel like there was some discussion > around this sentence in the past but I can't remember what the > conclusion was, although I want to say Clang did not have the same > behavior. The only thing that matters here is whether two asm blocks are emitted in a different order than they appear in the program, and I would be very surprised if volatile permits that. Otherwise, we might introduce a fake input dependency or a memory clobber instead. > Regardless: > > Acked-by: Nathan Chancellor <nathan@kernel.org> > Thanks. > I am just curious, how was this discovered or noticed? Was there a > report I missed? > I noticed this when building a recent kernel for the original Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially relies on emulation. On more recent cores, we never hit the issue because emulation is never needed. On older cores, there is no VFP so we never reach this code path either. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses 2024-03-27 7:05 ` Ard Biesheuvel @ 2024-03-27 7:31 ` Ard Biesheuvel 2024-03-27 14:41 ` Nathan Chancellor 1 sibling, 0 replies; 5+ messages in thread From: Ard Biesheuvel @ 2024-03-27 7:31 UTC (permalink / raw) To: Nathan Chancellor; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable On Wed, 27 Mar 2024 at 09:05, Ard Biesheuvel <ardb@kernel.org> wrote: > > On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote: > > > > On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote: > > > From: Ard Biesheuvel <ardb@kernel.org> > > > > > > Clang may reorder FP control register reads and writes, due to the fact > > > that the inline asm() blocks in the read/write wrappers are not volatile > > > qualified, and the compiler has no idea that these reads and writes may > > > have side effects. > > > > > > In particular, reads of FPSCR may generate an UNDEF exception if a > > > floating point exception is pending, and the FP emulation code in > > > VFP_bounce() explicitly clears FP exceptions temporarily in order to be > > > able to perform the emulation on behalf of user space. This requires > > > that the writes to FPEXC are never reordered with respect to accesses to > > > other FP control registers, such as FPSCR. > > > > > > So use asm volatile for both the read and the write helpers. > > > > > > Cc: <stable@kernel.org> > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > > > This seems reasonable to me based on my understanding of GCC's > > documentation. However, their documentation states "the compiler can > > move even volatile asm instructions relative to other code, including > > across jump instructions" and I feel like there was some discussion > > around this sentence in the past but I can't remember what the > > conclusion was, although I want to say Clang did not have the same > > behavior. > > The only thing that matters here is whether two asm blocks are emitted > in a different order than they appear in the program, and I would be > very surprised if volatile permits that. Otherwise, we might introduce > a fake input dependency or a memory clobber instead. > > > Regardless: > > > > Acked-by: Nathan Chancellor <nathan@kernel.org> > > > > Thanks. > > > I am just curious, how was this discovered or noticed? Was there a > > report I missed? > > > > I noticed this when building a recent kernel for the original > Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially > relies on emulation. On more recent cores, we never hit the issue > because emulation is never needed. On older cores, there is no VFP so > we never reach this code path either. An alternative approach might be to do the following, which tricks the compiler into thinking fmxr might update *current, and fmrx might access it. Given that current == current_thread_info(), which is used all the time in the VFP code, it will already be available in a register, and so it doesn't require the compiler to generate any additional code. --- a/arch/arm/vfp/vfpinstr.h +++ b/arch/arm/vfp/vfpinstr.h @@ -68,14 +68,14 @@ u32 __v; \ asm(".fpu vfpv2\n" \ "vmrs %0, " #_vfp_ \ - : "=r" (__v) : : "cc"); \ + : "=r" (__v) : "Q" (*current)); \ __v; \ }) #define fmxr(_vfp_,_var_) \ asm(".fpu vfpv2\n" \ - "vmsr " #_vfp_ ", %0" \ - : : "r" (_var_) : "cc") + "vmsr " #_vfp_ ", %1" \ + : "=Q" (*current) : "r" (_var_)) #else _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] ARM: vfp: use asm volatile for FP control register accesses 2024-03-27 7:05 ` Ard Biesheuvel 2024-03-27 7:31 ` Ard Biesheuvel @ 2024-03-27 14:41 ` Nathan Chancellor 1 sibling, 0 replies; 5+ messages in thread From: Nathan Chancellor @ 2024-03-27 14:41 UTC (permalink / raw) To: Ard Biesheuvel; +Cc: linux-arm-kernel, linux, arnd, linus.walleij, stable On Wed, Mar 27, 2024 at 09:05:17AM +0200, Ard Biesheuvel wrote: > On Wed, 27 Mar 2024 at 01:55, Nathan Chancellor <nathan@kernel.org> wrote: > > > > On Mon, Mar 18, 2024 at 10:30:05AM +0100, Ard Biesheuvel wrote: > > > From: Ard Biesheuvel <ardb@kernel.org> > > > > > > Clang may reorder FP control register reads and writes, due to the fact > > > that the inline asm() blocks in the read/write wrappers are not volatile > > > qualified, and the compiler has no idea that these reads and writes may > > > have side effects. > > > > > > In particular, reads of FPSCR may generate an UNDEF exception if a > > > floating point exception is pending, and the FP emulation code in > > > VFP_bounce() explicitly clears FP exceptions temporarily in order to be > > > able to perform the emulation on behalf of user space. This requires > > > that the writes to FPEXC are never reordered with respect to accesses to > > > other FP control registers, such as FPSCR. > > > > > > So use asm volatile for both the read and the write helpers. > > > > > > Cc: <stable@kernel.org> > > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > > > This seems reasonable to me based on my understanding of GCC's > > documentation. However, their documentation states "the compiler can > > move even volatile asm instructions relative to other code, including > > across jump instructions" and I feel like there was some discussion > > around this sentence in the past but I can't remember what the > > conclusion was, although I want to say Clang did not have the same > > behavior. > > The only thing that matters here is whether two asm blocks are emitted > in a different order than they appear in the program, and I would be > very surprised if volatile permits that. Otherwise, we might introduce > a fake input dependency or a memory clobber instead. Yeah, I think it is reasonable to go with this approach and fall back to your follow up suggestion if this proves not to be robust enough. > > Regardless: > > > > Acked-by: Nathan Chancellor <nathan@kernel.org> > > > > Thanks. > > > I am just curious, how was this discovered or noticed? Was there a > > report I missed? > > > > I noticed this when building a recent kernel for the original > Raspberry Pi, which is ARMv6 not ARMv7, and has a VFP which partially > relies on emulation. On more recent cores, we never hit the issue > because emulation is never needed. On older cores, there is no VFP so > we never reach this code path either. Ah, good to know. I boot test -next on a Pi 3 with a 32-bit kernel, so that explains why I have not seen any issues. Just wanted to make sure I did not miss something :) Cheers, Nathan _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2024-03-27 14:41 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-03-18 9:30 [PATCH] ARM: vfp: use asm volatile for FP control register accesses Ard Biesheuvel 2024-03-26 23:55 ` Nathan Chancellor 2024-03-27 7:05 ` Ard Biesheuvel 2024-03-27 7:31 ` Ard Biesheuvel 2024-03-27 14:41 ` Nathan Chancellor
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).