* [PATCH] arm64/io: add constant-argument check
@ 2024-05-28 12:08 Arnd Bergmann
2024-05-28 15:30 ` Arnd Bergmann
2024-05-29 11:14 ` Mark Rutland
0 siblings, 2 replies; 7+ messages in thread
From: Arnd Bergmann @ 2024-05-28 12:08 UTC (permalink / raw)
To: Catalin Marinas, Will Deacon, Jason Gunthorpe
Cc: Arnd Bergmann, Valentin Schneider, Mark Rutland, Baoquan He,
Peter Zijlstra, linux-arm-kernel, linux-kernel
From: Arnd Bergmann <arnd@arndb.de>
In some configurations __const_iowrite32_copy() does not get inlined
and gcc runs into the BUILD_BUG():
In file included from <command-line>:
In function '__const_memcpy_toio_aligned32',
inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:203:3,
inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:199:20:
include/linux/compiler_types.h:487:45: error: call to '__compiletime_assert_538' declared with attribute error: BUILD_BUG failed
487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^
include/linux/compiler_types.h:468:25: note: in definition of macro '__compiletime_assert'
468 | prefix ## suffix(); \
| ^~~~~~
include/linux/compiler_types.h:487:9: note: in expansion of macro '_compiletime_assert'
487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__)
| ^~~~~~~~~~~~~~~~~~~
include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert'
39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
| ^~~~~~~~~~~~~~~~~~
include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG'
59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed")
| ^~~~~~~~~~~~~~~~
arch/arm64/include/asm/io.h:193:17: note: in expansion of macro 'BUILD_BUG'
193 | BUILD_BUG();
| ^~~~~~~~~
Add a check to ensure that the argument is in fact a constant before
calling into __const_memcpy_toio_aligned32().
Fixes: ead79118dae6 ("arm64/io: Provide a WC friendly __iowriteXX_copy()")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
arch/arm64/include/asm/io.h | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 4ff0ae3f6d66..44913f227060 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -199,7 +199,8 @@ void __iowrite32_copy_full(void __iomem *to, const void *from, size_t count);
static inline void __const_iowrite32_copy(void __iomem *to, const void *from,
size_t count)
{
- if (count == 8 || count == 4 || count == 2 || count == 1) {
+ if (__builtin_constant_p(count) &&
+ (count == 8 || count == 4 || count == 2 || count == 1)) {
__const_memcpy_toio_aligned32(to, from, count);
dgh();
} else {
--
2.39.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 7+ messages in thread* Re: [PATCH] arm64/io: add constant-argument check 2024-05-28 12:08 [PATCH] arm64/io: add constant-argument check Arnd Bergmann @ 2024-05-28 15:30 ` Arnd Bergmann 2024-05-29 11:14 ` Mark Rutland 1 sibling, 0 replies; 7+ messages in thread From: Arnd Bergmann @ 2024-05-28 15:30 UTC (permalink / raw) To: Arnd Bergmann, Catalin Marinas, Will Deacon, Jason Gunthorpe Cc: Valentin Schneider, Mark Rutland, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Tue, May 28, 2024, at 14:08, Arnd Bergmann wrote: > diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h > index 4ff0ae3f6d66..44913f227060 100644 > --- a/arch/arm64/include/asm/io.h > +++ b/arch/arm64/include/asm/io.h > @@ -199,7 +199,8 @@ void __iowrite32_copy_full(void __iomem *to, const > void *from, size_t count); > static inline void __const_iowrite32_copy(void __iomem *to, const void > *from, > size_t count) > { > - if (count == 8 || count == 4 || count == 2 || count == 1) { > + if (__builtin_constant_p(count) && > + (count == 8 || count == 4 || count == 2 || count == 1)) { > __const_memcpy_toio_aligned32(to, from, count); > dgh(); > } else { I ran into the same issue on __const_iowrite64_copy() now, will send an updated patch. I also noticed that the __iowrite32_copy() and __iowrite64_copy() look redundant after my change, so I wonder if we should just remove those and rename __const_iowrite32_copy() and __const_iowrite64_copy() accordingly. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] arm64/io: add constant-argument check 2024-05-28 12:08 [PATCH] arm64/io: add constant-argument check Arnd Bergmann 2024-05-28 15:30 ` Arnd Bergmann @ 2024-05-29 11:14 ` Mark Rutland 2024-05-29 12:29 ` Arnd Bergmann 1 sibling, 1 reply; 7+ messages in thread From: Mark Rutland @ 2024-05-29 11:14 UTC (permalink / raw) To: Arnd Bergmann Cc: Catalin Marinas, Will Deacon, Jason Gunthorpe, Arnd Bergmann, Valentin Schneider, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Tue, May 28, 2024 at 02:08:38PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann <arnd@arndb.de> > > In some configurations __const_iowrite32_copy() does not get inlined > and gcc runs into the BUILD_BUG(): > > In file included from <command-line>: > In function '__const_memcpy_toio_aligned32', > inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:203:3, > inlined from '__const_iowrite32_copy' at arch/arm64/include/asm/io.h:199:20: > include/linux/compiler_types.h:487:45: error: call to '__compiletime_assert_538' declared with attribute error: BUILD_BUG failed > 487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > | ^ > include/linux/compiler_types.h:468:25: note: in definition of macro '__compiletime_assert' > 468 | prefix ## suffix(); \ > | ^~~~~~ > include/linux/compiler_types.h:487:9: note: in expansion of macro '_compiletime_assert' > 487 | _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) > | ^~~~~~~~~~~~~~~~~~~ > include/linux/build_bug.h:39:37: note: in expansion of macro 'compiletime_assert' > 39 | #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg) > | ^~~~~~~~~~~~~~~~~~ > include/linux/build_bug.h:59:21: note: in expansion of macro 'BUILD_BUG_ON_MSG' > 59 | #define BUILD_BUG() BUILD_BUG_ON_MSG(1, "BUILD_BUG failed") > | ^~~~~~~~~~~~~~~~ > arch/arm64/include/asm/io.h:193:17: note: in expansion of macro 'BUILD_BUG' > 193 | BUILD_BUG(); > | ^~~~~~~~~ > > Add a check to ensure that the argument is in fact a constant before > calling into __const_memcpy_toio_aligned32(). > > Fixes: ead79118dae6 ("arm64/io: Provide a WC friendly __iowriteXX_copy()") > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > --- > arch/arm64/include/asm/io.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h > index 4ff0ae3f6d66..44913f227060 100644 > --- a/arch/arm64/include/asm/io.h > +++ b/arch/arm64/include/asm/io.h > @@ -199,7 +199,8 @@ void __iowrite32_copy_full(void __iomem *to, const void *from, size_t count); > static inline void __const_iowrite32_copy(void __iomem *to, const void *from, > size_t count) > { > - if (count == 8 || count == 4 || count == 2 || count == 1) { > + if (__builtin_constant_p(count) && > + (count == 8 || count == 4 || count == 2 || count == 1)) { > __const_memcpy_toio_aligned32(to, from, count); > dgh(); > } else { I don't think this is the right fix. The idea was that this was checked in __iowrite32_copy(), which does: #define __iowrite32_copy(to, from, count) \ (__builtin_constant_p(count) ? \ __const_iowrite32_copy(to, from, count) : \ __iowrite32_copy_full(to, from, count)) ... and so __const_iowrite32_copy() should really be marked as __always_inline, and the same for __const_memcpy_toio_aligned32(), to guarantee that both get inlined and such that __const_memcpy_toio_aligned32() sees a constant. The same reasoning applies to __const_iowrite64_copy() and __const_memcpy_toio_aligned64(). Checking for a constant in __const_iowrite32_copy() doesn't guarantee that __const_memcpy_toio_aligned32() is inlined and will actually see a constant. Does diff the below you for you? Mark. ---->8---- diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 4ff0ae3f6d669..f4350aae92d5d 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -153,8 +153,9 @@ extern void __memset_io(volatile void __iomem *, int, size_t); * emit the large TLP from the CPU. */ -static inline void __const_memcpy_toio_aligned32(volatile u32 __iomem *to, - const u32 *from, size_t count) +static __always_inline void +__const_memcpy_toio_aligned32(volatile u32 __iomem *to, const u32 *from, + size_t count) { switch (count) { case 8: @@ -196,8 +197,8 @@ static inline void __const_memcpy_toio_aligned32(volatile u32 __iomem *to, void __iowrite32_copy_full(void __iomem *to, const void *from, size_t count); -static inline void __const_iowrite32_copy(void __iomem *to, const void *from, - size_t count) +static __always_inline void +__const_iowrite32_copy(void __iomem *to, const void *from, size_t count) { if (count == 8 || count == 4 || count == 2 || count == 1) { __const_memcpy_toio_aligned32(to, from, count); @@ -212,8 +213,9 @@ static inline void __const_iowrite32_copy(void __iomem *to, const void *from, __const_iowrite32_copy(to, from, count) : \ __iowrite32_copy_full(to, from, count)) -static inline void __const_memcpy_toio_aligned64(volatile u64 __iomem *to, - const u64 *from, size_t count) +static __always_inline void +__const_memcpy_toio_aligned64(volatile u64 __iomem *to, const u64 *from, + size_t count) { switch (count) { case 8: @@ -255,8 +257,8 @@ static inline void __const_memcpy_toio_aligned64(volatile u64 __iomem *to, void __iowrite64_copy_full(void __iomem *to, const void *from, size_t count); -static inline void __const_iowrite64_copy(void __iomem *to, const void *from, - size_t count) +static __always_inline void +__const_iowrite64_copy(void __iomem *to, const void *from, size_t count) { if (count == 8 || count == 4 || count == 2 || count == 1) { __const_memcpy_toio_aligned64(to, from, count); _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] arm64/io: add constant-argument check 2024-05-29 11:14 ` Mark Rutland @ 2024-05-29 12:29 ` Arnd Bergmann 2024-05-29 15:08 ` Mark Rutland 0 siblings, 1 reply; 7+ messages in thread From: Arnd Bergmann @ 2024-05-29 12:29 UTC (permalink / raw) To: Mark Rutland, Arnd Bergmann Cc: Catalin Marinas, Will Deacon, Jason Gunthorpe, Valentin Schneider, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Wed, May 29, 2024, at 13:14, Mark Rutland wrote: >> { >> - if (count == 8 || count == 4 || count == 2 || count == 1) { >> + if (__builtin_constant_p(count) && >> + (count == 8 || count == 4 || count == 2 || count == 1)) { >> __const_memcpy_toio_aligned32(to, from, count); >> dgh(); >> } else { > > I don't think this is the right fix. > > The idea was that this was checked in __iowrite32_copy(), which does: > > #define __iowrite32_copy(to, from, count) \ > (__builtin_constant_p(count) ? \ > __const_iowrite32_copy(to, from, count) : \ > __iowrite32_copy_full(to, from, count)) > > ... and so __const_iowrite32_copy() should really be marked as __always_inline, > and the same for __const_memcpy_toio_aligned32(), to guarantee that both get > inlined and such that __const_memcpy_toio_aligned32() sees a constant. > > The same reasoning applies to __const_iowrite64_copy() and > __const_memcpy_toio_aligned64(). > > Checking for a constant in __const_iowrite32_copy() doesn't guarantee > that __const_memcpy_toio_aligned32() is inlined and will actually see a > constant. > > Does diff the below you for you? Yes, your version addresses both failures I ran into, and I think all other theoretical cases. I would prefer to combine both though, using __always_inline to force the compiler to pick the inline version over __iowrite32_copy_full() even when it is optimizing for size and it decides the inline version is larger, but removing the extra complexity from the macro. According to Jason, he used a macro here to be sure that the compiler can detect an inline function argument as constant when the value is known but it is not a constant value according to the C standard. This was indeed a problem in older versions of clang that missed a lot of optimizations in the kernel, but clang-8 and higher were changed to have the same behavior as gcc here, so it is no longer necessary now that the older versions are unable to build kernels. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] arm64/io: add constant-argument check 2024-05-29 12:29 ` Arnd Bergmann @ 2024-05-29 15:08 ` Mark Rutland 2024-05-29 16:15 ` Arnd Bergmann 0 siblings, 1 reply; 7+ messages in thread From: Mark Rutland @ 2024-05-29 15:08 UTC (permalink / raw) To: Arnd Bergmann Cc: Arnd Bergmann, Catalin Marinas, Will Deacon, Jason Gunthorpe, Valentin Schneider, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Wed, May 29, 2024 at 02:29:37PM +0200, Arnd Bergmann wrote: > On Wed, May 29, 2024, at 13:14, Mark Rutland wrote: > >> { > >> - if (count == 8 || count == 4 || count == 2 || count == 1) { > >> + if (__builtin_constant_p(count) && > >> + (count == 8 || count == 4 || count == 2 || count == 1)) { > >> __const_memcpy_toio_aligned32(to, from, count); > >> dgh(); > >> } else { > > > > I don't think this is the right fix. > > > > The idea was that this was checked in __iowrite32_copy(), which does: > > > > #define __iowrite32_copy(to, from, count) \ > > (__builtin_constant_p(count) ? \ > > __const_iowrite32_copy(to, from, count) : \ > > __iowrite32_copy_full(to, from, count)) > > > > ... and so __const_iowrite32_copy() should really be marked as __always_inline, > > and the same for __const_memcpy_toio_aligned32(), to guarantee that both get > > inlined and such that __const_memcpy_toio_aligned32() sees a constant. > > > > The same reasoning applies to __const_iowrite64_copy() and > > __const_memcpy_toio_aligned64(). > > > > Checking for a constant in __const_iowrite32_copy() doesn't guarantee > > that __const_memcpy_toio_aligned32() is inlined and will actually see a > > constant. > > > > Does diff the below you for you? > > Yes, your version addresses both failures I ran into, and > I think all other theoretical cases. > > I would prefer to combine both though, using __always_inline > to force the compiler to pick the inline version over > __iowrite32_copy_full() even when it is optimizing for size > and it decides the inline version is larger, but removing > the extra complexity from the macro. Sorry, I'm not sure what you mean here. I don't see anything handling optimizing for size today so I'm not sure what change your suggesting to force the use of the inline version; AFAICT that'd always be forced for a suitable constant size. What change are you suggesting? > According to Jason, he used a macro here to be sure > that the compiler can detect an inline function argument > as constant when the value is known but it is not > a constant value according to the C standard. > > This was indeed a problem in older versions of clang > that missed a lot of optimizations in the kernel, but > clang-8 and higher were changed to have the same behavior > as gcc here, so it is no longer necessary now that the > older versions are unable to build kernels. Getting rid of the macro is fine by me. Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] arm64/io: add constant-argument check 2024-05-29 15:08 ` Mark Rutland @ 2024-05-29 16:15 ` Arnd Bergmann 2024-05-29 16:36 ` Mark Rutland 0 siblings, 1 reply; 7+ messages in thread From: Arnd Bergmann @ 2024-05-29 16:15 UTC (permalink / raw) To: Mark Rutland Cc: Arnd Bergmann, Catalin Marinas, Will Deacon, Jason Gunthorpe, Valentin Schneider, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Wed, May 29, 2024, at 17:08, Mark Rutland wrote: > On Wed, May 29, 2024 at 02:29:37PM +0200, Arnd Bergmann wrote: >> On Wed, May 29, 2024, at 13:14, Mark Rutland wrote: >> >> Yes, your version addresses both failures I ran into, and >> I think all other theoretical cases. >> >> I would prefer to combine both though, using __always_inline >> to force the compiler to pick the inline version over >> __iowrite32_copy_full() even when it is optimizing for size >> and it decides the inline version is larger, but removing >> the extra complexity from the macro. > > Sorry, I'm not sure what you mean here. I don't see anything handling > optimizing for size today so I'm not sure what change your suggesting to > force the use of the inline version; AFAICT that'd always be forced for > a suitable constant size. > > What change are you suggesting? What I meant is that reason gcc chooses to not inline the macro is when we build with CONFIG_CC_OPTIMIZE_FOR_SIZE. Since it doesn't know that __const_memcpy_toio_aligned64() is intended to be small after inlining, it sometimes decides against it, which (with just my patch) would fall back to the out-of-line __iowrite32_copy_full() while trying to generate smaller code. The __always_inline annotation just overrides the calculation. Arnd _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] arm64/io: add constant-argument check 2024-05-29 16:15 ` Arnd Bergmann @ 2024-05-29 16:36 ` Mark Rutland 0 siblings, 0 replies; 7+ messages in thread From: Mark Rutland @ 2024-05-29 16:36 UTC (permalink / raw) To: Arnd Bergmann Cc: Arnd Bergmann, Catalin Marinas, Will Deacon, Jason Gunthorpe, Valentin Schneider, Baoquan He, Peter Zijlstra, linux-arm-kernel, linux-kernel On Wed, May 29, 2024 at 06:15:57PM +0200, Arnd Bergmann wrote: > On Wed, May 29, 2024, at 17:08, Mark Rutland wrote: > > On Wed, May 29, 2024 at 02:29:37PM +0200, Arnd Bergmann wrote: > >> On Wed, May 29, 2024, at 13:14, Mark Rutland wrote: > > >> > >> Yes, your version addresses both failures I ran into, and > >> I think all other theoretical cases. > >> > >> I would prefer to combine both though, using __always_inline > >> to force the compiler to pick the inline version over > >> __iowrite32_copy_full() even when it is optimizing for size > >> and it decides the inline version is larger, but removing > >> the extra complexity from the macro. > > > > Sorry, I'm not sure what you mean here. I don't see anything handling > > optimizing for size today so I'm not sure what change your suggesting to > > force the use of the inline version; AFAICT that'd always be forced for > > a suitable constant size. > > > > What change are you suggesting? > > What I meant is that reason gcc chooses to not inline > the macro is when we build with CONFIG_CC_OPTIMIZE_FOR_SIZE. > > Since it doesn't know that __const_memcpy_toio_aligned64() > is intended to be small after inlining, it sometimes > decides against it, which (with just my patch) would > fall back to the out-of-line __iowrite32_copy_full() > while trying to generate smaller code. > > The __always_inline annotation just overrides the > calculation. Ah, ok. I think what you're suggesting is: * Add the __always_inline annotations, as in my patch. * Move the __builtin_constant_p check into __const_iowrite32_copy(), as in your patch. * Remove the __iowrite32_copy() macro and rename __const_iowrite32_copy() to __iowrite32_copy(), removing the redundant logic. Assuming so, that makes total sense to me. Mark. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-05-29 16:36 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-05-28 12:08 [PATCH] arm64/io: add constant-argument check Arnd Bergmann 2024-05-28 15:30 ` Arnd Bergmann 2024-05-29 11:14 ` Mark Rutland 2024-05-29 12:29 ` Arnd Bergmann 2024-05-29 15:08 ` Mark Rutland 2024-05-29 16:15 ` Arnd Bergmann 2024-05-29 16:36 ` Mark Rutland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).