* [PATCH 0/3] s390: Fix and optimize __flogr() inline assembly
@ 2025-09-10 15:12 Heiko Carstens
2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens
` (2 more replies)
0 siblings, 3 replies; 15+ messages in thread
From: Heiko Carstens @ 2025-09-10 15:12 UTC (permalink / raw)
To: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev,
Juergen Christ
Cc: linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger
A recent optimization of the s390 specific ffs() and ffs64()
implementations leads to a new compiler warning. Instead of reverting the
optimization address this with the rather new assume attribute, which
generates even better code, if supported by compilers.
Since the assume attribute may be useful for others as well, add the
__assume macro to compiler attributes, so it is kernel wide available,
instead of adding an s390 specific optimization.
Heiko Carstens (3):
Compiler Attributes: Add __assume macro
s390/bitops: Limit return value range of __flogr()
s390/bitops: Remove volatile qualifier from flogr() inline assembly
arch/s390/include/asm/bitops.h | 16 +++++++++-------
include/linux/compiler_attributes.h | 16 ++++++++++++++++
2 files changed, 25 insertions(+), 7 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 15+ messages in thread* [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-10 15:12 [PATCH 0/3] s390: Fix and optimize __flogr() inline assembly Heiko Carstens @ 2025-09-10 15:12 ` Heiko Carstens 2025-09-11 1:32 ` Nathan Chancellor 2025-09-11 18:59 ` Miguel Ojeda 2025-09-10 15:12 ` [PATCH 2/3] s390/bitops: Limit return value range of __flogr() Heiko Carstens 2025-09-10 15:12 ` [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly Heiko Carstens 2 siblings, 2 replies; 15+ messages in thread From: Heiko Carstens @ 2025-09-10 15:12 UTC (permalink / raw) To: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ Cc: linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger Make the statement attribute "assume" with a new __assume macro available. This allows compilers to generate better code, however code which makes use of __assume must be written as if the compiler ignores the hint. Otherwise this may lead to subtle bugs if code is compiled with compilers which do not support the attribute. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> --- include/linux/compiler_attributes.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+) diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h index c16d4199bf92..16c3d4a865e2 100644 --- a/include/linux/compiler_attributes.h +++ b/include/linux/compiler_attributes.h @@ -54,6 +54,22 @@ */ #define __always_inline inline __attribute__((__always_inline__)) +/* + * Beware: Code which makes use of __assume must be written as if the compiler + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled + * with compilers which do not support the attribute. + * + * Optional: only supported since GCC >= 13.1, clang >= 12.0 + * + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute + * clang: https://clang.llvm.org/docs/AttributeReference.html#assume + */ +#if __has_attribute(assume) +# define __assume(expr) __attribute__((__assume__(expr))) +#else +# define __assume(expr) +#endif + /* * The second argument is optional (default 0), so we use a variadic macro * to make the shorthand. -- 2.48.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens @ 2025-09-11 1:32 ` Nathan Chancellor 2025-09-11 14:56 ` Heiko Carstens 2025-09-11 18:56 ` Miguel Ojeda 2025-09-11 18:59 ` Miguel Ojeda 1 sibling, 2 replies; 15+ messages in thread From: Nathan Chancellor @ 2025-09-11 1:32 UTC (permalink / raw) To: Heiko Carstens Cc: Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger Hi Heiko, On Wed, Sep 10, 2025 at 05:12:14PM +0200, Heiko Carstens wrote: > Make the statement attribute "assume" with a new __assume macro available. > > This allows compilers to generate better code, however code which makes use > of __assume must be written as if the compiler ignores the hint. Otherwise > this may lead to subtle bugs if code is compiled with compilers which do > not support the attribute. > > Signed-off-by: Heiko Carstens <hca@linux.ibm.com> > --- > include/linux/compiler_attributes.h | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) > > diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h > index c16d4199bf92..16c3d4a865e2 100644 > --- a/include/linux/compiler_attributes.h > +++ b/include/linux/compiler_attributes.h > @@ -54,6 +54,22 @@ > */ > #define __always_inline inline __attribute__((__always_inline__)) > > +/* > + * Beware: Code which makes use of __assume must be written as if the compiler > + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled > + * with compilers which do not support the attribute. It may be worth noting that careful analysis should be performed when adding this attribute since clang's documentation [1] (more on that below...) notes that it could hurt optimizations just as much as it could help it. > + * > + * Optional: only supported since GCC >= 13.1, clang >= 12.0 > + * > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute > + * clang: https://clang.llvm.org/docs/AttributeReference.html#assume Looking at this link sent me down a bit of a rabbit hole :) Prior to Clang 19.1.0 [2], assume was an OpenMP attribute, which has completely different semantics and errors out when used in the way the series does: In file included from kernel/bounds.c:13: In file included from include/linux/log2.h:12: In file included from include/linux/bitops.h:67: arch/s390/include/asm/bitops.h:173:12: error: expected string literal as argument of '__assume__' attribute 173 | __assume(bit <= 64); | ^ Unfortunately, I think __assume will need to be handled in the compiler specific headers :/ [1]: https://clang.llvm.org/docs/AttributeReference.html#id13 [2]: https://github.com/llvm/llvm-project/commit/c44fa3e8a9a44c2e9a575768a3c185354b9f6c17 Cheers, Nathan > + */ > +#if __has_attribute(assume) > +# define __assume(expr) __attribute__((__assume__(expr))) > +#else > +# define __assume(expr) > +#endif > + > /* > * The second argument is optional (default 0), so we use a variadic macro > * to make the shorthand. > -- > 2.48.1 > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 1:32 ` Nathan Chancellor @ 2025-09-11 14:56 ` Heiko Carstens 2025-09-11 18:44 ` Nathan Chancellor 2025-09-11 18:56 ` Miguel Ojeda 1 sibling, 1 reply; 15+ messages in thread From: Heiko Carstens @ 2025-09-11 14:56 UTC (permalink / raw) To: Nathan Chancellor Cc: Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Wed, Sep 10, 2025 at 06:32:43PM -0700, Nathan Chancellor wrote: > > + * > > + * Optional: only supported since GCC >= 13.1, clang >= 12.0 > > + * > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#assume > > Looking at this link sent me down a bit of a rabbit hole :) Prior to > Clang 19.1.0 [2], assume was an OpenMP attribute, which has completely > different semantics and errors out when used in the way the series does: > > In file included from kernel/bounds.c:13: > In file included from include/linux/log2.h:12: > In file included from include/linux/bitops.h:67: > arch/s390/include/asm/bitops.h:173:12: error: expected string literal as argument of '__assume__' attribute > 173 | __assume(bit <= 64); > | ^ > > Unfortunately, I think __assume will need to be handled in the compiler > specific headers :/ > > [1]: https://clang.llvm.org/docs/AttributeReference.html#id13 > [2]: https://github.com/llvm/llvm-project/commit/c44fa3e8a9a44c2e9a575768a3c185354b9f6c17 Thank you for having look. This is quite surprising. So after looking into the various header files it might be acceptable to add this to compiler_types.h, since there seem to be a few similar constructs. Maybe something like this(?): From d9d67807e6854666507e55d9ac0c7b4ec659aa99 Mon Sep 17 00:00:00 2001 From: Heiko Carstens <hca@linux.ibm.com> Date: Wed, 10 Sep 2025 14:18:07 +0200 Subject: [PATCH] compiler_types: Add __assume macro Make the statement attribute "assume" with a new __assume macro available. This allows compilers to generate better code, however code which makes use of __assume must be written as if the compiler ignores the hint. Otherwise this may lead to subtle bugs if code is compiled with compilers which do not support the attribute. Signed-off-by: Heiko Carstens <hca@linux.ibm.com> --- include/linux/compiler_types.h | 20 ++++++++++++++++++++ init/Kconfig | 10 ++++++++++ 2 files changed, 30 insertions(+) diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index 16755431fc11..38a52a792e48 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -329,6 +329,26 @@ struct ftrace_likely_data { #define __no_sanitize_or_inline __always_inline #endif +/* + * Beware: Code which makes use of __assume must be written as if the compiler + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled + * with compilers which do not support the attribute. + * Using this attribute requires careful analysis, since in some cases it may + * generate worse code (see clang documentation). + * + * Optional: only supported since gcc >= 13 + * Optional: only supported since clang >= 19 + * + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute + * clang: https://clang.llvm.org/docs/AttributeReference.html#id13 + * + */ +#ifdef CONFIG_CC_HAS_ASSUME +# define __assume(expr) __attribute__((__assume__(expr))) +#else +# define __assume(expr) +#endif + /* * Optional: only supported since gcc >= 15 * Optional: only supported since clang >= 18 diff --git a/init/Kconfig b/init/Kconfig index e3eb63eadc87..5882c5e74047 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -112,6 +112,16 @@ config TOOLS_SUPPORT_RELR config CC_HAS_ASM_INLINE def_bool $(success,echo 'void foo(void) { asm inline (""); }' | $(CC) -x c - -c -o /dev/null) +config CC_HAS_ASSUME + bool + # clang needs to be at least 19.1.0 since the meaning of the assume + # attribute changed: + # https://github.com/llvm/llvm-project/commit/c44fa3e8a9a44c2e9a575768a3c185354b9f6c17 + default y if CC_IS_CLANG && CLANG_VERSION >= 190100 + # supported since gcc 13.1.0 + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106654 + default y if CC_IS_GCC && GCC_VERSION >= 130100 + config CC_HAS_NO_PROFILE_FN_ATTR def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror) -- 2.48.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 14:56 ` Heiko Carstens @ 2025-09-11 18:44 ` Nathan Chancellor 2025-09-11 19:04 ` Miguel Ojeda 0 siblings, 1 reply; 15+ messages in thread From: Nathan Chancellor @ 2025-09-11 18:44 UTC (permalink / raw) To: Heiko Carstens Cc: Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Thu, Sep 11, 2025 at 04:56:59PM +0200, Heiko Carstens wrote: > On Wed, Sep 10, 2025 at 06:32:43PM -0700, Nathan Chancellor wrote: > > > + * > > > + * Optional: only supported since GCC >= 13.1, clang >= 12.0 > > > + * > > > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute > > > + * clang: https://clang.llvm.org/docs/AttributeReference.html#assume > > > > Looking at this link sent me down a bit of a rabbit hole :) Prior to > > Clang 19.1.0 [2], assume was an OpenMP attribute, which has completely > > different semantics and errors out when used in the way the series does: > > > > In file included from kernel/bounds.c:13: > > In file included from include/linux/log2.h:12: > > In file included from include/linux/bitops.h:67: > > arch/s390/include/asm/bitops.h:173:12: error: expected string literal as argument of '__assume__' attribute > > 173 | __assume(bit <= 64); > > | ^ > > > > Unfortunately, I think __assume will need to be handled in the compiler > > specific headers :/ > > > > [1]: https://clang.llvm.org/docs/AttributeReference.html#id13 > > [2]: https://github.com/llvm/llvm-project/commit/c44fa3e8a9a44c2e9a575768a3c185354b9f6c17 > > Thank you for having look. This is quite surprising. So after looking into the > various header files it might be acceptable to add this to compiler_types.h, > since there seem to be a few similar constructs. > > Maybe something like this(?): Ah, yeah, that would work too. I had not considered compiler_types.h since most of those tend to involve dynamic checks via cc-option but I do like keeping the documentation attached to the attribute in a single location, rather than duplicating it in the individual compiler files. This will also make it easy to move this into compiler_attributes.h in the (likely distant) future when both compilers support this unconditionally (or clang 19.1.0 is the minimum supported version so __has_attribute can be used). > From d9d67807e6854666507e55d9ac0c7b4ec659aa99 Mon Sep 17 00:00:00 2001 > From: Heiko Carstens <hca@linux.ibm.com> > Date: Wed, 10 Sep 2025 14:18:07 +0200 > Subject: [PATCH] compiler_types: Add __assume macro > > Make the statement attribute "assume" with a new __assume macro available. > > This allows compilers to generate better code, however code which makes use > of __assume must be written as if the compiler ignores the hint. Otherwise > this may lead to subtle bugs if code is compiled with compilers which do > not support the attribute. > > Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> I do not think anyone really owns compiler_types.h so unless Miguel has any objections from the compiler attributes perspective, I think you can just take this via the s390 tree with the other two changes. > --- > include/linux/compiler_types.h | 20 ++++++++++++++++++++ > init/Kconfig | 10 ++++++++++ > 2 files changed, 30 insertions(+) > > diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h > index 16755431fc11..38a52a792e48 100644 > --- a/include/linux/compiler_types.h > +++ b/include/linux/compiler_types.h > @@ -329,6 +329,26 @@ struct ftrace_likely_data { > #define __no_sanitize_or_inline __always_inline > #endif > > +/* > + * Beware: Code which makes use of __assume must be written as if the compiler > + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled > + * with compilers which do not support the attribute. > + * Using this attribute requires careful analysis, since in some cases it may > + * generate worse code (see clang documentation). > + * > + * Optional: only supported since gcc >= 13 > + * Optional: only supported since clang >= 19 > + * > + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Statement-Attributes.html#index-assume-statement-attribute > + * clang: https://clang.llvm.org/docs/AttributeReference.html#id13 > + * > + */ > +#ifdef CONFIG_CC_HAS_ASSUME > +# define __assume(expr) __attribute__((__assume__(expr))) > +#else > +# define __assume(expr) > +#endif > + > /* > * Optional: only supported since gcc >= 15 > * Optional: only supported since clang >= 18 > diff --git a/init/Kconfig b/init/Kconfig > index e3eb63eadc87..5882c5e74047 100644 > --- a/init/Kconfig > +++ b/init/Kconfig > @@ -112,6 +112,16 @@ config TOOLS_SUPPORT_RELR > config CC_HAS_ASM_INLINE > def_bool $(success,echo 'void foo(void) { asm inline (""); }' | $(CC) -x c - -c -o /dev/null) > > +config CC_HAS_ASSUME > + bool > + # clang needs to be at least 19.1.0 since the meaning of the assume > + # attribute changed: > + # https://github.com/llvm/llvm-project/commit/c44fa3e8a9a44c2e9a575768a3c185354b9f6c17 > + default y if CC_IS_CLANG && CLANG_VERSION >= 190100 > + # supported since gcc 13.1.0 > + # https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106654 > + default y if CC_IS_GCC && GCC_VERSION >= 130100 > + > config CC_HAS_NO_PROFILE_FN_ATTR > def_bool $(success,echo '__attribute__((no_profile_instrument_function)) int x();' | $(CC) -x c - -c -o /dev/null -Werror) > > -- > 2.48.1 ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 18:44 ` Nathan Chancellor @ 2025-09-11 19:04 ` Miguel Ojeda 2025-09-11 20:42 ` Nathan Chancellor 0 siblings, 1 reply; 15+ messages in thread From: Miguel Ojeda @ 2025-09-11 19:04 UTC (permalink / raw) To: Nathan Chancellor Cc: Heiko Carstens, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Thu, Sep 11, 2025 at 8:44 PM Nathan Chancellor <nathan@kernel.org> wrote: > > I do not think anyone really owns compiler_types.h so unless Miguel has > any objections from the compiler attributes perspective, I think you can > just take this via the s390 tree with the other two changes. No objections from me, and thanks for spotting the OpenMP thing above. I would say, though, that this is a fairly general and subtle tool to have around, so it would be nice to have others chime in. In other words, do we want to start using `assume`s? Should we constrain its use a bit, e.g. say its use should really be justified etc.? (In the Rust side, a tool like this would require a SAFETY comment on top with a justification, which may give a developer pause). Thanks! Cheers, Miguel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 19:04 ` Miguel Ojeda @ 2025-09-11 20:42 ` Nathan Chancellor 0 siblings, 0 replies; 15+ messages in thread From: Nathan Chancellor @ 2025-09-11 20:42 UTC (permalink / raw) To: Miguel Ojeda Cc: Heiko Carstens, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Thu, Sep 11, 2025 at 09:04:36PM +0200, Miguel Ojeda wrote: > On Thu, Sep 11, 2025 at 8:44 PM Nathan Chancellor <nathan@kernel.org> wrote: > > > > I do not think anyone really owns compiler_types.h so unless Miguel has > > any objections from the compiler attributes perspective, I think you can > > just take this via the s390 tree with the other two changes. > > No objections from me, and thanks for spotting the OpenMP thing above. > > I would say, though, that this is a fairly general and subtle tool to > have around, so it would be nice to have others chime in. In other > words, do we want to start using `assume`s? Should we constrain its > use a bit, e.g. say its use should really be justified etc.? (In the > Rust side, a tool like this would require a SAFETY comment on top with > a justification, which may give a developer pause). I do think justification at the source level (i.e., a comment) would be a good baseline. I thought I remember a similar discussion around likely() / unlikely() annotations since those should have some evidence of benefit behind it. Applying the same policy to __assume() usage would help ensure there is sufficient justification for adding and maintaining such annotations, especially if they turn out to cause problems later. Not sure if there should be a format standard like exists for SAFETY comments but something is better than nothing. Cheers, Nathan ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 1:32 ` Nathan Chancellor 2025-09-11 14:56 ` Heiko Carstens @ 2025-09-11 18:56 ` Miguel Ojeda 1 sibling, 0 replies; 15+ messages in thread From: Miguel Ojeda @ 2025-09-11 18:56 UTC (permalink / raw) To: Nathan Chancellor Cc: Heiko Carstens, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Thu, Sep 11, 2025 at 3:32 AM Nathan Chancellor <nathan@kernel.org> wrote: > > It may be worth noting that careful analysis should be performed when > adding this attribute since clang's documentation [1] (more on that > below...) notes that it could hurt optimizations just as much as it > could help it. Yeah, it can be tricky, and I assume it may depend on the compiler version too, i.e. the result could change over time. At least for "build asserts relying on optimizations" it is clear if it stops working, but here I assume we may have new compiler versions getting released that stop doing what the developer intended. But perhaps is not a problem in practice for the cases we care? Does someone know? > Looking at this link sent me down a bit of a rabbit hole :) Prior to > Clang 19.1.0 [2], assume was an OpenMP attribute, which has completely > different semantics and errors out when used in the way the series does: Oh... :( Cheers, Miguel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens 2025-09-11 1:32 ` Nathan Chancellor @ 2025-09-11 18:59 ` Miguel Ojeda 2025-09-12 10:25 ` Heiko Carstens 1 sibling, 1 reply; 15+ messages in thread From: Miguel Ojeda @ 2025-09-11 18:59 UTC (permalink / raw) To: Heiko Carstens Cc: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Wed, Sep 10, 2025 at 5:12 PM Heiko Carstens <hca@linux.ibm.com> wrote: > > + * Beware: Code which makes use of __assume must be written as if the compiler > + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled > + * with compilers which do not support the attribute. I am not sure I understand this "Beware:" comment: is it referring to evaluation side-effects? If so, the GCC docs say it is not evaluated. The real danger is triggering UB with it, but that is different, i.e. one needs to be really, really sure the expression is true. Cheers, Miguel ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 1/3] Compiler Attributes: Add __assume macro 2025-09-11 18:59 ` Miguel Ojeda @ 2025-09-12 10:25 ` Heiko Carstens 0 siblings, 0 replies; 15+ messages in thread From: Heiko Carstens @ 2025-09-12 10:25 UTC (permalink / raw) To: Miguel Ojeda Cc: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger On Thu, Sep 11, 2025 at 08:59:29PM +0200, Miguel Ojeda wrote: > On Wed, Sep 10, 2025 at 5:12 PM Heiko Carstens <hca@linux.ibm.com> wrote: > > > > + * Beware: Code which makes use of __assume must be written as if the compiler > > + * ignores the hint. Otherwise this may lead to subtle bugs if code is compiled > > + * with compilers which do not support the attribute. > > I am not sure I understand this "Beware:" comment: is it referring to > evaluation side-effects? If so, the GCC docs say it is not evaluated. > The real danger is triggering UB with it, but that is different, i.e. > one needs to be really, really sure the expression is true. No, I was referring to the original build error where the missing "& 127" lead to a warning / build error. So what I was trying to say: if you have a construct like: ... return a & 127; and then make this: ... __assume(a < 64); return a & 127; then it is not possible to leave the "& 127" part away, since __assume() is optional. But thinking about this again, I guess the comment is misleading, like also your reply proved. This is not about subtle bugs, but just an optimization that is not being done, which may or may not lead to compile time warnings for the particular case I was trying to improve; but the code would be correct in any case, as long as __assume() is used correctly. I'll rephrase the comment, and split / reorder patches differently so it is (hopefully) more obvious what I try to achieve: allow for micro-optimizations of inline assembly outputs. ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 2/3] s390/bitops: Limit return value range of __flogr() 2025-09-10 15:12 [PATCH 0/3] s390: Fix and optimize __flogr() inline assembly Heiko Carstens 2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens @ 2025-09-10 15:12 ` Heiko Carstens 2025-09-11 7:44 ` Juergen Christ 2025-09-11 13:24 ` kernel test robot 2025-09-10 15:12 ` [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly Heiko Carstens 2 siblings, 2 replies; 15+ messages in thread From: Heiko Carstens @ 2025-09-10 15:12 UTC (permalink / raw) To: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ Cc: linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger With the recent ffs() and ffs64() optimization a logical and operation was removed, which allows the compiler to tell that the return value range of both functions. This may lead to compile warnings as reported by the kernel test robot. Instead of only adding the not needed mask again, also add an __assume() statement to tell newer compilers that they can assume a specific value range. This allows newer compilers to optimize the not-needed logical and operation away. Also change the return type of flogr() to unsigned long and add the const attribute to the function. With this the reported warning is away, and in addition the kernel image size is reduced by ~4kb. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202508211859.UoYsJbLN-lkp@intel.com/ Fixes: de88e74889a3 ("s390/bitops: Slightly optimize ffs() and fls64()") Suggested-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> --- arch/s390/include/asm/bitops.h | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/s390/include/asm/bitops.h b/arch/s390/include/asm/bitops.h index 9dfb687ba620..46b90b545986 100644 --- a/arch/s390/include/asm/bitops.h +++ b/arch/s390/include/asm/bitops.h @@ -130,11 +130,12 @@ static inline bool test_bit_inv(unsigned long nr, * where the most significant bit has bit number 0. * If no bit is set this function returns 64. */ -static __always_inline unsigned char __flogr(unsigned long word) +static __always_inline __attribute_const__ unsigned long __flogr(unsigned long word) { - if (__builtin_constant_p(word)) { - unsigned long bit = 0; + unsigned long bit; + if (__builtin_constant_p(word)) { + bit = 0; if (!word) return 64; if (!(word & 0xffffffff00000000UL)) { @@ -169,7 +170,9 @@ static __always_inline unsigned char __flogr(unsigned long word) asm volatile( " flogr %[rp],%[rp]\n" : [rp] "+d" (rp.pair) : : "cc"); - return rp.even; + bit = rp.even; + __assume(bit <= 64); + return bit & 127; } } -- 2.48.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] s390/bitops: Limit return value range of __flogr() 2025-09-10 15:12 ` [PATCH 2/3] s390/bitops: Limit return value range of __flogr() Heiko Carstens @ 2025-09-11 7:44 ` Juergen Christ 2025-09-11 13:24 ` kernel test robot 1 sibling, 0 replies; 15+ messages in thread From: Juergen Christ @ 2025-09-11 7:44 UTC (permalink / raw) To: Heiko Carstens Cc: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger > With the recent ffs() and ffs64() optimization a logical and operation was > removed, which allows the compiler to tell that the return value range of > both functions. This may lead to compile warnings as reported by the kernel > test robot. > > Instead of only adding the not needed mask again, also add an __assume() > statement to tell newer compilers that they can assume a specific value > range. This allows newer compilers to optimize the not-needed logical and > operation away. > > Also change the return type of flogr() to unsigned long and add the const > attribute to the function. > > With this the reported warning is away, and in addition the kernel image > size is reduced by ~4kb. > > Reported-by: kernel test robot <lkp@intel.com> > Closes: https://lore.kernel.org/oe-kbuild-all/202508211859.UoYsJbLN-lkp@intel.com/ > Fixes: de88e74889a3 ("s390/bitops: Slightly optimize ffs() and fls64()") > Suggested-by: Juergen Christ <jchrist@linux.ibm.com> > Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH 2/3] s390/bitops: Limit return value range of __flogr() 2025-09-10 15:12 ` [PATCH 2/3] s390/bitops: Limit return value range of __flogr() Heiko Carstens 2025-09-11 7:44 ` Juergen Christ @ 2025-09-11 13:24 ` kernel test robot 1 sibling, 0 replies; 15+ messages in thread From: kernel test robot @ 2025-09-11 13:24 UTC (permalink / raw) To: Heiko Carstens, Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ Cc: llvm, oe-kbuild-all, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger Hi Heiko, kernel test robot noticed the following build errors: [auto build test ERROR on s390/features] [also build test ERROR on next-20250911] [cannot apply to linus/master v6.17-rc5] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Heiko-Carstens/Compiler-Attributes-Add-__assume-macro/20250910-231949 base: https://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git features patch link: https://lore.kernel.org/r/20250910151216.646600-3-hca%40linux.ibm.com patch subject: [PATCH 2/3] s390/bitops: Limit return value range of __flogr() config: s390-randconfig-002-20250911 (https://download.01.org/0day-ci/archive/20250911/202509112018.eZI47cSy-lkp@intel.com/config) compiler: clang version 16.0.6 (https://github.com/llvm/llvm-project 7cbf1a2591520c2491aa35339f227775f4d3adf6) reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250911/202509112018.eZI47cSy-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202509112018.eZI47cSy-lkp@intel.com/ All errors (new ones prefixed by >>): In file included from kernel/bounds.c:13: In file included from include/linux/log2.h:12: In file included from include/linux/bitops.h:67: >> arch/s390/include/asm/bitops.h:174:3: error: '__assume__' attribute cannot be applied to a statement __assume(bit <= 64); ^ ~ include/linux/compiler_attributes.h:68:56: note: expanded from macro '__assume' # define __assume(expr) __attribute__((__assume__(expr))) ^ 1 error generated. make[3]: *** [scripts/Makefile.build:182: kernel/bounds.s] Error 1 shuffle=1723937077 make[3]: Target 'prepare' not remade because of errors. make[2]: *** [Makefile:1282: prepare0] Error 2 shuffle=1723937077 make[2]: Target 'prepare' not remade because of errors. make[1]: *** [Makefile:248: __sub-make] Error 2 shuffle=1723937077 make[1]: Target 'prepare' not remade because of errors. make: *** [Makefile:248: __sub-make] Error 2 shuffle=1723937077 make: Target 'prepare' not remade because of errors. vim +/__assume__ +174 arch/s390/include/asm/bitops.h 124 125 /** 126 * __flogr - find leftmost one 127 * @word - The word to search 128 * 129 * Returns the bit number of the most significant bit set, 130 * where the most significant bit has bit number 0. 131 * If no bit is set this function returns 64. 132 */ 133 static __always_inline __attribute_const__ unsigned long __flogr(unsigned long word) 134 { 135 unsigned long bit; 136 137 if (__builtin_constant_p(word)) { 138 bit = 0; 139 if (!word) 140 return 64; 141 if (!(word & 0xffffffff00000000UL)) { 142 word <<= 32; 143 bit += 32; 144 } 145 if (!(word & 0xffff000000000000UL)) { 146 word <<= 16; 147 bit += 16; 148 } 149 if (!(word & 0xff00000000000000UL)) { 150 word <<= 8; 151 bit += 8; 152 } 153 if (!(word & 0xf000000000000000UL)) { 154 word <<= 4; 155 bit += 4; 156 } 157 if (!(word & 0xc000000000000000UL)) { 158 word <<= 2; 159 bit += 2; 160 } 161 if (!(word & 0x8000000000000000UL)) { 162 word <<= 1; 163 bit += 1; 164 } 165 return bit; 166 } else { 167 union register_pair rp __uninitialized; 168 169 rp.even = word; 170 asm volatile( 171 " flogr %[rp],%[rp]\n" 172 : [rp] "+d" (rp.pair) : : "cc"); 173 bit = rp.even; > 174 __assume(bit <= 64); 175 return bit & 127; 176 } 177 } 178 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly 2025-09-10 15:12 [PATCH 0/3] s390: Fix and optimize __flogr() inline assembly Heiko Carstens 2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens 2025-09-10 15:12 ` [PATCH 2/3] s390/bitops: Limit return value range of __flogr() Heiko Carstens @ 2025-09-10 15:12 ` Heiko Carstens 2025-09-11 7:45 ` Juergen Christ 2 siblings, 1 reply; 15+ messages in thread From: Heiko Carstens @ 2025-09-10 15:12 UTC (permalink / raw) To: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, Juergen Christ Cc: linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger The flogr() inline assembly has no side effects and generates the same output if the input does not change. Therefore remove the volatile qualifier to allow the compiler to optimize the inline assembly away, if possible. This also removes the superfluous '\n' which makes the inline assembly appear larger than it is according to compiler heuristics (number of lines). Suggested-by: Juergen Christ <jchrist@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> --- arch/s390/include/asm/bitops.h | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/arch/s390/include/asm/bitops.h b/arch/s390/include/asm/bitops.h index 46b90b545986..5e04836c9c8d 100644 --- a/arch/s390/include/asm/bitops.h +++ b/arch/s390/include/asm/bitops.h @@ -167,9 +167,8 @@ static __always_inline __attribute_const__ unsigned long __flogr(unsigned long w union register_pair rp __uninitialized; rp.even = word; - asm volatile( - " flogr %[rp],%[rp]\n" - : [rp] "+d" (rp.pair) : : "cc"); + asm("flogr %[rp],%[rp]" + : [rp] "+d" (rp.pair) : : "cc"); bit = rp.even; __assume(bit <= 64); return bit & 127; -- 2.48.1 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly 2025-09-10 15:12 ` [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly Heiko Carstens @ 2025-09-11 7:45 ` Juergen Christ 0 siblings, 0 replies; 15+ messages in thread From: Juergen Christ @ 2025-09-11 7:45 UTC (permalink / raw) To: Heiko Carstens Cc: Nathan Chancellor, Miguel Ojeda, Vasily Gorbik, Alexander Gordeev, linux-kernel, linux-s390, Sven Schnelle, Christian Borntraeger > The flogr() inline assembly has no side effects and generates the same > output if the input does not change. Therefore remove the volatile > qualifier to allow the compiler to optimize the inline assembly away, > if possible. > > This also removes the superfluous '\n' which makes the inline assembly > appear larger than it is according to compiler heuristics (number of > lines). > > Suggested-by: Juergen Christ <jchrist@linux.ibm.com> > Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Juergen Christ <jchrist@linux.ibm.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-09-12 10:26 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-10 15:12 [PATCH 0/3] s390: Fix and optimize __flogr() inline assembly Heiko Carstens 2025-09-10 15:12 ` [PATCH 1/3] Compiler Attributes: Add __assume macro Heiko Carstens 2025-09-11 1:32 ` Nathan Chancellor 2025-09-11 14:56 ` Heiko Carstens 2025-09-11 18:44 ` Nathan Chancellor 2025-09-11 19:04 ` Miguel Ojeda 2025-09-11 20:42 ` Nathan Chancellor 2025-09-11 18:56 ` Miguel Ojeda 2025-09-11 18:59 ` Miguel Ojeda 2025-09-12 10:25 ` Heiko Carstens 2025-09-10 15:12 ` [PATCH 2/3] s390/bitops: Limit return value range of __flogr() Heiko Carstens 2025-09-11 7:44 ` Juergen Christ 2025-09-11 13:24 ` kernel test robot 2025-09-10 15:12 ` [PATCH 3/3] s390/bitops: Remove volatile qualifier from flogr() inline assembly Heiko Carstens 2025-09-11 7:45 ` Juergen Christ
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox