public inbox for linux-arch@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
@ 2023-08-01 22:22 ndesaulniers
  2023-08-01 22:45 ` Nick Desaulniers
  2023-08-02  1:07 ` Linus Torvalds
  0 siblings, 2 replies; 9+ messages in thread
From: ndesaulniers @ 2023-08-01 22:22 UTC (permalink / raw)
  To: Arnd Bergmann, Linus Torvalds
  Cc: Nathan Chancellor, Tom Rix, linux-arch, linux-kernel, llvm,
	Nick Desaulniers

Compiling big-endian targets with Clang produces the diagnostic:

fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
[-Wbitwise-instead-of-logical]
} while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));
          ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                                           ||
fs/namei.c:2173:13: note: cast one or both operands to int to silence
this warning

It appears that when has_zero was introduced, two definitions were
produced with different signatures (in particular different return types).

Looking at the usage in hash_name() in fs/namei.c, I suspect that
has_zero() is meant to be invoked twice per while loop iteration; using
logical-or would not update `bdata` when `a` did not have zeros. So I
think it's preferred to always return an unsigned long rather than a
bool then update the while loop in hash_name() to use a logical-or
rather than bitwise-or.

Link: https://github.com/ClangBuiltLinux/linux/issues/1832
Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic")
Debugged-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
---
 include/asm-generic/word-at-a-time.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
index 20c93f08c993..95a1d214108a 100644
--- a/include/asm-generic/word-at-a-time.h
+++ b/include/asm-generic/word-at-a-time.h
@@ -38,7 +38,7 @@ static inline long find_zero(unsigned long mask)
 	return (mask >> 8) ? byte : byte + 1;
 }
 
-static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
+static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
 {
 	unsigned long rhs = val | c->low_bits;
 	*data = rhs;

---
base-commit: 18b44bc5a67275641fb26f2c54ba7eef80ac5950
change-id: 20230801-bitwise-7812b11e5fb7

Best regards,
-- 
Nick Desaulniers <ndesaulniers@google.com>


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-01 22:22 [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness ndesaulniers
@ 2023-08-01 22:45 ` Nick Desaulniers
  2023-08-02  1:07 ` Linus Torvalds
  1 sibling, 0 replies; 9+ messages in thread
From: Nick Desaulniers @ 2023-08-01 22:45 UTC (permalink / raw)
  To: Arnd Bergmann, Linus Torvalds
  Cc: Nathan Chancellor, Tom Rix, linux-arch, linux-kernel, llvm

On Tue, Aug 1, 2023 at 3:22 PM <ndesaulniers@google.com> wrote:
>
> Compiling big-endian targets with Clang produces the diagnostic:
>
> fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
> [-Wbitwise-instead-of-logical]
> } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));
>           ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>                                            ||
> fs/namei.c:2173:13: note: cast one or both operands to int to silence
> this warning
>
> It appears that when has_zero was introduced, two definitions were
> produced with different signatures (in particular different return types).
>
> Looking at the usage in hash_name() in fs/namei.c, I suspect that
> has_zero() is meant to be invoked twice per while loop iteration; using
> logical-or would not update `bdata` when `a` did not have zeros. So I
> think it's preferred to always return an unsigned long rather than a
> bool then update the while loop in hash_name() to use a logical-or

s/then/than/

> rather than bitwise-or.
>
> Link: https://github.com/ClangBuiltLinux/linux/issues/1832
> Fixes: 36126f8f2ed8 ("word-at-a-time: make the interfaces truly generic")
> Debugged-by: Nathan Chancellor <nathan@kernel.org>
> Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
> ---
>  include/asm-generic/word-at-a-time.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/asm-generic/word-at-a-time.h b/include/asm-generic/word-at-a-time.h
> index 20c93f08c993..95a1d214108a 100644
> --- a/include/asm-generic/word-at-a-time.h
> +++ b/include/asm-generic/word-at-a-time.h
> @@ -38,7 +38,7 @@ static inline long find_zero(unsigned long mask)
>         return (mask >> 8) ? byte : byte + 1;
>  }
>
> -static inline bool has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
> +static inline unsigned long has_zero(unsigned long val, unsigned long *data, const struct word_at_a_time *c)
>  {
>         unsigned long rhs = val | c->low_bits;
>         *data = rhs;
>
> ---
> base-commit: 18b44bc5a67275641fb26f2c54ba7eef80ac5950
> change-id: 20230801-bitwise-7812b11e5fb7
>
> Best regards,
> --
> Nick Desaulniers <ndesaulniers@google.com>
>


-- 
Thanks,
~Nick Desaulniers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-01 22:22 [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness ndesaulniers
  2023-08-01 22:45 ` Nick Desaulniers
@ 2023-08-02  1:07 ` Linus Torvalds
  2023-08-02  5:59   ` Arnd Bergmann
  2023-08-02 16:15   ` Nathan Chancellor
  1 sibling, 2 replies; 9+ messages in thread
From: Linus Torvalds @ 2023-08-02  1:07 UTC (permalink / raw)
  To: ndesaulniers
  Cc: Arnd Bergmann, Nathan Chancellor, Tom Rix, linux-arch,
	linux-kernel, llvm

On Tue, 1 Aug 2023 at 15:22, <ndesaulniers@google.com> wrote:
>
> Compiling big-endian targets with Clang produces the diagnostic:
>
> fs/namei.c:2173:13: warning: use of bitwise '|' with boolean operands
> [-Wbitwise-instead-of-logical]
> } while (!(has_zero(a, &adata, &constants) | has_zero(b, &bdata, &constants)));

Gaah.

Yes, I think that 'has_zero()' should return the 'unsigned long' bits
on big-endian too, because I do think we always want the bit ops, and
turn it into a boolean only at the very end.

> It appears that when has_zero was introduced, two definitions were
> produced with different signatures (in particular different return types).

Big-endian was kind of a later addition, and while that file is called
"generic", it's really "little-endian has an easier time of this all,
but let's do the 'generic' file for the more complicated case".

Who ends up being affected by this? Powerpc does its own
word-at-a-time thing because the big-endian case is nasty and you can
do better with special instructions that they have.

Who else is even BE any more? Some old 32-bit arm setup?

I think the patch is fine, but I guess I'd like to know that people
who are affected actually don't see any code generation changes (or
possibly see improvements from not turning it into a bool until later)

            Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02  1:07 ` Linus Torvalds
@ 2023-08-02  5:59   ` Arnd Bergmann
  2023-08-02  7:50     ` Heiko Carstens
  2023-08-02 16:15   ` Nathan Chancellor
  1 sibling, 1 reply; 9+ messages in thread
From: Arnd Bergmann @ 2023-08-02  5:59 UTC (permalink / raw)
  To: Linus Torvalds, Nick Desaulniers
  Cc: Nathan Chancellor, Tom Rix, Linux-Arch, linux-kernel, llvm,
	Heiko Carstens, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, linux-s390,
	Thomas Bogendoerfer

On Wed, Aug 2, 2023, at 03:07, Linus Torvalds wrote:
> On Tue, 1 Aug 2023 at 15:22, <ndesaulniers@google.com> wrote:

> Who ends up being affected by this? Powerpc does its own
> word-at-a-time thing because the big-endian case is nasty and you can
> do better with special instructions that they have.

powerpc needs the same patch though.

> Who else is even BE any more? Some old 32-bit arm setup?
>
> I think the patch is fine, but I guess I'd like to know that people
> who are affected actually don't see any code generation changes (or
> possibly see improvements from not turning it into a bool until later)

s390 is the main one here, maintainers added to Cc.

Atheros and Realtek MIPS are older but still shipping in low-end
network equipment, and there are still users on old embedded
big-endian mips (broadcom, cavium, intel/lantiq and arm
(intel ixp4xx) hardware. All modern Arm hardware (v6/v7/v8/v9)
can do both big-endian and little-endian, but actual user numbers
for big-endian are close to zero -- I have not heard from anyone
actually using it in years.

And then there was a lot of the old-school workstation/server
hardware (m68k, mips, parisc and sparc, not alpha/ia64) that is
mostly big-endian and still maintained to some degree.

      Arnd

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02  5:59   ` Arnd Bergmann
@ 2023-08-02  7:50     ` Heiko Carstens
  0 siblings, 0 replies; 9+ messages in thread
From: Heiko Carstens @ 2023-08-02  7:50 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Nick Desaulniers, Nathan Chancellor, Tom Rix,
	Linux-Arch, linux-kernel, llvm, Vasily Gorbik, Alexander Gordeev,
	Christian Borntraeger, Sven Schnelle, linux-s390,
	Thomas Bogendoerfer

On Wed, Aug 02, 2023 at 07:59:48AM +0200, Arnd Bergmann wrote:
> On Wed, Aug 2, 2023, at 03:07, Linus Torvalds wrote:
> > On Tue, 1 Aug 2023 at 15:22, <ndesaulniers@google.com> wrote:
> 
> > Who ends up being affected by this? Powerpc does its own
> > word-at-a-time thing because the big-endian case is nasty and you can
> > do better with special instructions that they have.
> 
> powerpc needs the same patch though.
> 
> > Who else is even BE any more? Some old 32-bit arm setup?
> >
> > I think the patch is fine, but I guess I'd like to know that people
> > who are affected actually don't see any code generation changes (or
> > possibly see improvements from not turning it into a bool until later)
> 
> s390 is the main one here, maintainers added to Cc.

The generated code on s390 is identical with and without the patch
using gcc 13.2.
So this looks fine from my point of view.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02  1:07 ` Linus Torvalds
  2023-08-02  5:59   ` Arnd Bergmann
@ 2023-08-02 16:15   ` Nathan Chancellor
  2023-08-02 17:37     ` Linus Torvalds
  1 sibling, 1 reply; 9+ messages in thread
From: Nathan Chancellor @ 2023-08-02 16:15 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: ndesaulniers, Arnd Bergmann, Tom Rix, linux-arch, linux-kernel,
	llvm

On Tue, Aug 01, 2023 at 06:07:08PM -0700, Linus Torvalds wrote:
> I think the patch is fine, but I guess I'd like to know that people
> who are affected actually don't see any code generation changes (or
> possibly see improvements from not turning it into a bool until later)

We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
With both clang 18.0.0 (tip of tree) and GCC 13.1.0, I don't see any
actual code generation changes in fs/namei.o with this configuration.
I'd be pretty surprised if any of the other uses of has_zero() show any
changes, I at least checked lib/string.o with that configuration and
s390 and did not see anything.

As far as I can tell, arm and arm64 with CONFIG_CPU_BIG_ENDIAN=y are the
only configurations that can hit the particular bit of code with the
generic big endian has_zero() implementation because the version of
hash_name() that uses has_zero() in this manner is only used when
CONFIG_DCACHE_WORD_ACCESS is set, which only arm, arm64, powerpc (little
endian), and x86 select.

arch/arm/Kconfig:49:         select DCACHE_WORD_ACCESS if HAVE_EFFICIENT_UNALIGNED_ACCESS
arch/arm64/Kconfig:121:        select DCACHE_WORD_ACCESS
arch/powerpc/Kconfig:183:        select DCACHE_WORD_ACCESS                if PPC64 && CPU_LITTLE_ENDIAN
arch/x86/Kconfig:140:        select DCACHE_WORD_ACCESS                if !KMSAN
arch/x86/um/Kconfig:12:         select DCACHE_WORD_ACCESS

So seems like a pretty low risk patch to me but I could be missing
something.

Cheers,
Nathan

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02 16:15   ` Nathan Chancellor
@ 2023-08-02 17:37     ` Linus Torvalds
  2023-08-02 18:17       ` Arnd Bergmann
  0 siblings, 1 reply; 9+ messages in thread
From: Linus Torvalds @ 2023-08-02 17:37 UTC (permalink / raw)
  To: Nathan Chancellor, Michael Ellerman, Will Deacon, Catalin Marinas
  Cc: ndesaulniers, Arnd Bergmann, Tom Rix, linux-arch, linux-kernel,
	llvm, linuxppc-dev

On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <nathan@kernel.org> wrote:
>
> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.

Oh Christ. I didn't even realize that arm64 allowed a BE config.

The config option goes back to 2013 - are there actually BE user space
implementations around?

People, why do we do that? That's positively crazy. BE is dead and
should be relegated to legacy platforms. There are no advantages to
being different just for the sake of being different - any "security
by obscurity" argument would be far outweighed by the inconvenience to
actual users.

Yes, yes, I know the aarch64 architecture technically allows BE
implementations - and apparently you can even do it by exception
level, which I had to look up. But do any actually exist?

Does the kernel even work right in BE mode? It's really easy to miss
some endianness check when all the actual hardware and use is LE, and
when (for example) instruction encoding and IO is then always LE
anyway.

> With both clang 18.0.0 (tip of tree) and GCC 13.1.0, I don't see any
> actual code generation changes in fs/namei.o with this configuration.

Ok, since the main legacy platform confirmed that, I'll just apply
that patch directly.

I'll also do the powerpc version that Arnd pointed to at the same
time, since it seems silly to pick these off one at a time. It too
should just be 'unsigned long', so that the two values can be bitwise
or'ed together without any questions.

              Linus

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02 17:37     ` Linus Torvalds
@ 2023-08-02 18:17       ` Arnd Bergmann
  2023-08-04 13:04         ` Will Deacon
  0 siblings, 1 reply; 9+ messages in thread
From: Arnd Bergmann @ 2023-08-02 18:17 UTC (permalink / raw)
  To: Linus Torvalds, Nathan Chancellor, Michael Ellerman, Will Deacon,
	Catalin Marinas
  Cc: Nick Desaulniers, Tom Rix, Linux-Arch, linux-kernel, llvm,
	linuxppc-dev

On Wed, Aug 2, 2023, at 19:37, Linus Torvalds wrote:
> On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <nathan@kernel.org> wrote:
>>
>> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
>
> Oh Christ. I didn't even realize that arm64 allowed a BE config.
>
> The config option goes back to 2013 - are there actually BE user space
> implementations around?

At least NXP's Layerscape and Huawei's SoCs ended up in big-endian
appliances, running legacy software ported from mips or powerpc.
I agree this was a mistake, but that wasn't nearly as obvious ten
years ago when there were still new BE-only sparc, mips and powerpc
put on the market -- that really only ended in 2017.

> People, why do we do that? That's positively crazy. BE is dead and
> should be relegated to legacy platforms. There are no advantages to
> being different just for the sake of being different - any "security
> by obscurity" argument would be far outweighed by the inconvenience to
> actual users.
>
> Yes, yes, I know the aarch64 architecture technically allows BE
> implementations - and apparently you can even do it by exception
> level, which I had to look up. But do any actually exist?
>
> Does the kernel even work right in BE mode? It's really easy to miss
> some endianness check when all the actual hardware and use is LE, and
> when (for example) instruction encoding and IO is then always LE
> anyway.

This was always only done for compatibility with non-portable
software when companies with large custom network stacks argued
that it was cheaper to build the entire open source software to
big-endian than port their own product to little-endian. ;-)

We (Linaro) used to test all toolchain and kernel releases in
big-endian mode as member companies had customers that asked
for it, but that stopped a while ago as those legacy software
stacks either got more portable or got replaced over time.

Many Arm systems won't boot BE kernels any more because UEFI
firmware only supports LE, or because of driver bugs.
Virtual machines are still likely to work fine though.
I'm fairly sure that all Arm Cortex and Neoverse cores still\
support BE mode in all exception levels, OTOH at least Apple's
custom CPUs do not implement it at all.

     Arnd

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness
  2023-08-02 18:17       ` Arnd Bergmann
@ 2023-08-04 13:04         ` Will Deacon
  0 siblings, 0 replies; 9+ messages in thread
From: Will Deacon @ 2023-08-04 13:04 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linus Torvalds, Nathan Chancellor, Michael Ellerman, Will Deacon,
	Catalin Marinas, Nick Desaulniers, Tom Rix, Linux-Arch,
	linux-kernel, llvm, linuxppc-dev

On Wed, Aug 02, 2023 at 08:17:32PM +0200, Arnd Bergmann wrote:
> On Wed, Aug 2, 2023, at 19:37, Linus Torvalds wrote:
> > On Wed, 2 Aug 2023 at 09:16, Nathan Chancellor <nathan@kernel.org> wrote:
> >>
> >> We see this warning with ARCH=arm64 defconfig + CONFIG_CPU_BIG_ENDIAN=y.
> >
> > Oh Christ. I didn't even realize that arm64 allowed a BE config.
> >
> > The config option goes back to 2013 - are there actually BE user space
> > implementations around?
> 
> At least NXP's Layerscape and Huawei's SoCs ended up in big-endian
> appliances, running legacy software ported from mips or powerpc.
> I agree this was a mistake, but that wasn't nearly as obvious ten
> years ago when there were still new BE-only sparc, mips and powerpc
> put on the market -- that really only ended in 2017.
> 
> > People, why do we do that? That's positively crazy. BE is dead and
> > should be relegated to legacy platforms. There are no advantages to
> > being different just for the sake of being different - any "security
> > by obscurity" argument would be far outweighed by the inconvenience to
> > actual users.
> >
> > Yes, yes, I know the aarch64 architecture technically allows BE
> > implementations - and apparently you can even do it by exception
> > level, which I had to look up. But do any actually exist?
> >
> > Does the kernel even work right in BE mode? It's really easy to miss
> > some endianness check when all the actual hardware and use is LE, and
> > when (for example) instruction encoding and IO is then always LE
> > anyway.
> 
> This was always only done for compatibility with non-portable
> software when companies with large custom network stacks argued
> that it was cheaper to build the entire open source software to
> big-endian than port their own product to little-endian. ;-)
> 
> We (Linaro) used to test all toolchain and kernel releases in
> big-endian mode as member companies had customers that asked
> for it, but that stopped a while ago as those legacy software
> stacks either got more portable or got replaced over time.
> 
> Many Arm systems won't boot BE kernels any more because UEFI
> firmware only supports LE, or because of driver bugs.
> Virtual machines are still likely to work fine though.
> I'm fairly sure that all Arm Cortex and Neoverse cores still\
> support BE mode in all exception levels, OTOH at least Apple's
> custom CPUs do not implement it at all.

Yes, that's right. The CPUs we have *do* tend to support BE, but
practically there isn't any software to run on them. I asked about
removing it a few years ago:

https://lore.kernel.org/linux-arm-kernel/20191011102747.lpbaur2e4nqyf7sw@willie-the-truck/

but Hanjun said that Huawei are using it, so it stayed.

Will

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-08-04 13:04 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-01 22:22 [PATCH] word-at-a-time: use the same return type for has_zero regardless of endianness ndesaulniers
2023-08-01 22:45 ` Nick Desaulniers
2023-08-02  1:07 ` Linus Torvalds
2023-08-02  5:59   ` Arnd Bergmann
2023-08-02  7:50     ` Heiko Carstens
2023-08-02 16:15   ` Nathan Chancellor
2023-08-02 17:37     ` Linus Torvalds
2023-08-02 18:17       ` Arnd Bergmann
2023-08-04 13:04         ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox