linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
@ 2019-06-21  8:58 Mathieu Malaterre
  2022-09-07 17:21 ` Christophe Leroy
  2025-08-27 17:15 ` Christophe Leroy
  0 siblings, 2 replies; 9+ messages in thread
From: Mathieu Malaterre @ 2019-06-21  8:58 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Mathieu Malaterre, linux-kernel, Paul Mackerras, Joel Stanley,
	linuxppc-dev

When building with clang-8 the frame size limit is hit:

  ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]

Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
frame size for clang") until a proper fix is implemented upstream in
clang and relax requirement for clang.

Link: https://github.com/ClangBuiltLinux/linux/issues/563
Cc: Joel Stanley <joel@jms.id.au>
Signed-off-by: Mathieu Malaterre <malat@debian.org>
---
 arch/powerpc/lib/Makefile | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index c55f9c27bf79..b3f7d64caaf0 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -58,5 +58,9 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
 
 obj-$(CONFIG_ALTIVEC)	+= xor_vmx.o xor_vmx_glue.o
 CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
+ifdef CONFIG_CC_IS_CLANG
+# See https://github.com/ClangBuiltLinux/linux/issues/563
+CFLAGS_xor_vmx.o += -Wframe-larger-than=4096
+endif
 
 obj-$(CONFIG_PPC64) += $(obj64-y)
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2019-06-21  8:58 [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang Mathieu Malaterre
@ 2022-09-07 17:21 ` Christophe Leroy
  2022-09-08  0:27   ` Michael Ellerman
  2025-08-27 17:15 ` Christophe Leroy
  1 sibling, 1 reply; 9+ messages in thread
From: Christophe Leroy @ 2022-09-07 17:21 UTC (permalink / raw)
  To: Mathieu Malaterre, Michael Ellerman, Nick Desaulniers
  Cc: linuxppc-dev, Paul Mackerras, linux-kernel, Joel Stanley



Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :
> When building with clang-8 the frame size limit is hit:
> 
>    ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]
> 
> Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
> frame size for clang") until a proper fix is implemented upstream in
> clang and relax requirement for clang.

With Clang 14 I get the following errors, but only with KASAN selected.

   CC      arch/powerpc/lib/xor_vmx.o
arch/powerpc/lib/xor_vmx.c:95:6: error: stack frame size (1040) exceeds 
limit (1024) in '__xor_altivec_4' [-Werror,-Wframe-larger-than]
void __xor_altivec_4(unsigned long bytes,
      ^
arch/powerpc/lib/xor_vmx.c:124:6: error: stack frame size (1312) exceeds 
limit (1024) in '__xor_altivec_5' [-Werror,-Wframe-larger-than]
void __xor_altivec_5(unsigned long bytes,
      ^


Is this patch still relevant ?

Or should frame size be relaxed when KASAN is selected ? After all the 
stack size is multiplied by 2 when we have KASAN, so maybe the warning 
limit should be increased as well ?

Thanks
Christophe

> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/563
> Cc: Joel Stanley <joel@jms.id.au>
> Signed-off-by: Mathieu Malaterre <malat@debian.org>
> ---
>   arch/powerpc/lib/Makefile | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
> index c55f9c27bf79..b3f7d64caaf0 100644
> --- a/arch/powerpc/lib/Makefile
> +++ b/arch/powerpc/lib/Makefile
> @@ -58,5 +58,9 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
>   
>   obj-$(CONFIG_ALTIVEC)	+= xor_vmx.o xor_vmx_glue.o
>   CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
> +ifdef CONFIG_CC_IS_CLANG
> +# See https://github.com/ClangBuiltLinux/linux/issues/563
> +CFLAGS_xor_vmx.o += -Wframe-larger-than=4096
> +endif
>   
>   obj-$(CONFIG_PPC64) += $(obj64-y)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-07 17:21 ` Christophe Leroy
@ 2022-09-08  0:27   ` Michael Ellerman
  2022-09-08  6:00     ` Christophe Leroy
  2022-09-08 15:07     ` Arnd Bergmann
  0 siblings, 2 replies; 9+ messages in thread
From: Michael Ellerman @ 2022-09-08  0:27 UTC (permalink / raw)
  To: Christophe Leroy, Mathieu Malaterre, Nick Desaulniers
  Cc: linuxppc-dev, Paul Mackerras, linux-kernel, Joel Stanley

Christophe Leroy <christophe.leroy@csgroup.eu> writes:
> Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :
>> When building with clang-8 the frame size limit is hit:
>> 
>>    ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]
>> 
>> Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
>> frame size for clang") until a proper fix is implemented upstream in
>> clang and relax requirement for clang.
>
> With Clang 14 I get the following errors, but only with KASAN selected.
>
>    CC      arch/powerpc/lib/xor_vmx.o
> arch/powerpc/lib/xor_vmx.c:95:6: error: stack frame size (1040) exceeds 
> limit (1024) in '__xor_altivec_4' [-Werror,-Wframe-larger-than]
> void __xor_altivec_4(unsigned long bytes,
>       ^
> arch/powerpc/lib/xor_vmx.c:124:6: error: stack frame size (1312) exceeds 
> limit (1024) in '__xor_altivec_5' [-Werror,-Wframe-larger-than]
> void __xor_altivec_5(unsigned long bytes,
>       ^

That's a 32-bit build?

> Is this patch still relevant ?

The clang issue was closed because a different change fixed the issue:

  https://github.com/ClangBuiltLinux/linux/issues/563

> Or should frame size be relaxed when KASAN is selected ? After all the 
> stack size is multiplied by 2 when we have KASAN, so maybe the warning 
> limit should be increased as well ?

Yeah that would make some sense.

On 64-bit the largest frame in that file is 1424, which is below the
default 2048 byte limit.

So maybe just increase it for 32-bit && KASAN.

What would be nice is if the FRAME_WARN value could be calculated as a
percentage of the THREAD_SHIFT, but that's not easily doable with the
way things are structured in Kconfig.

cheers

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-08  0:27   ` Michael Ellerman
@ 2022-09-08  6:00     ` Christophe Leroy
  2022-09-08 13:48       ` Segher Boessenkool
  2022-09-08 15:07     ` Arnd Bergmann
  1 sibling, 1 reply; 9+ messages in thread
From: Christophe Leroy @ 2022-09-08  6:00 UTC (permalink / raw)
  To: Michael Ellerman, Mathieu Malaterre, Nick Desaulniers,
	Segher Boessenkool
  Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras,
	linux-kernel@vger.kernel.org, Joel Stanley



Le 08/09/2022 à 02:27, Michael Ellerman a écrit :
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>> Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :
>>> When building with clang-8 the frame size limit is hit:
>>>
>>>     ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]
>>>
>>> Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
>>> frame size for clang") until a proper fix is implemented upstream in
>>> clang and relax requirement for clang.
>>
>> With Clang 14 I get the following errors, but only with KASAN selected.
>>
>>     CC      arch/powerpc/lib/xor_vmx.o
>> arch/powerpc/lib/xor_vmx.c:95:6: error: stack frame size (1040) exceeds
>> limit (1024) in '__xor_altivec_4' [-Werror,-Wframe-larger-than]
>> void __xor_altivec_4(unsigned long bytes,
>>        ^
>> arch/powerpc/lib/xor_vmx.c:124:6: error: stack frame size (1312) exceeds
>> limit (1024) in '__xor_altivec_5' [-Werror,-Wframe-larger-than]
>> void __xor_altivec_5(unsigned long bytes,
>>        ^
> 
> That's a 32-bit build?

Yes, pmac32_defconfig

> 
>> Is this patch still relevant ?
> 
> The clang issue was closed because a different change fixed the issue:
> 
>    https://github.com/ClangBuiltLinux/linux/issues/563
> 
>> Or should frame size be relaxed when KASAN is selected ? After all the
>> stack size is multiplied by 2 when we have KASAN, so maybe the warning
>> limit should be increased as well ?
> 
> Yeah that would make some sense.
> 
> On 64-bit the largest frame in that file is 1424, which is below the
> default 2048 byte limit.
> 
> So maybe just increase it for 32-bit && KASAN.
> 
> What would be nice is if the FRAME_WARN value could be calculated as a
> percentage of the THREAD_SHIFT, but that's not easily doable with the
> way things are structured in Kconfig.
> 

Looking at it more deeply, I see strange things.

What is that frame size ? I thought it was the number of bytes r1 is 
decremented at the begining of the function, but it seems not, at least 
on GCC. It seems GCC substrats 112 bytes while clang doesn't.

I set CONFIG_FRAME_WARN to 8 and with GCC and without KASAN, I get no 
warning, allthough I have:

00000000 <__xor_altivec_2>:
    0:	94 21 ff f0 	stwu    r1,-16(r1)
00000078 <__xor_altivec_3>:
   78:	94 21 ff f0 	stwu    r1,-16(r1)
0000010c <__xor_altivec_4>:
  10c:	94 21 ff f0 	stwu    r1,-16(r1)
000001c4 <__xor_altivec_5>:
  1c4:	94 21 ff e0 	stwu    r1,-32(r1)

With GCC and inline KASAN I get:

arch/powerpc/lib/xor_vmx.c: In function '__xor_altivec_2':
arch/powerpc/lib/xor_vmx.c:69:1: warning: the frame size of 96 bytes is 
larger than 8 bytes [-Wframe-larger-than=]
arch/powerpc/lib/xor_vmx.c: In function '__xor_altivec_3':
arch/powerpc/lib/xor_vmx.c:93:1: warning: the frame size of 128 bytes is 
larger than 8 bytes [-Wframe-larger-than=]
arch/powerpc/lib/xor_vmx.c: In function '__xor_altivec_4':
arch/powerpc/lib/xor_vmx.c:122:1: warning: the frame size of 80 bytes is 
larger than 8 bytes [-Wframe-larger-than=]
arch/powerpc/lib/xor_vmx.c: In function '__xor_altivec_5':
arch/powerpc/lib/xor_vmx.c:156:1: warning: the frame size of 128 bytes 
is larger than 8 bytes [-Wframe-larger-than=]

00000000 <__xor_altivec_2>:
        0:	94 21 ff 30 	stwu    r1,-208(r1)
00000458 <__xor_altivec_3>:
      458:	94 21 ff 00 	stwu    r1,-256(r1)
00000b94 <__xor_altivec_4>:
      b94:	94 21 fe b0 	stwu    r1,-336(r1)
000015b8 <__xor_altivec_5>:
     15b8:	94 21 fe 60 	stwu    r1,-416(r1)

With CLANG and without KASAN I get:

   CC      arch/powerpc/lib/xor_vmx.o
arch/powerpc/lib/xor_vmx.c:52:6: warning: stack frame size (144) exceeds 
limit (8) in '__xor_altivec_2' [-Wframe-larger-than]
void __xor_altivec_2(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:71:6: warning: stack frame size (144) exceeds 
limit (8) in '__xor_altivec_3' [-Wframe-larger-than]
void __xor_altivec_3(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:95:6: warning: stack frame size (160) exceeds 
limit (8) in '__xor_altivec_4' [-Wframe-larger-than]
void __xor_altivec_4(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:124:6: warning: stack frame size (144) 
exceeds limit (8) in '__xor_altivec_5' [-Wframe-larger-than]
void __xor_altivec_5(unsigned long bytes,

00000000 <__xor_altivec_2>:
        0:	94 21 ff 70 	stwu    r1,-144(r1)
00000528 <__xor_altivec_3>:
      528:	94 21 ff 70 	stwu    r1,-144(r1)
00000c4c <__xor_altivec_4>:
      c4c:	94 21 ff 60 	stwu    r1,-160(r1)
000015a4 <__xor_altivec_5>:
     15a4:	94 21 ff 70 	stwu    r1,-144(r1)

With CLANG and with inline KASAN I get:

arch/powerpc/lib/xor_vmx.c:52:6: warning: stack frame size (512) exceeds 
limit (8) in '__xor_altivec_2' [-Wframe-larger-than]
void __xor_altivec_2(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:71:6: warning: stack frame size (768) exceeds 
limit (8) in '__xor_altivec_3' [-Wframe-larger-than]
void __xor_altivec_3(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:95:6: warning: stack frame size (1040) 
exceeds limit (8) in '__xor_altivec_4' [-Wframe-larger-than]
void __xor_altivec_4(unsigned long bytes,
arch/powerpc/lib/xor_vmx.c:124:6: warning: stack frame size (1312) 
exceeds limit (8) in '__xor_altivec_5' [-Wframe-larger-than]
void __xor_altivec_5(unsigned long bytes,

00000000 <__xor_altivec_2>:
        8:	94 21 fe 00 	stwu    r1,-512(r1)
00000a24 <__xor_altivec_3>:
      a2c:	94 21 fd 00 	stwu    r1,-768(r1)
000019a4 <__xor_altivec_4>:
     19ac:	94 21 fb f0 	stwu    r1,-1040(r1)
00002f20 <__xor_altivec_5>:
     2f28:	94 21 fa e0 	stwu    r1,-1312(r1)


So it seems that GCC and CLANG don't warn on the same thing, is that 
expected ? GCC substrats 112 bytes, which is the minimum frame size on a 
ppc64, but here I'm building a ppc32 kernel, min frame size is 16.

And CLANG is still using stack a lot more than GCC.

Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-08  6:00     ` Christophe Leroy
@ 2022-09-08 13:48       ` Segher Boessenkool
  2022-09-09  5:01         ` Christophe Leroy
  0 siblings, 1 reply; 9+ messages in thread
From: Segher Boessenkool @ 2022-09-08 13:48 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Mathieu Malaterre, Nick Desaulniers, linux-kernel@vger.kernel.org,
	Paul Mackerras, Joel Stanley, linuxppc-dev@lists.ozlabs.org

On Thu, Sep 08, 2022 at 06:00:24AM +0000, Christophe Leroy wrote:
> Looking at it more deeply, I see strange things.

I'll have to see full generated machine code to be able to see strange
things, there isn't enough information at all here yet.  Sorry.

Use private mail if it is too big or uninteresting for the list :-)

> What is that frame size ? I thought it was the number of bytes r1 is 
> decremented at the begining of the function, but it seems not, at least 
> on GCC. It seems GCC substrats 112 bytes while clang doesn't.

That is the vars size + the fixed size + the size of the parameter
save area + the size of the regs save area, rounded up to a multiple
of 16.  Fixed size is 8 on 32-bit PowerPC ELF.  Frame size used by GCC
here is just the vars size.

> So it seems that GCC and CLANG don't warn on the same thing, is that 
> expected ? GCC substrats 112 bytes, which is the minimum frame size on a 
> ppc64, but here I'm building a ppc32 kernel, min frame size is 16.

I need to see the generated code to make sense of what is happening
here.  It sounds like it is doing varargs calls or similar expensive
stack juggling.  Or just saving a boatload of registers on the stack.

> And CLANG is still using stack a lot more than GCC.

Good to hear!  Well, good for GCC, anyway ;-)


Segher

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-08  0:27   ` Michael Ellerman
  2022-09-08  6:00     ` Christophe Leroy
@ 2022-09-08 15:07     ` Arnd Bergmann
  2022-09-08 22:40       ` Segher Boessenkool
  1 sibling, 1 reply; 9+ messages in thread
From: Arnd Bergmann @ 2022-09-08 15:07 UTC (permalink / raw)
  To: Michael Ellerman, Christophe Leroy, Mathieu Malaterre,
	Nick Desaulniers
  Cc: Paul Mackerras, llvm, linuxppc-dev, linux-kernel, Joel Stanley

On Thu, Sep 8, 2022, at 2:27 AM, Michael Ellerman wrote:
> Christophe Leroy <christophe.leroy@csgroup.eu> writes:
>
> Yeah that would make some sense.
>
> On 64-bit the largest frame in that file is 1424, which is below the
> default 2048 byte limit.
>
> So maybe just increase it for 32-bit && KASAN.
>
> What would be nice is if the FRAME_WARN value could be calculated as a
> percentage of the THREAD_SHIFT, but that's not easily doable with the
> way things are structured in Kconfig.
>

Increasing the warning limit slightly for 32-bit with
CONFIG_KASAN_STACK makes sense, but there are a lot of
related concerns:

- I was hoping to still stay under 1280 bytes for the warning
  limit, so that even with KASAN_STACK enabled, we are able to
  catch warnings in functions that use a stupid amount of
  local variables, without getting too many false positives.

- if the XOR code has its frame size explode like this, it's
  probably an indication of the compiler doing something wrong,
  not the kernel code. The result is likely that the "optimized"
  XOR implementation is slower than the default version as a
  result, and the kernel will pick the other one at boot time.
  This needs to be confirmed of course, but an easier workaround
  for this instance might be to just disable the xor_vmx module
  when KASAN_STACK is set.

- The warning limit on 32-bit is actually 2028 bytes when
  GCC_PLUGIN_LATENT_ENTROPY is set. I think this is a mistake
  and we should lower /that/ limit instead, but a side-effect
  here is that an allmodconfig kernel build with gcc will fail
  to warn about bugs that exist both with gcc and clang, while
  clang complains about it.

      Arnd

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-08 15:07     ` Arnd Bergmann
@ 2022-09-08 22:40       ` Segher Boessenkool
  0 siblings, 0 replies; 9+ messages in thread
From: Segher Boessenkool @ 2022-09-08 22:40 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Mathieu Malaterre, llvm, Nick Desaulniers, linux-kernel,
	Paul Mackerras, Joel Stanley, linuxppc-dev

Hi!

On Thu, Sep 08, 2022 at 05:07:24PM +0200, Arnd Bergmann wrote:
> - if the XOR code has its frame size explode like this, it's
>   probably an indication of the compiler doing something wrong,
>   not the kernel code.

On the contrary, it is most likely an indication that the kernel code
wants something unreasonable.  Like, having 20 variables live at the
same time, but still wanting nicely scheduled machine code generated.

But I suspect GCC unrolled the loops here, even?  Best way to prevent
that here is to put an option in the Makefile, for these files.  We
don't want any of this unrolled after all?  Or, alternatively, remove
all the manual unrolling from this code, let GCC do its thing, without
painting it in a corner.

>   The result is likely that the "optimized"
>   XOR implementation is slower than the default version as a
>   result, and the kernel will pick the other one at boot time.

Yes.  So it's self-healing even, of a sort :-)


Segher

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2022-09-08 13:48       ` Segher Boessenkool
@ 2022-09-09  5:01         ` Christophe Leroy
  0 siblings, 0 replies; 9+ messages in thread
From: Christophe Leroy @ 2022-09-09  5:01 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Mathieu Malaterre, Nick Desaulniers, linux-kernel@vger.kernel.org,
	Paul Mackerras, Joel Stanley, linuxppc-dev@lists.ozlabs.org



Le 08/09/2022 à 15:48, Segher Boessenkool a écrit :
> On Thu, Sep 08, 2022 at 06:00:24AM +0000, Christophe Leroy wrote:
>> Looking at it more deeply, I see strange things.
> 
> I'll have to see full generated machine code to be able to see strange
> things, there isn't enough information at all here yet.  Sorry.

Well, what I call strange is the fact that with GCC the number of bytes 
reported by -Wframe-larger-than doesn't match the value the offset used 
for the stwu at the start of the function, while it does with clang.

> 
> Use private mail if it is too big or uninteresting for the list :-)
> 
>> What is that frame size ? I thought it was the number of bytes r1 is
>> decremented at the begining of the function, but it seems not, at least
>> on GCC. It seems GCC substrats 112 bytes while clang doesn't.
> 
> That is the vars size + the fixed size + the size of the parameter
> save area + the size of the regs save area, rounded up to a multiple
> of 16.  Fixed size is 8 on 32-bit PowerPC ELF.  Frame size used by GCC
> here is just the vars size.

Ok, so it means that the stack utilisation is underestimated when using 
GCC ? Or is it clang that overestimates it ?

> 
>> So it seems that GCC and CLANG don't warn on the same thing, is that
>> expected ? GCC substrats 112 bytes, which is the minimum frame size on a
>> ppc64, but here I'm building a ppc32 kernel, min frame size is 16.
> 
> I need to see the generated code to make sense of what is happening
> here.  It sounds like it is doing varargs calls or similar expensive
> stack juggling.  Or just saving a boatload of registers on the stack.
> 

Ok, I'll send it to you. But once again, I don't mind what the code 
really look like, I'm just worried that GCC doesn't report the entire 
stack usage.


Christophe

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang
  2019-06-21  8:58 [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang Mathieu Malaterre
  2022-09-07 17:21 ` Christophe Leroy
@ 2025-08-27 17:15 ` Christophe Leroy
  1 sibling, 0 replies; 9+ messages in thread
From: Christophe Leroy @ 2025-08-27 17:15 UTC (permalink / raw)
  To: Mathieu Malaterre; +Cc: linux-kernel, Joel Stanley, linuxppc-dev



Le 21/06/2019 à 10:58, Mathieu Malaterre a écrit :
> When building with clang-8 the frame size limit is hit:
> 
>    ../arch/powerpc/lib/xor_vmx.c:119:6: error: stack frame size of 1200 bytes in function '__xor_altivec_5' [-Werror,-Wframe-larger-than=]
> 
> Follow the same approach as commit 9c87156cce5a ("powerpc/xmon: Relax
> frame size for clang") until a proper fix is implemented upstream in
> clang and relax requirement for clang.
> 
> Link: https://github.com/ClangBuiltLinux/linux/issues/563
> Cc: Joel Stanley <joel@jms.id.au>
> Signed-off-by: Mathieu Malaterre <malat@debian.org>

Apparently the problem is gone with recent versions of clang, frame size 
is 16.

000001b0 <__xor_altivec_5>:
  1b0:	94 21 ff f0 	stwu    r1,-16(r1)
  1b4:	93 c1 00 08 	stw     r30,8(r1)
  1b8:	54 63 d1 be 	srwi    r3,r3,6
  1bc:	39 20 00 00 	li      r9,0
  1c0:	39 40 00 10 	li      r10,16
  1c4:	39 60 00 20 	li      r11,32
  1c8:	7c 69 03 a6 	mtctr   r3
  1cc:	38 60 00 30 	li      r3,48
  1d0:	7c 44 48 ce 	lvx     v2,r4,r9
  1d4:	7d 84 4a 14 	add     r12,r4,r9
  1d8:	7c 65 48 ce 	lvx     v3,r5,r9
  1dc:	7f c5 4a 14 	add     r30,r5,r9
  1e0:	7c 86 48 ce 	lvx     v4,r6,r9
  1e4:	7c a7 48 ce 	lvx     v5,r7,r9
  1e8:	10 43 14 c4 	vxor    v2,v3,v2
  1ec:	7c 2c 50 ce 	lvx     v1,r12,r10
  1f0:	10 42 24 c4 	vxor    v2,v2,v4
  1f4:	7d 1e 50 ce 	lvx     v8,r30,r10
  1f8:	10 42 2c c4 	vxor    v2,v2,v5
  1fc:	7d 3e 58 ce 	lvx     v9,r30,r11
  200:	7d 5e 18 ce 	lvx     v10,r30,r3
  204:	7f c6 4a 14 	add     r30,r6,r9
  208:	7c 08 48 ce 	lvx     v0,r8,r9
  20c:	10 68 0c c4 	vxor    v3,v8,v1
  210:	7c cc 58 ce 	lvx     v6,r12,r11
  214:	7c ec 18 ce 	lvx     v7,r12,r3
  218:	10 42 04 c4 	vxor    v2,v2,v0
  21c:	7d 7e 50 ce 	lvx     v11,r30,r10
  220:	10 29 34 c4 	vxor    v1,v9,v6
  224:	7d 9e 58 ce 	lvx     v12,r30,r11
  228:	10 aa 3c c4 	vxor    v5,v10,v7
  22c:	7d be 18 ce 	lvx     v13,r30,r3
  230:	7f c7 4a 14 	add     r30,r7,r9
  234:	7d de 50 ce 	lvx     v14,r30,r10
  238:	10 63 5c c4 	vxor    v3,v3,v11
  23c:	10 21 64 c4 	vxor    v1,v1,v12
  240:	7d fe 58 ce 	lvx     v15,r30,r11
  244:	7e 1e 18 ce 	lvx     v16,r30,r3
  248:	7f c8 4a 14 	add     r30,r8,r9
  24c:	7e 3e 50 ce 	lvx     v17,r30,r10
  250:	10 63 74 c4 	vxor    v3,v3,v14
  254:	7e 5e 58 ce 	lvx     v18,r30,r11
  258:	7c 9e 18 ce 	lvx     v4,r30,r3
  25c:	10 63 8c c4 	vxor    v3,v3,v17
  260:	7c 44 49 ce 	stvx    v2,r4,r9
  264:	10 45 6c c4 	vxor    v2,v5,v13
  268:	10 a1 7c c4 	vxor    v5,v1,v15
  26c:	39 29 00 40 	addi    r9,r9,64
  270:	10 42 84 c4 	vxor    v2,v2,v16
  274:	7c 6c 51 ce 	stvx    v3,r12,r10
  278:	10 65 94 c4 	vxor    v3,v5,v18
  27c:	10 42 24 c4 	vxor    v2,v2,v4
  280:	7c 6c 59 ce 	stvx    v3,r12,r11
  284:	7c 4c 19 ce 	stvx    v2,r12,r3
  288:	42 00 ff 48 	bdnz    1d0 <__xor_altivec_5+0x20>
  28c:	83 c1 00 08 	lwz     r30,8(r1)
  290:	38 21 00 10 	addi    r1,r1,16
  294:	4e 80 00 20 	blr

Christophe

> ---
>   arch/powerpc/lib/Makefile | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
> index c55f9c27bf79..b3f7d64caaf0 100644
> --- a/arch/powerpc/lib/Makefile
> +++ b/arch/powerpc/lib/Makefile
> @@ -58,5 +58,9 @@ obj-$(CONFIG_FTR_FIXUP_SELFTEST) += feature-fixups-test.o
>   
>   obj-$(CONFIG_ALTIVEC)	+= xor_vmx.o xor_vmx_glue.o
>   CFLAGS_xor_vmx.o += -maltivec $(call cc-option,-mabi=altivec)
> +ifdef CONFIG_CC_IS_CLANG
> +# See https://github.com/ClangBuiltLinux/linux/issues/563
> +CFLAGS_xor_vmx.o += -Wframe-larger-than=4096
> +endif
>   
>   obj-$(CONFIG_PPC64) += $(obj64-y)



^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-27 17:50 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-21  8:58 [PATCH] powerpc/lib/xor_vmx: Relax frame size for clang Mathieu Malaterre
2022-09-07 17:21 ` Christophe Leroy
2022-09-08  0:27   ` Michael Ellerman
2022-09-08  6:00     ` Christophe Leroy
2022-09-08 13:48       ` Segher Boessenkool
2022-09-09  5:01         ` Christophe Leroy
2022-09-08 15:07     ` Arnd Bergmann
2022-09-08 22:40       ` Segher Boessenkool
2025-08-27 17:15 ` Christophe Leroy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).