* riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
@ 2025-05-22 13:29 Naresh Kamboju
2025-05-22 16:48 ` Kent Overstreet
0 siblings, 1 reply; 14+ messages in thread
From: Naresh Kamboju @ 2025-05-22 13:29 UTC (permalink / raw)
To: linux-bcache, open list, lkft-triage, Linux Regressions
Cc: kent.overstreet, Arnd Bergmann, Dan Carpenter, Anders Roxell
Regressions on riscv allyesconfig build failed with gcc-13 on the Linux next
tag next-20250516 and next-20250522.
First seen on the next-20250516
Good: next-20250515
Bad: next-20250516
Regressions found on riscv:
- build/gcc-13-allyesconfig
Regression Analysis:
- New regression? Yes
- Reproducible? Yes
Build regression: riscv gcc-13 allyesconfig error the frame size of
2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
## Build log
fs/bcachefs/data_update.c: In function '__bch2_data_update_index_update':
fs/bcachefs/data_update.c:464:1: error: the frame size of 2064 bytes
is larger than 2048 bytes [-Werror=frame-larger-than=]
464 | }
| ^
cc1: all warnings being treated as errors
## Source
* Kernel version: 6.15.0-rc7
* Git tree: https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git
* Git sha: 460178e842c7a1e48a06df684c66eb5fd630bcf7
* Git describe: next-20250522
## Build
* Build log: https://qa-reports.linaro.org/api/testruns/28521854/log_file/
* Build history:
https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20250522/testrun/28521854/suite/build/test/gcc-13-allyesconfig/history/
* Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/2xRoAAw5dl69AvvHb8oZ3pL1SFx/
* Kernel config:
https://storage.tuxsuite.com/public/linaro/lkft/builds/2xRoAAw5dl69AvvHb8oZ3pL1SFx/config
--
Linaro LKFT
https://lkft.linaro.org
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-22 13:29 riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Naresh Kamboju
@ 2025-05-22 16:48 ` Kent Overstreet
2025-05-23 13:19 ` Naresh Kamboju
0 siblings, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2025-05-22 16:48 UTC (permalink / raw)
To: Naresh Kamboju
Cc: linux-bcache, open list, lkft-triage, Linux Regressions,
Arnd Bergmann, Dan Carpenter, Anders Roxell
On Thu, May 22, 2025 at 06:59:53PM +0530, Naresh Kamboju wrote:
> Regressions on riscv allyesconfig build failed with gcc-13 on the Linux next
> tag next-20250516 and next-20250522.
>
> First seen on the next-20250516
> Good: next-20250515
> Bad: next-20250516
>
> Regressions found on riscv:
> - build/gcc-13-allyesconfig
>
> Regression Analysis:
> - New regression? Yes
> - Reproducible? Yes
>
> Build regression: riscv gcc-13 allyesconfig error the frame size of
> 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
Is this a kmsan build? kmsan seems to inflate stack usage by quite a
lot.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-22 16:48 ` Kent Overstreet
@ 2025-05-23 13:19 ` Naresh Kamboju
2025-05-23 13:49 ` Arnd Bergmann
0 siblings, 1 reply; 14+ messages in thread
From: Naresh Kamboju @ 2025-05-23 13:19 UTC (permalink / raw)
To: Kent Overstreet
Cc: linux-bcache, open list, lkft-triage, Linux Regressions,
Arnd Bergmann, Dan Carpenter, Anders Roxell
On Thu, 22 May 2025 at 22:18, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>
> On Thu, May 22, 2025 at 06:59:53PM +0530, Naresh Kamboju wrote:
> > Regressions on riscv allyesconfig build failed with gcc-13 on the Linux next
> > tag next-20250516 and next-20250522.
> >
> > First seen on the next-20250516
> > Good: next-20250515
> > Bad: next-20250516
> >
> > Regressions found on riscv:
> > - build/gcc-13-allyesconfig
> >
> > Regression Analysis:
> > - New regression? Yes
> > - Reproducible? Yes
> >
> > Build regression: riscv gcc-13 allyesconfig error the frame size of
> > 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>
> Is this a kmsan build? kmsan seems to inflate stack usage by quite a
> lot.
This is allyesconfig build which has KASAN builds.
CONFIG_HAVE_ARCH_KASAN=y
CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
CONFIG_CC_HAS_KASAN_GENERIC=y
CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
CONFIG_KASAN=y
CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
CONFIG_KASAN_GENERIC=y
# CONFIG_KASAN_OUTLINE is not set
CONFIG_KASAN_INLINE=y
CONFIG_KASAN_STACK=y
CONFIG_KASAN_VMALLOC=y
CONFIG_KASAN_KUNIT_TEST=y
CONFIG_KASAN_EXTRA_INFO=y
- Naresh
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 13:19 ` Naresh Kamboju
@ 2025-05-23 13:49 ` Arnd Bergmann
2025-05-23 14:08 ` Kent Overstreet
0 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2025-05-23 13:49 UTC (permalink / raw)
To: Naresh Kamboju, Kent Overstreet
Cc: linux-bcache, open list, lkft-triage, Linux Regressions,
Dan Carpenter, Anders Roxell
On Fri, May 23, 2025, at 15:19, Naresh Kamboju wrote:
> On Thu, 22 May 2025 at 22:18, Kent Overstreet <kent.overstreet@linux.dev> wrote:
>>
>> On Thu, May 22, 2025 at 06:59:53PM +0530, Naresh Kamboju wrote:
>> > Regressions on riscv allyesconfig build failed with gcc-13 on the Linux next
>> > tag next-20250516 and next-20250522.
>> >
>> > First seen on the next-20250516
>> > Good: next-20250515
>> > Bad: next-20250516
>> >
>> > Regressions found on riscv:
>> > - build/gcc-13-allyesconfig
>> >
>> > Regression Analysis:
>> > - New regression? Yes
>> > - Reproducible? Yes
>> >
>> > Build regression: riscv gcc-13 allyesconfig error the frame size of
>> > 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
>>
>> Is this a kmsan build? kmsan seems to inflate stack usage by quite a
>> lot.
KMSAN is currently a clang-only feature.
> This is allyesconfig build which has KASAN builds.
>
> CONFIG_HAVE_ARCH_KASAN=y
> CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> CONFIG_CC_HAS_KASAN_GENERIC=y
> CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
> CONFIG_KASAN=y
> CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
> CONFIG_KASAN_GENERIC=y
> # CONFIG_KASAN_OUTLINE is not set
> CONFIG_KASAN_INLINE=y
> CONFIG_KASAN_STACK=y
I reproduced the problem locally and found this to go down to
1440 bytes after I turn off KASAN_STACK. next-20250523 has
some changes that take the number down further to 1136 with
KASAN_STACK and or 1552 with KASAN_STACK.
I've turned bcachefs with kasan-stack on for my randconfig
builds again to see if there are any remaining corner cases.
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 13:49 ` Arnd Bergmann
@ 2025-05-23 14:08 ` Kent Overstreet
2025-05-23 15:17 ` Arnd Bergmann
0 siblings, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2025-05-23 14:08 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Naresh Kamboju, linux-bcache, open list, lkft-triage,
Linux Regressions, Dan Carpenter, Anders Roxell
On Fri, May 23, 2025 at 03:49:54PM +0200, Arnd Bergmann wrote:
> On Fri, May 23, 2025, at 15:19, Naresh Kamboju wrote:
> > On Thu, 22 May 2025 at 22:18, Kent Overstreet <kent.overstreet@linux.dev> wrote:
> >>
> >> On Thu, May 22, 2025 at 06:59:53PM +0530, Naresh Kamboju wrote:
> >> > Regressions on riscv allyesconfig build failed with gcc-13 on the Linux next
> >> > tag next-20250516 and next-20250522.
> >> >
> >> > First seen on the next-20250516
> >> > Good: next-20250515
> >> > Bad: next-20250516
> >> >
> >> > Regressions found on riscv:
> >> > - build/gcc-13-allyesconfig
> >> >
> >> > Regression Analysis:
> >> > - New regression? Yes
> >> > - Reproducible? Yes
> >> >
> >> > Build regression: riscv gcc-13 allyesconfig error the frame size of
> >> > 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
> >>
> >> Is this a kmsan build? kmsan seems to inflate stack usage by quite a
> >> lot.
>
> KMSAN is currently a clang-only feature.
>
> > This is allyesconfig build which has KASAN builds.
> >
> > CONFIG_HAVE_ARCH_KASAN=y
> > CONFIG_HAVE_ARCH_KASAN_VMALLOC=y
> > CONFIG_CC_HAS_KASAN_GENERIC=y
> > CONFIG_CC_HAS_WORKING_NOSANITIZE_ADDRESS=y
> > CONFIG_KASAN=y
> > CONFIG_CC_HAS_KASAN_MEMINTRINSIC_PREFIX=y
> > CONFIG_KASAN_GENERIC=y
> > # CONFIG_KASAN_OUTLINE is not set
> > CONFIG_KASAN_INLINE=y
> > CONFIG_KASAN_STACK=y
>
> I reproduced the problem locally and found this to go down to
> 1440 bytes after I turn off KASAN_STACK. next-20250523 has
> some changes that take the number down further to 1136 with
> KASAN_STACK and or 1552 with KASAN_STACK.
>
> I've turned bcachefs with kasan-stack on for my randconfig
> builds again to see if there are any remaining corner cases.
Thanks for the numbers - that does still seem high, I'll have to have a
look with pahole.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 14:08 ` Kent Overstreet
@ 2025-05-23 15:17 ` Arnd Bergmann
2025-05-23 17:11 ` Kent Overstreet
0 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2025-05-23 15:17 UTC (permalink / raw)
To: Kent Overstreet
Cc: Naresh Kamboju, linux-bcache, open list, lkft-triage,
Linux Regressions, Dan Carpenter, Anders Roxell
On Fri, May 23, 2025, at 16:08, Kent Overstreet wrote:
> On Fri, May 23, 2025 at 03:49:54PM +0200, Arnd Bergmann wrote:
>> On Fri, May 23, 2025, at 15:19, Naresh Kamboju wrote:
>
>> I reproduced the problem locally and found this to go down to
>> 1440 bytes after I turn off KASAN_STACK. next-20250523 has
>> some changes that take the number down further to 1136 with
>> KASAN_STACK and or 1552 with KASAN_STACK.
>>
>> I've turned bcachefs with kasan-stack on for my randconfig
>> builds again to see if there are any remaining corner cases.
>
> Thanks for the numbers - that does still seem high, I'll have to have a
> look with pahole.
I agree it's still larger than it should be: having more than
a few hundred bytes on a function usually means that there is
both the risk for actual overflow and general inefficiency if
all the stack data gets accessed as well.
It's probably not actually structure data though, but a
combination of effects:
- KASAN_STACK adds extra redzones for each variable
- KASAN_STACK further prevents stack slots from getting
reused inside one function, in order to better pinpoint
which instance caused problems like out-of-scope access
- passing structures by value causes them to be put on
the stack on some architectures, even when the structure
size is only one or two registers
- sanitizers turn off optimizations that lead to better
stack usage
- in some cases, the missed optimization ends up causing
local variables to get spilled to the stack many times
because of a combination of all the above.
The good news is that so far my randconfig builds have not
shown any more stack frame warnings on next-20230523 with
bcachefs force-enabled, now 55 builds into the change,
across arm32/arm64/x86 using gcc-15.1.
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 15:17 ` Arnd Bergmann
@ 2025-05-23 17:11 ` Kent Overstreet
2025-05-23 18:01 ` Arnd Bergmann
0 siblings, 1 reply; 14+ messages in thread
From: Kent Overstreet @ 2025-05-23 17:11 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Naresh Kamboju, linux-bcache, open list, lkft-triage,
Linux Regressions, Dan Carpenter, Anders Roxell
On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:
> On Fri, May 23, 2025, at 16:08, Kent Overstreet wrote:
> > On Fri, May 23, 2025 at 03:49:54PM +0200, Arnd Bergmann wrote:
> >> On Fri, May 23, 2025, at 15:19, Naresh Kamboju wrote:
> >
> >> I reproduced the problem locally and found this to go down to
> >> 1440 bytes after I turn off KASAN_STACK. next-20250523 has
> >> some changes that take the number down further to 1136 with
> >> KASAN_STACK and or 1552 with KASAN_STACK.
> >>
> >> I've turned bcachefs with kasan-stack on for my randconfig
> >> builds again to see if there are any remaining corner cases.
> >
> > Thanks for the numbers - that does still seem high, I'll have to have a
> > look with pahole.
>
> I agree it's still larger than it should be: having more than
> a few hundred bytes on a function usually means that there is
> both the risk for actual overflow and general inefficiency if
> all the stack data gets accessed as well.
>
> It's probably not actually structure data though, but a
> combination of effects:
>
> - KASAN_STACK adds extra redzones for each variable
> - KASAN_STACK further prevents stack slots from getting
> reused inside one function, in order to better pinpoint
> which instance caused problems like out-of-scope access
> - passing structures by value causes them to be put on
> the stack on some architectures, even when the structure
> size is only one or two registers
We mainly do this with bkey_s_c, which is just two words: on x86_64,
that gets passed in registers. Is riscv different?
> - sanitizers turn off optimizations that lead to better
> stack usage
> - in some cases, the missed optimization ends up causing
> local variables to get spilled to the stack many times
> because of a combination of all the above.
Yeesh.
I suspect we should be running with a larger stack when the sanitizers
are running, and perhaps tweak the warnings accordingly. I did a bunch
of stack usage work after I found a kmsan build was blowing out the
stack, but then running with max stack usage tracing enabled showed it
to be a largely non issue on non-sanitizer builds, IIRC.
> The good news is that so far my randconfig builds have not
> shown any more stack frame warnings on next-20230523 with
> bcachefs force-enabled, now 55 builds into the change,
> across arm32/arm64/x86 using gcc-15.1.
Good to know, thanks.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 17:11 ` Kent Overstreet
@ 2025-05-23 18:01 ` Arnd Bergmann
2025-05-25 17:18 ` David Laight
0 siblings, 1 reply; 14+ messages in thread
From: Arnd Bergmann @ 2025-05-23 18:01 UTC (permalink / raw)
To: Kent Overstreet
Cc: Naresh Kamboju, linux-bcache, open list, lkft-triage,
Linux Regressions, Dan Carpenter, Anders Roxell
On Fri, May 23, 2025, at 19:11, Kent Overstreet wrote:
> On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:
>>
>> - KASAN_STACK adds extra redzones for each variable
>> - KASAN_STACK further prevents stack slots from getting
>> reused inside one function, in order to better pinpoint
>> which instance caused problems like out-of-scope access
>> - passing structures by value causes them to be put on
>> the stack on some architectures, even when the structure
>> size is only one or two registers
>
> We mainly do this with bkey_s_c, which is just two words: on x86_64,
> that gets passed in registers. Is riscv different?
Not sure, I think it's mostly older ABIs that are limited,
either not passing structures in registers at all, or only
possibly one but not two of them.
>> - sanitizers turn off optimizations that lead to better
>> stack usage
>> - in some cases, the missed optimization ends up causing
>> local variables to get spilled to the stack many times
>> because of a combination of all the above.
>
> Yeesh.
>
> I suspect we should be running with a larger stack when the sanitizers
> are running, and perhaps tweak the warnings accordingly. I did a bunch
> of stack usage work after I found a kmsan build was blowing out the
> stack, but then running with max stack usage tracing enabled showed it
> to be a largely non issue on non-sanitizer builds, IIRC.
Enabling KASAN does double the available stack space. However, I don't
think we should use that as an excuse to raise the per-function
warning limit, because
- the majority of all function stacks do not grow that much when
sanitizers are enabled
- allmodconfig enables KASAN and should still catch mistakes
where a driver accidentally puts a large structure on the stack
- 2KB on 64-bit targes is a really large limit. At some point
in the past I had a series that lowered the limit to 1536 byte
for 64-bit targets, but I never managed to get all the changes
merged.
Arnd
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-23 18:01 ` Arnd Bergmann
@ 2025-05-25 17:18 ` David Laight
2025-05-25 17:36 ` Kent Overstreet
0 siblings, 1 reply; 14+ messages in thread
From: David Laight @ 2025-05-25 17:18 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Kent Overstreet, Naresh Kamboju, linux-bcache, open list,
lkft-triage, Linux Regressions, Dan Carpenter, Anders Roxell
On Fri, 23 May 2025 20:01:33 +0200
"Arnd Bergmann" <arnd@arndb.de> wrote:
> On Fri, May 23, 2025, at 19:11, Kent Overstreet wrote:
> > On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:
> >>
> >> - KASAN_STACK adds extra redzones for each variable
> >> - KASAN_STACK further prevents stack slots from getting
> >> reused inside one function, in order to better pinpoint
> >> which instance caused problems like out-of-scope access
> >> - passing structures by value causes them to be put on
> >> the stack on some architectures, even when the structure
> >> size is only one or two registers
> >
> > We mainly do this with bkey_s_c, which is just two words: on x86_64,
> > that gets passed in registers. Is riscv different?
>
> Not sure, I think it's mostly older ABIs that are limited,
> either not passing structures in registers at all, or only
> possibly one but not two of them.
>
> >> - sanitizers turn off optimizations that lead to better
> >> stack usage
> >> - in some cases, the missed optimization ends up causing
> >> local variables to get spilled to the stack many times
> >> because of a combination of all the above.
> >
> > Yeesh.
> >
> > I suspect we should be running with a larger stack when the sanitizers
> > are running, and perhaps tweak the warnings accordingly. I did a bunch
> > of stack usage work after I found a kmsan build was blowing out the
> > stack, but then running with max stack usage tracing enabled showed it
> > to be a largely non issue on non-sanitizer builds, IIRC.
>
> Enabling KASAN does double the available stack space. However, I don't
> think we should use that as an excuse to raise the per-function
> warning limit, because
>
> - the majority of all function stacks do not grow that much when
> sanitizers are enabled
> - allmodconfig enables KASAN and should still catch mistakes
> where a driver accidentally puts a large structure on the stack
That is rather annoying when you want to look at the generated code :-(
> - 2KB on 64-bit targes is a really large limit. At some point
> in the past I had a series that lowered the limit to 1536 byte
> for 64-bit targets, but I never managed to get all the changes
> merged.
I've a cunning plan to do a proper static analysis of stack usage.
It is a 'simple' matter of getting objtool to output all calls with
the stack offset.
Indirect calls need the function hashes from fine-ibt, but also need
clang to support 'hash seeds' to disambiguate all the void (*)(void *)
functions.
That'll first barf at all recursion, and then, I expect, show a massive
stack use inside snprintf() in some error path.
Just need a big stack of 'round tuits'.
David
>
>
> Arnd
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-25 17:18 ` David Laight
@ 2025-05-25 17:36 ` Kent Overstreet
2025-05-25 17:47 ` David Laight
2025-05-25 19:25 ` Steven Rostedt
0 siblings, 2 replies; 14+ messages in thread
From: Kent Overstreet @ 2025-05-25 17:36 UTC (permalink / raw)
To: David Laight
Cc: Arnd Bergmann, Naresh Kamboju, linux-bcache, open list,
lkft-triage, Linux Regressions, Dan Carpenter, Anders Roxell,
Steven Rostedt
+cc Steve
On Sun, May 25, 2025 at 06:18:42PM +0100, David Laight wrote:
> On Fri, 23 May 2025 20:01:33 +0200
> "Arnd Bergmann" <arnd@arndb.de> wrote:
>
> > On Fri, May 23, 2025, at 19:11, Kent Overstreet wrote:
> > > On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:
> > >>
> > >> - KASAN_STACK adds extra redzones for each variable
> > >> - KASAN_STACK further prevents stack slots from getting
> > >> reused inside one function, in order to better pinpoint
> > >> which instance caused problems like out-of-scope access
> > >> - passing structures by value causes them to be put on
> > >> the stack on some architectures, even when the structure
> > >> size is only one or two registers
> > >
> > > We mainly do this with bkey_s_c, which is just two words: on x86_64,
> > > that gets passed in registers. Is riscv different?
> >
> > Not sure, I think it's mostly older ABIs that are limited,
> > either not passing structures in registers at all, or only
> > possibly one but not two of them.
> >
> > >> - sanitizers turn off optimizations that lead to better
> > >> stack usage
> > >> - in some cases, the missed optimization ends up causing
> > >> local variables to get spilled to the stack many times
> > >> because of a combination of all the above.
> > >
> > > Yeesh.
> > >
> > > I suspect we should be running with a larger stack when the sanitizers
> > > are running, and perhaps tweak the warnings accordingly. I did a bunch
> > > of stack usage work after I found a kmsan build was blowing out the
> > > stack, but then running with max stack usage tracing enabled showed it
> > > to be a largely non issue on non-sanitizer builds, IIRC.
> >
> > Enabling KASAN does double the available stack space. However, I don't
> > think we should use that as an excuse to raise the per-function
> > warning limit, because
> >
> > - the majority of all function stacks do not grow that much when
> > sanitizers are enabled
> > - allmodconfig enables KASAN and should still catch mistakes
> > where a driver accidentally puts a large structure on the stack
>
> That is rather annoying when you want to look at the generated code :-(
>
> > - 2KB on 64-bit targes is a really large limit. At some point
> > in the past I had a series that lowered the limit to 1536 byte
> > for 64-bit targets, but I never managed to get all the changes
> > merged.
>
> I've a cunning plan to do a proper static analysis of stack usage.
> It is a 'simple' matter of getting objtool to output all calls with
> the stack offset.
> Indirect calls need the function hashes from fine-ibt, but also need
> clang to support 'hash seeds' to disambiguate all the void (*)(void *)
> functions.
> That'll first barf at all recursion, and then, I expect, show a massive
> stack use inside snprintf() in some error path.
I suspect recursion will make the results you get with that approach
useless.
We already have "trace max stack", but that only checks at process exit,
so it doesn't tell you much.
We could do better with tracing - just inject a trampoline that checks
the current stack usage against the maximum stack usage we've seen, and
emits a trace event with a stack trace if it's greater.
(and now Steve's going to tell us he's already done this :)
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-25 17:36 ` Kent Overstreet
@ 2025-05-25 17:47 ` David Laight
2025-05-25 18:10 ` Kent Overstreet
2025-05-25 19:25 ` Steven Rostedt
1 sibling, 1 reply; 14+ messages in thread
From: David Laight @ 2025-05-25 17:47 UTC (permalink / raw)
To: Kent Overstreet
Cc: Arnd Bergmann, Naresh Kamboju, linux-bcache, open list,
lkft-triage, Linux Regressions, Dan Carpenter, Anders Roxell,
Steven Rostedt
On Sun, 25 May 2025 13:36:16 -0400
Kent Overstreet <kent.overstreet@linux.dev> wrote:
> +cc Steve
...
> > I've a cunning plan to do a proper static analysis of stack usage.
> > It is a 'simple' matter of getting objtool to output all calls with
> > the stack offset.
> > Indirect calls need the function hashes from fine-ibt, but also need
> > clang to support 'hash seeds' to disambiguate all the void (*)(void *)
> > functions.
> > That'll first barf at all recursion, and then, I expect, show a massive
> > stack use inside snprintf() in some error path.
>
> I suspect recursion will make the results you get with that approach
> useless.
Recursion is an issue, but the kernel really doesn't support recursion.
So you actually want to know the possible recursion loops anyway.
I suspect (hope) most will be the 'recurses only once' type.
If not they need some other bound.
> We already have "trace max stack", but that only checks at process exit,
> so it doesn't tell you much.
>
> We could do better with tracing - just inject a trampoline that checks
> the current stack usage against the maximum stack usage we've seen, and
> emits a trace event with a stack trace if it's greater.
Both those only tells you the stack you've used.
The static analysis will show you the stack 'you might use'.
Which is really much more important.
I did this for an embedded system a long time ago.
The outcome was that we didn't have enough memory to allocate the
'worst case' stacks!
David
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-25 17:47 ` David Laight
@ 2025-05-25 18:10 ` Kent Overstreet
0 siblings, 0 replies; 14+ messages in thread
From: Kent Overstreet @ 2025-05-25 18:10 UTC (permalink / raw)
To: David Laight
Cc: Arnd Bergmann, Naresh Kamboju, linux-bcache, open list,
lkft-triage, Linux Regressions, Dan Carpenter, Anders Roxell,
Steven Rostedt
On Sun, May 25, 2025 at 06:47:57PM +0100, David Laight wrote:
> On Sun, 25 May 2025 13:36:16 -0400
> Kent Overstreet <kent.overstreet@linux.dev> wrote:
>
> > +cc Steve
> ...
> > > I've a cunning plan to do a proper static analysis of stack usage.
> > > It is a 'simple' matter of getting objtool to output all calls with
> > > the stack offset.
> > > Indirect calls need the function hashes from fine-ibt, but also need
> > > clang to support 'hash seeds' to disambiguate all the void (*)(void *)
> > > functions.
> > > That'll first barf at all recursion, and then, I expect, show a massive
> > > stack use inside snprintf() in some error path.
> >
> > I suspect recursion will make the results you get with that approach
> > useless.
>
> Recursion is an issue, but the kernel really doesn't support recursion.
> So you actually want to know the possible recursion loops anyway.
> I suspect (hope) most will be the 'recurses only once' type.
> If not they need some other bound.
Recursion is a fact of life when you get different subsystems
interacting in unpredictable ways.
You can be in one filesystem, and then end up in a fault handler (gup(),
or a simple copy to/from user), and then end up in a completely
different filesystem - and then you call into the block layer, or
networking if it's NFS.
Static analysis might get you some useful data within a subsystem, but
it won't tell you much about the kernel as a whole as people are
actually running it.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-25 17:36 ` Kent Overstreet
2025-05-25 17:47 ` David Laight
@ 2025-05-25 19:25 ` Steven Rostedt
2025-05-25 20:04 ` Kent Overstreet
1 sibling, 1 reply; 14+ messages in thread
From: Steven Rostedt @ 2025-05-25 19:25 UTC (permalink / raw)
To: Kent Overstreet
Cc: David Laight, Arnd Bergmann, Naresh Kamboju, linux-bcache,
open list, lkft-triage, Linux Regressions, Dan Carpenter,
Anders Roxell
On Sun, 25 May 2025 13:36:16 -0400
Kent Overstreet <kent.overstreet@linux.dev> wrote:
> We already have "trace max stack", but that only checks at process exit,
> so it doesn't tell you much.
Nope, it traces the stack at every function call, but it misses the leaf
functions and also doesn't check interrupts as they may use a different
stack.
>
> We could do better with tracing - just inject a trampoline that checks
> the current stack usage against the maximum stack usage we've seen, and
> emits a trace event with a stack trace if it's greater.
>
> (and now Steve's going to tell us he's already done this :)
Close ;-)
# echo 1 > /proc/sys/kernel/stack_tracer_enabled
Wait.
# cat /sys/kernel/tracing/stack_trace
Depth Size Location (33 entries)
----- ---- --------
0) 8360 48 __msecs_to_jiffies+0x9/0x30
1) 8312 104 update_group_capacity+0x95/0x970
2) 8208 520 update_sd_lb_stats.constprop.0+0x278/0x2f40
3) 7688 416 sched_balance_find_src_group+0x96/0xe30
4) 7272 512 sched_balance_rq+0x53f/0x2fe0
5) 6760 344 sched_balance_newidle+0x6c1/0x1310
6) 6416 80 pick_next_task_fair+0x55/0xe60
7) 6336 328 __schedule+0x8a5/0x33d0
8) 6008 32 schedule+0xe2/0x3b0
9) 5976 32 io_schedule+0x8f/0xf0
10) 5944 264 rq_qos_wait+0x12a/0x200
11) 5680 144 wbt_wait+0x159/0x260
12) 5536 40 __rq_qos_throttle+0x50/0x90
13) 5496 320 blk_mq_submit_bio+0x70b/0x1ff0
14) 5176 240 __submit_bio+0x1b3/0x600
15) 4936 248 submit_bio_noacct_nocheck+0x546/0xca0
16) 4688 144 ext4_bio_write_folio+0x69d/0x1870
17) 4544 64 mpage_submit_folio+0x14c/0x2b0
18) 4480 96 mpage_process_page_bufs+0x392/0x7a0
19) 4384 632 mpage_prepare_extent_to_map+0xa5b/0x1080
20) 3752 496 ext4_do_writepages+0x8af/0x2ee0
21) 3256 304 ext4_writepages+0x26f/0x5c0
22) 2952 344 do_writepages+0x183/0x7c0
23) 2608 152 __writeback_single_inode+0x114/0xb00
24) 2456 744 writeback_sb_inodes+0x52b/0xdf0
25) 1712 168 __writeback_inodes_wb+0xf4/0x270
26) 1544 312 wb_writeback+0x547/0x800
27) 1232 328 wb_workfn+0x7b1/0xbc0
28) 904 352 process_one_work+0x85a/0x1450
29) 552 176 worker_thread+0x5b7/0xf80
30) 376 168 kthread+0x371/0x720
31) 208 32 ret_from_fork+0x34/0x70
32) 176 176 ret_from_fork_asm+0x1a/0x30
The code that does this is in kernel/trace/trace_stack.c
It simply attaches to the function tracer and at ever function checks the
current stack size.
Hmm, I need to update this because today we even pass the stack pointer via
the ftrace_regs if the arch supports it. Using that would allow me to get
rid of the hack:
static void check_stack(unsigned long ip, unsigned long *stack)
{
[..]
this_size = ((unsigned long)stack) & (THREAD_SIZE-1);
this_size = THREAD_SIZE - this_size;
unsigned long stack;
[..]
static void
stack_trace_call(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
unsigned long stack;
[..]
check_stack(ip, &stack);
-- Steve
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
2025-05-25 19:25 ` Steven Rostedt
@ 2025-05-25 20:04 ` Kent Overstreet
0 siblings, 0 replies; 14+ messages in thread
From: Kent Overstreet @ 2025-05-25 20:04 UTC (permalink / raw)
To: Steven Rostedt
Cc: David Laight, Arnd Bergmann, Naresh Kamboju, linux-bcache,
open list, lkft-triage, Linux Regressions, Dan Carpenter,
Anders Roxell
On Sun, May 25, 2025 at 03:25:02PM -0400, Steven Rostedt wrote:
> On Sun, 25 May 2025 13:36:16 -0400
> Kent Overstreet <kent.overstreet@linux.dev> wrote:
>
> > We already have "trace max stack", but that only checks at process exit,
> > so it doesn't tell you much.
>
> Nope, it traces the stack at every function call, but it misses the leaf
> functions and also doesn't check interrupts as they may use a different
> stack.
I was thinking of DEBUG_STACK_USAGE :)
> > We could do better with tracing - just inject a trampoline that checks
> > the current stack usage against the maximum stack usage we've seen, and
> > emits a trace event with a stack trace if it's greater.
> >
> > (and now Steve's going to tell us he's already done this :)
>
> Close ;-)
>
> # echo 1 > /proc/sys/kernel/stack_tracer_enabled
>
> Wait.
>
> # cat /sys/kernel/tracing/stack_trace
> Depth Size Location (33 entries)
> ----- ---- --------
> 0) 8360 48 __msecs_to_jiffies+0x9/0x30
> 1) 8312 104 update_group_capacity+0x95/0x970
> 2) 8208 520 update_sd_lb_stats.constprop.0+0x278/0x2f40
> 3) 7688 416 sched_balance_find_src_group+0x96/0xe30
> 4) 7272 512 sched_balance_rq+0x53f/0x2fe0
> 5) 6760 344 sched_balance_newidle+0x6c1/0x1310
> 6) 6416 80 pick_next_task_fair+0x55/0xe60
> 7) 6336 328 __schedule+0x8a5/0x33d0
> 8) 6008 32 schedule+0xe2/0x3b0
> 9) 5976 32 io_schedule+0x8f/0xf0
> 10) 5944 264 rq_qos_wait+0x12a/0x200
> 11) 5680 144 wbt_wait+0x159/0x260
> 12) 5536 40 __rq_qos_throttle+0x50/0x90
> 13) 5496 320 blk_mq_submit_bio+0x70b/0x1ff0
> 14) 5176 240 __submit_bio+0x1b3/0x600
> 15) 4936 248 submit_bio_noacct_nocheck+0x546/0xca0
> 16) 4688 144 ext4_bio_write_folio+0x69d/0x1870
> 17) 4544 64 mpage_submit_folio+0x14c/0x2b0
> 18) 4480 96 mpage_process_page_bufs+0x392/0x7a0
> 19) 4384 632 mpage_prepare_extent_to_map+0xa5b/0x1080
> 20) 3752 496 ext4_do_writepages+0x8af/0x2ee0
> 21) 3256 304 ext4_writepages+0x26f/0x5c0
> 22) 2952 344 do_writepages+0x183/0x7c0
> 23) 2608 152 __writeback_single_inode+0x114/0xb00
> 24) 2456 744 writeback_sb_inodes+0x52b/0xdf0
> 25) 1712 168 __writeback_inodes_wb+0xf4/0x270
> 26) 1544 312 wb_writeback+0x547/0x800
> 27) 1232 328 wb_workfn+0x7b1/0xbc0
> 28) 904 352 process_one_work+0x85a/0x1450
> 29) 552 176 worker_thread+0x5b7/0xf80
> 30) 376 168 kthread+0x371/0x720
> 31) 208 32 ret_from_fork+0x34/0x70
> 32) 176 176 ret_from_fork_asm+0x1a/0x30
Nice! This is exactly what I was looking for :)
Depth Size Location (48 entries)
----- ---- --------
0) 7728 48 __update_load_avg_se+0x9/0x440
1) 7680 80 update_load_avg+0x25f/0x2b0
2) 7600 56 set_next_task_fair+0x232/0x290
3) 7544 48 pick_next_task_fair+0xcf/0x1a0
4) 7496 120 __schedule+0x284/0xe80
5) 7376 16 preempt_schedule_irq+0x33/0x50
6) 7360 136 asm_common_interrupt+0x26/0x40
7) 7224 48 get_symbol_offset+0x43/0x70
8) 7176 56 kallsyms_lookup_buildid+0x55/0xf0
9) 7120 88 __sprint_symbol.isra.0+0x48/0xf0
10) 7032 720 symbol_string+0xf1/0x120
11) 6312 120 vsnprintf+0x3dc/0x5d0
12) 6192 128 bch2_prt_printf+0x57/0x140
13) 6064 64 bch2_prt_task_backtrace+0x71/0xc0
14) 6000 40 print_cycle+0x71/0xa0
15) 5960 104 trace_would_deadlock+0xb6/0x150
16) 5856 128 break_cycle+0xfe/0x260
17) 5728 368 bch2_check_for_deadlock+0x35f/0x5f0
18) 5360 96 six_lock_slowpath.isra.0+0x204/0x4c0
19) 5264 96 __bch2_btree_node_get+0x384/0x5b0
20) 5168 336 bch2_btree_path_traverse_one+0x7a5/0xd60
21) 4832 232 bch2_btree_iter_peek_slot+0x104/0x7f0
22) 4600 216 btree_key_cache_fill+0xcf/0x1a0
23) 4384 72 bch2_btree_path_traverse_cached+0x2bd/0x310
24) 4312 336 bch2_btree_path_traverse_one+0x705/0xd60
25) 3976 232 bch2_btree_iter_peek_slot+0x104/0x7f0
26) 3744 424 bch2_check_discard_freespace_key+0x172/0x5e0
27) 3320 224 bch2_bucket_alloc_freelist+0x422/0x610
28) 3096 88 bch2_bucket_alloc_trans+0x1f3/0x3a0
29) 3008 168 bch2_bucket_alloc_set_trans+0xf1/0x360
30) 2840 184 __open_bucket_add_buckets+0x40b/0x660
31) 2656 40 open_bucket_add_buckets+0x72/0xf0
32) 2616 280 bch2_alloc_sectors_start_trans+0x76d/0xd00
33) 2336 424 __bch2_write+0x1d1/0x11d0
34) 1912 168 __bch2_writepage+0x3b2/0x790
35) 1744 72 write_cache_pages+0x5c/0xa0
36) 1672 176 bch2_writepages+0x67/0xc0
37) 1496 184 do_writepages+0xcc/0x240
38) 1312 64 __writeback_single_inode+0x41/0x320
39) 1248 456 writeback_sb_inodes+0x216/0x4e0
40) 792 64 __writeback_inodes_wb+0x4c/0xe0
41) 728 168 wb_writeback+0x19c/0x310
42) 560 136 wb_workfn+0x2a4/0x400
43) 424 64 process_one_work+0x18c/0x330
44) 360 72 worker_thread+0x252/0x3a0
45) 288 80 kthread+0xf9/0x210
46) 208 32 ret_from_fork+0x31/0x50
47) 176 176 ret_from_fork_asm+0x11/0x20
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-05-25 20:05 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-22 13:29 riscv gcc-13 allyesconfig error the frame size of 2064 bytes is larger than 2048 bytes [-Werror=frame-larger-than=] Naresh Kamboju
2025-05-22 16:48 ` Kent Overstreet
2025-05-23 13:19 ` Naresh Kamboju
2025-05-23 13:49 ` Arnd Bergmann
2025-05-23 14:08 ` Kent Overstreet
2025-05-23 15:17 ` Arnd Bergmann
2025-05-23 17:11 ` Kent Overstreet
2025-05-23 18:01 ` Arnd Bergmann
2025-05-25 17:18 ` David Laight
2025-05-25 17:36 ` Kent Overstreet
2025-05-25 17:47 ` David Laight
2025-05-25 18:10 ` Kent Overstreet
2025-05-25 19:25 ` Steven Rostedt
2025-05-25 20:04 ` Kent Overstreet
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox