* [PATCH] bpf: test_run: reduce kernel stack usage
@ 2026-05-15 11:25 Arnd Bergmann
2026-05-15 14:47 ` Alexei Starovoitov
0 siblings, 1 reply; 5+ messages in thread
From: Arnd Bergmann @ 2026-05-15 11:25 UTC (permalink / raw)
To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jesper Dangaard Brouer,
John Fastabend, Martin KaFai Lau,
Toke Høiland-Jørgensen
Cc: Arnd Bergmann, Martin KaFai Lau, Song Liu, Yonghong Song,
Jiri Olsa, Simon Horman, Stanislav Fomichev, bpf, netdev,
linux-kernel
From: Arnd Bergmann <arnd@arndb.de>
The xdp_test_data structure is really too large to put on the stack
and results in one of the largest stack frames in the kernel:
net/bpf/test_run.c: In function 'bpf_test_run_xdp_live':
net/bpf/test_run.c:387:1: error: the frame size of 1608 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
Reduce this using dynamic allocation, which avoids around 1KB of
stack usage.
Fixes: b530e9e1063e ("bpf: Add "live packet" mode for XDP in BPF_PROG_RUN")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
found while build testing s390 with gcc-16, had not seen this on
other architectures before.
---
net/bpf/test_run.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index c9aea7052ba7..763891df02be 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -362,27 +362,31 @@ static int bpf_test_run_xdp_live(struct bpf_prog *prog, struct xdp_buff *ctx,
u32 repeat, u32 batch_size, u32 *time)
{
- struct xdp_test_data xdp = { .batch_size = batch_size };
+ struct xdp_test_data *xdp __free(kfree) = kzalloc_obj(*xdp);
struct bpf_test_timer t = {};
int ret;
+ if (!xdp)
+ return -ENOMEM;
+
if (!repeat)
repeat = 1;
- ret = xdp_test_run_setup(&xdp, ctx);
+ xdp->batch_size = batch_size;
+ ret = xdp_test_run_setup(xdp, ctx);
if (ret)
return ret;
bpf_test_timer_enter(&t);
do {
- xdp.frame_cnt = 0;
- ret = xdp_test_run_batch(&xdp, prog, repeat - t.i);
+ xdp->frame_cnt = 0;
+ ret = xdp_test_run_batch(xdp, prog, repeat - t.i);
if (unlikely(ret < 0))
break;
- } while (bpf_test_timer_continue(&t, xdp.frame_cnt, repeat, &ret, time));
+ } while (bpf_test_timer_continue(&t, xdp->frame_cnt, repeat, &ret, time));
bpf_test_timer_leave(&t);
- xdp_test_run_teardown(&xdp);
+ xdp_test_run_teardown(xdp);
return ret;
}
--
2.39.5
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] bpf: test_run: reduce kernel stack usage
2026-05-15 11:25 [PATCH] bpf: test_run: reduce kernel stack usage Arnd Bergmann
@ 2026-05-15 14:47 ` Alexei Starovoitov
2026-05-15 15:15 ` Arnd Bergmann
0 siblings, 1 reply; 5+ messages in thread
From: Alexei Starovoitov @ 2026-05-15 14:47 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, David S. Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jesper Dangaard Brouer,
John Fastabend, Martin KaFai Lau,
Toke Høiland-Jørgensen, Arnd Bergmann, Martin KaFai Lau,
Song Liu, Yonghong Song, Jiri Olsa, Simon Horman,
Stanislav Fomichev, bpf, Network Development, LKML
On Fri, May 15, 2026 at 4:31 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> From: Arnd Bergmann <arnd@arndb.de>
>
> The xdp_test_data structure is really too large to put on the stack
> and results in one of the largest stack frames in the kernel:
>
> net/bpf/test_run.c: In function 'bpf_test_run_xdp_live':
> net/bpf/test_run.c:387:1: error: the frame size of 1608 bytes is larger than 1536 bytes [-Werror=frame-larger-than=]
>
> Reduce this using dynamic allocation, which avoids around 1KB of
> stack usage.
1k?
pahole -C xdp_test_data
/* size: 192, cachelines: 3, members: 9 */
/* sum members: 120, holes: 1, sum holes: 56 */
/* padding: 16 */
/* paddings: 1, sum paddings: 36 */
what s390 doing to make it huge?
Probably better to rearrange the field and fix the root cause.
pw-bot: cr
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] bpf: test_run: reduce kernel stack usage
2026-05-15 14:47 ` Alexei Starovoitov
@ 2026-05-15 15:15 ` Arnd Bergmann
2026-05-15 16:13 ` Alexei Starovoitov
2026-05-15 18:38 ` David Laight
0 siblings, 2 replies; 5+ messages in thread
From: Arnd Bergmann @ 2026-05-15 15:15 UTC (permalink / raw)
To: Alexei Starovoitov, Arnd Bergmann
Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, David S . Miller,
Eric Dumazet, Jakub Kicinski, Paolo Abeni, Jesper Dangaard Brouer,
John Fastabend, Martin KaFai Lau,
Toke Høiland-Jørgensen, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Simon Horman, Stanislav Fomichev, bpf,
Netdev, LKML
On Fri, May 15, 2026, at 16:47, Alexei Starovoitov wrote:
> On Fri, May 15, 2026 at 4:31 AM Arnd Bergmann <arnd@kernel.org> wrote:
>
> 1k?
> pahole -C xdp_test_data
> /* size: 192, cachelines: 3, members: 9 */
> /* sum members: 120, holes: 1, sum holes: 56 */
> /* padding: 16 */
> /* paddings: 1, sum paddings: 36 */
>
> what s390 doing to make it huge?
I think it's a combination of cacheline alignment (256 byte
lines) that leads to padding before 'rxq' and at the end of
xdp_test_data, as well as CONFIG_KASAN_STACK that increases
the stack usage further.
> Probably better to rearrange the field and fix the root cause.
It looks like this change reduces the stack usage by 256
bytes, and keeps it under the warning limit for all the
configurations I was testing:
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -98,8 +98,8 @@ struct xdp_page_head {
};
struct xdp_test_data {
- struct xdp_buff *orig_ctx;
struct xdp_rxq_info rxq;
+ struct xdp_buff *orig_ctx;
struct net_device *dev;
struct page_pool *pp;
struct xdp_frame **frames;
I still get 1368 bytes stack frame in one config (with KASAN_STACK)
here, compared to 392 with my patch. In another config without
KASAN, the mainline version has 1344, the rearranged xdp_test_data
has 1088 bytes and the dynamic allocation variant uses 200.
The one-line change is probably enough to stay off my radar since
it stays below the arbitrary warning limit I test with to find
regressions, but 1368 bytes is still a lot, so there is a good
chance someone else will hit this again.
Another alternative would be to allocate xdp_test_data statically
instead of on the stack.
Let me know which variant you prefer.
Arnd
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] bpf: test_run: reduce kernel stack usage
2026-05-15 15:15 ` Arnd Bergmann
@ 2026-05-15 16:13 ` Alexei Starovoitov
2026-05-15 18:38 ` David Laight
1 sibling, 0 replies; 5+ messages in thread
From: Alexei Starovoitov @ 2026-05-15 16:13 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Arnd Bergmann, Alexei Starovoitov, Daniel Borkmann,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau,
Toke Høiland-Jørgensen, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Simon Horman, Stanislav Fomichev, bpf,
Netdev, LKML
On Fri, May 15, 2026 at 8:16 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, May 15, 2026, at 16:47, Alexei Starovoitov wrote:
> > On Fri, May 15, 2026 at 4:31 AM Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > 1k?
> > pahole -C xdp_test_data
> > /* size: 192, cachelines: 3, members: 9 */
> > /* sum members: 120, holes: 1, sum holes: 56 */
> > /* padding: 16 */
> > /* paddings: 1, sum paddings: 36 */
> >
> > what s390 doing to make it huge?
>
> I think it's a combination of cacheline alignment (256 byte
> lines) that leads to padding before 'rxq' and at the end of
> xdp_test_data, as well as CONFIG_KASAN_STACK that increases
> the stack usage further.
>
> > Probably better to rearrange the field and fix the root cause.
>
> It looks like this change reduces the stack usage by 256
> bytes, and keeps it under the warning limit for all the
> configurations I was testing:
>
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -98,8 +98,8 @@ struct xdp_page_head {
> };
>
> struct xdp_test_data {
> - struct xdp_buff *orig_ctx;
> struct xdp_rxq_info rxq;
> + struct xdp_buff *orig_ctx;
> struct net_device *dev;
> struct page_pool *pp;
> struct xdp_frame **frames;
yeah. this is much better.
> Another alternative would be to allocate xdp_test_data statically
> instead of on the stack.
iirc test_run can execute it in parallel, so static won't work.
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH] bpf: test_run: reduce kernel stack usage
2026-05-15 15:15 ` Arnd Bergmann
2026-05-15 16:13 ` Alexei Starovoitov
@ 2026-05-15 18:38 ` David Laight
1 sibling, 0 replies; 5+ messages in thread
From: David Laight @ 2026-05-15 18:38 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Alexei Starovoitov, Arnd Bergmann, Alexei Starovoitov,
Daniel Borkmann, Andrii Nakryiko, Eduard Zingerman,
Kumar Kartikeya Dwivedi, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Jesper Dangaard Brouer,
John Fastabend, Martin KaFai Lau,
Toke Høiland-Jørgensen, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Simon Horman, Stanislav Fomichev, bpf,
Netdev, LKML
On Fri, 15 May 2026 17:15:46 +0200
"Arnd Bergmann" <arnd@arndb.de> wrote:
> On Fri, May 15, 2026, at 16:47, Alexei Starovoitov wrote:
> > On Fri, May 15, 2026 at 4:31 AM Arnd Bergmann <arnd@kernel.org> wrote:
> >
> > 1k?
> > pahole -C xdp_test_data
> > /* size: 192, cachelines: 3, members: 9 */
> > /* sum members: 120, holes: 1, sum holes: 56 */
> > /* padding: 16 */
> > /* paddings: 1, sum paddings: 36 */
> >
> > what s390 doing to make it huge?
>
> I think it's a combination of cacheline alignment (256 byte
> lines) that leads to padding before 'rxq' and at the end of
> xdp_test_data, as well as CONFIG_KASAN_STACK that increases
> the stack usage further.
There will also be a 'double stack frame' in order to put an
aligned item on stack.
That could be slightly larger than a stack frame and probably doesn't
show up in the figures.
-- David
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2026-05-15 18:38 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-15 11:25 [PATCH] bpf: test_run: reduce kernel stack usage Arnd Bergmann
2026-05-15 14:47 ` Alexei Starovoitov
2026-05-15 15:15 ` Arnd Bergmann
2026-05-15 16:13 ` Alexei Starovoitov
2026-05-15 18:38 ` David Laight
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox