BPF List
 help / color / mirror / Atom feed
* [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth
@ 2026-05-12 17:19 Paul Chaignon
  2026-05-12 17:19 ` [PATCH bpf-next 2/2] selftests/bpf: Test reported max " Paul Chaignon
  2026-05-12 21:53 ` [PATCH bpf-next 1/2] bpf: Report maximum combined " Eduard Zingerman
  0 siblings, 2 replies; 4+ messages in thread
From: Paul Chaignon @ 2026-05-12 17:19 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi

We've hit the 512 bytes limit on stack depth a few times in Cilium
recently. As a result, we started reporting in CI our current maximum
stack depth across all configurations for each BPF program.

Unfortunately, that is not trivial to compute in userspace. The
verifier reports the stack depths of individual subprogs at the end of
the logs. However the maximum combined stack depth also depends on the
callgraph of those subprogs (the max combined stack depth is the height
of the callgraph weighted by per-subprog stack depths). We can compute
a callgraph in userspace from the loaded instructions, but it often
doesn't match the verifier's own callgraph because of dead code
elimination. Our current approach relies on dumping the BPF_LOG_LEVEL2
logs, but this feels overkill considering the verifier already has the
information we need.

The patch lets the verifier dump the maximum combined stack depth in
the logs, on the same line as the per-subprog stack depths:

    stack depth 16+256 max 272

The per-subprog stack depths and the new max stack depth are not
directly comparable. The former is sometimes updated during fixups,
while the latter is not. As a result, even with a single subprog, we
may end up with two slightly different values. The aim of the new max
value is to be closest to what is actually enforced by the verifier.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
---
 include/linux/bpf_verifier.h | 2 ++
 kernel/bpf/verifier.c        | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 976e2b2f40e8..d91843994c82 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -936,6 +936,8 @@ struct bpf_verifier_env {
 	u32 prev_insn_processed, insn_processed;
 	/* number of jmps, calls, exits analyzed so far */
 	u32 prev_jmps_processed, jmps_processed;
+	/* maximum combined stack depth */
+	u32 max_stack_depth;
 	/* total verification time */
 	u64 verification_time;
 	/* maximum number of verifier states kept in 'branching' instructions */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 11054ad89c14..896dbb4515d7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5045,6 +5045,8 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
 		}
 	} else {
 		depth += subprog_depth;
+		if (depth > env->max_stack_depth)
+			env->max_stack_depth = depth;
 		if (depth > MAX_BPF_STACK) {
 			total = 0;
 			for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller)
@@ -5185,6 +5187,8 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
 	if (priv_stack_mode == PRIV_STACK_UNKNOWN)
 		priv_stack_mode = bpf_enable_priv_stack(env->prog);
 
+	env->max_stack_depth = env->subprog_info[0].stack_depth;
+
 	/* All async_cb subprogs use normal kernel stack. If a particular
 	 * subprog appears in both main prog and async_cb subtree, that
 	 * subprog will use normal kernel stack to avoid potential nesting.
@@ -18289,7 +18293,7 @@ static void print_verification_stats(struct bpf_verifier_env *env)
 		verbose(env, "stack depth %d", env->subprog_info[0].stack_depth);
 		for (i = 1; i < subprog_cnt; i++)
 			verbose(env, "+%d", env->subprog_info[i].stack_depth);
-		verbose(env, "\n");
+		verbose(env, " max %d\n", env->max_stack_depth);
 		verbose(env, "insns processed %d", env->subprog_info[0].insn_processed);
 		for (i = 1; i < subprog_cnt; i++)
 			if (bpf_subprog_is_global(env, i))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH bpf-next 2/2] selftests/bpf: Test reported max stack depth
  2026-05-12 17:19 [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth Paul Chaignon
@ 2026-05-12 17:19 ` Paul Chaignon
  2026-05-12 21:53 ` [PATCH bpf-next 1/2] bpf: Report maximum combined " Eduard Zingerman
  1 sibling, 0 replies; 4+ messages in thread
From: Paul Chaignon @ 2026-05-12 17:19 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Eduard Zingerman, Kumar Kartikeya Dwivedi

This patch tests the maximum stack depth reporting in verifier logs,
with a couple special cases covered: fastcall, private stacks, and
rounding up to 16 bytes. For that last one, we need to skip the test
when JIT compilation is disabled as the rounding is then to 32 bytes.

Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
---
 tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c  | 3 +--
 tools/testing/selftests/bpf/progs/verifier_private_stack.c | 3 +++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
index 0d9e167555b5..8d7ff38e4c06 100644
--- a/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
+++ b/tools/testing/selftests/bpf/progs/verifier_bpf_fastcall.c
@@ -799,8 +799,7 @@ __naked int bpf_loop_interaction2(void)
 
 SEC("raw_tp")
 __arch_x86_64
-__log_level(4)
-__msg("stack depth 512+0")
+__log_level(4) __msg("stack depth 512+0 max 512")
 /* just to print xlated version when debugging */
 __xlated("r0 = &(void __percpu *)(r0)")
 __success
diff --git a/tools/testing/selftests/bpf/progs/verifier_private_stack.c b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
index 646e8ef82051..4167d3a09252 100644
--- a/tools/testing/selftests/bpf/progs/verifier_private_stack.c
+++ b/tools/testing/selftests/bpf/progs/verifier_private_stack.c
@@ -86,6 +86,7 @@ __naked static void cumulative_stack_depth_subprog(void)
 SEC("kprobe")
 __description("Private stack, subtree > MAX_BPF_STACK")
 __success
+__log_level(4) __msg("stack depth 512+32 max 512")
 __arch_x86_64
 /* private stack fp for the main prog */
 __jited("	movabsq	$0x{{.*}}, %r9")
@@ -324,6 +325,8 @@ int private_stack_async_callback_1(void)
 SEC("fentry/bpf_fentry_test9")
 __description("Private stack, async callback, potential nesting")
 __success __retval(0)
+__load_if_JITed()
+__log_level(4) __msg("stack depth 8+0+256+0 max 272")
 __arch_x86_64
 __jited("	subq	$0x100, %rsp")
 __arch_arm64
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth
  2026-05-12 17:19 [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth Paul Chaignon
  2026-05-12 17:19 ` [PATCH bpf-next 2/2] selftests/bpf: Test reported max " Paul Chaignon
@ 2026-05-12 21:53 ` Eduard Zingerman
  2026-05-13 14:06   ` Paul Chaignon
  1 sibling, 1 reply; 4+ messages in thread
From: Eduard Zingerman @ 2026-05-12 21:53 UTC (permalink / raw)
  To: Paul Chaignon, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Kumar Kartikeya Dwivedi

On Tue, 2026-05-12 at 19:19 +0200, Paul Chaignon wrote:
> We've hit the 512 bytes limit on stack depth a few times in Cilium
> recently. As a result, we started reporting in CI our current maximum
> stack depth across all configurations for each BPF program.
> 
> Unfortunately, that is not trivial to compute in userspace. The
> verifier reports the stack depths of individual subprogs at the end of
> the logs. However the maximum combined stack depth also depends on the
> callgraph of those subprogs (the max combined stack depth is the height
> of the callgraph weighted by per-subprog stack depths). We can compute
> a callgraph in userspace from the loaded instructions, but it often
> doesn't match the verifier's own callgraph because of dead code
> elimination. Our current approach relies on dumping the BPF_LOG_LEVEL2
> logs, but this feels overkill considering the verifier already has the
> information we need.
> 
> The patch lets the verifier dump the maximum combined stack depth in
> the logs, on the same line as the per-subprog stack depths:
> 
>     stack depth 16+256 max 272
> 
> The per-subprog stack depths and the new max stack depth are not
> directly comparable. The former is sometimes updated during fixups,
> while the latter is not. As a result, even with a single subprog, we
> may end up with two slightly different values. The aim of the new max
> value is to be closest to what is actually enforced by the verifier.
> 
> Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com>
> ---
>  include/linux/bpf_verifier.h | 2 ++
>  kernel/bpf/verifier.c        | 6 +++++-
>  2 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 976e2b2f40e8..d91843994c82 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -936,6 +936,8 @@ struct bpf_verifier_env {
>  	u32 prev_insn_processed, insn_processed;
>  	/* number of jmps, calls, exits analyzed so far */
>  	u32 prev_jmps_processed, jmps_processed;
> +	/* maximum combined stack depth */
> +	u32 max_stack_depth;
>  	/* total verification time */
>  	u64 verification_time;
>  	/* maximum number of verifier states kept in 'branching' instructions */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 11054ad89c14..896dbb4515d7 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5045,6 +5045,8 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
>  		}
>  	} else {
>  		depth += subprog_depth;
> +		if (depth > env->max_stack_depth)
> +			env->max_stack_depth = depth;
>  		if (depth > MAX_BPF_STACK) {
>  			total = 0;
>  			for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller)
> @@ -5185,6 +5187,8 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
>  	if (priv_stack_mode == PRIV_STACK_UNKNOWN)
>  		priv_stack_mode = bpf_enable_priv_stack(env->prog);
>  
> +	env->max_stack_depth = env->subprog_info[0].stack_depth;
> +

I think this line is redundant, the loop below would call
check_max_stack_depth_subprog() for the main subprogram anyway.
Additionally it does not round the value same way
check_max_stack_depth_subprog() does. Also note that if main
subprogram uses private stack it's depth is omitted in cumulative
depth computation.

>  	/* All async_cb subprogs use normal kernel stack. If a particular
>  	 * subprog appears in both main prog and async_cb subtree, that
>  	 * subprog will use normal kernel stack to avoid potential nesting.
> @@ -18289,7 +18293,7 @@ static void print_verification_stats(struct bpf_verifier_env *env)
>  		verbose(env, "stack depth %d", env->subprog_info[0].stack_depth);
>  		for (i = 1; i < subprog_cnt; i++)
>  			verbose(env, "+%d", env->subprog_info[i].stack_depth);
> -		verbose(env, "\n");
> +		verbose(env, " max %d\n", env->max_stack_depth);
>  		verbose(env, "insns processed %d", env->subprog_info[0].insn_processed);
>  		for (i = 1; i < subprog_cnt; i++)
>  			if (bpf_subprog_is_global(env, i))

Maybe also add a veristat metric for this value?

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth
  2026-05-12 21:53 ` [PATCH bpf-next 1/2] bpf: Report maximum combined " Eduard Zingerman
@ 2026-05-13 14:06   ` Paul Chaignon
  0 siblings, 0 replies; 4+ messages in thread
From: Paul Chaignon @ 2026-05-13 14:06 UTC (permalink / raw)
  To: Eduard Zingerman
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Kumar Kartikeya Dwivedi

On Tue, May 12, 2026 at 02:53:33PM -0700, Eduard Zingerman wrote:
> On Tue, 2026-05-12 at 19:19 +0200, Paul Chaignon wrote:

[...]

> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 11054ad89c14..896dbb4515d7 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5045,6 +5045,8 @@ static int check_max_stack_depth_subprog(struct bpf_verifier_env *env, int idx,
> >  		}
> >  	} else {
> >  		depth += subprog_depth;
> > +		if (depth > env->max_stack_depth)
> > +			env->max_stack_depth = depth;
> >  		if (depth > MAX_BPF_STACK) {
> >  			total = 0;
> >  			for (tmp = idx; tmp >= 0; tmp = dinfo[tmp].caller)
> > @@ -5185,6 +5187,8 @@ static int check_max_stack_depth(struct bpf_verifier_env *env)
> >  	if (priv_stack_mode == PRIV_STACK_UNKNOWN)
> >  		priv_stack_mode = bpf_enable_priv_stack(env->prog);
> >  
> > +	env->max_stack_depth = env->subprog_info[0].stack_depth;
> > +
> 
> I think this line is redundant, the loop below would call
> check_max_stack_depth_subprog() for the main subprogram anyway.
> Additionally it does not round the value same way
> check_max_stack_depth_subprog() does. Also note that if main
> subprogram uses private stack it's depth is omitted in cumulative
> depth computation.

Yep, you're right. I had misread the loop below. I also need to update
env->max_stack_depth in the private-stack case in
check_max_stack_depth_subprog(). I'll add a selftest to cover that in
the v2.

> 
> >  	/* All async_cb subprogs use normal kernel stack. If a particular
> >  	 * subprog appears in both main prog and async_cb subtree, that
> >  	 * subprog will use normal kernel stack to avoid potential nesting.
> > @@ -18289,7 +18293,7 @@ static void print_verification_stats(struct bpf_verifier_env *env)
> >  		verbose(env, "stack depth %d", env->subprog_info[0].stack_depth);
> >  		for (i = 1; i < subprog_cnt; i++)
> >  			verbose(env, "+%d", env->subprog_info[i].stack_depth);
> > -		verbose(env, "\n");
> > +		verbose(env, " max %d\n", env->max_stack_depth);
> >  		verbose(env, "insns processed %d", env->subprog_info[0].insn_processed);
> >  		for (i = 1; i < subprog_cnt; i++)
> >  			if (bpf_subprog_is_global(env, i))
> 
> Maybe also add a veristat metric for this value?

Ack, makes sense.


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-13 14:06 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-12 17:19 [PATCH bpf-next 1/2] bpf: Report maximum combined stack depth Paul Chaignon
2026-05-12 17:19 ` [PATCH bpf-next 2/2] selftests/bpf: Test reported max " Paul Chaignon
2026-05-12 21:53 ` [PATCH bpf-next 1/2] bpf: Report maximum combined " Eduard Zingerman
2026-05-13 14:06   ` Paul Chaignon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox