* [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback
@ 2026-05-22 2:58 Jiayuan Chen
2026-05-22 3:13 ` sashiko-bot
2026-05-25 18:03 ` Jakub Kicinski
0 siblings, 2 replies; 4+ messages in thread
From: Jiayuan Chen @ 2026-05-22 2:58 UTC (permalink / raw)
To: bpf, netdev, kuba, daniel, martin.lau
Cc: Jiayuan Chen, Alexei Starovoitov, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu,
Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev,
Jamal Hadi Salim, Jiri Pirko, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, linux-kernel
Quan Sun reported [1] a stack overflow in cls_bpf_offload_cmd().
Reproducer on netdevsim: add a skip_sw cls_bpf filter, set the
bpf_tc_accept debugfs knob to 0, then `tc filter replace`. The replace
calls tc_setup_cb_replace() which fails. cls_bpf_offload_cmd() then
swaps prog/oldprog and recursively calls itself to roll back. But
bpf_tc_accept=0 makes the rollback fail too, which triggers yet another
rollback frame with the same arguments, and so on until the stack is
exhausted.
bpf_tc_accept is just a convenient knob for the reproducer. Any driver
whose tc_setup_cb_replace() fails twice in a row can hit the same loop,
so this is not a netdevsim-only issue.
Two ways to fix it:
1) Have the rollback call tc_setup_cb_add() on oldprog instead of
re-entering cls_bpf_offload_cmd().
2) Mark the rollback frame with a flag and skip a second-level
rollback from inside it.
Go with (2). It is the smaller change and keeps the original behaviour:
the rollback still goes through tc_setup_cb_replace(), so the driver
gets one real chance to restore its state. If that attempt also fails,
we just return the original error instead of recursing.
[1]: https://lore.kernel.org/bpf/ce5a6005-3c5e-4696-9e05-eba9461dc860@std.uestc.edu.cn/T/#u
Fixes: 102740bd9436 ("cls_bpf: fix offload assumptions after callback conversion")
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
---
net/sched/cls_bpf.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 9a346b6221b35..d71fbf7cb407e 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -142,7 +142,8 @@ static bool cls_bpf_is_ebpf(const struct cls_bpf_prog *prog)
static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
struct cls_bpf_prog *oldprog,
- struct netlink_ext_ack *extack)
+ struct netlink_ext_ack *extack,
+ bool is_rollback)
{
struct tcf_block *block = tp->chain->block;
struct tc_cls_bpf_offload cls_bpf = {};
@@ -176,8 +177,8 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
skip_sw, &oldprog->gen_flags,
&oldprog->in_hw_count, true);
- if (prog && err) {
- cls_bpf_offload_cmd(tp, oldprog, prog, extack);
+ if (prog && err && !is_rollback) {
+ cls_bpf_offload_cmd(tp, oldprog, prog, extack, true);
return err;
}
@@ -208,7 +209,7 @@ static int cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
if (!prog && !oldprog)
return 0;
- return cls_bpf_offload_cmd(tp, prog, oldprog, extack);
+ return cls_bpf_offload_cmd(tp, prog, oldprog, extack, false);
}
static void cls_bpf_stop_offload(struct tcf_proto *tp,
@@ -217,7 +218,7 @@ static void cls_bpf_stop_offload(struct tcf_proto *tp,
{
int err;
- err = cls_bpf_offload_cmd(tp, NULL, prog, extack);
+ err = cls_bpf_offload_cmd(tp, NULL, prog, extack, false);
if (err)
pr_err("Stopping hardware offload failed: %d\n", err);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback
2026-05-22 2:58 [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback Jiayuan Chen
@ 2026-05-22 3:13 ` sashiko-bot
2026-05-25 18:03 ` Jakub Kicinski
1 sibling, 0 replies; 4+ messages in thread
From: sashiko-bot @ 2026-05-22 3:13 UTC (permalink / raw)
To: Jiayuan Chen; +Cc: bpf
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
- [Low] The function silently swallows error codes during a rollback, returning success (0) instead of the actual error.
--
commit ed592986d69e913d97df2ff4e0394a1bcd8686ac
Author: Jiayuan Chen <jiayuan.chen@linux.dev>
net/sched: cls_bpf: prevent unbounded recursion in offload rollback
This commit adds an is_rollback flag to cls_bpf_offload_cmd() to prevent
infinite recursion when a driver's tc_setup_cb_replace callback repeatedly
fails during a rollback operation.
Link: https://lore.kernel.org/bpf/ce5a6005-3c5e-4696-9e05-eba9461dc860@std.uestc.edu.cn/T/#u
> diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
> index 9a346b6221b35..d71fbf7cb407e 100644
> --- a/net/sched/cls_bpf.c
> +++ b/net/sched/cls_bpf.c
[ ... ]
> @@ -176,8 +177,8 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
> skip_sw, &oldprog->gen_flags,
> &oldprog->in_hw_count, true);
>
> - if (prog && err) {
> - cls_bpf_offload_cmd(tp, oldprog, prog, extack);
> + if (prog && err && !is_rollback) {
> + cls_bpf_offload_cmd(tp, oldprog, prog, extack, true);
> return err;
> }
>
[Severity: Low]
Does this silently swallow the error code and return success during a rollback?
If err is non-zero but is_rollback is true, the entire if condition evaluates
to false. The function then falls through and returns 0 at the end instead of
returning the actual error.
While the only caller that sets is_rollback to true ignores the return value,
this still drops the error code. Should the recursion guard only wrap the
nested call so that the error is always returned?
if (prog && err) {
if (!is_rollback)
cls_bpf_offload_cmd(tp, oldprog, prog, extack, true);
return err;
}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260522025854.341647-1-jiayuan.chen@linux.dev?part=1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback
2026-05-22 2:58 [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback Jiayuan Chen
2026-05-22 3:13 ` sashiko-bot
@ 2026-05-25 18:03 ` Jakub Kicinski
2026-05-25 18:56 ` Alexei Starovoitov
1 sibling, 1 reply; 4+ messages in thread
From: Jakub Kicinski @ 2026-05-25 18:03 UTC (permalink / raw)
To: Jiayuan Chen
Cc: bpf, netdev, daniel, martin.lau, Alexei Starovoitov,
Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
Stanislav Fomichev, Jamal Hadi Salim, Jiri Pirko, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, linux-kernel
On Fri, 22 May 2026 10:58:53 +0800 Jiayuan Chen wrote:
> - if (prog && err) {
> - cls_bpf_offload_cmd(tp, oldprog, prog, extack);
> + if (prog && err && !is_rollback) {
> + cls_bpf_offload_cmd(tp, oldprog, prog, extack, true);
> return err;
> }
Doesn't matter in practice but for code clarity we should only gate
the call to cls_bpf_offload_cmd() not the entire error path.
With that:
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
I'd take this via -next.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback
2026-05-25 18:03 ` Jakub Kicinski
@ 2026-05-25 18:56 ` Alexei Starovoitov
0 siblings, 0 replies; 4+ messages in thread
From: Alexei Starovoitov @ 2026-05-25 18:56 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Jiayuan Chen, bpf, Network Development, Daniel Borkmann,
Martin KaFai Lau, Alexei Starovoitov, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, Song Liu,
Yonghong Song, Jiri Olsa, John Fastabend, Stanislav Fomichev,
Jamal Hadi Salim, Jiri Pirko, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, LKML
On Mon, May 25, 2026 at 11:03 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 22 May 2026 10:58:53 +0800 Jiayuan Chen wrote:
> > - if (prog && err) {
> > - cls_bpf_offload_cmd(tp, oldprog, prog, extack);
> > + if (prog && err && !is_rollback) {
> > + cls_bpf_offload_cmd(tp, oldprog, prog, extack, true);
> > return err;
> > }
>
> Doesn't matter in practice but for code clarity we should only gate
> the call to cls_bpf_offload_cmd() not the entire error path.
>
> With that:
>
> Reviewed-by: Jakub Kicinski <kuba@kernel.org>
>
> I'd take this via -next.
feel free to take it into net-next directly.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-05-25 18:57 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-22 2:58 [PATCH bpf] net/sched: cls_bpf: prevent unbounded recursion in offload rollback Jiayuan Chen
2026-05-22 3:13 ` sashiko-bot
2026-05-25 18:03 ` Jakub Kicinski
2026-05-25 18:56 ` Alexei Starovoitov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.