public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next v3 0/2] bpf, x64: Fix tailcall infinite loop
@ 2023-08-25 14:52 Leon Hwang
  2023-08-25 14:52 ` [RFC PATCH bpf-next v3 1/2] " Leon Hwang
  2023-08-25 14:52 ` [RFC PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for tailcall infinite loop fixing Leon Hwang
  0 siblings, 2 replies; 8+ messages in thread
From: Leon Hwang @ 2023-08-25 14:52 UTC (permalink / raw)
  To: ast, daniel, andrii, maciej.fijalkowski; +Cc: song, hffilwlqm, bpf

This patch series fixes a tailcall infinite loop.

From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall
handling in JIT"), the tailcall on x64 works better than before.

From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms
for x64 JIT"), tailcall is able to run in BPF subprograms on x64.

From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program
to other BPF programs"), BPF program is able to trace other BPF programs.

How about combining them all together?

1. FENTRY/FEXIT on a BPF subprogram.
2. A tailcall runs in the BPF subprogram.
3. The tailcall calls itself.

As a result, a tailcall infinite loop comes up. And the loop would halt
the machine.

As we know, in tail call context, the tail_call_cnt propagates by stack
and rax register between BPF subprograms. So do it in trampolines.

How did I discover the bug?

From commit 7f6e4312e15a5c37 ("bpf: Limit caller's stack depth 256 for
subprogs with tailcalls"), the total stack size limits to around 8KiB.
Then, I write some bpf progs to validate the stack consuming, that are
tailcalls running in bpf2bpf and FENTRY/FEXIT tracing on bpf2bpf[1].

At that time, accidently, I made a tailcall loop. And then the loop halted
my VM. Without the loop, the bpf progs would consume over 8KiB stack size.
But the _stack-overflow_ did not halt my VM.

With bpf_printk(), I confirmed that the tailcall count limit did not work
expectedly. Next, read the code and fix it.

Finally, unfortunately, I only fix it on x64 but other arches. As a
result, CI tests failed because this bug hasn't been fixed on s390x.

Some helps are requested.

v2 -> v3:
  * Suggestions from Alexei:
    * Fix comment style.
    * Remove FIXME comment.
    * Remove arch_prepare_bpf_trampoline() function comment update.
    * Remove the unnecessary 'tgt_prog->aux->func[subprog]->is_func' check.

[1]: https://github.com/Asphaltt/learn-by-example/tree/main/ebpf/tailcall-stackoverflow

Leon Hwang (2):
  bpf, x64: Fix tailcall infinite loop
  selftests/bpf: Add testcases for tailcall infinite loop fixing

 arch/x86/net/bpf_jit_comp.c                   |  32 ++-
 include/linux/bpf.h                           |   5 +
 kernel/bpf/trampoline.c                       |   4 +-
 kernel/bpf/verifier.c                         |  30 ++-
 .../selftests/bpf/prog_tests/tailcalls.c      | 194 +++++++++++++++++-
 .../bpf/progs/tailcall_bpf2bpf_fentry.c       |  18 ++
 .../bpf/progs/tailcall_bpf2bpf_fexit.c        |  18 ++
 7 files changed, 285 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_fentry.c
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_fexit.c


base-commit: 9930e4af4b509bcf6f060b09b16884f26102d110
-- 
2.41.0


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2023-08-31 14:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-25 14:52 [RFC PATCH bpf-next v3 0/2] bpf, x64: Fix tailcall infinite loop Leon Hwang
2023-08-25 14:52 ` [RFC PATCH bpf-next v3 1/2] " Leon Hwang
2023-08-25 17:58   ` Maciej Fijalkowski
2023-08-26  4:03     ` Leon Hwang
2023-08-30 22:49       ` Maciej Fijalkowski
2023-08-31 13:12         ` Leon Hwang
2023-08-31 14:44           ` Leon Hwang
2023-08-25 14:52 ` [RFC PATCH bpf-next v3 2/2] selftests/bpf: Add testcases for tailcall infinite loop fixing Leon Hwang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox