netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/2] bpf, x64: Fix tailcall infinite loop bug
@ 2023-08-14 13:41 Leon Hwang
  2023-08-14 13:41 ` [RFC PATCH bpf-next 1/2] " Leon Hwang
  2023-08-14 13:41 ` [RFC PATCH bpf-next 2/2] selftests/bpf: Add testcases for tailcall infinite loop bug fixing Leon Hwang
  0 siblings, 2 replies; 10+ messages in thread
From: Leon Hwang @ 2023-08-14 13:41 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, martin.lau, song, yonghong.song,
	john.fastabend, kpsingh, sdf, haoluo, jolsa, x86, tglx, mingo, bp,
	dave.hansen, hpa, mykolal, shuah, davem, dsahern, hffilwlqm,
	tangyeechou, kernel-patches-bot, maciej.fijalkowski, netdev,
	linux-kernel, linux-kselftest

From commit ebf7d1f508a73871 ("bpf, x64: rework pro/epilogue and tailcall
handling in JIT"), the tailcall on x64 works better than before.

From commit e411901c0b775a3a ("bpf: allow for tailcalls in BPF subprograms
for x64 JIT"), tailcall is able to run in BPF subprograms on x64.

From commit 5b92a28aae4dd0f8 ("bpf: Support attaching tracing BPF program
to other BPF programs"), BPF program is able to trace other BPF programs.

How about combining them all together?

1. FENTRY/FEXIT on a BPF subprogram.
2. A tailcall runs in the BPF subprogram.
3. The tailcall calls itself.

As a result, a tailcall infinite loop comes up. And the loop would halt
the machine.

As we know, in tail call context, the tail_call_cnt propagates by stack
and RAX register between BPF subprograms. So do it in FENTRY/FEXIT
trampolines.

How did I discover the bug?

From commit 7f6e4312e15a5c37 ("bpf: Limit caller's stack depth 256 for
subprogs with tailcalls"), the total stack size limits to around 8KiB.
Then, I write some bpf progs to validate the stack consuming, that are
tailcalls running in bpf2bpf and FENTRY/FEXIT tracing on bpf2bpf[1].

At that time, accidently, I made a tailcall loop. And then the loop halted
my VM. Without the loop, the bpf progs would consume over 8KiB stack size.
But the _stack-overflow_ did not halt my VM.

With bpf_printk(), I confirmed that the tailcall count limit did not work
expectedly. Next, read the code and fix it.

Finally, unfortunately, I only fix it on x64 but other arches. As a
result, CI tests failed because this bug hasn't been fixed on s390x.

Some helps are requested.

[1]: https://github.com/Asphaltt/learn-by-example/tree/main/ebpf/tailcall-stackoverflow

Leon Hwang (2):
  bpf, x64: Fix tailcall infinite loop bug
  selftests/bpf: Add testcases for tailcall infinite loop bug fixing

 arch/x86/net/bpf_jit_comp.c                   |  23 ++-
 include/linux/bpf.h                           |   6 +
 kernel/bpf/trampoline.c                       |   5 +-
 kernel/bpf/verifier.c                         |   9 +-
 .../selftests/bpf/prog_tests/tailcalls.c      | 194 +++++++++++++++++-
 .../bpf/progs/tailcall_bpf2bpf_fentry.c       |  18 ++
 .../bpf/progs/tailcall_bpf2bpf_fexit.c        |  18 ++
 7 files changed, 264 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_fentry.c
 create mode 100644 tools/testing/selftests/bpf/progs/tailcall_bpf2bpf_fexit.c


base-commit: 9930e4af4b509bcf6f060b09b16884f26102d110
-- 
2.41.0


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2023-08-19  3:38 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-14 13:41 [RFC PATCH bpf-next 0/2] bpf, x64: Fix tailcall infinite loop bug Leon Hwang
2023-08-14 13:41 ` [RFC PATCH bpf-next 1/2] " Leon Hwang
2023-08-15  0:52   ` Eduard Zingerman
2023-08-15  3:01     ` Leon Hwang
2023-08-15 14:35       ` Eduard Zingerman
2023-08-17 22:31   ` Alexei Starovoitov
2023-08-18  2:10     ` Leon Hwang
2023-08-18 19:59       ` Alexei Starovoitov
2023-08-19  3:38         ` Leon Hwang
2023-08-14 13:41 ` [RFC PATCH bpf-next 2/2] selftests/bpf: Add testcases for tailcall infinite loop bug fixing Leon Hwang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).