From: Kumar Kartikeya Dwivedi <memxor@gmail.com>
To: bpf@vger.kernel.org
Cc: Alexei Starovoitov <ast@kernel.org>,
Andrii Nakryiko <andrii@kernel.org>,
Daniel Borkmann <daniel@iogearbox.net>,
Martin KaFai Lau <martin.lau@kernel.org>,
Eduard Zingerman <eddyz87@gmail.com>, Tejun Heo <tj@kernel.org>,
Emil Tsalapatis <emil@etsalapatis.com>,
Barret Rhoden <brho@google.com>, Josh Don <joshdon@google.com>,
Dohyun Kim <dohyunkim@google.com>,
kkd@meta.com, kernel-team@meta.com
Subject: [PATCH bpf-next v2 0/2] Timed may_goto
Date: Mon, 3 Mar 2025 16:32:37 -0800 [thread overview]
Message-ID: <20250304003239.2390751-1-memxor@gmail.com> (raw)
This series replaces the current implementation of cond_break, which
uses the may_goto instruction, and counts 8 million iterations per stack
frame, with an implementation based on sampling time locally on the CPU.
This is done to permit a longer time for a given loop per-program
invocation. The accounting is still done per-stack frame, but the count
is used to instead amortize the cost of the logic to sample and check
the time spent since the start.
This is needed for expressing more complicated algorithms (spin locks,
waiting loops, etc.) in BPF programs without false positive expiration
of the loop. For instance, the plan is to make use of this for
implementing spin locks for BPF arena [0].
For the loop as follows:
for (int i = 0;; i++) {}
Testing on a bare-metal Sapphire Rapids Intel server yields the following
table (taking an average of 25 runs).
+-----------------------------+--------------+--------------+------------------+
| Loop type | Iterations | Time (ms) | Time/iter (ns) |
+-----------------------------|--------------+--------------+------------------+
| may_goto | 8388608 | 3 | 0.36 |
| timed_may_goto (count=65535)| 589674932 | 250 | 0.42 |
| bpf_for | 8388608 | 10 | 1.19 |
+-----------------------------+--------------+--------------+------------------+
Here, count is used to amortize the time sampling and checking logic.
Obviously, this is the limit of an empty loop. Given the complexity of
the loop body, the time spent in the loop can be longer. Cancellations
will address the task of imposing an upper bound on program runtime.
For now, the implementation only supports x86.
[0]: https://lore.kernel.org/bpf/20250118162238.2621311-1-memxor@gmail.com
Changelog:
----------
v1 -> v2
v1: https://lore.kernel.org/bpf/20250302201348.940234-1-memxor@gmail.com
* Address comments from Alexei
* Use kernel comment style for new code.
* Remove p->count == 0 check in bpf_check_timed_may_goto.
* Add comments on AX as argument/retval calling convention.
* Add comments describing how the counting logic works.
* Use BPF_EMIT_CALL instead of open-coding instruction encoding.
* Change if ax != 1 goto pc+X condition to if ax != 0 goto pc+X.
Kumar Kartikeya Dwivedi (2):
bpf: Add verifier support for timed may_goto
bpf, x86: Add x86 JIT support for timed may_goto
arch/x86/net/Makefile | 2 +-
arch/x86/net/bpf_jit_comp.c | 5 ++
arch/x86/net/bpf_timed_may_goto.S | 52 ++++++++++++++
include/linux/bpf.h | 1 +
include/linux/filter.h | 8 +++
kernel/bpf/core.c | 32 +++++++++
kernel/bpf/verifier.c | 70 ++++++++++++++++---
.../bpf/progs/verifier_bpf_fastcall.c | 58 ++++++++++++---
.../selftests/bpf/progs/verifier_may_goto_1.c | 34 ++++++++-
9 files changed, 241 insertions(+), 21 deletions(-)
create mode 100644 arch/x86/net/bpf_timed_may_goto.S
base-commit: 7586e2169c77a444d235a98ac858272d3dcec16e
--
2.43.5
next reply other threads:[~2025-03-04 0:32 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-04 0:32 Kumar Kartikeya Dwivedi [this message]
2025-03-04 0:32 ` [PATCH bpf-next v2 1/2] bpf: Add verifier support for timed may_goto Kumar Kartikeya Dwivedi
2025-03-04 2:04 ` Alexei Starovoitov
2025-03-04 0:32 ` [PATCH bpf-next v2 2/2] bpf, x86: Add x86 JIT " Kumar Kartikeya Dwivedi
2025-03-04 2:10 ` [PATCH bpf-next v2 0/2] Timed may_goto patchwork-bot+netdevbpf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250304003239.2390751-1-memxor@gmail.com \
--to=memxor@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=brho@google.com \
--cc=daniel@iogearbox.net \
--cc=dohyunkim@google.com \
--cc=eddyz87@gmail.com \
--cc=emil@etsalapatis.com \
--cc=joshdon@google.com \
--cc=kernel-team@meta.com \
--cc=kkd@meta.com \
--cc=martin.lau@kernel.org \
--cc=tj@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox