From: "Alexei Starovoitov" <alexei.starovoitov@gmail.com>
To: "Seunghyeon Lee" <hyeon3125@gmail.com>,
"Alexei Starovoitov" <ast@kernel.org>,
"Daniel Borkmann" <daniel@iogearbox.net>
Cc: <bpf@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
"Song Liu" <song@kernel.org>,
"Andrii Nakryiko" <andrii@kernel.org>
Subject: Re: [PATCH RFC] bpf: add DAG fast-path in verifier to skip redundant state pruning
Date: Fri, 29 May 2026 18:23:26 -0700 [thread overview]
Message-ID: <DIVM9YXYJV5N.2TEM951JWKD95@gmail.com> (raw)
In-Reply-To: <178004391665.3522.4865582628003357086@gmail.com>
On Fri May 29, 2026 at 1:38 AM PDT, Seunghyeon Lee wrote:
> bpf: add DAG fast-path in verifier to skip redundant state pruning
>
> The BPF verifier's state-equivalence loop (is_state_visited /
> states_equal) dominates load time for programs with acyclic control
> flow graphs (DAGs). In a DAG, every instruction is reached via exactly
> one path: no state is ever revisited, making state-pruning comparisons
> mathematically vacuous.
>
> This RFC proposes three components:
>
> 1. compute_dag_topo() -- replaces a simple is_acyclic_dag() bool.
> Uses Kahn's algorithm (O(V+E), O(V) space, no recursion) to detect
> back-edges AND preserve the topological visit order for use in (2).
> Handles BPF_JMP32, BPF_LD_IMM64 wide instructions, and falls back
> conservatively on BPF-to-BPF subprogram calls.
>
> 2. do_check_dag() -- fast-path verifier for confirmed-DAG programs.
> Critical correctness property: maintains a per-instruction state
> table (states[insn_cnt]) and, at every join point (in_degree > 1),
> calls merge_verifier_state() to conservatively merge all incoming
> paths BEFORE processing the instruction. This ensures the verifier
> never sees a register as initialized when any predecessor path left
> it NOT_INIT.
>
> Skips is_state_visited() / states_equal() entirely; provably safe
> because topological order guarantees each instruction is processed
> exactly once.
>
> 3. Full fallback: if compute_dag_topo() returns NULL (cyclic program,
> subprog, or allocation failure), or if do_check_dag() returns an
> error, the unmodified do_check() runs unchanged.
>
> Register-state correctness at join points
> -----------------------------------------
> When paths A and B converge at instruction N, the verifier must hold
> the least precise state that is safe for both paths.
>
> Path A arrives at insn N: R2 = scalar [0, 5]
> Path B arrives at insn N: R2 = NOT_INIT
> merge_verifier_state() result: R2 = NOT_INIT
> -> verifier correctly rejects any use of R2 at N
>
> Without this merge (a flaw in naive linear-scan approaches), the
> verifier would see only the state from whichever path was processed
> last in memory order, potentially accepting a program that
> dereferences an uninitialized register at runtime.
>
> RFC design question:
> --------------------
> do_check_dag()'s inner per-instruction verification step requires
> calling into do_check()'s inner loop logic, which is not currently
> factored as a callable sub-function. We propose two integration
> options and request maintainer guidance:
>
> Option A: Extract a verify_one_insn() helper from do_check()'s
> inner loop; do_check_dag() calls it per topological step.
Specializing the verifier for DAG only case is not interesting.
Almost all programs now contain loops.
But the approach to use do_check_insn() as a transfer function
in the data flow analysis is totally viable.
I've been prototyping such transfer + join verifier for some time.
Will share soon. What merge_verifier_state() does is a key.
next prev parent reply other threads:[~2026-05-30 1:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-29 8:38 [PATCH RFC] bpf: add DAG fast-path in verifier to skip redundant state pruning Seunghyeon Lee
2026-05-30 1:23 ` Alexei Starovoitov [this message]
2026-05-30 2:12 ` Seunghyeon Lee
2026-06-24 17:38 ` L0
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DIVM9YXYJV5N.2TEM951JWKD95@gmail.com \
--to=alexei.starovoitov@gmail.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=hyeon3125@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=song@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.