From: Shung-Hsi Yu <shung-hsi.yu@suse.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, Mohamed Mahmoud <mmahmoud@redhat.com>
Subject: Re: Hitting verifier backtracking bug on 6.5.5 kernel
Date: Tue, 17 Oct 2023 23:26:20 +0800 [thread overview]
Message-ID: <ZS6nnJRuI22tgI4D@u94a> (raw)
In-Reply-To: <87il75v74m.fsf@toke.dk>
On Tue, Oct 17, 2023 at 01:08:25PM +0200, Toke Høiland-Jørgensen wrote:
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Mon, Oct 16, 2023 at 12:37 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >>
> >> > On Thu, Oct 12, 2023 at 1:25 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >> >>
> >> >> Hi Andrii
> >> >>
> >> >> Mohamed ran into what appears to be a verifier bug related to your
> >> >> commit:
> >> >>
> >> >> fde2a3882bd0 ("bpf: support precision propagation in the presence of subprogs")
> >> >>
> >> >> So I figured you'd be the person to ask about this :)
> >> >>
> >> >> The issue appears on a vanilla 6.5 kernel (on both 6.5.6 on Fedora 38,
> >> >> and 6.5.5 on my Arch machine):
> >> >>
> >> >> INFO[0000] Verifier error: load program: bad address:
> >> >> 1861: frame2: R1_w=fp-160 R2_w=pkt_end(off=0,imm=0) R3=scalar(umin=17,umax=255,var_off=(0x0; 0xff)) R4_w=fp-96 R6_w=fp-96 R7_w=pkt(off=34,r=34,imm=0) R10=fp0
> >> >> ; switch (protocol) {
> >> >> 1861: (15) if r3 == 0x11 goto pc+22 1884: frame2: R1_w=fp-160 R2_w=pkt_end(off=0,imm=0) R3=17 R4_w=fp-96 R6_w=fp-96 R7_w=pkt(off=34,r=34,imm=0) R10=fp0
> >> >> ; if ((void *)udp + sizeof(*udp) <= data_end) {
> >> >> 1884: (bf) r3 = r7 ; frame2: R3_w=pkt(off=34,r=34,imm=0) R7_w=pkt(off=34,r=34,imm=0)
> >> >> 1885: (07) r3 += 8 ; frame2: R3_w=pkt(off=42,r=34,imm=0)
> >> >> ; if ((void *)udp + sizeof(*udp) <= data_end) {
> >> >> 1886: (2d) if r3 > r2 goto pc+23 ; frame2: R2_w=pkt_end(off=0,imm=0) R3_w=pkt(off=42,r=42,imm=0)
> >> >> ; id->src_port = bpf_ntohs(udp->source);
> >> >> 1887: (69) r2 = *(u16 *)(r7 +0) ; frame2: R2_w=scalar(umax=65535,var_off=(0x0; 0xffff)) R7_w=pkt(off=34,r=42,imm=0)
> >> >> 1888: (bf) r3 = r2 ; frame2: R2_w=scalar(id=103,umax=65535,var_off=(0x0; 0xffff)) R3_w=scalar(id=103,umax=65535,var_off=(0x0; 0xffff))
> >> >> 1889: (dc) r3 = be16 r3 ; frame2: R3_w=scalar()
> >> >> ; id->src_port = bpf_ntohs(udp->source);
> >> >> 1890: (73) *(u8 *)(r1 +47) = r3 ; frame2: R1_w=fp-160 R3_w=scalar()
> >> >> ; id->src_port = bpf_ntohs(udp->source);
> >> >> 1891: (dc) r2 = be64 r2 ; frame2: R2_w=scalar()
> >> >> ; id->src_port = bpf_ntohs(udp->source);
> >> >> 1892: (77) r2 >>= 56 ; frame2: R2_w=scalar(umax=255,var_off=(0x0; 0xff))
> >> >> 1893: (73) *(u8 *)(r1 +48) = r2
> >> >> BUG regs 1
> >> >> processed 5121 insns (limit 1000000) max_states_per_insn 4 total_states 92 peak_states 90 mark_read 20
> >> >> (truncated) component=ebpf.FlowFetcher
> >> >>
> >> >> Dmesg says:
> >> >>
> >> >> [252431.093126] verifier backtracking bug
> >> >> [252431.093129] WARNING: CPU: 3 PID: 302245 at kernel/bpf/verifier.c:3533 __mark_chain_precision+0xe83/0x1090
> >> >>
> >> >>
> >> >> The splat appears when trying to run the netobserv-ebpf-agent. Steps to
> >> >> reproduce:
> >> >>
> >> >> git clone https://github.com/netobserv/netobserv-ebpf-agent
> >> >> cd netobserv-ebpf-agent && make compile
> >> >> sudo FLOWS_TARGET_HOST=127.0.0.1 FLOWS_TARGET_PORT=9999 ./bin/netobserv-ebpf-agent
> >> >>
> >> >> (It needs a 'make generate' before the compile to recompile the BPF
> >> >> program itself, but that requires the Cilium bpf2go program to be
> >> >> installed and there's a binary version checked into the tree so that is
> >> >> not strictly necessary to reproduce the splat).
> >> >>
> >> >> That project uses the Cilium Go eBPF loader. Interestingly, loading the
> >> >> same program using tc (with libbpf 1.2.2) works just fine:
> >> >>
> >> >> ip link add type veth
> >> >> tc qdisc add dev veth0 clsact
> >> >> tc filter add dev veth0 egress bpf direct-action obj pkg/ebpf/bpf_bpfel.o sec tc_egress
> >> >>
> >> >> So maybe there is some massaging of the object file that libbpf is doing
> >> >> but the Go library isn't, that prevents this bug from triggering? I'm
> >> >> only guessing here, I don't really know exactly what the Go library is
> >> >> doing under the hood.
> >> >>
> >> >> Anyway, I guess this is a kernel bug in any case since that WARN() is
> >> >> there; could you please take a look?
> >> >>
> >> >
> >> > Yes, I tried. Unfortunately I can't build netobserv-ebpf-agent on my
> >> > dev machine and can't run it. I tried to load bpf_bpfel.o through
> >> > veristat, but unfortunately it is not libbpf-compatible.
> >> >
> >> > Is there some way to get a full verifier log for the failure above?
> >> > with log_level 2, if possible? If you can share it through Github Gist
> >> > or something like that, I'd really appreciate it. Thanks!
> >>
> >> Sure, here you go:
> >> https://gist.github.com/tohojo/31173d2bb07262a21393f76d9a45132d
> >
> > Thanks, this is very useful. And it's pretty clear what happens from
> > last few lines:
> >
> > mark_precise: frame2: regs=r2 stack= before 1890: (dc) r2 = be64 r2
> > mark_precise: frame2: regs=r0,r2 stack= before 1889: (73) *(u8
> > *)(r1 +47) = r3
> >
> > See how we add r0 to the regs set, while there is no r0 involved in
> > `r2 = be64 r2`? I think it's just a missing case of handling BPF_END
> > (and perhaps BPF_NEG as well) instructions in backtrack_insn(). Should
> > be a trivial fix, though ideally we should also add some test for this
> > as well.
>
> Sounds good, thank you for looking into it! Let me know if you need me
> to test a patch :)
Patch based on Andrii's analysis.
Given that both BPF_END and BPF_NEG always operates on dst_reg itself
and that bt_is_reg_set(bt, dreg) was already checked I believe we can
just return with no futher action.
---
kernel/bpf/verifier.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9cdba4ce23d2..7e396288aaf0 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -3418,7 +3418,9 @@ static int backtrack_insn(struct bpf_verifier_env *env, int idx, int subseq_idx,
if (class == BPF_ALU || class == BPF_ALU64) {
if (!bt_is_reg_set(bt, dreg))
return 0;
- if (opcode == BPF_MOV) {
+ if (opcode == BPF_END || opcode == BPF_NEG) {
+ return 0;
+ } else if (opcode == BPF_MOV) {
if (BPF_SRC(insn->code) == BPF_X) {
/* dreg = sreg
* dreg needs precision after this insn
--
2.42.0
next prev parent reply other threads:[~2023-10-17 15:26 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-12 20:25 Hitting verifier backtracking bug on 6.5.5 kernel Toke Høiland-Jørgensen
2023-10-13 21:11 ` Andrii Nakryiko
2023-10-16 19:36 ` Toke Høiland-Jørgensen
2023-10-16 20:22 ` Andrii Nakryiko
2023-10-17 11:08 ` Toke Høiland-Jørgensen
2023-10-17 12:16 ` Mohamed Mahmoud
2023-10-17 15:39 ` Shung-Hsi Yu
2023-10-17 15:26 ` Shung-Hsi Yu [this message]
2023-10-17 17:22 ` Alexei Starovoitov
2023-10-17 17:24 ` Toke Høiland-Jørgensen
2023-10-20 16:30 ` Toke Høiland-Jørgensen
2023-10-23 2:08 ` Shung-Hsi Yu
2023-10-23 9:27 ` Toke Høiland-Jørgensen
2023-10-30 14:16 ` Shung-Hsi Yu
2023-10-30 14:44 ` Toke Høiland-Jørgensen
2023-10-17 5:33 ` Hengqi Chen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZS6nnJRuI22tgI4D@u94a \
--to=shung-hsi.yu@suse.com \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=mmahmoud@redhat.com \
--cc=toke@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox