* [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt
@ 2026-04-28 10:01 Hasan Basbunar
2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar
0 siblings, 1 reply; 4+ messages in thread
From: Hasan Basbunar @ 2026-04-28 10:01 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel,
Hasan Basbunar
bpf_dbg's interactive 'select <N>' command, documented in the file
header ("select 3 (run etc will start from the 3rd packet in the pcap)")
to use 1-based packet indexing, advances the pcap cursor one packet too
many. The loop in cmd_select():
pcap_reset_pkt(); /* cursor on packet 1 */
for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt()
validates the packet at the cursor and then advances past it. After
N calls the cursor is on packet N+1, so 'select 3' positions on
packet 4, 'select 4' on packet 5, etc. To land on packet N the loop
must advance the cursor only N-1 times.
A second off-by-one in pcap_next_pkt() rejects the last packet of any
pcap whose mapped size equals the sum of its packets exactly (the
common case — pcap files have no trailer):
if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
pcap_ptr_va_start >= pcap_map_size)
return false;
When the current packet ends exactly at the mmap boundary, the
expression equals pcap_map_size and the >= check rejects a fully
in-bounds packet. The same off-by-one is present in the earlier
header-fits check on the same function. Both should compare with >.
Combined effect: 'select N' on a pcap of N packets always reports
"no packet #N available!". For a 1-packet pcap, 'select 1' reports
the only packet as unavailable.
Reproduction (deterministic, no kernel needed): build bpf_dbg from
the unmodified tree, synthesize a pcap with N>=1 packets each with a
distinct payload byte, and drive 'select K / step 1 / quit'. Before
this fix, 'select 1' shows packet 2's payload; 'select N' shows the
"no packet" error. After this fix, 'select K' shows packet K for
all K in 1..N, and 'select N+1' correctly errors.
Cloudflare's downstream mirror at github.com/cloudflare/bpftools
carries the same defect.
Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger")
Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com>
---
tools/bpf/bpf_dbg.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
index 00e560a17baf..f21576dc2326 100644
--- a/tools/bpf/bpf_dbg.c
+++ b/tools/bpf/bpf_dbg.c
@@ -923,12 +923,12 @@ static bool pcap_next_pkt(void)
struct pcap_pkthdr *hdr = pcap_curr_pkt();
if (pcap_ptr_va_curr + sizeof(*hdr) -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len)
return false;
if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen);
@@ -1141,7 +1141,7 @@ static int cmd_select(char *num)
pcap_reset_pkt();
bpf_reset();
- for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
+ for (i = 1; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
if (!have_next || pcap_curr_pkt() == NULL) {
rl_printf("no packet #%u available!\n", which);
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select 2026-04-28 10:01 [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt Hasan Basbunar @ 2026-04-29 8:44 ` Hasan Basbunar 2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar 0 siblings, 1 reply; 4+ messages in thread From: Hasan Basbunar @ 2026-04-29 8:44 UTC (permalink / raw) To: Daniel Borkmann Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel, Hasan Basbunar bpf_dbg's interactive 'select <N>' command, documented in the file header ("select 3 (run etc will start from the 3rd packet in the pcap)") to use 1-based packet indexing, advances the pcap cursor one packet too many. The loop in cmd_select(): pcap_reset_pkt(); /* cursor on packet 1 */ for (i = 0; i < which && (have_next = pcap_next_pkt()); i++) /* noop */; calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt() validates the packet at the cursor and then advances past it. After N calls the cursor is on packet N+1, so 'select 3' positions on packet 4, 'select 4' on packet 5, etc. To land on packet N the loop must advance the cursor only N-1 times. Reproduction (deterministic, no kernel needed): build bpf_dbg from the unmodified tree, synthesize a pcap with N>=2 packets each with a distinct payload byte, and drive 'select 1 / step 1 / quit'. Before this fix, 'select 1' shows packet 2's payload. After this fix, 'select K' shows packet K for all K in 1..N, and 'select N+1' correctly errors with "no packet #N+1 available!". Cloudflare's downstream mirror at github.com/cloudflare/bpftools carries the same defect. Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger") Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com> --- Changes in v2: - Drop the pcap_next_pkt() boundary change (>= -> >). As correctly pointed out by Sashiko AI on the v1 thread, that change was wrong: when the last packet body ends exactly at the mmap boundary (the common case for pcap files with no trailer), the relaxed check let pcap_next_pkt() advance the cursor to pcap_ptr_va_start + pcap_map_size and return true. The cmd_run() do/while loop then re-entered its body, called pcap_curr_pkt() at end-of-mmap, and bpf_run_all() dereferenced hdr->caplen / hdr->len out of bounds. The original >= comparison is correct: when the body ends at the boundary it returns false without advancing, and the loop exits cleanly. The cmd_select() 1-based fix below is sufficient and self-contained; pcap_next_pkt() is left untouched. - v1: https://lore.kernel.org/bpf/20260428100109.56572-1-basbunarhasan@gmail.com/ tools/bpf/bpf_dbg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c index 00e560a17baf..4895602ab37d 100644 --- a/tools/bpf/bpf_dbg.c +++ b/tools/bpf/bpf_dbg.c @@ -1141,7 +1141,7 @@ static int cmd_select(char *num) pcap_reset_pkt(); bpf_reset(); - for (i = 0; i < which && (have_next = pcap_next_pkt()); i++) + for (i = 1; i < which && (have_next = pcap_next_pkt()); i++) /* noop */; if (!have_next || pcap_curr_pkt() == NULL) { rl_printf("no packet #%u available!\n", which); -- 2.53.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, fix off-by-one in cmd_select 2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar @ 2026-04-29 12:35 ` Hasan Basbunar 2026-04-29 13:13 ` bot+bpf-ci 0 siblings, 1 reply; 4+ messages in thread From: Hasan Basbunar @ 2026-04-29 12:35 UTC (permalink / raw) To: Daniel Borkmann Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel, Hasan Basbunar bpf_dbg's interactive 'select <N>' command, documented in the file header ("select 3 (run etc will start from the 3rd packet in the pcap)") to use 1-based packet indexing, advances the pcap cursor one packet too many. The loop in cmd_select(): pcap_reset_pkt(); /* cursor on packet 1 */ for (i = 0; i < which && (have_next = pcap_next_pkt()); i++) /* noop */; calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt() validates the packet at the cursor and then advances past it. After N calls the cursor is on packet N+1, so 'select 3' positions on packet 4, 'select 4' on packet 5, etc. Simply changing the loop init to 'i = 1' (so it advances N-1 times) fixes the user-visible symptom but leaves the final landed-on packet unvalidated, and combined with pcap_next_pkt()'s '>=' boundary checks, mis-handles the boundary cases on the last and just-past-the- last packet. As pointed out by the Sashiko AI review on v1 and v2, this surfaces in two ways: 1. On a perfect pcap (no trailing bytes after the last packet), pcap_next_pkt()'s '>= pcap_map_size' rejects packets whose body ends exactly at the file boundary, so 'select N' on an N-packet file errors as "no packet #N available" even though the packet is fully in-bounds. 2. On a truncated pcap (filehdr + a few stray bytes that happen to pass try_load_pcap()'s 'pcap_map_size > sizeof(filehdr)' guard but not enough to contain a full pkthdr), 'select 1' returns CMD_OK without ever validating the header, and a subsequent 'step' or 'run' dereferences pcap_curr_pkt()->caplen past the mapped region. Fix all three issues by splitting pcap_next_pkt() into a pure validator (pcap_curr_pkt_valid()) and a validate-advance-validate combinator. The boundary check now uses '>' instead of '>=', so a packet whose body ends exactly at pcap_map_size is correctly accepted. pcap_next_pkt() returns true only when both the current packet was valid and, after advancing, the new cursor position is also valid. This means the do-while in cmd_run() exits cleanly after the last packet (no past-end dereference), and cmd_select() can call pcap_curr_pkt_valid() after the loop to bounds-check the final packet. Reproduction (deterministic, no kernel needed): build bpf_dbg from the unmodified tree, synthesize a pcap with N>=2 packets each with a distinct payload byte, and drive 'select 1 / step 1 / quit'. Before this fix, 'select 1' shows packet 2's payload. After this fix, 'select K' shows packet K for all K in 1..N, 'select N+1' correctly errors with "no packet #N+1 available!", and 'select 1' on a pcap truncated to filehdr + 1 byte also correctly errors. Cloudflare's downstream mirror at github.com/cloudflare/bpftools carries the same defect. Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger") Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com> --- Changes in v3: - Split pcap_next_pkt() into pcap_curr_pkt_valid() (pure validator) and pcap_next_pkt() (validate-current, advance, validate-new). - Boundary check now uses '>' instead of '>='; a packet whose body ends exactly at pcap_map_size is correctly accepted. - cmd_select() validates the final landed-on packet via pcap_curr_pkt_valid() instead of the dead `pcap_curr_pkt() == NULL` check. - Empirically verified in a clean Debian container (gcc -Wall -O0) against: * 5-packet pcap, select K for K in 1..6 (5 successes + 1 error on K=6, payload byte matches K per the file header docs); * 1-packet pcap, select 1 (succeeds), select 2 (errors); * truncated pcap (filehdr + 1 byte), select 1 errors cleanly without dereferencing past the mapped region; * `run` after `select 3` on a 5-packet pcap processes exactly 3 packets and exits cleanly without past-end deref. - Addresses both review concerns raised by Sashiko AI on v1 and v2. - v1: https://lore.kernel.org/bpf/20260428100109.56572-1-basbunarhasan@gmail.com/ v2: https://lore.kernel.org/bpf/20260429084441.22089-1-basbunarhasan@gmail.com/ tools/bpf/bpf_dbg.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c index 4895602ab37d..db12d2f8fb73 100644 --- a/tools/bpf/bpf_dbg.c +++ b/tools/bpf/bpf_dbg.c @@ -918,21 +918,30 @@ static struct pcap_pkthdr *pcap_curr_pkt(void) return (void *) pcap_ptr_va_curr; } -static bool pcap_next_pkt(void) +static bool pcap_curr_pkt_valid(void) { struct pcap_pkthdr *hdr = pcap_curr_pkt(); if (pcap_ptr_va_curr + sizeof(*hdr) - - pcap_ptr_va_start >= pcap_map_size) + pcap_ptr_va_start > pcap_map_size) return false; if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len) return false; if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen - - pcap_ptr_va_start >= pcap_map_size) + pcap_ptr_va_start > pcap_map_size) return false; + return true; +} + +static bool pcap_next_pkt(void) +{ + struct pcap_pkthdr *hdr; + if (!pcap_curr_pkt_valid()) + return false; + hdr = pcap_curr_pkt(); pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen); - return true; + return pcap_curr_pkt_valid(); } static void pcap_reset_pkt(void) @@ -1143,7 +1152,7 @@ static int cmd_select(char *num) for (i = 1; i < which && (have_next = pcap_next_pkt()); i++) /* noop */; - if (!have_next || pcap_curr_pkt() == NULL) { + if (!have_next || !pcap_curr_pkt_valid()) { rl_printf("no packet #%u available!\n", which); pcap_reset_pkt(); return CMD_ERR; -- 2.53.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, fix off-by-one in cmd_select 2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar @ 2026-04-29 13:13 ` bot+bpf-ci 0 siblings, 0 replies; 4+ messages in thread From: bot+bpf-ci @ 2026-04-29 13:13 UTC (permalink / raw) To: basbunarhasan, daniel Cc: ast, andrii, bpf, linux-kernel, basbunarhasan, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai [-- Attachment #1: Type: text/plain, Size: 3728 bytes --] > diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c > --- a/tools/bpf/bpf_dbg.c > +++ b/tools/bpf/bpf_dbg.c > @@ -918,21 +918,30 @@ static struct pcap_pkthdr *pcap_curr_pkt(void) > return (void *) pcap_ptr_va_curr; > } > > -static bool pcap_next_pkt(void) > +static bool pcap_curr_pkt_valid(void) > { > struct pcap_pkthdr *hdr = pcap_curr_pkt(); > > if (pcap_ptr_va_curr + sizeof(*hdr) - > - pcap_ptr_va_start >= pcap_map_size) > + pcap_ptr_va_start > pcap_map_size) > return false; > if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len) > return false; > if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen - > - pcap_ptr_va_start >= pcap_map_size) > + pcap_ptr_va_start > pcap_map_size) > return false; > + return true; > +} > > +static bool pcap_next_pkt(void) > +{ > + struct pcap_pkthdr *hdr; > + > + if (!pcap_curr_pkt_valid()) > + return false; > + hdr = pcap_curr_pkt(); > pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen); > - return true; > + return pcap_curr_pkt_valid(); > } > > static void pcap_reset_pkt(void) > @@ -1143,7 +1152,7 @@ static int cmd_select(char *num) > > for (i = 0; i < which && (have_next = pcap_next_pkt()); i++) > /* noop */; > - if (!have_next || pcap_curr_pkt() == NULL) { > + if (!have_next || !pcap_curr_pkt_valid()) { The commit message asserts that after this fix, 'select K shows packet K for all K in 1..N' and 'select N+1 correctly errors with no packet #N+1 available!'. Does the actual code achieve this behavior given that the loop still initializes i to 0? With the new pcap_next_pkt() semantics (validate-current, advance, validate-new), trace through 'select 1' on a file with >=2 packets: Iteration 1 (i=0): pcap_next_pkt() validates pkt1 (ok) advances cursor to pkt2 validates pkt2 (ok) returns true i becomes 1 Loop exit (i=1, i<1 is false) Cursor is now on pkt2 pcap_curr_pkt_valid() returns true Function returns CMD_OK The user requested packet 1, but subsequent 'step' or 'run' operates on pkt2. This reproduces the exact defect the commit claims to fix: 'Before this fix, select 1 shows packet 2's payload.' Similarly, trace 'select 3' on a file with exactly 3 packets: Iteration 1 (i=0): advances pkt1 -> pkt2, validates pkt2 (ok), i=1 Iteration 2 (i=1): advances pkt2 -> pkt3, validates pkt3 (ok), i=2 Iteration 3 (i=2): pkt3 valid, advances past pkt3 (cursor at EOF), post-advance pcap_curr_pkt_valid() is false, pcap_next_pkt() returns false, have_next=false Loop exits with have_next=false Code hits 'no packet #3 available!' even though packet 3 exists The commit message itself identifies the correct fix: "Simply changing the loop init to 'i = 1' (so it advances N-1 times) fixes the user-visible symptom". That change was not applied in the diff. The loop still uses 'i = 0', which causes it to advance N times instead of N-1 times, landing on packet N+1 when the user requested packet N. The v3 patch discussion on lore.kernel.org/bpf addressed this by changing the loop from 'for (i = 0; i < which ...)' to 'for (i = 1; i < which ...)' to properly implement 1-based indexing. The current code has the pcap_curr_pkt_valid() refactoring but is missing the loop initialization fix. Reference: https://lore.kernel.org/bpf/20260429123543.61559-1-basbunarhasan@gmail.com/ > rl_printf("no packet #%u available!\n", which); > pcap_reset_pkt(); > return CMD_ERR; --- AI reviewed your patch. Please fix the bug or email reply why it's not a bug. See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25109638113 ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-29 13:13 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-28 10:01 [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt Hasan Basbunar 2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar 2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar 2026-04-29 13:13 ` bot+bpf-ci
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox