* [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt
@ 2026-04-28 10:01 Hasan Basbunar
2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar
0 siblings, 1 reply; 4+ messages in thread
From: Hasan Basbunar @ 2026-04-28 10:01 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel,
Hasan Basbunar
bpf_dbg's interactive 'select <N>' command, documented in the file
header ("select 3 (run etc will start from the 3rd packet in the pcap)")
to use 1-based packet indexing, advances the pcap cursor one packet too
many. The loop in cmd_select():
pcap_reset_pkt(); /* cursor on packet 1 */
for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt()
validates the packet at the cursor and then advances past it. After
N calls the cursor is on packet N+1, so 'select 3' positions on
packet 4, 'select 4' on packet 5, etc. To land on packet N the loop
must advance the cursor only N-1 times.
A second off-by-one in pcap_next_pkt() rejects the last packet of any
pcap whose mapped size equals the sum of its packets exactly (the
common case — pcap files have no trailer):
if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
pcap_ptr_va_start >= pcap_map_size)
return false;
When the current packet ends exactly at the mmap boundary, the
expression equals pcap_map_size and the >= check rejects a fully
in-bounds packet. The same off-by-one is present in the earlier
header-fits check on the same function. Both should compare with >.
Combined effect: 'select N' on a pcap of N packets always reports
"no packet #N available!". For a 1-packet pcap, 'select 1' reports
the only packet as unavailable.
Reproduction (deterministic, no kernel needed): build bpf_dbg from
the unmodified tree, synthesize a pcap with N>=1 packets each with a
distinct payload byte, and drive 'select K / step 1 / quit'. Before
this fix, 'select 1' shows packet 2's payload; 'select N' shows the
"no packet" error. After this fix, 'select K' shows packet K for
all K in 1..N, and 'select N+1' correctly errors.
Cloudflare's downstream mirror at github.com/cloudflare/bpftools
carries the same defect.
Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger")
Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com>
---
tools/bpf/bpf_dbg.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
index 00e560a17baf..f21576dc2326 100644
--- a/tools/bpf/bpf_dbg.c
+++ b/tools/bpf/bpf_dbg.c
@@ -923,12 +923,12 @@ static bool pcap_next_pkt(void)
struct pcap_pkthdr *hdr = pcap_curr_pkt();
if (pcap_ptr_va_curr + sizeof(*hdr) -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len)
return false;
if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen);
@@ -1141,7 +1141,7 @@ static int cmd_select(char *num)
pcap_reset_pkt();
bpf_reset();
- for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
+ for (i = 1; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
if (!have_next || pcap_curr_pkt() == NULL) {
rl_printf("no packet #%u available!\n", which);
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select
2026-04-28 10:01 [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt Hasan Basbunar
@ 2026-04-29 8:44 ` Hasan Basbunar
2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar
0 siblings, 1 reply; 4+ messages in thread
From: Hasan Basbunar @ 2026-04-29 8:44 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel,
Hasan Basbunar
bpf_dbg's interactive 'select <N>' command, documented in the file
header ("select 3 (run etc will start from the 3rd packet in the
pcap)") to use 1-based packet indexing, advances the pcap cursor one
packet too many. The loop in cmd_select():
pcap_reset_pkt(); /* cursor on packet 1 */
for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt()
validates the packet at the cursor and then advances past it. After
N calls the cursor is on packet N+1, so 'select 3' positions on
packet 4, 'select 4' on packet 5, etc. To land on packet N the loop
must advance the cursor only N-1 times.
Reproduction (deterministic, no kernel needed): build bpf_dbg from
the unmodified tree, synthesize a pcap with N>=2 packets each with
a distinct payload byte, and drive 'select 1 / step 1 / quit'.
Before this fix, 'select 1' shows packet 2's payload. After this
fix, 'select K' shows packet K for all K in 1..N, and 'select N+1'
correctly errors with "no packet #N+1 available!".
Cloudflare's downstream mirror at github.com/cloudflare/bpftools
carries the same defect.
Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger")
Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com>
---
Changes in v2:
- Drop the pcap_next_pkt() boundary change (>= -> >). As correctly
pointed out by Sashiko AI on the v1 thread, that change was wrong:
when the last packet body ends exactly at the mmap boundary (the
common case for pcap files with no trailer), the relaxed check let
pcap_next_pkt() advance the cursor to pcap_ptr_va_start +
pcap_map_size and return true. The cmd_run() do/while loop then
re-entered its body, called pcap_curr_pkt() at end-of-mmap, and
bpf_run_all() dereferenced hdr->caplen / hdr->len out of bounds.
The original >= comparison is correct: when the body ends at the
boundary it returns false without advancing, and the loop exits
cleanly. The cmd_select() 1-based fix below is sufficient and
self-contained; pcap_next_pkt() is left untouched.
- v1: https://lore.kernel.org/bpf/20260428100109.56572-1-basbunarhasan@gmail.com/
tools/bpf/bpf_dbg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
index 00e560a17baf..4895602ab37d 100644
--- a/tools/bpf/bpf_dbg.c
+++ b/tools/bpf/bpf_dbg.c
@@ -1141,7 +1141,7 @@ static int cmd_select(char *num)
pcap_reset_pkt();
bpf_reset();
- for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
+ for (i = 1; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
if (!have_next || pcap_curr_pkt() == NULL) {
rl_printf("no packet #%u available!\n", which);
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, fix off-by-one in cmd_select
2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar
@ 2026-04-29 12:35 ` Hasan Basbunar
2026-04-29 13:13 ` bot+bpf-ci
0 siblings, 1 reply; 4+ messages in thread
From: Hasan Basbunar @ 2026-04-29 12:35 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, linux-kernel,
Hasan Basbunar
bpf_dbg's interactive 'select <N>' command, documented in the file
header ("select 3 (run etc will start from the 3rd packet in the
pcap)") to use 1-based packet indexing, advances the pcap cursor one
packet too many. The loop in cmd_select():
pcap_reset_pkt(); /* cursor on packet 1 */
for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
calls pcap_next_pkt() N times to reach packet N, but pcap_next_pkt()
validates the packet at the cursor and then advances past it. After
N calls the cursor is on packet N+1, so 'select 3' positions on
packet 4, 'select 4' on packet 5, etc.
Simply changing the loop init to 'i = 1' (so it advances N-1 times)
fixes the user-visible symptom but leaves the final landed-on packet
unvalidated, and combined with pcap_next_pkt()'s '>=' boundary
checks, mis-handles the boundary cases on the last and just-past-the-
last packet. As pointed out by the Sashiko AI review on v1 and v2,
this surfaces in two ways:
1. On a perfect pcap (no trailing bytes after the last packet),
pcap_next_pkt()'s '>= pcap_map_size' rejects packets whose body
ends exactly at the file boundary, so 'select N' on an N-packet
file errors as "no packet #N available" even though the packet
is fully in-bounds.
2. On a truncated pcap (filehdr + a few stray bytes that happen to
pass try_load_pcap()'s 'pcap_map_size > sizeof(filehdr)' guard
but not enough to contain a full pkthdr), 'select 1' returns
CMD_OK without ever validating the header, and a subsequent
'step' or 'run' dereferences pcap_curr_pkt()->caplen past the
mapped region.
Fix all three issues by splitting pcap_next_pkt() into a pure
validator (pcap_curr_pkt_valid()) and a validate-advance-validate
combinator. The boundary check now uses '>' instead of '>=', so a
packet whose body ends exactly at pcap_map_size is correctly accepted.
pcap_next_pkt() returns true only when both the current packet was
valid and, after advancing, the new cursor position is also valid.
This means the do-while in cmd_run() exits cleanly after the last
packet (no past-end dereference), and cmd_select() can call
pcap_curr_pkt_valid() after the loop to bounds-check the final
packet.
Reproduction (deterministic, no kernel needed): build bpf_dbg from
the unmodified tree, synthesize a pcap with N>=2 packets each with a
distinct payload byte, and drive 'select 1 / step 1 / quit'. Before
this fix, 'select 1' shows packet 2's payload. After this fix,
'select K' shows packet K for all K in 1..N, 'select N+1' correctly
errors with "no packet #N+1 available!", and 'select 1' on a pcap
truncated to filehdr + 1 byte also correctly errors.
Cloudflare's downstream mirror at github.com/cloudflare/bpftools
carries the same defect.
Fixes: fd981e3c321a ("filter: bpf_dbg: add minimal bpf debugger")
Signed-off-by: Hasan Basbunar <basbunarhasan@gmail.com>
---
Changes in v3:
- Split pcap_next_pkt() into pcap_curr_pkt_valid() (pure validator)
and pcap_next_pkt() (validate-current, advance, validate-new).
- Boundary check now uses '>' instead of '>='; a packet whose body
ends exactly at pcap_map_size is correctly accepted.
- cmd_select() validates the final landed-on packet via
pcap_curr_pkt_valid() instead of the dead
`pcap_curr_pkt() == NULL` check.
- Empirically verified in a clean Debian container (gcc -Wall -O0)
against:
* 5-packet pcap, select K for K in 1..6 (5 successes + 1 error
on K=6, payload byte matches K per the file header docs);
* 1-packet pcap, select 1 (succeeds), select 2 (errors);
* truncated pcap (filehdr + 1 byte), select 1 errors cleanly
without dereferencing past the mapped region;
* `run` after `select 3` on a 5-packet pcap processes exactly
3 packets and exits cleanly without past-end deref.
- Addresses both review concerns raised by Sashiko AI on v1 and v2.
- v1: https://lore.kernel.org/bpf/20260428100109.56572-1-basbunarhasan@gmail.com/
v2: https://lore.kernel.org/bpf/20260429084441.22089-1-basbunarhasan@gmail.com/
tools/bpf/bpf_dbg.c | 19 ++++++++++++++-----
1 file changed, 14 insertions(+), 5 deletions(-)
diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
index 4895602ab37d..db12d2f8fb73 100644
--- a/tools/bpf/bpf_dbg.c
+++ b/tools/bpf/bpf_dbg.c
@@ -918,21 +918,30 @@ static struct pcap_pkthdr *pcap_curr_pkt(void)
return (void *) pcap_ptr_va_curr;
}
-static bool pcap_next_pkt(void)
+static bool pcap_curr_pkt_valid(void)
{
struct pcap_pkthdr *hdr = pcap_curr_pkt();
if (pcap_ptr_va_curr + sizeof(*hdr) -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len)
return false;
if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
- pcap_ptr_va_start >= pcap_map_size)
+ pcap_ptr_va_start > pcap_map_size)
return false;
+ return true;
+}
+
+static bool pcap_next_pkt(void)
+{
+ struct pcap_pkthdr *hdr;
+ if (!pcap_curr_pkt_valid())
+ return false;
+ hdr = pcap_curr_pkt();
pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen);
- return true;
+ return pcap_curr_pkt_valid();
}
static void pcap_reset_pkt(void)
@@ -1143,7 +1152,7 @@ static int cmd_select(char *num)
for (i = 1; i < which && (have_next = pcap_next_pkt()); i++)
/* noop */;
- if (!have_next || pcap_curr_pkt() == NULL) {
+ if (!have_next || !pcap_curr_pkt_valid()) {
rl_printf("no packet #%u available!\n", which);
pcap_reset_pkt();
return CMD_ERR;
--
2.53.0
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, fix off-by-one in cmd_select
2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar
@ 2026-04-29 13:13 ` bot+bpf-ci
0 siblings, 0 replies; 4+ messages in thread
From: bot+bpf-ci @ 2026-04-29 13:13 UTC (permalink / raw)
To: basbunarhasan, daniel
Cc: ast, andrii, bpf, linux-kernel, basbunarhasan, ast, andrii,
daniel, martin.lau, eddyz87, yonghong.song, clm, ihor.solodrai
[-- Attachment #1: Type: text/plain, Size: 3728 bytes --]
> diff --git a/tools/bpf/bpf_dbg.c b/tools/bpf/bpf_dbg.c
> --- a/tools/bpf/bpf_dbg.c
> +++ b/tools/bpf/bpf_dbg.c
> @@ -918,21 +918,30 @@ static struct pcap_pkthdr *pcap_curr_pkt(void)
> return (void *) pcap_ptr_va_curr;
> }
>
> -static bool pcap_next_pkt(void)
> +static bool pcap_curr_pkt_valid(void)
> {
> struct pcap_pkthdr *hdr = pcap_curr_pkt();
>
> if (pcap_ptr_va_curr + sizeof(*hdr) -
> - pcap_ptr_va_start >= pcap_map_size)
> + pcap_ptr_va_start > pcap_map_size)
> return false;
> if (hdr->caplen == 0 || hdr->len == 0 || hdr->caplen > hdr->len)
> return false;
> if (pcap_ptr_va_curr + sizeof(*hdr) + hdr->caplen -
> - pcap_ptr_va_start >= pcap_map_size)
> + pcap_ptr_va_start > pcap_map_size)
> return false;
> + return true;
> +}
>
> +static bool pcap_next_pkt(void)
> +{
> + struct pcap_pkthdr *hdr;
> +
> + if (!pcap_curr_pkt_valid())
> + return false;
> + hdr = pcap_curr_pkt();
> pcap_ptr_va_curr += (sizeof(*hdr) + hdr->caplen);
> - return true;
> + return pcap_curr_pkt_valid();
> }
>
> static void pcap_reset_pkt(void)
> @@ -1143,7 +1152,7 @@ static int cmd_select(char *num)
>
> for (i = 0; i < which && (have_next = pcap_next_pkt()); i++)
> /* noop */;
> - if (!have_next || pcap_curr_pkt() == NULL) {
> + if (!have_next || !pcap_curr_pkt_valid()) {
The commit message asserts that after this fix, 'select K shows packet K
for all K in 1..N' and 'select N+1 correctly errors with no packet #N+1
available!'. Does the actual code achieve this behavior given that the
loop still initializes i to 0?
With the new pcap_next_pkt() semantics (validate-current, advance,
validate-new), trace through 'select 1' on a file with >=2 packets:
Iteration 1 (i=0):
pcap_next_pkt() validates pkt1 (ok)
advances cursor to pkt2
validates pkt2 (ok)
returns true
i becomes 1
Loop exit (i=1, i<1 is false)
Cursor is now on pkt2
pcap_curr_pkt_valid() returns true
Function returns CMD_OK
The user requested packet 1, but subsequent 'step' or 'run' operates on
pkt2. This reproduces the exact defect the commit claims to fix: 'Before
this fix, select 1 shows packet 2's payload.'
Similarly, trace 'select 3' on a file with exactly 3 packets:
Iteration 1 (i=0): advances pkt1 -> pkt2, validates pkt2 (ok), i=1
Iteration 2 (i=1): advances pkt2 -> pkt3, validates pkt3 (ok), i=2
Iteration 3 (i=2): pkt3 valid, advances past pkt3 (cursor at EOF),
post-advance pcap_curr_pkt_valid() is false,
pcap_next_pkt() returns false, have_next=false
Loop exits with have_next=false
Code hits 'no packet #3 available!' even though packet 3 exists
The commit message itself identifies the correct fix: "Simply changing the
loop init to 'i = 1' (so it advances N-1 times) fixes the user-visible
symptom". That change was not applied in the diff. The loop still uses
'i = 0', which causes it to advance N times instead of N-1 times, landing
on packet N+1 when the user requested packet N.
The v3 patch discussion on lore.kernel.org/bpf addressed this by changing
the loop from 'for (i = 0; i < which ...)' to 'for (i = 1; i < which ...)'
to properly implement 1-based indexing. The current code has the
pcap_curr_pkt_valid() refactoring but is missing the loop initialization
fix.
Reference: https://lore.kernel.org/bpf/20260429123543.61559-1-basbunarhasan@gmail.com/
> rl_printf("no packet #%u available!\n", which);
> pcap_reset_pkt();
> return CMD_ERR;
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/25109638113
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-29 13:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-28 10:01 [PATCH] bpf: bpf_dbg: fix off-by-one in cmd_select and pcap_next_pkt Hasan Basbunar
2026-04-29 8:44 ` [PATCH v2] bpf: bpf_dbg: fix off-by-one in cmd_select Hasan Basbunar
2026-04-29 12:35 ` [PATCH v3] bpf: bpf_dbg: split pcap_next_pkt() validation/advance, " Hasan Basbunar
2026-04-29 13:13 ` bot+bpf-ci
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox