[PATCH v3] bpf, sockmap: keep sk_msg copy state in sync

Netdev List
 help / color / mirror / Atom feed

* [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync
@ 2026-05-20 10:27 Zhang Cen
  2026-05-20 11:09 ` bot+bpf-ci
  2026-05-20 16:00 ` John Fastabend
  0 siblings, 2 replies; 4+ messages in thread
From: Zhang Cen @ 2026-05-20 10:27 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, John Fastabend, Stanislav Fomichev,
	Jakub Sitnicki, David S . Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Simon Horman
  Cc: bpf, netdev, linux-kernel, zerocling0077, 2045gemini, Zhang Cen,
	stable

SK_MSG uses msg->sg.copy as per-scatterlist-entry provenance. Entries
with this bit set are copied before data/data_end are exposed to SK_MSG
BPF programs for direct packet access.

bpf_msg_pull_data(), bpf_msg_push_data(), and bpf_msg_pop_data()
rewrite the sk_msg scatterlist ring by collapsing, splitting, and
shifting entries. These operations move msg->sg.data[] entries, but the
parallel copy bitmap can be left behind on the old slot. A copied entry
can then return to msg->sg.start with its copy bit clear and be exposed
as directly writable packet data.

This corruption path requires an attached SK_MSG BPF program that calls
the mutating helpers; ordinary sockmap/TLS traffic that never runs
push/pop/pull helper sequences is not affected.

Keep msg->sg.copy synchronized with scatterlist entry moves, preserve
the copy bit when an entry is split, clear it when a helper replaces an
entry with a private page, and clear slots vacated by pull-data
compaction.

Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
Fixes: 7246d8ed4dcc ("bpf: helper to pop data from messages")
Cc: stable@vger.kernel.org
Co-developed-by: Han Guidong <2045gemini@gmail.com>
Signed-off-by: Han Guidong <2045gemini@gmail.com>
Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
---
v3:
- Refactor copy-bit helpers per John Fastabend's review: encapsulate scatterlist element moves alongside their copy state into a unified helper, and streamline bit-clearing operations.
- Clarify in commit log that only programs using push/pop/pull are affected.
- Note: The additional edge cases reported by the Sashiko-bot bot and the addition of BPF selftests will be addressed in a separate follow-up patch series to expedite this core fix.

v2:
Address Sashiko-bot's feedback by clearing msg->sg.copy for every entry consumed by bpf_msg_pull_data() before compacting the scatterlist ring, preventing stale copy bits on collapsed tail entries.

v1:
While researching recent page cache bugs, we discovered this bug.
We confirmed it allows overwriting the page cache of read-only files
via splice(). We haven't attempted to write an exploit, but the
corruption primitive is verified. PoC available upon request.
Recommend fixing ASAP.
---

 net/core/filter.c | 88 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 83 insertions(+), 5 deletions(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 78b548158fb05..1be8fc750a1a1 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2650,6 +2650,38 @@ static void sk_msg_reset_curr(struct sk_msg *msg)
 	}
 }
 
+static bool sk_msg_elem_is_copy(const struct sk_msg *msg, u32 i)
+{
+	return test_bit(i, msg->sg.copy);
+}
+
+static void sk_msg_clear_elem_copy(struct sk_msg *msg, u32 i)
+{
+	__clear_bit(i, msg->sg.copy);
+}
+
+static void sk_msg_set_elem_copy(struct sk_msg *msg, u32 i)
+{
+	__set_bit(i, msg->sg.copy);
+}
+
+static void sk_msg_clear_copy_range(struct sk_msg *msg, u32 start, u32 end)
+{
+	while (start != end) {
+		sk_msg_clear_elem_copy(msg, start);
+		sk_msg_iter_var_next(start);
+	}
+}
+
+static void sk_msg_sg_move(struct sk_msg *msg, u32 dst, u32 src)
+{
+	msg->sg.data[dst] = msg->sg.data[src];
+	if (sk_msg_elem_is_copy(msg, src))
+		sk_msg_set_elem_copy(msg, dst);
+	else
+		sk_msg_clear_elem_copy(msg, dst);
+}
+
 static const struct bpf_func_proto bpf_msg_cork_bytes_proto = {
 	.func           = bpf_msg_cork_bytes,
 	.gpl_only       = false,
@@ -2688,7 +2720,7 @@ BPF_CALL_4(bpf_msg_pull_data, struct sk_msg *, msg, u32, start,
 	 * account for the headroom.
 	 */
 	bytes_sg_total = start - offset + bytes;
-	if (!test_bit(i, msg->sg.copy) && bytes_sg_total <= len)
+	if (!sk_msg_elem_is_copy(msg, i) && bytes_sg_total <= len)
 		goto out;
 
 	/* At this point we need to linearize multiple scatterlist
@@ -2734,6 +2766,7 @@ BPF_CALL_4(bpf_msg_pull_data, struct sk_msg *, msg, u32, start,
 	} while (i != last_sge);
 
 	sg_set_page(&msg->sg.data[first_sge], page, copy, 0);
+	sk_msg_clear_elem_copy(msg, first_sge);
 
 	/* To repair sg ring we need to shift entries. If we only
 	 * had a single entry though we can just replace it and
@@ -2743,8 +2776,14 @@ BPF_CALL_4(bpf_msg_pull_data, struct sk_msg *, msg, u32, start,
 	shift = last_sge > first_sge ?
 		last_sge - first_sge - 1 :
 		NR_MSG_FRAG_IDS - first_sge + last_sge - 1;
-	if (!shift)
+	if (!shift) {
+		sk_msg_clear_elem_copy(msg, msg->sg.end);
 		goto out;
+	}
+
+	i = first_sge;
+	sk_msg_iter_var_next(i);
+	sk_msg_clear_copy_range(msg, i, last_sge);
 
 	i = first_sge;
 	sk_msg_iter_var_next(i);
@@ -2758,16 +2797,18 @@ BPF_CALL_4(bpf_msg_pull_data, struct sk_msg *, msg, u32, start,
 		if (move_from == msg->sg.end)
 			break;
 
-		msg->sg.data[i] = msg->sg.data[move_from];
+		sk_msg_sg_move(msg, i, move_from);
 		msg->sg.data[move_from].length = 0;
 		msg->sg.data[move_from].page_link = 0;
 		msg->sg.data[move_from].offset = 0;
+		sk_msg_clear_elem_copy(msg, move_from);
 		sk_msg_iter_var_next(i);
 	} while (1);
 
 	msg->sg.end = msg->sg.end - shift > msg->sg.end ?
 		      msg->sg.end - shift + NR_MSG_FRAG_IDS :
 		      msg->sg.end - shift;
+	sk_msg_clear_elem_copy(msg, msg->sg.end);
 out:
 	sk_msg_reset_curr(msg);
 	msg->data = sg_virt(&msg->sg.data[first_sge]) + start - offset;
@@ -2790,6 +2831,8 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 {
 	struct scatterlist sge, nsge, nnsge, rsge = {0}, *psge;
 	u32 new, i = 0, l = 0, space, copy = 0, offset = 0;
+	bool sge_copy = false, nsge_copy = false, nnsge_copy = false;
+	bool rsge_copy = false;
 	u8 *raw, *to, *from;
 	struct page *page;
 
@@ -2862,6 +2905,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 			sk_msg_iter_var_prev(i);
 		psge = sk_msg_elem(msg, i);
 		rsge = sk_msg_elem_cpy(msg, i);
+		rsge_copy = sk_msg_elem_is_copy(msg, i);
 
 		psge->length = start - offset;
 		rsge.length -= psge->length;
@@ -2887,23 +2931,34 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 	/* Shift one or two slots as needed */
 	sge = sk_msg_elem_cpy(msg, new);
 	sg_unmark_end(&sge);
+	sge_copy = sk_msg_elem_is_copy(msg, new);
 
 	nsge = sk_msg_elem_cpy(msg, i);
+	nsge_copy = sk_msg_elem_is_copy(msg, i);
 	if (rsge.length) {
 		sk_msg_iter_var_next(i);
 		nnsge = sk_msg_elem_cpy(msg, i);
+		nnsge_copy = sk_msg_elem_is_copy(msg, i);
 		sk_msg_iter_next(msg, end);
 	}
 
 	while (i != msg->sg.end) {
 		msg->sg.data[i] = sge;
+		if (sge_copy)
+			sk_msg_set_elem_copy(msg, i);
+		else
+			sk_msg_clear_elem_copy(msg, i);
 		sge = nsge;
+		sge_copy = nsge_copy;
 		sk_msg_iter_var_next(i);
 		if (rsge.length) {
 			nsge = nnsge;
+			nsge_copy = nnsge_copy;
 			nnsge = sk_msg_elem_cpy(msg, i);
+			nnsge_copy = sk_msg_elem_is_copy(msg, i);
 		} else {
 			nsge = sk_msg_elem_cpy(msg, i);
+			nsge_copy = sk_msg_elem_is_copy(msg, i);
 		}
 	}
 
@@ -2911,13 +2966,18 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 	/* Place newly allocated data buffer */
 	sk_mem_charge(msg->sk, len);
 	msg->sg.size += len;
-	__clear_bit(new, msg->sg.copy);
+	sk_msg_clear_elem_copy(msg, new);
 	sg_set_page(&msg->sg.data[new], page, len + copy, 0);
 	if (rsge.length) {
 		get_page(sg_page(&rsge));
 		sk_msg_iter_var_next(new);
 		msg->sg.data[new] = rsge;
+		if (rsge_copy)
+			sk_msg_set_elem_copy(msg, new);
+		else
+			sk_msg_clear_elem_copy(msg, new);
 	}
+	sk_msg_clear_elem_copy(msg, msg->sg.end);
 
 	sk_msg_reset_curr(msg);
 	sk_msg_compute_data_pointers(msg);
@@ -2943,27 +3003,38 @@ static void sk_msg_shift_left(struct sk_msg *msg, int i)
 	do {
 		prev = i;
 		sk_msg_iter_var_next(i);
-		msg->sg.data[prev] = msg->sg.data[i];
+		sk_msg_sg_move(msg, prev, i);
 	} while (i != msg->sg.end);
 
 	sk_msg_iter_prev(msg, end);
+	sk_msg_clear_elem_copy(msg, msg->sg.end);
 }
 
 static void sk_msg_shift_right(struct sk_msg *msg, int i)
 {
 	struct scatterlist tmp, sge;
+	bool tmp_copy, sge_copy;
 
 	sk_msg_iter_next(msg, end);
 	sge = sk_msg_elem_cpy(msg, i);
+	sge_copy = sk_msg_elem_is_copy(msg, i);
 	sk_msg_iter_var_next(i);
 	tmp = sk_msg_elem_cpy(msg, i);
+	tmp_copy = sk_msg_elem_is_copy(msg, i);
 
 	while (i != msg->sg.end) {
 		msg->sg.data[i] = sge;
+		if (sge_copy)
+			sk_msg_set_elem_copy(msg, i);
+		else
+			sk_msg_clear_elem_copy(msg, i);
 		sk_msg_iter_var_next(i);
 		sge = tmp;
+		sge_copy = tmp_copy;
 		tmp = sk_msg_elem_cpy(msg, i);
+		tmp_copy = sk_msg_elem_is_copy(msg, i);
 	}
+	sk_msg_clear_elem_copy(msg, msg->sg.end);
 }
 
 BPF_CALL_4(bpf_msg_pop_data, struct sk_msg *, msg, u32, start,
@@ -3020,8 +3091,10 @@ BPF_CALL_4(bpf_msg_pop_data, struct sk_msg *, msg, u32, start,
 	 */
 	if (start != offset) {
 		struct scatterlist *nsge, *sge = sk_msg_elem(msg, i);
+		u32 sge_idx = i;
 		int a = start - offset;
 		int b = sge->length - pop - a;
+		bool sge_copy = sk_msg_elem_is_copy(msg, sge_idx);
 
 		sk_msg_iter_var_next(i);
 
@@ -3034,6 +3107,10 @@ BPF_CALL_4(bpf_msg_pop_data, struct sk_msg *, msg, u32, start,
 				sg_set_page(nsge,
 					    sg_page(sge),
 					    b, sge->offset + pop + a);
+				if (sge_copy)
+					sk_msg_set_elem_copy(msg, i);
+				else
+					sk_msg_clear_elem_copy(msg, i);
 			} else {
 				struct page *page, *orig;
 				u8 *to, *from;
@@ -3050,6 +3127,7 @@ BPF_CALL_4(bpf_msg_pop_data, struct sk_msg *, msg, u32, start,
 				memcpy(to, from, a);
 				memcpy(to + a, from + a + pop, b);
 				sg_set_page(sge, page, a + b, 0);
+				sk_msg_clear_elem_copy(msg, sge_idx);
 				put_page(orig);
 			}
 			pop = 0;
-- 
2.43.0

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync
  2026-05-20 10:27 [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync Zhang Cen
@ 2026-05-20 11:09 ` bot+bpf-ci
  2026-05-20 16:59   ` John Fastabend
  2026-05-20 16:00 ` John Fastabend
  1 sibling, 1 reply; 4+ messages in thread
From: bot+bpf-ci @ 2026-05-20 11:09 UTC (permalink / raw)
  To: rollkingzzc, ast, daniel, andrii, martin.lau, john.fastabend, sdf,
	jakub, davem, edumazet, kuba, pabeni, horms
  Cc: bpf, netdev, linux-kernel, zerocling0077, 2045gemini, rollkingzzc,
	stable, ast, andrii, daniel, martin.lau, eddyz87, yonghong.song,
	clm, ihor.solodrai

[-- Attachment #1: Type: text/plain, Size: 4111 bytes --]

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 9590877b0714..223d78708486 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c

[ ... ]

> @@ -2794,6 +2835,8 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  {
>  	struct scatterlist sge, nsge, nnsge, rsge = {0}, *psge;
>  	u32 new, i = 0, l = 0, space, copy = 0, offset = 0;
> +	bool sge_copy = false, nsge_copy = false, nnsge_copy = false;
> +	bool rsge_copy = false;
>  	u8 *raw, *to, *from;
>  	struct page *page;
>
> @@ -2866,6 +2909,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  			sk_msg_iter_var_prev(i);
>  		psge = sk_msg_elem(msg, i);
>  		rsge = sk_msg_elem_cpy(msg, i);
> +		rsge_copy = sk_msg_elem_is_copy(msg, i);
>
>  		psge->length = start - offset;
>  		rsge.length -= psge->length;

Could copy + len overflow and cause an out-of-bounds write? In
bpf_msg_push_data(), the len parameter is an unbounded u32 value passed
from a bpf program. When calculating the allocation size, copy + len can
overflow a 32-bit integer:

    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
                       get_order(copy + len));

If copy + len wraps around, alloc_pages() will allocate an undersized
buffer. Later, the destination pointer for shifting tail data is
calculated as:

    to = raw + front + len

Due to len being a large value, front + len wraps around and results in a
large positive 32-bit offset (zero-extended to 64-bit). The subsequent
memcpy(to, from, back) then performs a controlled out-of-bounds write well
past the allocated page.

Does bpf_msg_push_data() leak uninitialized kernel memory? The function
allocates a new memory page:

    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
                       get_order(copy + len));

Since the allocation does not include the __GFP_ZERO flag, the new len
bytes inserted into the message are never explicitly zeroed or initialized.
If the bpf program does not manually overwrite these bytes and the message
is transmitted over a socket, this could leak uninitialized kernel memory
to the network or userspace.

Could the ring buffer state become corrupted when data is appended exactly
at the end of the message and the ring buffer is full?

If data is appended exactly at the end of the message (start == offset +
l), the loop finishes with i pointing to the unused slot msg->sg.end. If
the ring buffer is full (!space), the coalescing fallback logic reads:

    copy = msg->sg.data[i].length

Because i points to an unused slot, its length is 0, making copy == 0, and
the fallback coalescing block is bypassed. The function then
unconditionally calls:

    sk_msg_iter_next(msg, end)

which increments msg->sg.end. Since the ring buffer was already full,
advancing msg->sg.end causes it to wrap around and equal msg->sg.start.
This could corrupt the ring buffer state and overwrite the first element
without calling put_page(), leading to page leaks.

[ ... ]

> @@ -2891,23 +2935,34 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  	/* Shift one or two slots as needed */
>  	sge = sk_msg_elem_cpy(msg, new);
>  	sg_unmark_end(&sge);
> +	sge_copy = sk_msg_elem_is_copy(msg, new);
>
>  	nsge = sk_msg_elem_cpy(msg, i);
> +	nsge_copy = sk_msg_elem_is_copy(msg, i);
>  	if (rsge.length) {
>  		sk_msg_iter_var_next(i);
>  		nnsge = sk_msg_elem_cpy(msg, i);
> +		nnsge_copy = sk_msg_elem_is_copy(msg, i);
>  		sk_msg_iter_next(msg, end);
>  	}

Should rsge.offset be adjusted after splitting? When rsge is created
from a split entry, the code does not include:

    rsge.offset += start - offset;

to correctly adjust the offset. The rsge points to the tail portion of
the split entry but retains the original offset, which appears incorrect.
This could cause the residual entry to point to the wrong data within
the page.

[ ... ]

---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md

CI run summary: https://github.com/kernel-patches/bpf/actions/runs/26156967283

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync
  2026-05-20 10:27 [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync Zhang Cen
  2026-05-20 11:09 ` bot+bpf-ci
@ 2026-05-20 16:00 ` John Fastabend
  1 sibling, 0 replies; 4+ messages in thread
From: John Fastabend @ 2026-05-20 16:00 UTC (permalink / raw)
  To: Zhang Cen
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Stanislav Fomichev, Jakub Sitnicki,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, bpf, netdev, linux-kernel, zerocling0077,
	2045gemini, stable

On Wed, May 20, 2026 at 06:27:15PM +0800, Zhang Cen wrote:
>SK_MSG uses msg->sg.copy as per-scatterlist-entry provenance. Entries
>with this bit set are copied before data/data_end are exposed to SK_MSG
>BPF programs for direct packet access.
>
>bpf_msg_pull_data(), bpf_msg_push_data(), and bpf_msg_pop_data()
>rewrite the sk_msg scatterlist ring by collapsing, splitting, and
>shifting entries. These operations move msg->sg.data[] entries, but the
>parallel copy bitmap can be left behind on the old slot. A copied entry
>can then return to msg->sg.start with its copy bit clear and be exposed
>as directly writable packet data.
>
>This corruption path requires an attached SK_MSG BPF program that calls
>the mutating helpers; ordinary sockmap/TLS traffic that never runs
>push/pop/pull helper sequences is not affected.
>
>Keep msg->sg.copy synchronized with scatterlist entry moves, preserve
>the copy bit when an entry is split, clear it when a helper replaces an
>entry with a private page, and clear slots vacated by pull-data
>compaction.
>
>Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
>Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
>Fixes: 7246d8ed4dcc ("bpf: helper to pop data from messages")
>Cc: stable@vger.kernel.org
>Co-developed-by: Han Guidong <2045gemini@gmail.com>
>Signed-off-by: Han Guidong <2045gemini@gmail.com>
>Signed-off-by: Zhang Cen <rollkingzzc@gmail.com>
>---

The bot reports are smaller fixups that we can add on top of this.

Reviewed-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync
  2026-05-20 11:09 ` bot+bpf-ci
@ 2026-05-20 16:59   ` John Fastabend
  0 siblings, 0 replies; 4+ messages in thread
From: John Fastabend @ 2026-05-20 16:59 UTC (permalink / raw)
  To: bot+bpf-ci
  Cc: rollkingzzc, ast, daniel, andrii, martin.lau, sdf, jakub, davem,
	edumazet, kuba, pabeni, horms, bpf, netdev, linux-kernel,
	zerocling0077, 2045gemini, stable, martin.lau, eddyz87,
	yonghong.song, clm, ihor.solodrai

On Wed, May 20, 2026 at 11:09:36AM +0000, bot+bpf-ci@kernel.org wrote:
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index 9590877b0714..223d78708486 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>
>[ ... ]
>
>> @@ -2794,6 +2835,8 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>>  {
>>  	struct scatterlist sge, nsge, nnsge, rsge = {0}, *psge;
>>  	u32 new, i = 0, l = 0, space, copy = 0, offset = 0;
>> +	bool sge_copy = false, nsge_copy = false, nnsge_copy = false;
>> +	bool rsge_copy = false;
>>  	u8 *raw, *to, *from;
>>  	struct page *page;
>>
>> @@ -2866,6 +2909,7 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>>  			sk_msg_iter_var_prev(i);
>>  		psge = sk_msg_elem(msg, i);
>>  		rsge = sk_msg_elem_cpy(msg, i);
>> +		rsge_copy = sk_msg_elem_is_copy(msg, i);
>>
>>  		psge->length = start - offset;
>>  		rsge.length -= psge->length;

Zhang,

Did you want to push two smallish patches to fix these two? I have had
similar patches on my test system for a few days so they LGTM. These
two fixes

>
>    page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC |
>                       __GFP_COMP | __GFP_ZERO,
>                       get_order(copy + len));
>

and

>
>    rsge.offset += start - offset;
>

bot must be reading reviews because I called that out in v2.

[...]

>
>Could the ring buffer state become corrupted when data is appended exactly
>at the end of the message and the ring buffer is full?
>
>If data is appended exactly at the end of the message (start == offset +
>l), the loop finishes with i pointing to the unused slot msg->sg.end. If
>the ring buffer is full (!space), the coalescing fallback logic reads:
>
>    copy = msg->sg.data[i].length
>
>Because i points to an unused slot, its length is 0, making copy == 0, and
>the fallback coalescing block is bypassed. The function then
>unconditionally calls:
>
>    sk_msg_iter_next(msg, end)
>
>which increments msg->sg.end. Since the ring buffer was already full,
>advancing msg->sg.end causes it to wrap around and equal msg->sg.start.
>This could corrupt the ring buffer state and overwrite the first element
>without calling put_page(), leading to page leaks.

^^^

This one I haven't looked into yet. Let me know if you have time to get
to above two issues this week would be great to get a v1 out at least.

Thanks,
John

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2026-05-20 16:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-20 10:27 [PATCH v3] bpf, sockmap: keep sk_msg copy state in sync Zhang Cen
2026-05-20 11:09 ` bot+bpf-ci
2026-05-20 16:59   ` John Fastabend
2026-05-20 16:00 ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox