public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
@ 2026-04-24 19:16 Weiming Shi
  2026-04-25  2:55 ` Jiayuan Chen
  2026-04-25 19:19 ` sashiko-bot
  0 siblings, 2 replies; 3+ messages in thread
From: Weiming Shi @ 2026-04-24 19:16 UTC (permalink / raw)
  To: Martin KaFai Lau, Daniel Borkmann, Alexei Starovoitov,
	Andrii Nakryiko, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	David S . Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni
  Cc: John Fastabend, Stanislav Fomichev, Song Liu, Yonghong Song,
	Jiri Olsa, Simon Horman, bpf, netdev, Xiang Mei, Weiming Shi,
	Xinyu Ma

When the scatterlist ring is full or nearly full, bpf_msg_push_data()
enters a copy fallback path and computes copy + len for the page
allocation size. Since len comes from BPF with arg3_type = ARG_ANYTHING
and both are u32, a crafted len can wrap the sum to a small value,
causing an undersized allocation followed by an out-of-bounds memcpy.

 BUG: unable to handle page fault for address: ffffed104089a402
 Oops: Oops: 0000 [#1] SMP KASAN NOPTI
 Call Trace:
  __asan_memcpy (mm/kasan/shadow.c:105)
  bpf_msg_push_data (net/core/filter.c:2852 net/core/filter.c:2788)
  bpf_prog_9ed8b5711920a7d7+0x2e/0x36
  sk_psock_msg_verdict (net/core/skmsg.c:934)
  tcp_bpf_sendmsg (net/ipv4/tcp_bpf.c:421 net/ipv4/tcp_bpf.c:584)
  __sys_sendto (net/socket.c:2206)
  do_syscall_64 (arch/x86/entry/syscall_64.c:94)
  entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Add an overflow check before the allocation.

Link: https://lore.kernel.org/all/20260424155913.A19FDC19425@smtp.kernel.org
Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
Tested-by: Xiang Mei <xmei5@asu.edu>
Tested-by: Xinyu Ma <mmmxny@gmail.com>
Signed-off-by: Weiming Shi <bestswngs@gmail.com>
---
 net/core/filter.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/core/filter.c b/net/core/filter.c
index bc96c18df4e0..4f5a00ade2d3 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2820,6 +2820,9 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
 	if (!space || (space == 1 && start != offset))
 		copy = msg->sg.data[i].length;
 
+	if (unlikely(copy + len < copy))
+		return -EINVAL;
+
 	page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
 			   get_order(copy + len));
 	if (unlikely(!page))
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
  2026-04-24 19:16 [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data() Weiming Shi
@ 2026-04-25  2:55 ` Jiayuan Chen
  2026-04-25 19:19 ` sashiko-bot
  1 sibling, 0 replies; 3+ messages in thread
From: Jiayuan Chen @ 2026-04-25  2:55 UTC (permalink / raw)
  To: Weiming Shi, Martin KaFai Lau, Daniel Borkmann,
	Alexei Starovoitov, Andrii Nakryiko, Eduard Zingerman,
	Kumar Kartikeya Dwivedi, David S . Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: John Fastabend, Stanislav Fomichev, Song Liu, Yonghong Song,
	Jiri Olsa, Simon Horman, bpf, netdev, Xiang Mei, Xinyu Ma


On 4/25/26 3:16 AM, Weiming Shi wrote:
> When the scatterlist ring is full or nearly full, bpf_msg_push_data()
> enters a copy fallback path and computes copy + len for the page
> allocation size. Since len comes from BPF with arg3_type = ARG_ANYTHING
> and both are u32, a crafted len can wrap the sum to a small value,
> causing an undersized allocation followed by an out-of-bounds memcpy.
>
>   BUG: unable to handle page fault for address: ffffed104089a402
>   Oops: Oops: 0000 [#1] SMP KASAN NOPTI
>   Call Trace:
>    __asan_memcpy (mm/kasan/shadow.c:105)
>    bpf_msg_push_data (net/core/filter.c:2852 net/core/filter.c:2788)
>    bpf_prog_9ed8b5711920a7d7+0x2e/0x36
>    sk_psock_msg_verdict (net/core/skmsg.c:934)
>    tcp_bpf_sendmsg (net/ipv4/tcp_bpf.c:421 net/ipv4/tcp_bpf.c:584)
>    __sys_sendto (net/socket.c:2206)
>    do_syscall_64 (arch/x86/entry/syscall_64.c:94)
>    entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
>
> Add an overflow check before the allocation.
>
> Link: https://lore.kernel.org/all/20260424155913.A19FDC19425@smtp.kernel.org
> Fixes: 6fff607e2f14 ("bpf: sk_msg program helper bpf_msg_push_data")
> Tested-by: Xiang Mei <xmei5@asu.edu>
> Tested-by: Xinyu Ma <mmmxny@gmail.com>
> Signed-off-by: Weiming Shi <bestswngs@gmail.com>


Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev>



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()
  2026-04-24 19:16 [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data() Weiming Shi
  2026-04-25  2:55 ` Jiayuan Chen
@ 2026-04-25 19:19 ` sashiko-bot
  1 sibling, 0 replies; 3+ messages in thread
From: sashiko-bot @ 2026-04-25 19:19 UTC (permalink / raw)
  To: Weiming Shi; +Cc: bpf

Sashiko AI review found 5 potential issue(s):
- [Critical] Heap buffer overflow when appending to a full ring buffer
- [High] Out-of-bounds physical memory read due to incorrect rsge.offset calculation
- [High] Information leak due to uninitialized kernel memory in pushed data
- [High] State corruption because msg->sg.copy bitmap is not shifted
- [Medium] Memory limit bypass via unaccounted sk_mem_charge
--

commit f82297413325a738aacdfa77689e82663d5e3cc8
Author: Weiming Shi <bestswngs@gmail.com>

    bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data()

    When the scatterlist ring is full or nearly full, bpf_msg_push_data()
    enters a copy fallback path and computes copy + len for the page
    allocation size. Since len comes from BPF with arg3_type = ARG_ANYTHING
    and both are u32, a crafted len can wrap the sum to a small value,
    causing an undersized allocation followed by an out-of-bounds memcpy.

    Link: https://lore.kernel.org/all/20260424155913.A19FDC19425@smtp.kernel.org

> diff --git a/net/core/filter.c b/net/core/filter.c
> index 5fa9189eb772b2..76965cc7ea2f22 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -2820,6 +2820,9 @@ BPF_CALL_4(bpf_msg_push_data, struct sk_msg *, msg, u32, start,
>  	if (!space || (space == 1 && start != offset))
>  		copy = msg->sg.data[i].length;

When appending data to the very end of a full ring buffer (space == 0 and
start == offset + l), the search loop terminates with i == msg->sg.end.
Does this assignment read a garbage length from the empty slot at
msg->sg.end? Since sk_msg_shift_left() does not clear slots when shifting
elements, this slot could contain the length of a previously used element.

>  
> +	if (unlikely(copy + len < copy))
> +		return -EINVAL;
> +
>  	page = alloc_pages(__GFP_NOWARN | GFP_ATOMIC | __GFP_COMP,
>  			   get_order(copy + len));

Since this page allocation does not use __GFP_ZERO, can this leak
uninitialized kernel memory? If copy == 0, the entire allocated len bytes
are added to the scatterlist payload without initialization. If copy > 0,
the existing data is copied to the front and back of the new page, but
the len bytes in the middle are left untouched.

Also, if copy read a garbage value from the empty slot above, can this
lead to a heap buffer overflow? Later in this function, i is correctly
stepped back to msg->sg.end - 1, and front is calculated as start - offset.
If the true length of the last element (front) exceeds the garbage length
read into copy, would the memcpy(raw, from, front) overflow the allocated
page?

[ ... ]

Can this result in an out-of-bounds physical memory read? When inserting
data into the middle of a scatterlist element, the back half (rsge) is
advanced by start:

	if (start - offset) {
		if (i == msg->sg.end)
			sk_msg_iter_var_prev(i);
		psge = sk_msg_elem(msg, i);
		rsge = sk_msg_elem_cpy(msg, i);

		psge->length = start - offset;
		rsge.length -= psge->length;
		rsge.offset += start;

Since start is the absolute byte offset from the beginning of the entire
message, should this instead add the relative offset within the element
(start - offset or psge->length)? By adding start, rsge.offset skips an
additional offset bytes, which might point past valid physical page
boundaries when the packet is transmitted.


When inserting a new element, the existing scatterlist elements are
shifted to the right:

	while (i != msg->sg.end) {
		msg->sg.data[i] = sge;
		sge = nsge;
		sk_msg_iter_var_next(i);
		...

Does this shifting loop need to correspondingly shift the msg->sg.copy
bitmap?

Without updating the bitmap, shifted elements will inherit the copy state
of the previous elements at those indices. If a read-only or zero-copy
element is shifted into a position where the copy bit is cleared, could
a BPF program be erroneously granted direct write access to it via
msg->data?


Is it possible to bypass memory limits here?

place_new:
	/* Place newly allocated data buffer */
	sk_mem_charge(msg->sk, len);
	msg->sg.size += len;

Since sk_mem_charge() only subtracts the memory size from
sk->sk_forward_alloc without validating limits, does this need a prior
call to sk_wmem_schedule() to verify and allocate memory against the
socket limits? Without it, could a BPF program repeatedly push data and
drive sk->sk_forward_alloc deeply negative?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260424191602.1522411-3-bestswngs@gmail.com?part=1

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-04-25 19:19 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-24 19:16 [PATCH bpf] bpf, sockmap: reject overflowing copy + len in bpf_msg_push_data() Weiming Shi
2026-04-25  2:55 ` Jiayuan Chen
2026-04-25 19:19 ` sashiko-bot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox