* [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
@ 2024-11-30 13:38 Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter Levi Zim via B4 Relay
` (4 more replies)
0 siblings, 5 replies; 18+ messages in thread
From: Levi Zim via B4 Relay @ 2024-11-30 13:38 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel, Levi Zim
I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
test_sockmap.c triggers a kernel NULL pointer dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000008
? __die_body+0x6e/0xb0
? __die+0x8b/0xa0
? page_fault_oops+0x358/0x3c0
? local_clock+0x19/0x30
? lock_release+0x11b/0x440
? kernelmode_fixup_or_oops+0x54/0x60
? __bad_area_nosemaphore+0x4f/0x210
? mmap_read_unlock+0x13/0x30
? bad_area_nosemaphore+0x16/0x20
? do_user_addr_fault+0x6fd/0x740
? prb_read_valid+0x1d/0x30
? exc_page_fault+0x55/0xd0
? asm_exc_page_fault+0x2b/0x30
? splice_to_socket+0x52e/0x630
? shmem_file_splice_read+0x2b1/0x310
direct_splice_actor+0x47/0x70
splice_direct_to_actor+0x133/0x300
? do_splice_direct+0x90/0x90
do_splice_direct+0x64/0x90
? __ia32_sys_tee+0x30/0x30
do_sendfile+0x214/0x300
__se_sys_sendfile64+0x8e/0xb0
__x64_sys_sendfile64+0x25/0x30
x64_sys_call+0xb82/0x2840
do_syscall_64+0x75/0x110
entry_SYSCALL_64_after_hwframe+0x4b/0x53
This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
size(8192), which causes the while loop in splice_to_socket() to release
an uninitialized pipe buf.
The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
will copy all bytes upon success but it actually might only copy part of
it.
This series change sk_msg_memcopy_from_iter() to return copied bytes on
success and tcp_bpf_sendmsg() to use the real copied bytes instead of
assuming all bytes gets copied.
Signed-off-by: Levi Zim <rsworktech@outlook.com>
---
Levi Zim (2):
skmsg: return copied bytes in sk_msg_memcopy_from_iter
tcp_bpf: fix copied value in tcp_bpf_sendmsg
net/core/skmsg.c | 5 +++--
net/ipv4/tcp_bpf.c | 8 ++++----
2 files changed, 7 insertions(+), 6 deletions(-)
---
base-commit: f1cd565ce57760923d5e0fbd9e9914b415c0620a
change-id: 20241130-tcp-bpf-sendmsg-ff3c9d84e693
Best regards,
--
Levi Zim <rsworktech@outlook.com>
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH net 1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
@ 2024-11-30 13:38 ` Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg Levi Zim via B4 Relay
` (3 subsequent siblings)
4 siblings, 0 replies; 18+ messages in thread
From: Levi Zim via B4 Relay @ 2024-11-30 13:38 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel, Levi Zim
From: Levi Zim <rsworktech@outlook.com>
Previously sk_msg_memcopy_from_iter returns the copied bytes from the
last copy_from_iter{,_nocache} call upon success.
This commit changes it to return the total number of copied bytes on
success.
Signed-off-by: Levi Zim <rsworktech@outlook.com>
---
net/core/skmsg.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index e90fbab703b2db1da49068b5a53338ce7ff99087..a65c2e64645863b80ddd94c464d86fcc587b5c04 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -369,8 +369,8 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
struct sk_msg *msg, u32 bytes)
{
int ret = -ENOSPC, i = msg->sg.curr;
+ u32 copy, buf_size, copied = 0;
struct scatterlist *sge;
- u32 copy, buf_size;
void *to;
do {
@@ -397,6 +397,7 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
goto out;
}
bytes -= copy;
+ copied += copy;
if (!bytes)
break;
msg->sg.copybreak = 0;
@@ -404,7 +405,7 @@ int sk_msg_memcopy_from_iter(struct sock *sk, struct iov_iter *from,
} while (i != msg->sg.end);
out:
msg->sg.curr = i;
- return ret;
+ return (ret < 0) ? ret : copied;
}
EXPORT_SYMBOL_GPL(sk_msg_memcopy_from_iter);
--
2.47.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter Levi Zim via B4 Relay
@ 2024-11-30 13:38 ` Levi Zim via B4 Relay
2024-12-09 7:02 ` John Fastabend
2024-12-01 1:42 ` [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim
` (2 subsequent siblings)
4 siblings, 1 reply; 18+ messages in thread
From: Levi Zim via B4 Relay @ 2024-11-30 13:38 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel, Levi Zim
From: Levi Zim <rsworktech@outlook.com>
bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a
kernel NULL pointer dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000008
? __die_body+0x6e/0xb0
? __die+0x8b/0xa0
? page_fault_oops+0x358/0x3c0
? local_clock+0x19/0x30
? lock_release+0x11b/0x440
? kernelmode_fixup_or_oops+0x54/0x60
? __bad_area_nosemaphore+0x4f/0x210
? mmap_read_unlock+0x13/0x30
? bad_area_nosemaphore+0x16/0x20
? do_user_addr_fault+0x6fd/0x740
? prb_read_valid+0x1d/0x30
? exc_page_fault+0x55/0xd0
? asm_exc_page_fault+0x2b/0x30
? splice_to_socket+0x52e/0x630
? shmem_file_splice_read+0x2b1/0x310
direct_splice_actor+0x47/0x70
splice_direct_to_actor+0x133/0x300
? do_splice_direct+0x90/0x90
do_splice_direct+0x64/0x90
? __ia32_sys_tee+0x30/0x30
do_sendfile+0x214/0x300
__se_sys_sendfile64+0x8e/0xb0
__x64_sys_sendfile64+0x25/0x30
x64_sys_call+0xb82/0x2840
do_syscall_64+0x75/0x110
entry_SYSCALL_64_after_hwframe+0x4b/0x53
This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
size (8192), which causes the while loop in splice_to_socket() to release
an uninitialized pipe buf.
The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
will copy all bytes upon success but it actually might only copy part of
it.
This commit changes it to use the real copied bytes.
Signed-off-by: Levi Zim <rsworktech@outlook.com>
---
net/ipv4/tcp_bpf.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 370993c03d31363c0f82a003d9e5b0ca3bbed721..8e46c4d618cbbff0d120fe4cd917624e5d5cae15 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -496,7 +496,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
{
struct sk_msg tmp, *msg_tx = NULL;
- int copied = 0, err = 0;
+ int copied = 0, err = 0, ret = 0;
struct sk_psock *psock;
long timeo;
int flags;
@@ -539,14 +539,14 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
copy = msg_tx->sg.size - osize;
}
- err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
+ ret = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
copy);
- if (err < 0) {
+ if (ret < 0) {
sk_msg_trim(sk, msg_tx, osize);
goto out_err;
}
- copied += copy;
+ copied += ret;
if (psock->cork_bytes) {
if (size > psock->cork_bytes)
psock->cork_bytes = 0;
--
2.47.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg Levi Zim via B4 Relay
@ 2024-12-01 1:42 ` Levi Zim
2024-12-04 1:01 ` Cong Wang
2024-12-02 23:04 ` Jakub Kicinski
2024-12-20 22:20 ` patchwork-bot+netdevbpf
4 siblings, 1 reply; 18+ messages in thread
From: Levi Zim @ 2024-12-01 1:42 UTC (permalink / raw)
To: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel
On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
> test_sockmap.c triggers a kernel NULL pointer dereference:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> ? __die_body+0x6e/0xb0
> ? __die+0x8b/0xa0
> ? page_fault_oops+0x358/0x3c0
> ? local_clock+0x19/0x30
> ? lock_release+0x11b/0x440
> ? kernelmode_fixup_or_oops+0x54/0x60
> ? __bad_area_nosemaphore+0x4f/0x210
> ? mmap_read_unlock+0x13/0x30
> ? bad_area_nosemaphore+0x16/0x20
> ? do_user_addr_fault+0x6fd/0x740
> ? prb_read_valid+0x1d/0x30
> ? exc_page_fault+0x55/0xd0
> ? asm_exc_page_fault+0x2b/0x30
> ? splice_to_socket+0x52e/0x630
> ? shmem_file_splice_read+0x2b1/0x310
> direct_splice_actor+0x47/0x70
> splice_direct_to_actor+0x133/0x300
> ? do_splice_direct+0x90/0x90
> do_splice_direct+0x64/0x90
> ? __ia32_sys_tee+0x30/0x30
> do_sendfile+0x214/0x300
> __se_sys_sendfile64+0x8e/0xb0
> __x64_sys_sendfile64+0x25/0x30
> x64_sys_call+0xb82/0x2840
> do_syscall_64+0x75/0x110
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>
> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
> size(8192), which causes the while loop in splice_to_socket() to release
> an uninitialized pipe buf.
>
> The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
> will copy all bytes upon success but it actually might only copy part of
> it.
I am not sure what Fixes tag I should put. Git blame leads me to a
refactor commit
and I am not familiar with this part of code base. Any suggestions?
>
> This series change sk_msg_memcopy_from_iter() to return copied bytes on
> success and tcp_bpf_sendmsg() to use the real copied bytes instead of
> assuming all bytes gets copied.
>
> Signed-off-by: Levi Zim <rsworktech@outlook.com>
> ---
> Levi Zim (2):
> skmsg: return copied bytes in sk_msg_memcopy_from_iter
> tcp_bpf: fix copied value in tcp_bpf_sendmsg
>
> net/core/skmsg.c | 5 +++--
> net/ipv4/tcp_bpf.c | 8 ++++----
> 2 files changed, 7 insertions(+), 6 deletions(-)
> ---
> base-commit: f1cd565ce57760923d5e0fbd9e9914b415c0620a
> change-id: 20241130-tcp-bpf-sendmsg-ff3c9d84e693
>
> Best regards,
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
` (2 preceding siblings ...)
2024-12-01 1:42 ` [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim
@ 2024-12-02 23:04 ` Jakub Kicinski
2024-12-03 6:42 ` Levi Zim
2024-12-20 22:20 ` patchwork-bot+netdevbpf
4 siblings, 1 reply; 18+ messages in thread
From: Jakub Kicinski @ 2024-12-02 23:04 UTC (permalink / raw)
To: Levi Zim via B4 Relay
Cc: rsworktech, John Fastabend, Jakub Sitnicki, David S. Miller,
Eric Dumazet, Paolo Abeni, Simon Horman, David Ahern, netdev, bpf,
linux-kernel
On Sat, 30 Nov 2024 21:38:21 +0800 Levi Zim via B4 Relay wrote:
> net/core/skmsg.c | 5 +++--
> net/ipv4/tcp_bpf.c | 8 ++++----
Haven't looked at the code, but these files are BPF related.
I'll reassign the patch to BPF maintainers, and please use "PATCH bpf"
instead of "PATCH net" for next revisions.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-02 23:04 ` Jakub Kicinski
@ 2024-12-03 6:42 ` Levi Zim
0 siblings, 0 replies; 18+ messages in thread
From: Levi Zim @ 2024-12-03 6:42 UTC (permalink / raw)
To: Jakub Kicinski
Cc: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
On 2024-12-03 07:04, Jakub Kicinski wrote:
> On Sat, 30 Nov 2024 21:38:21 +0800 Levi Zim via B4 Relay wrote:
>> net/core/skmsg.c | 5 +++--
>> net/ipv4/tcp_bpf.c | 8 ++++----
> Haven't looked at the code, but these files are BPF related.
> I'll reassign the patch to BPF maintainers, and please use "PATCH bpf"
> instead of "PATCH net" for next revisions.
Sorry for sending the patch using a wrong prefix. I will use bpf prefix
for next revisions.
I am getting started with bpf development in the kernel.
Initially I thought about using bpf prefix but I saw all the files I
touched are under net which
confuses me about what prefix I should use.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-01 1:42 ` [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim
@ 2024-12-04 1:01 ` Cong Wang
2024-12-04 6:49 ` Levi Zim
0 siblings, 1 reply; 18+ messages in thread
From: Cong Wang @ 2024-12-04 1:01 UTC (permalink / raw)
To: Levi Zim
Cc: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern, netdev,
bpf, linux-kernel
On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
> > I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
> > test_sockmap.c triggers a kernel NULL pointer dereference:
Interesting, I also ran this test recently and I didn't see such a
crash.
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000008
> > ? __die_body+0x6e/0xb0
> > ? __die+0x8b/0xa0
> > ? page_fault_oops+0x358/0x3c0
> > ? local_clock+0x19/0x30
> > ? lock_release+0x11b/0x440
> > ? kernelmode_fixup_or_oops+0x54/0x60
> > ? __bad_area_nosemaphore+0x4f/0x210
> > ? mmap_read_unlock+0x13/0x30
> > ? bad_area_nosemaphore+0x16/0x20
> > ? do_user_addr_fault+0x6fd/0x740
> > ? prb_read_valid+0x1d/0x30
> > ? exc_page_fault+0x55/0xd0
> > ? asm_exc_page_fault+0x2b/0x30
> > ? splice_to_socket+0x52e/0x630
> > ? shmem_file_splice_read+0x2b1/0x310
> > direct_splice_actor+0x47/0x70
> > splice_direct_to_actor+0x133/0x300
> > ? do_splice_direct+0x90/0x90
> > do_splice_direct+0x64/0x90
> > ? __ia32_sys_tee+0x30/0x30
> > do_sendfile+0x214/0x300
> > __se_sys_sendfile64+0x8e/0xb0
> > __x64_sys_sendfile64+0x25/0x30
> > x64_sys_call+0xb82/0x2840
> > do_syscall_64+0x75/0x110
> > entry_SYSCALL_64_after_hwframe+0x4b/0x53
> >
> > This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
> > size(8192), which causes the while loop in splice_to_socket() to release
> > an uninitialized pipe buf.
> >
> > The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
> > will copy all bytes upon success but it actually might only copy part of
> > it.
> I am not sure what Fixes tag I should put. Git blame leads me to a refactor
> commit
> and I am not familiar with this part of code base. Any suggestions?
I think it is the following commit which introduced memcopy_from_iter()
(which was renamed to sk_msg_memcopy_from_iter() later):
commit 4f738adba30a7cfc006f605707e7aee847ffefa0
Author: John Fastabend <john.fastabend@gmail.com>
Date: Sun Mar 18 12:57:10 2018 -0700
bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
Please double check.
Thanks.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-04 1:01 ` Cong Wang
@ 2024-12-04 6:49 ` Levi Zim
2024-12-17 15:43 ` Björn Töpel
0 siblings, 1 reply; 18+ messages in thread
From: Levi Zim @ 2024-12-04 6:49 UTC (permalink / raw)
To: Cong Wang
Cc: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern, netdev,
bpf, linux-kernel
On 2024-12-04 09:01, Cong Wang wrote:
> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>> test_sockmap.c triggers a kernel NULL pointer dereference:
> Interesting, I also ran this test recently and I didn't see such a
> crash.
I am also curious about why other people or the CI didn't hit such crash.
I just did a search and find only one mention of this bug:
https://lore.kernel.org/bpf/20241020110345.1468595-1-zijianzhang@bytedance.com/
Personally when trying to run test_sockmap on Arch Linux 6.12.1 kernel,
I get rcu stall instead of this NPE:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: Tasks blocked on level-0 rcu_node (CPUs 0-11): P3378
rcu: (detected by 0, t=18002 jiffies, g=9525, q=28619 ncpus=12)
task:test_sockmap state:R running task stack:0 pid:3378
tgid:3378 ppid:1168 flags:0x00004006
Call Trace:
<TASK>
? __schedule+0x3b8/0x12b0
? get_page_from_freelist+0x366/0x1730
? sysvec_apic_timer_interrupt+0xe/0x90
? asm_sysvec_apic_timer_interrupt+0x1a/0x20
? bpf_msg_pop_data+0x41e/0x690
? mem_cgroup_charge_skmem+0x40/0x60
? bpf_prog_1fca1a523ce93f38_bpf_prog4+0x23d/0x248
? sk_psock_msg_verdict+0x99/0x1e0
? tcp_bpf_sendmsg+0x42d/0x9f0
? sock_sendmsg+0x109/0x130
? splice_to_socket+0x359/0x4f0
? shmem_file_splice_read+0x2cd/0x300
? direct_splice_actor+0x51/0x130
? splice_direct_to_actor+0xf0/0x260
? __pfx_direct_splice_actor+0x10/0x10
? do_splice_direct+0x77/0xc0
? __pfx_direct_file_splice_eof+0x10/0x10
? do_sendfile+0x382/0x440
? __x64_sys_sendfile64+0xb3/0xd0
? do_syscall_64+0x82/0x190
? find_next_iomem_res+0xbe/0x130
? __pfx_pagerange_is_ram_callback+0x10/0x10
? walk_system_ram_range+0xa6/0x100
? __pte_offset_map+0x1b/0x180
? __pte_offset_map_lock+0x9e/0x130
? set_ptes.isra.0+0x41/0x90
? insert_pfn+0xba/0x210
? vmf_insert_pfn_prot+0x85/0xd0
? __do_fault+0x30/0x170
? do_fault+0x303/0x4c0
? __handle_mm_fault+0x7c2/0xfa0
? shmem_file_write_iter+0x5b/0x90
? __count_memcg_events+0x53/0xf0
? count_memcg_events.constprop.0+0x1a/0x30
? handle_mm_fault+0x1bb/0x2c0
? do_user_addr_fault+0x17f/0x620
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
? clear_bhb_loop+0x25/0x80
? entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
>>> BUG: kernel NULL pointer dereference, address: 0000000000000008
>>> ? __die_body+0x6e/0xb0
>>> ? __die+0x8b/0xa0
>>> ? page_fault_oops+0x358/0x3c0
>>> ? local_clock+0x19/0x30
>>> ? lock_release+0x11b/0x440
>>> ? kernelmode_fixup_or_oops+0x54/0x60
>>> ? __bad_area_nosemaphore+0x4f/0x210
>>> ? mmap_read_unlock+0x13/0x30
>>> ? bad_area_nosemaphore+0x16/0x20
>>> ? do_user_addr_fault+0x6fd/0x740
>>> ? prb_read_valid+0x1d/0x30
>>> ? exc_page_fault+0x55/0xd0
>>> ? asm_exc_page_fault+0x2b/0x30
>>> ? splice_to_socket+0x52e/0x630
>>> ? shmem_file_splice_read+0x2b1/0x310
>>> direct_splice_actor+0x47/0x70
>>> splice_direct_to_actor+0x133/0x300
>>> ? do_splice_direct+0x90/0x90
>>> do_splice_direct+0x64/0x90
>>> ? __ia32_sys_tee+0x30/0x30
>>> do_sendfile+0x214/0x300
>>> __se_sys_sendfile64+0x8e/0xb0
>>> __x64_sys_sendfile64+0x25/0x30
>>> x64_sys_call+0xb82/0x2840
>>> do_syscall_64+0x75/0x110
>>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>>
>>> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
>>> size(8192), which causes the while loop in splice_to_socket() to release
>>> an uninitialized pipe buf.
>>>
>>> The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
>>> will copy all bytes upon success but it actually might only copy part of
>>> it.
>> I am not sure what Fixes tag I should put. Git blame leads me to a refactor
>> commit
>> and I am not familiar with this part of code base. Any suggestions?
> I think it is the following commit which introduced memcopy_from_iter()
> (which was renamed to sk_msg_memcopy_from_iter() later):
>
> commit 4f738adba30a7cfc006f605707e7aee847ffefa0
> Author: John Fastabend <john.fastabend@gmail.com>
> Date: Sun Mar 18 12:57:10 2018 -0700
>
> bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data
>
> Please double check.
>
> Thanks.
Thanks for your help. I will double check it.
^ permalink raw reply [flat|nested] 18+ messages in thread
* RE: [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg
2024-11-30 13:38 ` [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg Levi Zim via B4 Relay
@ 2024-12-09 7:02 ` John Fastabend
2024-12-09 11:56 ` Levi Zim
0 siblings, 1 reply; 18+ messages in thread
From: John Fastabend @ 2024-12-09 7:02 UTC (permalink / raw)
To: Levi Zim via B4 Relay, John Fastabend, Jakub Sitnicki,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel, Levi Zim
Levi Zim via B4 Relay wrote:
> From: Levi Zim <rsworktech@outlook.com>
>
> bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a
> kernel NULL pointer dereference:
Is it just the cork test that causes issue?
>
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> ? __die_body+0x6e/0xb0
> ? __die+0x8b/0xa0
> ? page_fault_oops+0x358/0x3c0
> ? local_clock+0x19/0x30
> ? lock_release+0x11b/0x440
> ? kernelmode_fixup_or_oops+0x54/0x60
> ? __bad_area_nosemaphore+0x4f/0x210
> ? mmap_read_unlock+0x13/0x30
> ? bad_area_nosemaphore+0x16/0x20
> ? do_user_addr_fault+0x6fd/0x740
> ? prb_read_valid+0x1d/0x30
> ? exc_page_fault+0x55/0xd0
> ? asm_exc_page_fault+0x2b/0x30
> ? splice_to_socket+0x52e/0x630
> ? shmem_file_splice_read+0x2b1/0x310
> direct_splice_actor+0x47/0x70
> splice_direct_to_actor+0x133/0x300
> ? do_splice_direct+0x90/0x90
> do_splice_direct+0x64/0x90
> ? __ia32_sys_tee+0x30/0x30
> do_sendfile+0x214/0x300
> __se_sys_sendfile64+0x8e/0xb0
> __x64_sys_sendfile64+0x25/0x30
> x64_sys_call+0xb82/0x2840
> do_syscall_64+0x75/0x110
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>
> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
> size (8192), which causes the while loop in splice_to_socket() to release
> an uninitialized pipe buf.
>
> The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
> will copy all bytes upon success but it actually might only copy part of
> it.
The intent was to ensure we allocate a buffer large enough to fit the
data. I guess the cork + send here is not allocating enough bytes?
>
> This commit changes it to use the real copied bytes.
>
> Signed-off-by: Levi Zim <rsworktech@outlook.com>
> ---
> net/ipv4/tcp_bpf.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
> index 370993c03d31363c0f82a003d9e5b0ca3bbed721..8e46c4d618cbbff0d120fe4cd917624e5d5cae15 100644
> --- a/net/ipv4/tcp_bpf.c
> +++ b/net/ipv4/tcp_bpf.c
> @@ -496,7 +496,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
> static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> {
> struct sk_msg tmp, *msg_tx = NULL;
> - int copied = 0, err = 0;
> + int copied = 0, err = 0, ret = 0;
> struct sk_psock *psock;
> long timeo;
> int flags;
> @@ -539,14 +539,14 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> copy = msg_tx->sg.size - osize;
> }
>
> - err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
> + ret = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
> copy);
> - if (err < 0) {
> + if (ret < 0) {
> sk_msg_trim(sk, msg_tx, osize);
> goto out_err;
> }
>
> - copied += copy;
> + copied += ret;
> if (psock->cork_bytes) {
> if (size > psock->cork_bytes)
> psock->cork_bytes = 0;
>
> --
> 2.47.1
>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg
2024-12-09 7:02 ` John Fastabend
@ 2024-12-09 11:56 ` Levi Zim
2024-12-10 6:14 ` John Fastabend
0 siblings, 1 reply; 18+ messages in thread
From: Levi Zim @ 2024-12-09 11:56 UTC (permalink / raw)
To: John Fastabend, Levi Zim via B4 Relay, Jakub Sitnicki,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel
On 2024-12-09 15:02, John Fastabend wrote:
> Levi Zim via B4 Relay wrote:
>> From: Levi Zim <rsworktech@outlook.com>
>>
>> bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a
>> kernel NULL pointer dereference:
> Is it just the cork test that causes issue?
Yes. More specifically only "sockhash::test_txmsg_cork_hangs" but not
"sockmap::test_txmsg_cork_hangs"
>
>> BUG: kernel NULL pointer dereference, address: 0000000000000008
>> ? __die_body+0x6e/0xb0
>> ? __die+0x8b/0xa0
>> ? page_fault_oops+0x358/0x3c0
>> ? local_clock+0x19/0x30
>> ? lock_release+0x11b/0x440
>> ? kernelmode_fixup_or_oops+0x54/0x60
>> ? __bad_area_nosemaphore+0x4f/0x210
>> ? mmap_read_unlock+0x13/0x30
>> ? bad_area_nosemaphore+0x16/0x20
>> ? do_user_addr_fault+0x6fd/0x740
>> ? prb_read_valid+0x1d/0x30
>> ? exc_page_fault+0x55/0xd0
>> ? asm_exc_page_fault+0x2b/0x30
>> ? splice_to_socket+0x52e/0x630
>> ? shmem_file_splice_read+0x2b1/0x310
>> direct_splice_actor+0x47/0x70
>> splice_direct_to_actor+0x133/0x300
>> ? do_splice_direct+0x90/0x90
>> do_splice_direct+0x64/0x90
>> ? __ia32_sys_tee+0x30/0x30
>> do_sendfile+0x214/0x300
>> __se_sys_sendfile64+0x8e/0xb0
>> __x64_sys_sendfile64+0x25/0x30
>> x64_sys_call+0xb82/0x2840
>> do_syscall_64+0x75/0x110
>> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>>
>> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
>> size (8192), which causes the while loop in splice_to_socket() to release
>> an uninitialized pipe buf.
>>
>> The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
>> will copy all bytes upon success but it actually might only copy part of
>> it.
> The intent was to ensure we allocate a buffer large enough to fit the
> data. I guess the cork + send here is not allocating enough bytes?
I am not familiar enough with neither this part of code nor tcp with bpf
in general and just
hit this bug when trying to run the bpf kselftests. Then I decided to
debug it.
In my perspective the buffer(8192) is large enough to hold the data(8192),
but tcp_bpf_sendmsg returns 12289 which is a little surprising for me.
Could you further elaborate why 8192 bytes are not enough? Thanks!
>> This commit changes it to use the real copied bytes.
>>
>> Signed-off-by: Levi Zim <rsworktech@outlook.com>
>> ---
>> net/ipv4/tcp_bpf.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
>> index 370993c03d31363c0f82a003d9e5b0ca3bbed721..8e46c4d618cbbff0d120fe4cd917624e5d5cae15 100644
>> --- a/net/ipv4/tcp_bpf.c
>> +++ b/net/ipv4/tcp_bpf.c
>> @@ -496,7 +496,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
>> static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>> {
>> struct sk_msg tmp, *msg_tx = NULL;
>> - int copied = 0, err = 0;
>> + int copied = 0, err = 0, ret = 0;
>> struct sk_psock *psock;
>> long timeo;
>> int flags;
>> @@ -539,14 +539,14 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
>> copy = msg_tx->sg.size - osize;
>> }
>>
>> - err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
>> + ret = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, msg_tx,
>> copy);
>> - if (err < 0) {
>> + if (ret < 0) {
>> sk_msg_trim(sk, msg_tx, osize);
>> goto out_err;
>> }
>>
>> - copied += copy;
>> + copied += ret;
>> if (psock->cork_bytes) {
>> if (size > psock->cork_bytes)
>> psock->cork_bytes = 0;
>>
>> --
>> 2.47.1
>>
>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg
2024-12-09 11:56 ` Levi Zim
@ 2024-12-10 6:14 ` John Fastabend
0 siblings, 0 replies; 18+ messages in thread
From: John Fastabend @ 2024-12-10 6:14 UTC (permalink / raw)
To: Levi Zim, John Fastabend, Levi Zim via B4 Relay, Jakub Sitnicki,
David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, David Ahern
Cc: netdev, bpf, linux-kernel
Levi Zim wrote:
> On 2024-12-09 15:02, John Fastabend wrote:
> > Levi Zim via B4 Relay wrote:
> >> From: Levi Zim <rsworktech@outlook.com>
> >>
> >> bpf kselftest sockhash::test_txmsg_cork_hangs in test_sockmap.c triggers a
> >> kernel NULL pointer dereference:
> > Is it just the cork test that causes issue?
> Yes. More specifically only "sockhash::test_txmsg_cork_hangs" but not
> "sockmap::test_txmsg_cork_hangs"
> >
> >> BUG: kernel NULL pointer dereference, address: 0000000000000008
> >> ? __die_body+0x6e/0xb0
> >> ? __die+0x8b/0xa0
> >> ? page_fault_oops+0x358/0x3c0
> >> ? local_clock+0x19/0x30
> >> ? lock_release+0x11b/0x440
> >> ? kernelmode_fixup_or_oops+0x54/0x60
> >> ? __bad_area_nosemaphore+0x4f/0x210
> >> ? mmap_read_unlock+0x13/0x30
> >> ? bad_area_nosemaphore+0x16/0x20
> >> ? do_user_addr_fault+0x6fd/0x740
> >> ? prb_read_valid+0x1d/0x30
> >> ? exc_page_fault+0x55/0xd0
> >> ? asm_exc_page_fault+0x2b/0x30
> >> ? splice_to_socket+0x52e/0x630
> >> ? shmem_file_splice_read+0x2b1/0x310
> >> direct_splice_actor+0x47/0x70
> >> splice_direct_to_actor+0x133/0x300
> >> ? do_splice_direct+0x90/0x90
> >> do_splice_direct+0x64/0x90
> >> ? __ia32_sys_tee+0x30/0x30
> >> do_sendfile+0x214/0x300
> >> __se_sys_sendfile64+0x8e/0xb0
> >> __x64_sys_sendfile64+0x25/0x30
> >> x64_sys_call+0xb82/0x2840
> >> do_syscall_64+0x75/0x110
> >> entry_SYSCALL_64_after_hwframe+0x4b/0x53
> >>
> >> This is caused by tcp_bpf_sendmsg() returning a larger value(12289) than
> >> size (8192), which causes the while loop in splice_to_socket() to release
> >> an uninitialized pipe buf.
> >>
> >> The underlying cause is that this code assumes sk_msg_memcopy_from_iter()
> >> will copy all bytes upon success but it actually might only copy part of
> >> it.
> > The intent was to ensure we allocate a buffer large enough to fit the
> > data. I guess the cork + send here is not allocating enough bytes?
> I am not familiar enough with neither this part of code nor tcp with bpf
> in general and just
> hit this bug when trying to run the bpf kselftests. Then I decided to
> debug it.
>
> In my perspective the buffer(8192) is large enough to hold the data(8192),
> but tcp_bpf_sendmsg returns 12289 which is a little surprising for me.
>
> Could you further elaborate why 8192 bytes are not enough? Thanks!
>
There is some bug in the buffer allocation sizing that is happening
because of cork'd data. The cork logic is used to hold extra bytes
in buffer until N bytes have been received.
I'm not really opposed to the fix here, but would be good to understand
how it got here. I have some time tommorrow I can look a bit more.
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-04 6:49 ` Levi Zim
@ 2024-12-17 15:43 ` Björn Töpel
2024-12-19 9:17 ` Björn Töpel
0 siblings, 1 reply; 18+ messages in thread
From: Björn Töpel @ 2024-12-17 15:43 UTC (permalink / raw)
To: Levi Zim, Cong Wang
Cc: John Fastabend, Jakub Sitnicki, David S. Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Simon Horman, David Ahern, netdev,
bpf, linux-kernel
Levi Zim <rsworktech@outlook.com> writes:
> On 2024-12-04 09:01, Cong Wang wrote:
>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
>> Interesting, I also ran this test recently and I didn't see such a
>> crash.
>
> I am also curious about why other people or the CI didn't hit such crash.
FWIW, I'm hitting it on RISC-V:
| Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
| Oops [#1]
| Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
| CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
| Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
| epc : splice_to_socket+0x376/0x49a
| ra : splice_to_socket+0x37c/0x49a
| epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
| gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
| t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
| s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
| a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
| a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
| s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
| s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
| s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
| s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
| t5 : 0000000000000000 t6 : ff6000008869f230
| status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
| [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
| [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
| [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
| [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
| [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
| [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
| [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
| [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
| Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
| ---[ end trace 0000000000000000 ]---
| Kernel panic - not syncing: Fatal exception
| SMP: stopping secondary CPUs
| ---[ end Kernel panic - not syncing: Fatal exception ]---
This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
(Yet to bisect!)
Björn
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-17 15:43 ` Björn Töpel
@ 2024-12-19 9:17 ` Björn Töpel
2024-12-20 7:56 ` John Fastabend
0 siblings, 1 reply; 18+ messages in thread
From: Björn Töpel @ 2024-12-19 9:17 UTC (permalink / raw)
To: Levi Zim, Cong Wang, John Fastabend
Cc: Jakub Sitnicki, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
Björn Töpel <bjorn@kernel.org> writes:
> Levi Zim <rsworktech@outlook.com> writes:
>
>> On 2024-12-04 09:01, Cong Wang wrote:
>>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
>>> Interesting, I also ran this test recently and I didn't see such a
>>> crash.
>>
>> I am also curious about why other people or the CI didn't hit such crash.
>
> FWIW, I'm hitting it on RISC-V:
>
> | Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
> | Oops [#1]
> | Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
> | CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
> | Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
> | epc : splice_to_socket+0x376/0x49a
> | ra : splice_to_socket+0x37c/0x49a
> | epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
> | gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
> | t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
> | s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
> | a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
> | a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
> | s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
> | s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
> | s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
> | s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
> | t5 : 0000000000000000 t6 : ff6000008869f230
> | status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
> | [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
> | [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
> | [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
> | [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
> | [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
> | [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
> | [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
> | [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
> | Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
> | ---[ end trace 0000000000000000 ]---
> | Kernel panic - not syncing: Fatal exception
> | SMP: stopping secondary CPUs
> | ---[ end Kernel panic - not syncing: Fatal exception ]---
>
> This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
>
> (Yet to bisect!)
Took the series for a run, and it does solve crash, but I'm getting
additional failures:
| [TEST 298]: (512, 1, 3, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Invalid argument
| rx thread exited with err 1.
| FAILED
| [TEST 299]: (100, 1, 5, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Invalid argument
| rx thread exited with err 1.
| FAILED
| [TEST 300]: (2, 32, 8192, sendpage, pass,pop (4096,8192),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| ...
| #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
| ...
| [TEST 308]: (2, 32, 8192, sendpage, pass,pop (5,21),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| [TEST 309]: (2, 32, 8192, sendpage, pass,pop (1,11),ktls,): socket(peer2) kTLS enabled
| socket(client1) kTLS enabled
| recv failed(): Bad message
| rx thread exited with err 1.
| FAILED
| ...
| #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-19 9:17 ` Björn Töpel
@ 2024-12-20 7:56 ` John Fastabend
2024-12-20 9:00 ` Levi Zim
2024-12-20 9:03 ` Björn Töpel
0 siblings, 2 replies; 18+ messages in thread
From: John Fastabend @ 2024-12-20 7:56 UTC (permalink / raw)
To: Björn Töpel, Levi Zim, Cong Wang, John Fastabend
Cc: Jakub Sitnicki, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
Björn Töpel wrote:
> Björn Töpel <bjorn@kernel.org> writes:
>
> > Levi Zim <rsworktech@outlook.com> writes:
> >
> >> On 2024-12-04 09:01, Cong Wang wrote:
> >>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
> >>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
> >>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
> >>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
> >>> Interesting, I also ran this test recently and I didn't see such a
> >>> crash.
> >>
> >> I am also curious about why other people or the CI didn't hit such crash.
> >
> > FWIW, I'm hitting it on RISC-V:
> >
> > | Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
> > | Oops [#1]
> > | Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
> > | CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
> > | Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
> > | epc : splice_to_socket+0x376/0x49a
> > | ra : splice_to_socket+0x37c/0x49a
> > | epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
> > | gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
> > | t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
> > | s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
> > | a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
> > | a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
> > | s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
> > | s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
> > | s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
> > | s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
> > | t5 : 0000000000000000 t6 : ff6000008869f230
> > | status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
> > | [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
> > | [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
> > | [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
> > | [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
> > | [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
> > | [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
> > | [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
> > | [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
> > | Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
> > | ---[ end trace 0000000000000000 ]---
> > | Kernel panic - not syncing: Fatal exception
> > | SMP: stopping secondary CPUs
> > | ---[ end Kernel panic - not syncing: Fatal exception ]---
> >
> > This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
> >
> > (Yet to bisect!)
>
> Took the series for a run, and it does solve crash, but I'm getting
> additional failures:
Hi Bjorn,
Thanks! I'm guessing those tests were failing even without the patch
though right?
Thanks,
John
>
> | [TEST 298]: (512, 1, 3, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
> | socket(client1) kTLS enabled
> | recv failed(): Invalid argument
> | rx thread exited with err 1.
> | FAILED
> | [TEST 299]: (100, 1, 5, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
> | socket(client1) kTLS enabled
> | recv failed(): Invalid argument
> | rx thread exited with err 1.
> | FAILED
> | [TEST 300]: (2, 32, 8192, sendpage, pass,pop (4096,8192),ktls,): socket(peer2) kTLS enabled
> | socket(client1) kTLS enabled
> | recv failed(): Bad message
> | rx thread exited with err 1.
> | FAILED
> | ...
> | #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
> | ...
> | [TEST 308]: (2, 32, 8192, sendpage, pass,pop (5,21),ktls,): socket(peer2) kTLS enabled
> | socket(client1) kTLS enabled
> | recv failed(): Bad message
> | rx thread exited with err 1.
> | FAILED
> | [TEST 309]: (2, 32, 8192, sendpage, pass,pop (1,11),ktls,): socket(peer2) kTLS enabled
> | socket(client1) kTLS enabled
> | recv failed(): Bad message
> | rx thread exited with err 1.
> | FAILED
> | ...
> | #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-20 7:56 ` John Fastabend
@ 2024-12-20 9:00 ` Levi Zim
2024-12-20 9:03 ` Björn Töpel
1 sibling, 0 replies; 18+ messages in thread
From: Levi Zim @ 2024-12-20 9:00 UTC (permalink / raw)
To: John Fastabend, Björn Töpel, Cong Wang
Cc: Jakub Sitnicki, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
On 2024-12-20 15:56, John Fastabend wrote:
> Björn Töpel wrote:
>> Björn Töpel <bjorn@kernel.org> writes:
>>
>>> Levi Zim <rsworktech@outlook.com> writes:
>>>
>>>> On 2024-12-04 09:01, Cong Wang wrote:
>>>>> On Sun, Dec 01, 2024 at 09:42:08AM +0800, Levi Zim wrote:
>>>>>> On 2024-11-30 21:38, Levi Zim via B4 Relay wrote:
>>>>>>> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
>>>>>>> test_sockmap.c triggers a kernel NULL pointer dereference:
>>>>> Interesting, I also ran this test recently and I didn't see such a
>>>>> crash.
>>>> I am also curious about why other people or the CI didn't hit such crash.
>>> FWIW, I'm hitting it on RISC-V:
>>>
>>> | Unable to handle kernel access to user memory without uaccess routines at virtual address 0000000000000008
>>> | Oops [#1]
>>> | Modules linked in: sch_fq_codel drm fuse drm_panel_orientation_quirks backlight
>>> | CPU: 7 UID: 0 PID: 732 Comm: test_sockmap Not tainted 6.13.0-rc3-00017-gf44d154d6e3d #1
>>> | Hardware name: riscv-virtio qemu/qemu, BIOS 2025.01-rc3-00042-gacab6e78aca7 01/01/2025
>>> | epc : splice_to_socket+0x376/0x49a
>>> | ra : splice_to_socket+0x37c/0x49a
>>> | epc : ffffffff803d9ffc ra : ffffffff803da002 sp : ff20000001c3b8b0
>>> | gp : ffffffff827aefa8 tp : ff60000083450040 t0 : ff6000008a12d001
>>> | t1 : 0000100100001001 t2 : 0000000000000000 s0 : ff20000001c3bae0
>>> | s1 : ffffffffffffefff a0 : ff6000008245e200 a1 : ff60000087dd0450
>>> | a2 : 0000000000000000 a3 : 0000000000000000 a4 : 0000000000000000
>>> | a5 : 0000000000000000 a6 : ff20000001c3b450 a7 : ff6000008a12c004
>>> | s2 : 000000000000000f s3 : ff6000008245e2d0 s4 : ff6000008245e280
>>> | s5 : 0000000000000000 s6 : 0000000000000002 s7 : 0000000000001001
>>> | s8 : 0000000000003001 s9 : 0000000000000002 s10: 0000000000000002
>>> | s11: ff6000008245e200 t3 : ffffffff8001e78c t4 : 0000000000000000
>>> | t5 : 0000000000000000 t6 : ff6000008869f230
>>> | status: 0000000200000120 badaddr: 0000000000000008 cause: 000000000000000d
>>> | [<ffffffff803d9ffc>] splice_to_socket+0x376/0x49a
>>> | [<ffffffff803d8bc0>] direct_splice_actor+0x44/0x216
>>> | [<ffffffff803d8532>] splice_direct_to_actor+0xb6/0x1e8
>>> | [<ffffffff803d8780>] do_splice_direct+0x70/0xa2
>>> | [<ffffffff80392e40>] do_sendfile+0x26e/0x2d4
>>> | [<ffffffff803939d4>] __riscv_sys_sendfile64+0xf2/0x10e
>>> | [<ffffffff80fdfb64>] do_trap_ecall_u+0x1f8/0x26c
>>> | [<ffffffff80fedaee>] _new_vmalloc_restore_context_a0+0xc6/0xd2
>>> | Code: c5d8 9e35 c590 8bb3 40db eb01 6998 b823 0005 856e (6718) 2d05
>>> | ---[ end trace 0000000000000000 ]---
>>> | Kernel panic - not syncing: Fatal exception
>>> | SMP: stopping secondary CPUs
>>> | ---[ end Kernel panic - not syncing: Fatal exception ]---
>>>
>>> This is commit f44d154d6e3d ("Merge tag 'soc-fixes-6.13' of
>>> git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc").
>>>
>>> (Yet to bisect!)
>> Took the series for a run, and it does solve crash, but I'm getting
>> additional failures:
> Hi Bjorn,
>
> Thanks! I'm guessing those tests were failing even without the patch
> though right?
IIRC those kTLS tests were failing when I manually commented out the
cork hangs test that crashes the kernel.
>
> Thanks,
> John
>
>> | [TEST 298]: (512, 1, 3, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Invalid argument
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 299]: (100, 1, 5, sendpage, pass,pop (1,3),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Invalid argument
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 300]: (2, 32, 8192, sendpage, pass,pop (4096,8192),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | ...
>> | #42/ 9 sockhash:ktls:txmsg test pop-data:FAIL
>> | ...
>> | [TEST 308]: (2, 32, 8192, sendpage, pass,pop (5,21),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | [TEST 309]: (2, 32, 8192, sendpage, pass,pop (1,11),ktls,): socket(peer2) kTLS enabled
>> | socket(client1) kTLS enabled
>> | recv failed(): Bad message
>> | rx thread exited with err 1.
>> | FAILED
>> | ...
>> | #43/ 6 sockhash:ktls:txmsg test push/pop data:FAIL
>>
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-20 7:56 ` John Fastabend
2024-12-20 9:00 ` Levi Zim
@ 2024-12-20 9:03 ` Björn Töpel
2024-12-20 16:56 ` John Fastabend
1 sibling, 1 reply; 18+ messages in thread
From: Björn Töpel @ 2024-12-20 9:03 UTC (permalink / raw)
To: John Fastabend, Levi Zim, Cong Wang, John Fastabend
Cc: Jakub Sitnicki, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
John Fastabend <john.fastabend@gmail.com> writes:
>> Took the series for a run, and it does solve crash, but I'm getting
>> additional failures:
>
> Thanks! I'm guessing those tests were failing even without the patch
> though right?
Correct.
test_sockmap did however pass the full suite in 6.12. So, something
changed with the addition of [1].
I guess:
Tested-by: Björn Töpel <bjorn@kernel.org>
can be added to this series, and not crashing is nice, but it would be
interesting to know how it got there.
Björn
[1] https://lore.kernel.org/all/20241106222520.527076-1-zijianzhang@bytedance.com/
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-12-20 9:03 ` Björn Töpel
@ 2024-12-20 16:56 ` John Fastabend
0 siblings, 0 replies; 18+ messages in thread
From: John Fastabend @ 2024-12-20 16:56 UTC (permalink / raw)
To: Björn Töpel, John Fastabend, Levi Zim, Cong Wang,
John Fastabend
Cc: Jakub Sitnicki, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Simon Horman, David Ahern, netdev, bpf, linux-kernel
Björn Töpel wrote:
> John Fastabend <john.fastabend@gmail.com> writes:
>
> >> Took the series for a run, and it does solve crash, but I'm getting
> >> additional failures:
> >
> > Thanks! I'm guessing those tests were failing even without the patch
> > though right?
>
> Correct.
Thanks Bjorn!
>
> test_sockmap did however pass the full suite in 6.12. So, something
> changed with the addition of [1].
Agree with finding this. I also have a compliance test failing in one
of our nginx/apache/bpf test suites so might be related. I'll dig into
it.
>
> I guess:
>
> Tested-by: Björn Töpel <bjorn@kernel.org>
Because when I find the bug that gets here I would have stacked this on
top of any fix as well I think lets apply this now.
Reviewed-by: John Fastabend <john.fastabend@gmail.com>
>
> can be added to this series, and not crashing is nice, but it would be
> interesting to know how it got there.
>
>
> Björn
>
>
> [1] https://lore.kernel.org/all/20241106222520.527076-1-zijianzhang@bytedance.com/
>
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net 0/2] Fix NPE discovered by running bpf kselftest
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
` (3 preceding siblings ...)
2024-12-02 23:04 ` Jakub Kicinski
@ 2024-12-20 22:20 ` patchwork-bot+netdevbpf
4 siblings, 0 replies; 18+ messages in thread
From: patchwork-bot+netdevbpf @ 2024-12-20 22:20 UTC (permalink / raw)
To: Levi Zim via B4 Relay
Cc: john.fastabend, jakub, davem, edumazet, kuba, pabeni, horms,
dsahern, netdev, bpf, linux-kernel, rsworktech
Hello:
This series was applied to bpf/bpf.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:
On Sat, 30 Nov 2024 21:38:21 +0800 you wrote:
> I found that bpf kselftest sockhash::test_txmsg_cork_hangs in
> test_sockmap.c triggers a kernel NULL pointer dereference:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000008
> ? __die_body+0x6e/0xb0
> ? __die+0x8b/0xa0
> ? page_fault_oops+0x358/0x3c0
> ? local_clock+0x19/0x30
> ? lock_release+0x11b/0x440
> ? kernelmode_fixup_or_oops+0x54/0x60
> ? __bad_area_nosemaphore+0x4f/0x210
> ? mmap_read_unlock+0x13/0x30
> ? bad_area_nosemaphore+0x16/0x20
> ? do_user_addr_fault+0x6fd/0x740
> ? prb_read_valid+0x1d/0x30
> ? exc_page_fault+0x55/0xd0
> ? asm_exc_page_fault+0x2b/0x30
> ? splice_to_socket+0x52e/0x630
> ? shmem_file_splice_read+0x2b1/0x310
> direct_splice_actor+0x47/0x70
> splice_direct_to_actor+0x133/0x300
> ? do_splice_direct+0x90/0x90
> do_splice_direct+0x64/0x90
> ? __ia32_sys_tee+0x30/0x30
> do_sendfile+0x214/0x300
> __se_sys_sendfile64+0x8e/0xb0
> __x64_sys_sendfile64+0x25/0x30
> x64_sys_call+0xb82/0x2840
> do_syscall_64+0x75/0x110
> entry_SYSCALL_64_after_hwframe+0x4b/0x53
>
> [...]
Here is the summary with links:
- [net,1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter
https://git.kernel.org/bpf/bpf/c/fdf478d236dc
- [net,2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg
https://git.kernel.org/bpf/bpf/c/5153a75ef34b
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2024-12-20 22:20 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-30 13:38 [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 1/2] skmsg: return copied bytes in sk_msg_memcopy_from_iter Levi Zim via B4 Relay
2024-11-30 13:38 ` [PATCH net 2/2] tcp_bpf: fix copied value in tcp_bpf_sendmsg Levi Zim via B4 Relay
2024-12-09 7:02 ` John Fastabend
2024-12-09 11:56 ` Levi Zim
2024-12-10 6:14 ` John Fastabend
2024-12-01 1:42 ` [PATCH net 0/2] Fix NPE discovered by running bpf kselftest Levi Zim
2024-12-04 1:01 ` Cong Wang
2024-12-04 6:49 ` Levi Zim
2024-12-17 15:43 ` Björn Töpel
2024-12-19 9:17 ` Björn Töpel
2024-12-20 7:56 ` John Fastabend
2024-12-20 9:00 ` Levi Zim
2024-12-20 9:03 ` Björn Töpel
2024-12-20 16:56 ` John Fastabend
2024-12-02 23:04 ` Jakub Kicinski
2024-12-03 6:42 ` Levi Zim
2024-12-20 22:20 ` patchwork-bot+netdevbpf
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).