netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
@ 2022-03-04  8:11 Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
                   ` (5 more replies)
  0 siblings, 6 replies; 11+ messages in thread
From: Wang Yufen @ 2022-03-04  8:11 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

This patchset fixes memleaks and incorrect charge/uncharge memory, these
issues cause the following info:

WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
 <IRQ>
 inet_csk_destroy_sock+0x55/0x110
 tcp_rcv_state_process+0xe5f/0xe90
 ? sk_filter_trim_cap+0x10d/0x230
 ? tcp_v4_do_rcv+0x161/0x250
 tcp_v4_do_rcv+0x161/0x250
 tcp_v4_rcv+0xc3a/0xce0
 ip_protocol_deliver_rcu+0x3d/0x230
 ip_local_deliver_finish+0x54/0x60
 ip_local_deliver+0xfd/0x110
 ? ip_protocol_deliver_rcu+0x230/0x230
 ip_rcv+0xd6/0x100
 ? ip_local_deliver+0x110/0x110
 __netif_receive_skb_one_core+0x85/0xa0
 process_backlog+0xa4/0x160
 __napi_poll+0x29/0x1b0
 net_rx_action+0x287/0x300
 __do_softirq+0xff/0x2fc
 do_softirq+0x79/0x90
 </IRQ>

WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Changes since v2:
-Move sk_msg_trim() logic into sk_msg_alloc while -ENOMEM occurs, as
Cong Wang suggested.

Changes since v1:
-Update the commit message of patch #2, the error path is from ENOMEM not
the ENOSPC.
-Simply returning an error code when psock is null, as John Fastabend
suggested.

Wang Yufen (4):
  bpf, sockmap: Fix memleak in sk_psock_queue_msg
  bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  bpf, sockmap: Fix more uncharged while msg has more_data
  bpf, sockmap: Fix double uncharge the mem of sk_msg

 include/linux/skmsg.h | 13 ++++---------
 net/core/skmsg.c      | 17 +++++++++++++----
 net/ipv4/tcp_bpf.c    | 14 ++++++++------
 3 files changed, 25 insertions(+), 19 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
@ 2022-03-04  8:11 ` Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Wang Yufen @ 2022-03-04  8:11 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

If tcp_bpf_sendmsg is running during a tear down operation we may enqueue
data on the ingress msg queue while tear down is trying to free it.

 sk1 (redirect sk2)                         sk2
 -------------------                      ---------------
tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  tcp_bpf_sendmsg_redir()
   bpf_tcp_ingress()
                                          sock_map_close()
                                           lock_sock()
    lock_sock() ... blocking
                                           sk_psock_stop
                                            sk_psock_clear_state(psock, SK_PSOCK_TX_ENABLED);
                                           release_sock(sk);
    lock_sock()
    sk_mem_charge()
    get_page()
    sk_psock_queue_msg()
     sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED);
      drop_sk_msg()
    release_sock()

While drop_sk_msg(), the msg has charged memory form sk by sk_mem_charge
and has sg pages need to put. To fix we use sk_msg_free() and then kfee()
msg.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
Call Trace:
 <IRQ>
 inet_csk_destroy_sock+0x55/0x110
 tcp_rcv_state_process+0xe5f/0xe90
 ? sk_filter_trim_cap+0x10d/0x230
 ? tcp_v4_do_rcv+0x161/0x250
 tcp_v4_do_rcv+0x161/0x250
 tcp_v4_rcv+0xc3a/0xce0
 ip_protocol_deliver_rcu+0x3d/0x230
 ip_local_deliver_finish+0x54/0x60
 ip_local_deliver+0xfd/0x110
 ? ip_protocol_deliver_rcu+0x230/0x230
 ip_rcv+0xd6/0x100
 ? ip_local_deliver+0x110/0x110
 __netif_receive_skb_one_core+0x85/0xa0
 process_backlog+0xa4/0x160
 __napi_poll+0x29/0x1b0
 net_rx_action+0x287/0x300
 __do_softirq+0xff/0x2fc
 do_softirq+0x79/0x90
 </IRQ>

WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 ? process_one_work+0x3c0/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 9635720b7c88 ("bpf, sockmap: Fix memleak on ingress msg enqueue")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 include/linux/skmsg.h | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index fdb5375f0562..c5a2d6f50f25 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -304,21 +304,16 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
 	kfree_skb(skb);
 }
 
-static inline void drop_sk_msg(struct sk_psock *psock, struct sk_msg *msg)
-{
-	if (msg->skb)
-		sock_drop(psock->sk, msg->skb);
-	kfree(msg);
-}
-
 static inline void sk_psock_queue_msg(struct sk_psock *psock,
 				      struct sk_msg *msg)
 {
 	spin_lock_bh(&psock->ingress_lock);
 	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
 		list_add_tail(&msg->list, &psock->ingress_msg);
-	else
-		drop_sk_msg(psock, msg);
+	else {
+		sk_msg_free(psock->sk, msg);
+		kfree(msg);
+	}
 	spin_unlock_bh(&psock->ingress_lock);
 }
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
@ 2022-03-04  8:11 ` Wang Yufen
  2022-03-11 21:55   ` John Fastabend
  2022-03-04  8:11 ` [PATCH bpf-next v3 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 11+ messages in thread
From: Wang Yufen @ 2022-03-04  8:11 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

If tcp_bpf_sendmsg() is running while sk msg is full. When sk_msg_alloc()
returns -ENOMEM error, tcp_bpf_sendmsg() goes to wait_for_memory. If partial
memory has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is
greater than osize after sk_msg_alloc(), memleak occurs. To fix we use
sk_msg_trim() to release the allocated memory, then goto wait for memory.

Other call paths of sk_msg_alloc() have the similar issue, such as
tls_sw_sendmsg(), so handle sk_msg_trim logic inside sk_msg_alloc(),
as Cong Wang suggested.

This issue can cause the following info:
WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 kthread+0xe6/0x110
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 net/core/skmsg.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index 8eb671c827f9..38ee73f453f8 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -27,6 +27,7 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
 		 int elem_first_coalesce)
 {
 	struct page_frag *pfrag = sk_page_frag(sk);
+	u32 osize = msg->sg.size;
 	int ret = 0;
 
 	len -= msg->sg.size;
@@ -35,13 +36,17 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
 		u32 orig_offset;
 		int use, i;
 
-		if (!sk_page_frag_refill(sk, pfrag))
-			return -ENOMEM;
+		if (!sk_page_frag_refill(sk, pfrag)) {
+			ret = -ENOMEM;
+			goto msg_trim;
+		}
 
 		orig_offset = pfrag->offset;
 		use = min_t(int, len, pfrag->size - orig_offset);
-		if (!sk_wmem_schedule(sk, use))
-			return -ENOMEM;
+		if (!sk_wmem_schedule(sk, use)) {
+			ret = -ENOMEM;
+			goto msg_trim;
+		}
 
 		i = msg->sg.end;
 		sk_msg_iter_var_prev(i);
@@ -71,6 +76,10 @@ int sk_msg_alloc(struct sock *sk, struct sk_msg *msg, int len,
 	}
 
 	return ret;
+
+msg_trim:
+	sk_msg_trim(sk, msg, osize);
+	return ret;
 }
 EXPORT_SYMBOL_GPL(sk_msg_alloc);
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 3/4] bpf, sockmap: Fix more uncharged while msg has more_data
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
@ 2022-03-04  8:11 ` Wang Yufen
  2022-03-04  8:11 ` [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 11+ messages in thread
From: Wang Yufen @ 2022-03-04  8:11 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

In tcp_bpf_send_verdict(), if msg has more data after
tcp_bpf_sendmsg_redir():

tcp_bpf_send_verdict()
 tosend = msg->sg.size  //msg->sg.size = 22220
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(22220) sk->sk_forward_alloc
  tcp_bpf_sendmsg_redir() //after tcp_bpf_sendmsg_redir, msg->sg.size=11000
 goto more_data;
 tosend = msg->sg.size  //msg->sg.size = 11000
 case __SK_REDIRECT:
  sk_msg_return()  //uncharged msg->sg.size(11000) to sk->sk_forward_alloc

The msg->sg.size(11000) has been uncharged twice, to fix we can charge the
remaining msg->sg.size before goto more data.

This issue can cause the following info:
WARNING: CPU: 0 PID: 9860 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
Call Trace:
 <TASK>
 inet_csk_destroy_sock+0x55/0x110
 __tcp_close+0x279/0x470
 tcp_close+0x1f/0x60
 inet_release+0x3f/0x80
 __sock_release+0x3d/0xb0
 sock_close+0x11/0x20
 __fput+0x92/0x250
 task_work_run+0x6a/0xa0
 do_exit+0x33b/0xb60
 do_group_exit+0x2f/0xa0
 get_signal+0xb6/0x950
 arch_do_signal_or_restart+0xac/0x2a0
 ? vfs_write+0x237/0x290
 exit_to_user_mode_prepare+0xa9/0x200
 syscall_exit_to_user_mode+0x12/0x30
 do_syscall_64+0x46/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xae
 </TASK>

WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
---
 net/ipv4/tcp_bpf.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 9b9b02052fd3..304800c60427 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -335,7 +335,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 			cork = true;
 			psock->cork = NULL;
 		}
-		sk_msg_return(sk, msg, tosend);
+		sk_msg_return(sk, msg, msg->sg.size);
 		release_sock(sk);
 
 		ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
@@ -375,8 +375,11 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 		}
 		if (msg &&
 		    msg->sg.data[msg->sg.start].page_link &&
-		    msg->sg.data[msg->sg.start].length)
+		    msg->sg.data[msg->sg.start].length) {
+			if (eval == __SK_REDIRECT)
+				sk_mem_charge(sk, msg->sg.size);
 			goto more_data;
+		}
 	}
 	return ret;
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
                   ` (2 preceding siblings ...)
  2022-03-04  8:11 ` [PATCH bpf-next v3 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
@ 2022-03-04  8:11 ` Wang Yufen
  2022-03-11 21:51   ` John Fastabend
  2022-03-15 15:50 ` [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge patchwork-bot+netdevbpf
  2022-10-27 21:10 ` Jakub Sitnicki
  5 siblings, 1 reply; 11+ messages in thread
From: Wang Yufen @ 2022-03-04  8:11 UTC (permalink / raw)
  To: john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

If tcp_bpf_sendmsg is running during a tear down operation, psock may be
freed.

tcp_bpf_sendmsg()
 tcp_bpf_send_verdict()
  sk_msg_return()
  tcp_bpf_sendmsg_redir()
   unlikely(!psock))
     sk_msg_free()

The mem of msg has been uncharged in tcp_bpf_send_verdict() by
sk_msg_return(), and would be uncharged by sk_msg_free() again. When psock
is null, we can simply returning an error code, this would then trigger
the sk_msg_free_nocharge in the error path of __SK_REDIRECT and would have
the side effect of throwing an error up to user space. This would be a
slight change in behavior from user side but would look the same as an
error if the redirect on the socket threw an error.

This issue can cause the following info:
WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
Call Trace:
 <TASK>
 __sk_destruct+0x24/0x1f0
 sk_psock_destroy+0x19b/0x1c0
 process_one_work+0x1b3/0x3c0
 worker_thread+0x30/0x350
 ? process_one_work+0x3c0/0x3c0
 kthread+0xe6/0x110
 ? kthread_complete_and_exit+0x20/0x20
 ret_from_fork+0x22/0x30
 </TASK>

Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
Signed-off-by: Wang Yufen <wangyufen@huawei.com>
---
 net/ipv4/tcp_bpf.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 304800c60427..1cdcb4df0eb7 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -138,10 +138,9 @@ int tcp_bpf_sendmsg_redir(struct sock *sk, struct sk_msg *msg,
 	struct sk_psock *psock = sk_psock_get(sk);
 	int ret;
 
-	if (unlikely(!psock)) {
-		sk_msg_free(sk, msg);
-		return 0;
-	}
+	if (unlikely(!psock))
+		return -EPIPE;
+
 	ret = ingress ? bpf_tcp_ingress(sk, psock, msg, bytes, flags) :
 			tcp_bpf_push_locked(sk, msg, bytes, flags, false);
 	sk_psock_put(sk, psock);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg
  2022-03-04  8:11 ` [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
@ 2022-03-11 21:51   ` John Fastabend
  0 siblings, 0 replies; 11+ messages in thread
From: John Fastabend @ 2022-03-11 21:51 UTC (permalink / raw)
  To: Wang Yufen, john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

Wang Yufen wrote:
> If tcp_bpf_sendmsg is running during a tear down operation, psock may be
> freed.
> 
> tcp_bpf_sendmsg()
>  tcp_bpf_send_verdict()
>   sk_msg_return()
>   tcp_bpf_sendmsg_redir()
>    unlikely(!psock))
>      sk_msg_free()
> 
> The mem of msg has been uncharged in tcp_bpf_send_verdict() by
> sk_msg_return(), and would be uncharged by sk_msg_free() again. When psock
> is null, we can simply returning an error code, this would then trigger
> the sk_msg_free_nocharge in the error path of __SK_REDIRECT and would have
> the side effect of throwing an error up to user space. This would be a
> slight change in behavior from user side but would look the same as an
> error if the redirect on the socket threw an error.
> 
> This issue can cause the following info:
> WARNING: CPU: 0 PID: 2136 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
> Call Trace:
>  <TASK>
>  __sk_destruct+0x24/0x1f0
>  sk_psock_destroy+0x19b/0x1c0
>  process_one_work+0x1b3/0x3c0
>  worker_thread+0x30/0x350
>  ? process_one_work+0x3c0/0x3c0
>  kthread+0xe6/0x110
>  ? kthread_complete_and_exit+0x20/0x20
>  ret_from_fork+0x22/0x30
>  </TASK>
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
> ---
>  net/ipv4/tcp_bpf.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)

Thanks John!

Acked-by: John Fastabend <john.fastabend@gmail.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
  2022-03-04  8:11 ` [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
@ 2022-03-11 21:55   ` John Fastabend
  0 siblings, 0 replies; 11+ messages in thread
From: John Fastabend @ 2022-03-11 21:55 UTC (permalink / raw)
  To: Wang Yufen, john.fastabend, daniel, jakub, lmb, davem, bpf
  Cc: edumazet, yoshfuji, dsahern, kuba, ast, andrii, kafai,
	songliubraving, yhs, kpsingh, netdev, Wang Yufen

Wang Yufen wrote:
> If tcp_bpf_sendmsg() is running while sk msg is full. When sk_msg_alloc()
> returns -ENOMEM error, tcp_bpf_sendmsg() goes to wait_for_memory. If partial
> memory has been alloced by sk_msg_alloc(), that is, msg_tx->sg.size is
> greater than osize after sk_msg_alloc(), memleak occurs. To fix we use
> sk_msg_trim() to release the allocated memory, then goto wait for memory.
> 
> Other call paths of sk_msg_alloc() have the similar issue, such as
> tls_sw_sendmsg(), so handle sk_msg_trim logic inside sk_msg_alloc(),
> as Cong Wang suggested.
> 
> This issue can cause the following info:
> WARNING: CPU: 3 PID: 7950 at net/core/stream.c:208 sk_stream_kill_queues+0xd4/0x1a0
> Call Trace:
>  <TASK>
>  inet_csk_destroy_sock+0x55/0x110
>  __tcp_close+0x279/0x470
>  tcp_close+0x1f/0x60
>  inet_release+0x3f/0x80
>  __sock_release+0x3d/0xb0
>  sock_close+0x11/0x20
>  __fput+0x92/0x250
>  task_work_run+0x6a/0xa0
>  do_exit+0x33b/0xb60
>  do_group_exit+0x2f/0xa0
>  get_signal+0xb6/0x950
>  arch_do_signal_or_restart+0xac/0x2a0
>  exit_to_user_mode_prepare+0xa9/0x200
>  syscall_exit_to_user_mode+0x12/0x30
>  do_syscall_64+0x46/0x80
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
>  </TASK>
> 
> WARNING: CPU: 3 PID: 2094 at net/ipv4/af_inet.c:155 inet_sock_destruct+0x13c/0x260
> Call Trace:
>  <TASK>
>  __sk_destruct+0x24/0x1f0
>  sk_psock_destroy+0x19b/0x1c0
>  process_one_work+0x1b3/0x3c0
>  kthread+0xe6/0x110
>  ret_from_fork+0x22/0x30
>  </TASK>
> 
> Fixes: 604326b41a6f ("bpf, sockmap: convert to generic sk_msg interface")
> Signed-off-by: Wang Yufen <wangyufen@huawei.com>
> Acked-by: John Fastabend <john.fastabend@gmail.com>
> ---

Still LGTM Thanks! Next time drop the ack though if you rewrite the patch
this much. Appreciate the fixes and helpful commit messages though.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
                   ` (3 preceding siblings ...)
  2022-03-04  8:11 ` [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
@ 2022-03-15 15:50 ` patchwork-bot+netdevbpf
  2022-10-27 21:10 ` Jakub Sitnicki
  5 siblings, 0 replies; 11+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-03-15 15:50 UTC (permalink / raw)
  To: wangyufen
  Cc: john.fastabend, daniel, jakub, lmb, davem, bpf, edumazet,
	yoshfuji, dsahern, kuba, ast, andrii, kafai, songliubraving, yhs,
	kpsingh, netdev

Hello:

This series was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Fri, 4 Mar 2022 16:11:41 +0800 you wrote:
> This patchset fixes memleaks and incorrect charge/uncharge memory, these
> issues cause the following info:
> 
> WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
> Call Trace:
>  <IRQ>
>  inet_csk_destroy_sock+0x55/0x110
>  tcp_rcv_state_process+0xe5f/0xe90
>  ? sk_filter_trim_cap+0x10d/0x230
>  ? tcp_v4_do_rcv+0x161/0x250
>  tcp_v4_do_rcv+0x161/0x250
>  tcp_v4_rcv+0xc3a/0xce0
>  ip_protocol_deliver_rcu+0x3d/0x230
>  ip_local_deliver_finish+0x54/0x60
>  ip_local_deliver+0xfd/0x110
>  ? ip_protocol_deliver_rcu+0x230/0x230
>  ip_rcv+0xd6/0x100
>  ? ip_local_deliver+0x110/0x110
>  __netif_receive_skb_one_core+0x85/0xa0
>  process_backlog+0xa4/0x160
>  __napi_poll+0x29/0x1b0
>  net_rx_action+0x287/0x300
>  __do_softirq+0xff/0x2fc
>  do_softirq+0x79/0x90
>  </IRQ>
> 
> [...]

Here is the summary with links:
  - [bpf-next,v3,1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg
    https://git.kernel.org/bpf/bpf-next/c/938d3480b92f
  - [bpf-next,v3,2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full
    https://git.kernel.org/bpf/bpf-next/c/9c34e38c4a87
  - [bpf-next,v3,3/4] bpf, sockmap: Fix more uncharged while msg has more_data
    https://git.kernel.org/bpf/bpf-next/c/84472b436e76
  - [bpf-next,v3,4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg
    https://git.kernel.org/bpf/bpf-next/c/2486ab434b2c

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
  2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
                   ` (4 preceding siblings ...)
  2022-03-15 15:50 ` [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge patchwork-bot+netdevbpf
@ 2022-10-27 21:10 ` Jakub Sitnicki
  2022-10-28  1:41   ` wangyufen
  2022-10-28  5:20   ` wangyufen
  5 siblings, 2 replies; 11+ messages in thread
From: Jakub Sitnicki @ 2022-10-27 21:10 UTC (permalink / raw)
  To: Wang Yufen, john.fastabend
  Cc: daniel, lmb, davem, bpf, edumazet, yoshfuji, dsahern, kuba, ast,
	andrii, kafai, songliubraving, yhs, kpsingh, netdev

Wang, John,

On Fri, Mar 04, 2022 at 04:11 PM +08, Wang Yufen wrote:
> This patchset fixes memleaks and incorrect charge/uncharge memory, these
> issues cause the following info:
>
> WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
> Call Trace:
>  <IRQ>
>  inet_csk_destroy_sock+0x55/0x110
>  tcp_rcv_state_process+0xe5f/0xe90
>  ? sk_filter_trim_cap+0x10d/0x230
>  ? tcp_v4_do_rcv+0x161/0x250
>  tcp_v4_do_rcv+0x161/0x250
>  tcp_v4_rcv+0xc3a/0xce0
>  ip_protocol_deliver_rcu+0x3d/0x230
>  ip_local_deliver_finish+0x54/0x60
>  ip_local_deliver+0xfd/0x110
>  ? ip_protocol_deliver_rcu+0x230/0x230
>  ip_rcv+0xd6/0x100
>  ? ip_local_deliver+0x110/0x110
>  __netif_receive_skb_one_core+0x85/0xa0
>  process_backlog+0xa4/0x160
>  __napi_poll+0x29/0x1b0
>  net_rx_action+0x287/0x300
>  __do_softirq+0xff/0x2fc
>  do_softirq+0x79/0x90
>  </IRQ>
>
> WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
> Call Trace:
>  <TASK>
>  __sk_destruct+0x24/0x1f0
>  sk_psock_destroy+0x19b/0x1c0
>  process_one_work+0x1b3/0x3c0
>  ? process_one_work+0x3c0/0x3c0
>  worker_thread+0x30/0x350
>  ? process_one_work+0x3c0/0x3c0
>  kthread+0xe6/0x110
>  ? kthread_complete_and_exit+0x20/0x20
>  ret_from_fork+0x22/0x30
>  </TASK>
>
> Changes since v2:
> -Move sk_msg_trim() logic into sk_msg_alloc while -ENOMEM occurs, as
> Cong Wang suggested.
>
> Changes since v1:
> -Update the commit message of patch #2, the error path is from ENOMEM not
> the ENOSPC.
> -Simply returning an error code when psock is null, as John Fastabend
> suggested.

This is going to sound a bit weird, but I'm actually observing the
mentioned warnings with these patches applied, when running
`test_sockmap` selftests. Without the patch set - all is well.

Bisect went like so:

broken  [bpf-next] 31af1aa09fb9 ("Merge branch 'bpf: Fixes for kprobe multi on kernel modules'")
...
broken  2486ab434b2c ("bpf, sockmap: Fix double uncharge the mem of sk_msg")
broken  84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data")
working 9c34e38c4a87 ("bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full")
working 938d3480b92f ("bpf, sockmap: Fix memleak in sk_psock_queue_msg")
working [base] d3b351f65bf4 ("selftests/bpf: Fix a clang compilation error for send_signal.c")

At 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has
more_data") I'm seeing:

bash-5.1# uname -r
5.17.0-rc6-01987-g84472b436e76
bash-5.1# cd tools/testing/selftests/bpf/
bash-5.1# ./test_sockmap
# 1/ 6  sockmap::txmsg test passthrough:OK
# 2/ 6  sockmap::txmsg test redirect:OK
# 3/ 6  sockmap::txmsg test drop:OK
# 4/ 6  sockmap::txmsg test ingress redirect:OK
# 5/ 7  sockmap::txmsg test skb:OK
# 6/ 8  sockmap::txmsg test apply:OK
# 7/12  sockmap::txmsg test cork:OK
[   16.399282] ------------[ cut here ]------------
[   16.400465] WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0
[   16.402718] Modules linked in:
[   16.403504] CPU: 2 PID: 197 Comm: test_sockmap Not tainted 5.17.0-rc6-01987-g84472b436e76 #76
[   16.404276] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
[   16.404800] RIP: 0010:sk_stream_kill_queues+0xd3/0xf0
[   16.405110] Code: 29 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 48 89 df e8 63 db fe ff 8b 83 78 02 00 00 8b b3 30 02 00 00 85 c0 74 d9 0f 0b 85 f6 74 d7 <0f> 0b 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 0f 0b eb 92 66 66 2e 0f 1f
[   16.406226] RSP: 0018:ffffc90000423d48 EFLAGS: 00010206
[   16.406545] RAX: 0000000000000000 RBX: ffff888104208000 RCX: 0000000000000000
[   16.407117] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff8881042081b8
[   16.407571] RBP: ffff8881042081b8 R08: 0000000000000000 R09: 0000000000000000
[   16.407995] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
[   16.408413] R13: ffff888102763000 R14: ffff888101b528e0 R15: ffff888104208130
[   16.408837] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
[   16.409318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.409666] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
[   16.410098] Call Trace:
[   16.410248]  <TASK>
[   16.410378]  inet_csk_destroy_sock+0x55/0x110
[   16.410647]  tcp_rcv_state_process+0xd28/0x1380
[   16.410934]  ? tcp_v4_do_rcv+0x77/0x2c0
[   16.411166]  tcp_v4_do_rcv+0x77/0x2c0
[   16.411388]  __release_sock+0x106/0x130
[   16.411626]  __tcp_close+0x1a7/0x4e0
[   16.411844]  tcp_close+0x20/0x70
[   16.412045]  inet_release+0x3c/0x80
[   16.412257]  __sock_release+0x3a/0xb0
[   16.412509]  sock_close+0x14/0x20
[   16.412770]  __fput+0xa3/0x260
[   16.413022]  task_work_run+0x59/0xb0
[   16.413286]  exit_to_user_mode_prepare+0x1b3/0x1c0
[   16.413665]  syscall_exit_to_user_mode+0x19/0x50
[   16.414020]  do_syscall_64+0x48/0x90
[   16.414285]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.414659] RIP: 0033:0x7f3728791eb7
[   16.414916] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
[   16.416286] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   16.416841] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
[   16.417296] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
[   16.417723] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
[   16.418178] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
[   16.418669] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
[   16.419148]  </TASK>
[   16.419283] irq event stamp: 135687
[   16.419492] hardirqs last  enabled at (135695): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
[   16.420065] hardirqs last disabled at (135712): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
[   16.420606] softirqs last  enabled at (135710): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
[   16.421201] softirqs last disabled at (135703): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
[   16.421839] ---[ end trace 0000000000000000 ]---
[   16.422195] ------------[ cut here ]------------
[   16.422550] WARNING: CPU: 2 PID: 197 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x198/0x1d0
[   16.423142] Modules linked in:
[   16.423327] CPU: 2 PID: 197 Comm: test_sockmap Tainted: G        W         5.17.0-rc6-01987-g84472b436e76 #76
[   16.423908] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
[   16.424436] RIP: 0010:inet_sock_destruct+0x198/0x1d0
[   16.424737] Code: ff 49 8b bc 24 68 02 00 00 e8 b4 65 e9 ff 49 8b bc 24 88 00 00 00 5b 41 5c e9 a4 65 e9 ff 41 8b 84 24 30 02 00 00 85 c0 74 ca <0f> 0b eb c6 4c 89 e7 e8 bc 41 e6 ff e9 4d ff ff ff 0f 0b 41 8b 84
[   16.425858] RSP: 0018:ffffc90000423e40 EFLAGS: 00010206
[   16.426172] RAX: 0000000000000fc0 RBX: ffff888104208160 RCX: 0000000000000000
[   16.426593] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff888104208160
[   16.427024] RBP: ffff888104208000 R08: 0000000000000000 R09: 0000000000000000
[   16.427485] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
[   16.428016] R13: ffff8881001ce8e0 R14: ffff888103170c30 R15: 0000000000000000
[   16.428561] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
[   16.429207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.429638] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
[   16.430178] Call Trace:
[   16.430366]  <TASK>
[   16.430524]  __sk_destruct+0x23/0x220
[   16.430753]  inet_release+0x3c/0x80
[   16.430969]  __sock_release+0x3a/0xb0
[   16.431189]  sock_close+0x14/0x20
[   16.431388]  __fput+0xa3/0x260
[   16.431578]  task_work_run+0x59/0xb0
[   16.431793]  exit_to_user_mode_prepare+0x1b3/0x1c0
[   16.432085]  syscall_exit_to_user_mode+0x19/0x50
[   16.432359]  do_syscall_64+0x48/0x90
[   16.432578]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[   16.432877] RIP: 0033:0x7f3728791eb7
[   16.433099] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
[   16.434190] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
[   16.434637] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
[   16.435058] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
[   16.435476] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
[   16.435893] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
[   16.436315] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
[   16.436741]  </TASK>
[   16.436876] irq event stamp: 136127
[   16.437089] hardirqs last  enabled at (136137): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
[   16.437599] hardirqs last disabled at (136144): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
[   16.438115] softirqs last  enabled at (135830): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
[   16.438641] softirqs last disabled at (135817): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
[   16.439277] ---[ end trace 0000000000000000 ]---
# 8/ 3  sockmap::txmsg test hanging corks:OK
# 9/11  sockmap::txmsg test push_data:OK
#10/17  sockmap::txmsg test pull-data:OK
#11/ 9  sockmap::txmsg test pop-data:OK
#12/ 1  sockmap::txmsg test push/pop data:OK
#13/ 1  sockmap::txmsg test ingress parser:OK
#14/ 1  sockmap::txmsg test ingress parser2:OK
#15/ 6 sockhash::txmsg test passthrough:OK
#16/ 6 sockhash::txmsg test redirect:OK
#17/ 6 sockhash::txmsg test drop:OK
#18/ 6 sockhash::txmsg test ingress redirect:OK
#19/ 7 sockhash::txmsg test skb:OK
#20/ 8 sockhash::txmsg test apply:OK
#21/12 sockhash::txmsg test cork:OK
#22/ 3 sockhash::txmsg test hanging corks:OK
#23/11 sockhash::txmsg test push_data:OK
#24/17 sockhash::txmsg test pull-data:OK
#25/ 9 sockhash::txmsg test pop-data:OK
#26/ 1 sockhash::txmsg test push/pop data:OK
#27/ 1 sockhash::txmsg test ingress parser:OK
#28/ 1 sockhash::txmsg test ingress parser2:OK
#29/ 6 sockhash:ktls:txmsg test passthrough:OK
#30/ 6 sockhash:ktls:txmsg test redirect:OK
#31/ 6 sockhash:ktls:txmsg test drop:OK
#32/ 6 sockhash:ktls:txmsg test ingress redirect:OK
#33/ 7 sockhash:ktls:txmsg test skb:OK
#34/ 8 sockhash:ktls:txmsg test apply:OK
#35/12 sockhash:ktls:txmsg test cork:OK
#36/ 3 sockhash:ktls:txmsg test hanging corks:OK
#37/11 sockhash:ktls:txmsg test push_data:OK
#38/17 sockhash:ktls:txmsg test pull-data:OK
#39/ 9 sockhash:ktls:txmsg test pop-data:OK
#40/ 1 sockhash:ktls:txmsg test push/pop data:OK
#41/ 1 sockhash:ktls:txmsg test ingress parser:OK
#42/ 0 sockhash:ktls:txmsg test ingress parser2:OK
Pass: 42 Fail: 0
bash-5.1#

No idea why yet. I will need to dig into what is happening.

Wang, can you please take a look as well?

Thanks,
Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
  2022-10-27 21:10 ` Jakub Sitnicki
@ 2022-10-28  1:41   ` wangyufen
  2022-10-28  5:20   ` wangyufen
  1 sibling, 0 replies; 11+ messages in thread
From: wangyufen @ 2022-10-28  1:41 UTC (permalink / raw)
  To: Jakub Sitnicki, john.fastabend
  Cc: daniel, lmb, davem, bpf, edumazet, yoshfuji, dsahern, kuba, ast,
	andrii, kafai, songliubraving, yhs, kpsingh, netdev


在 2022/10/28 5:10, Jakub Sitnicki 写道:
> Wang, John,
>
> On Fri, Mar 04, 2022 at 04:11 PM +08, Wang Yufen wrote:
>> This patchset fixes memleaks and incorrect charge/uncharge memory, these
>> issues cause the following info:
>>
>> WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
>> Call Trace:
>>   <IRQ>
>>   inet_csk_destroy_sock+0x55/0x110
>>   tcp_rcv_state_process+0xe5f/0xe90
>>   ? sk_filter_trim_cap+0x10d/0x230
>>   ? tcp_v4_do_rcv+0x161/0x250
>>   tcp_v4_do_rcv+0x161/0x250
>>   tcp_v4_rcv+0xc3a/0xce0
>>   ip_protocol_deliver_rcu+0x3d/0x230
>>   ip_local_deliver_finish+0x54/0x60
>>   ip_local_deliver+0xfd/0x110
>>   ? ip_protocol_deliver_rcu+0x230/0x230
>>   ip_rcv+0xd6/0x100
>>   ? ip_local_deliver+0x110/0x110
>>   __netif_receive_skb_one_core+0x85/0xa0
>>   process_backlog+0xa4/0x160
>>   __napi_poll+0x29/0x1b0
>>   net_rx_action+0x287/0x300
>>   __do_softirq+0xff/0x2fc
>>   do_softirq+0x79/0x90
>>   </IRQ>
>>
>> WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
>> Call Trace:
>>   <TASK>
>>   __sk_destruct+0x24/0x1f0
>>   sk_psock_destroy+0x19b/0x1c0
>>   process_one_work+0x1b3/0x3c0
>>   ? process_one_work+0x3c0/0x3c0
>>   worker_thread+0x30/0x350
>>   ? process_one_work+0x3c0/0x3c0
>>   kthread+0xe6/0x110
>>   ? kthread_complete_and_exit+0x20/0x20
>>   ret_from_fork+0x22/0x30
>>   </TASK>
>>
>> Changes since v2:
>> -Move sk_msg_trim() logic into sk_msg_alloc while -ENOMEM occurs, as
>> Cong Wang suggested.
>>
>> Changes since v1:
>> -Update the commit message of patch #2, the error path is from ENOMEM not
>> the ENOSPC.
>> -Simply returning an error code when psock is null, as John Fastabend
>> suggested.
> This is going to sound a bit weird, but I'm actually observing the
> mentioned warnings with these patches applied, when running
> `test_sockmap` selftests. Without the patch set - all is well.
>
> Bisect went like so:
>
> broken  [bpf-next] 31af1aa09fb9 ("Merge branch 'bpf: Fixes for kprobe multi on kernel modules'")
> ...
> broken  2486ab434b2c ("bpf, sockmap: Fix double uncharge the mem of sk_msg")
> broken  84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data")
> working 9c34e38c4a87 ("bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full")
> working 938d3480b92f ("bpf, sockmap: Fix memleak in sk_psock_queue_msg")
> working [base] d3b351f65bf4 ("selftests/bpf: Fix a clang compilation error for send_signal.c")
>
> At 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has
> more_data") I'm seeing:
>
> bash-5.1# uname -r
> 5.17.0-rc6-01987-g84472b436e76
> bash-5.1# cd tools/testing/selftests/bpf/
> bash-5.1# ./test_sockmap
> # 1/ 6  sockmap::txmsg test passthrough:OK
> # 2/ 6  sockmap::txmsg test redirect:OK
> # 3/ 6  sockmap::txmsg test drop:OK
> # 4/ 6  sockmap::txmsg test ingress redirect:OK
> # 5/ 7  sockmap::txmsg test skb:OK
> # 6/ 8  sockmap::txmsg test apply:OK
> # 7/12  sockmap::txmsg test cork:OK
> [   16.399282] ------------[ cut here ]------------
> [   16.400465] WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0
> [   16.402718] Modules linked in:
> [   16.403504] CPU: 2 PID: 197 Comm: test_sockmap Not tainted 5.17.0-rc6-01987-g84472b436e76 #76
> [   16.404276] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> [   16.404800] RIP: 0010:sk_stream_kill_queues+0xd3/0xf0
> [   16.405110] Code: 29 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 48 89 df e8 63 db fe ff 8b 83 78 02 00 00 8b b3 30 02 00 00 85 c0 74 d9 0f 0b 85 f6 74 d7 <0f> 0b 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 0f 0b eb 92 66 66 2e 0f 1f
> [   16.406226] RSP: 0018:ffffc90000423d48 EFLAGS: 00010206
> [   16.406545] RAX: 0000000000000000 RBX: ffff888104208000 RCX: 0000000000000000
> [   16.407117] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff8881042081b8
> [   16.407571] RBP: ffff8881042081b8 R08: 0000000000000000 R09: 0000000000000000
> [   16.407995] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
> [   16.408413] R13: ffff888102763000 R14: ffff888101b528e0 R15: ffff888104208130
> [   16.408837] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> [   16.409318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.409666] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
> [   16.410098] Call Trace:
> [   16.410248]  <TASK>
> [   16.410378]  inet_csk_destroy_sock+0x55/0x110
> [   16.410647]  tcp_rcv_state_process+0xd28/0x1380
> [   16.410934]  ? tcp_v4_do_rcv+0x77/0x2c0
> [   16.411166]  tcp_v4_do_rcv+0x77/0x2c0
> [   16.411388]  __release_sock+0x106/0x130
> [   16.411626]  __tcp_close+0x1a7/0x4e0
> [   16.411844]  tcp_close+0x20/0x70
> [   16.412045]  inet_release+0x3c/0x80
> [   16.412257]  __sock_release+0x3a/0xb0
> [   16.412509]  sock_close+0x14/0x20
> [   16.412770]  __fput+0xa3/0x260
> [   16.413022]  task_work_run+0x59/0xb0
> [   16.413286]  exit_to_user_mode_prepare+0x1b3/0x1c0
> [   16.413665]  syscall_exit_to_user_mode+0x19/0x50
> [   16.414020]  do_syscall_64+0x48/0x90
> [   16.414285]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   16.414659] RIP: 0033:0x7f3728791eb7
> [   16.414916] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
> [   16.416286] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [   16.416841] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
> [   16.417296] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
> [   16.417723] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
> [   16.418178] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
> [   16.418669] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
> [   16.419148]  </TASK>
> [   16.419283] irq event stamp: 135687
> [   16.419492] hardirqs last  enabled at (135695): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
> [   16.420065] hardirqs last disabled at (135712): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
> [   16.420606] softirqs last  enabled at (135710): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.421201] softirqs last disabled at (135703): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.421839] ---[ end trace 0000000000000000 ]---
> [   16.422195] ------------[ cut here ]------------
> [   16.422550] WARNING: CPU: 2 PID: 197 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x198/0x1d0
> [   16.423142] Modules linked in:
> [   16.423327] CPU: 2 PID: 197 Comm: test_sockmap Tainted: G        W         5.17.0-rc6-01987-g84472b436e76 #76
> [   16.423908] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> [   16.424436] RIP: 0010:inet_sock_destruct+0x198/0x1d0
> [   16.424737] Code: ff 49 8b bc 24 68 02 00 00 e8 b4 65 e9 ff 49 8b bc 24 88 00 00 00 5b 41 5c e9 a4 65 e9 ff 41 8b 84 24 30 02 00 00 85 c0 74 ca <0f> 0b eb c6 4c 89 e7 e8 bc 41 e6 ff e9 4d ff ff ff 0f 0b 41 8b 84
> [   16.425858] RSP: 0018:ffffc90000423e40 EFLAGS: 00010206
> [   16.426172] RAX: 0000000000000fc0 RBX: ffff888104208160 RCX: 0000000000000000
> [   16.426593] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff888104208160
> [   16.427024] RBP: ffff888104208000 R08: 0000000000000000 R09: 0000000000000000
> [   16.427485] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
> [   16.428016] R13: ffff8881001ce8e0 R14: ffff888103170c30 R15: 0000000000000000
> [   16.428561] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> [   16.429207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.429638] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
> [   16.430178] Call Trace:
> [   16.430366]  <TASK>
> [   16.430524]  __sk_destruct+0x23/0x220
> [   16.430753]  inet_release+0x3c/0x80
> [   16.430969]  __sock_release+0x3a/0xb0
> [   16.431189]  sock_close+0x14/0x20
> [   16.431388]  __fput+0xa3/0x260
> [   16.431578]  task_work_run+0x59/0xb0
> [   16.431793]  exit_to_user_mode_prepare+0x1b3/0x1c0
> [   16.432085]  syscall_exit_to_user_mode+0x19/0x50
> [   16.432359]  do_syscall_64+0x48/0x90
> [   16.432578]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   16.432877] RIP: 0033:0x7f3728791eb7
> [   16.433099] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
> [   16.434190] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [   16.434637] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
> [   16.435058] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
> [   16.435476] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
> [   16.435893] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
> [   16.436315] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
> [   16.436741]  </TASK>
> [   16.436876] irq event stamp: 136127
> [   16.437089] hardirqs last  enabled at (136137): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
> [   16.437599] hardirqs last disabled at (136144): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
> [   16.438115] softirqs last  enabled at (135830): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.438641] softirqs last disabled at (135817): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.439277] ---[ end trace 0000000000000000 ]---
> # 8/ 3  sockmap::txmsg test hanging corks:OK
> # 9/11  sockmap::txmsg test push_data:OK
> #10/17  sockmap::txmsg test pull-data:OK
> #11/ 9  sockmap::txmsg test pop-data:OK
> #12/ 1  sockmap::txmsg test push/pop data:OK
> #13/ 1  sockmap::txmsg test ingress parser:OK
> #14/ 1  sockmap::txmsg test ingress parser2:OK
> #15/ 6 sockhash::txmsg test passthrough:OK
> #16/ 6 sockhash::txmsg test redirect:OK
> #17/ 6 sockhash::txmsg test drop:OK
> #18/ 6 sockhash::txmsg test ingress redirect:OK
> #19/ 7 sockhash::txmsg test skb:OK
> #20/ 8 sockhash::txmsg test apply:OK
> #21/12 sockhash::txmsg test cork:OK
> #22/ 3 sockhash::txmsg test hanging corks:OK
> #23/11 sockhash::txmsg test push_data:OK
> #24/17 sockhash::txmsg test pull-data:OK
> #25/ 9 sockhash::txmsg test pop-data:OK
> #26/ 1 sockhash::txmsg test push/pop data:OK
> #27/ 1 sockhash::txmsg test ingress parser:OK
> #28/ 1 sockhash::txmsg test ingress parser2:OK
> #29/ 6 sockhash:ktls:txmsg test passthrough:OK
> #30/ 6 sockhash:ktls:txmsg test redirect:OK
> #31/ 6 sockhash:ktls:txmsg test drop:OK
> #32/ 6 sockhash:ktls:txmsg test ingress redirect:OK
> #33/ 7 sockhash:ktls:txmsg test skb:OK
> #34/ 8 sockhash:ktls:txmsg test apply:OK
> #35/12 sockhash:ktls:txmsg test cork:OK
> #36/ 3 sockhash:ktls:txmsg test hanging corks:OK
> #37/11 sockhash:ktls:txmsg test push_data:OK
> #38/17 sockhash:ktls:txmsg test pull-data:OK
> #39/ 9 sockhash:ktls:txmsg test pop-data:OK
> #40/ 1 sockhash:ktls:txmsg test push/pop data:OK
> #41/ 1 sockhash:ktls:txmsg test ingress parser:OK
> #42/ 0 sockhash:ktls:txmsg test ingress parser2:OK
> Pass: 42 Fail: 0
> bash-5.1#
>
> No idea why yet. I will need to dig into what is happening.
>
> Wang, can you please take a look as well?


OK,I will check.

Thanks,
Wang

>
> Thanks,
> Jakub
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge
  2022-10-27 21:10 ` Jakub Sitnicki
  2022-10-28  1:41   ` wangyufen
@ 2022-10-28  5:20   ` wangyufen
  1 sibling, 0 replies; 11+ messages in thread
From: wangyufen @ 2022-10-28  5:20 UTC (permalink / raw)
  To: Jakub Sitnicki, john.fastabend
  Cc: daniel, lmb, davem, bpf, edumazet, yoshfuji, dsahern, kuba, ast,
	andrii, kafai, songliubraving, yhs, kpsingh, netdev


在 2022/10/28 5:10, Jakub Sitnicki 写道:
> Wang, John,
>
> On Fri, Mar 04, 2022 at 04:11 PM +08, Wang Yufen wrote:
>> This patchset fixes memleaks and incorrect charge/uncharge memory, these
>> issues cause the following info:
>>
>> WARNING: CPU: 0 PID: 9202 at net/core/stream.c:205 sk_stream_kill_queues+0xc8/0xe0
>> Call Trace:
>>   <IRQ>
>>   inet_csk_destroy_sock+0x55/0x110
>>   tcp_rcv_state_process+0xe5f/0xe90
>>   ? sk_filter_trim_cap+0x10d/0x230
>>   ? tcp_v4_do_rcv+0x161/0x250
>>   tcp_v4_do_rcv+0x161/0x250
>>   tcp_v4_rcv+0xc3a/0xce0
>>   ip_protocol_deliver_rcu+0x3d/0x230
>>   ip_local_deliver_finish+0x54/0x60
>>   ip_local_deliver+0xfd/0x110
>>   ? ip_protocol_deliver_rcu+0x230/0x230
>>   ip_rcv+0xd6/0x100
>>   ? ip_local_deliver+0x110/0x110
>>   __netif_receive_skb_one_core+0x85/0xa0
>>   process_backlog+0xa4/0x160
>>   __napi_poll+0x29/0x1b0
>>   net_rx_action+0x287/0x300
>>   __do_softirq+0xff/0x2fc
>>   do_softirq+0x79/0x90
>>   </IRQ>
>>
>> WARNING: CPU: 0 PID: 531 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x175/0x1b0
>> Call Trace:
>>   <TASK>
>>   __sk_destruct+0x24/0x1f0
>>   sk_psock_destroy+0x19b/0x1c0
>>   process_one_work+0x1b3/0x3c0
>>   ? process_one_work+0x3c0/0x3c0
>>   worker_thread+0x30/0x350
>>   ? process_one_work+0x3c0/0x3c0
>>   kthread+0xe6/0x110
>>   ? kthread_complete_and_exit+0x20/0x20
>>   ret_from_fork+0x22/0x30
>>   </TASK>
>>
>> Changes since v2:
>> -Move sk_msg_trim() logic into sk_msg_alloc while -ENOMEM occurs, as
>> Cong Wang suggested.
>>
>> Changes since v1:
>> -Update the commit message of patch #2, the error path is from ENOMEM not
>> the ENOSPC.
>> -Simply returning an error code when psock is null, as John Fastabend
>> suggested.
> This is going to sound a bit weird, but I'm actually observing the
> mentioned warnings with these patches applied, when running
> `test_sockmap` selftests. Without the patch set - all is well.
>
> Bisect went like so:
>
> broken  [bpf-next] 31af1aa09fb9 ("Merge branch 'bpf: Fixes for kprobe multi on kernel modules'")
> ...
> broken  2486ab434b2c ("bpf, sockmap: Fix double uncharge the mem of sk_msg")
> broken  84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has more_data")
> working 9c34e38c4a87 ("bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full")
> working 938d3480b92f ("bpf, sockmap: Fix memleak in sk_psock_queue_msg")
> working [base] d3b351f65bf4 ("selftests/bpf: Fix a clang compilation error for send_signal.c")
>
> At 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has
> more_data") I'm seeing:
>
> bash-5.1# uname -r
> 5.17.0-rc6-01987-g84472b436e76
> bash-5.1# cd tools/testing/selftests/bpf/
> bash-5.1# ./test_sockmap
> # 1/ 6  sockmap::txmsg test passthrough:OK
> # 2/ 6  sockmap::txmsg test redirect:OK
> # 3/ 6  sockmap::txmsg test drop:OK
> # 4/ 6  sockmap::txmsg test ingress redirect:OK
> # 5/ 7  sockmap::txmsg test skb:OK
> # 6/ 8  sockmap::txmsg test apply:OK
> # 7/12  sockmap::txmsg test cork:OK
> [   16.399282] ------------[ cut here ]------------
> [   16.400465] WARNING: CPU: 2 PID: 197 at net/core/stream.c:205 sk_stream_kill_queues+0xd3/0xf0
> [   16.402718] Modules linked in:
> [   16.403504] CPU: 2 PID: 197 Comm: test_sockmap Not tainted 5.17.0-rc6-01987-g84472b436e76 #76
> [   16.404276] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> [   16.404800] RIP: 0010:sk_stream_kill_queues+0xd3/0xf0
> [   16.405110] Code: 29 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 48 89 df e8 63 db fe ff 8b 83 78 02 00 00 8b b3 30 02 00 00 85 c0 74 d9 0f 0b 85 f6 74 d7 <0f> 0b 5b 5d 31 c0 89 c2 89 c6 89 c7 c3 0f 0b eb 92 66 66 2e 0f 1f
> [   16.406226] RSP: 0018:ffffc90000423d48 EFLAGS: 00010206
> [   16.406545] RAX: 0000000000000000 RBX: ffff888104208000 RCX: 0000000000000000
> [   16.407117] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff8881042081b8
> [   16.407571] RBP: ffff8881042081b8 R08: 0000000000000000 R09: 0000000000000000
> [   16.407995] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
> [   16.408413] R13: ffff888102763000 R14: ffff888101b528e0 R15: ffff888104208130
> [   16.408837] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> [   16.409318] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.409666] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
> [   16.410098] Call Trace:
> [   16.410248]  <TASK>
> [   16.410378]  inet_csk_destroy_sock+0x55/0x110
> [   16.410647]  tcp_rcv_state_process+0xd28/0x1380
> [   16.410934]  ? tcp_v4_do_rcv+0x77/0x2c0
> [   16.411166]  tcp_v4_do_rcv+0x77/0x2c0
> [   16.411388]  __release_sock+0x106/0x130
> [   16.411626]  __tcp_close+0x1a7/0x4e0
> [   16.411844]  tcp_close+0x20/0x70
> [   16.412045]  inet_release+0x3c/0x80
> [   16.412257]  __sock_release+0x3a/0xb0
> [   16.412509]  sock_close+0x14/0x20
> [   16.412770]  __fput+0xa3/0x260
> [   16.413022]  task_work_run+0x59/0xb0
> [   16.413286]  exit_to_user_mode_prepare+0x1b3/0x1c0
> [   16.413665]  syscall_exit_to_user_mode+0x19/0x50
> [   16.414020]  do_syscall_64+0x48/0x90
> [   16.414285]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   16.414659] RIP: 0033:0x7f3728791eb7
> [   16.414916] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
> [   16.416286] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [   16.416841] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
> [   16.417296] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
> [   16.417723] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
> [   16.418178] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
> [   16.418669] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
> [   16.419148]  </TASK>
> [   16.419283] irq event stamp: 135687
> [   16.419492] hardirqs last  enabled at (135695): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
> [   16.420065] hardirqs last disabled at (135712): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
> [   16.420606] softirqs last  enabled at (135710): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.421201] softirqs last disabled at (135703): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.421839] ---[ end trace 0000000000000000 ]---
> [   16.422195] ------------[ cut here ]------------
> [   16.422550] WARNING: CPU: 2 PID: 197 at net/ipv4/af_inet.c:154 inet_sock_destruct+0x198/0x1d0
> [   16.423142] Modules linked in:
> [   16.423327] CPU: 2 PID: 197 Comm: test_sockmap Tainted: G        W         5.17.0-rc6-01987-g84472b436e76 #76
> [   16.423908] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1.fc35 04/01/2014
> [   16.424436] RIP: 0010:inet_sock_destruct+0x198/0x1d0
> [   16.424737] Code: ff 49 8b bc 24 68 02 00 00 e8 b4 65 e9 ff 49 8b bc 24 88 00 00 00 5b 41 5c e9 a4 65 e9 ff 41 8b 84 24 30 02 00 00 85 c0 74 ca <0f> 0b eb c6 4c 89 e7 e8 bc 41 e6 ff e9 4d ff ff ff 0f 0b 41 8b 84
> [   16.425858] RSP: 0018:ffffc90000423e40 EFLAGS: 00010206
> [   16.426172] RAX: 0000000000000fc0 RBX: ffff888104208160 RCX: 0000000000000000
> [   16.426593] RDX: 0000000000000000 RSI: 0000000000000fc0 RDI: ffff888104208160
> [   16.427024] RBP: ffff888104208000 R08: 0000000000000000 R09: 0000000000000000
> [   16.427485] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888104208000
> [   16.428016] R13: ffff8881001ce8e0 R14: ffff888103170c30 R15: 0000000000000000
> [   16.428561] FS:  00007f3728652000(0000) GS:ffff88813bd00000(0000) knlGS:0000000000000000
> [   16.429207] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   16.429638] CR2: 00007f3728651f78 CR3: 0000000100872005 CR4: 0000000000370ee0
> [   16.430178] Call Trace:
> [   16.430366]  <TASK>
> [   16.430524]  __sk_destruct+0x23/0x220
> [   16.430753]  inet_release+0x3c/0x80
> [   16.430969]  __sock_release+0x3a/0xb0
> [   16.431189]  sock_close+0x14/0x20
> [   16.431388]  __fput+0xa3/0x260
> [   16.431578]  task_work_run+0x59/0xb0
> [   16.431793]  exit_to_user_mode_prepare+0x1b3/0x1c0
> [   16.432085]  syscall_exit_to_user_mode+0x19/0x50
> [   16.432359]  do_syscall_64+0x48/0x90
> [   16.432578]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> [   16.432877] RIP: 0033:0x7f3728791eb7
> [   16.433099] Code: ff e8 7d e2 01 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 43 cd f5 ff
> [   16.434190] RSP: 002b:00007ffe6c91bb98 EFLAGS: 00000246 ORIG_RAX: 0000000000000003
> [   16.434637] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f3728791eb7
> [   16.435058] RDX: 0000000000000090 RSI: 00007ffe6c91baf0 RDI: 0000000000000019
> [   16.435476] RBP: 00007ffe6c91bbe0 R08: 00007ffe6c91baf0 R09: 00007ffe6c91baf0
> [   16.435893] R10: 00007ffe6c91baf0 R11: 0000000000000246 R12: 00007ffe6c91beb8
> [   16.436315] R13: 000000000040de7f R14: 0000000000471de8 R15: 00007f37288ec000
> [   16.436741]  </TASK>
> [   16.436876] irq event stamp: 136127
> [   16.437089] hardirqs last  enabled at (136137): [<ffffffff810ee6a2>] __up_console_sem+0x52/0x60
> [   16.437599] hardirqs last disabled at (136144): [<ffffffff810ee687>] __up_console_sem+0x37/0x60
> [   16.438115] softirqs last  enabled at (135830): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.438641] softirqs last disabled at (135817): [<ffffffff81078fe1>] irq_exit_rcu+0xe1/0x130
> [   16.439277] ---[ end trace 0000000000000000 ]---
> # 8/ 3  sockmap::txmsg test hanging corks:OK
> # 9/11  sockmap::txmsg test push_data:OK
> #10/17  sockmap::txmsg test pull-data:OK
> #11/ 9  sockmap::txmsg test pop-data:OK
> #12/ 1  sockmap::txmsg test push/pop data:OK
> #13/ 1  sockmap::txmsg test ingress parser:OK
> #14/ 1  sockmap::txmsg test ingress parser2:OK
> #15/ 6 sockhash::txmsg test passthrough:OK
> #16/ 6 sockhash::txmsg test redirect:OK
> #17/ 6 sockhash::txmsg test drop:OK
> #18/ 6 sockhash::txmsg test ingress redirect:OK
> #19/ 7 sockhash::txmsg test skb:OK
> #20/ 8 sockhash::txmsg test apply:OK
> #21/12 sockhash::txmsg test cork:OK
> #22/ 3 sockhash::txmsg test hanging corks:OK
> #23/11 sockhash::txmsg test push_data:OK
> #24/17 sockhash::txmsg test pull-data:OK
> #25/ 9 sockhash::txmsg test pop-data:OK
> #26/ 1 sockhash::txmsg test push/pop data:OK
> #27/ 1 sockhash::txmsg test ingress parser:OK
> #28/ 1 sockhash::txmsg test ingress parser2:OK
> #29/ 6 sockhash:ktls:txmsg test passthrough:OK
> #30/ 6 sockhash:ktls:txmsg test redirect:OK
> #31/ 6 sockhash:ktls:txmsg test drop:OK
> #32/ 6 sockhash:ktls:txmsg test ingress redirect:OK
> #33/ 7 sockhash:ktls:txmsg test skb:OK
> #34/ 8 sockhash:ktls:txmsg test apply:OK
> #35/12 sockhash:ktls:txmsg test cork:OK
> #36/ 3 sockhash:ktls:txmsg test hanging corks:OK
> #37/11 sockhash:ktls:txmsg test push_data:OK
> #38/17 sockhash:ktls:txmsg test pull-data:OK
> #39/ 9 sockhash:ktls:txmsg test pop-data:OK
> #40/ 1 sockhash:ktls:txmsg test push/pop data:OK
> #41/ 1 sockhash:ktls:txmsg test ingress parser:OK
> #42/ 0 sockhash:ktls:txmsg test ingress parser2:OK
> Pass: 42 Fail: 0
> bash-5.1#
>
> No idea why yet. I will need to dig into what is happening.
>
> Wang, can you please take a look as well?

Sorry for the breakage. In commit 84472b436e76 ("bpf, sockmap: Fix more uncharged while msg has
more_data") , I used msg->sg.size replace tosend rudely, which break the
if (msg->apply_bytes && msg->apply_bytes < send) scene.

The following modifications can fix the issue.

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index a1626afe87a1..38d47357dd68 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -278,7 +278,7 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
  {
         bool cork = false, enospc = sk_msg_full(msg);
         struct sock *sk_redir;
-       u32 tosend, delta = 0;
+       u32 tosend, orgsize, sended, delta = 0;
         u32 eval = __SK_NONE;
         int ret;
  
@@ -333,10 +333,12 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
                         cork = true;
                         psock->cork = NULL;
                 }
-               sk_msg_return(sk, msg, msg->sg.size);
+               sk_msg_return(sk, msg, tosend);
                 release_sock(sk);
  
+               orgsize = msg->sg.size;
                 ret = tcp_bpf_sendmsg_redir(sk_redir, msg, tosend, flags);
+               sended = orgsize - msg->sg.size;
  
                 if (eval == __SK_REDIRECT)
                         sock_put(sk_redir);
@@ -374,8 +376,8 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
                 if (msg &&
                     msg->sg.data[msg->sg.start].page_link &&
                     msg->sg.data[msg->sg.start].length) {
-                       if (eval == __SK_REDIRECT)
-                               sk_mem_charge(sk, msg->sg.size);
+                       if (eval == __SK_REDIRECT && tosend > sended)
+                               sk_mem_charge(sk, tosend - sended);
                         goto more_data;
                 }
         }

>
> Thanks,
> Jakub
>

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-10-28  5:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-03-04  8:11 [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge Wang Yufen
2022-03-04  8:11 ` [PATCH bpf-next v3 1/4] bpf, sockmap: Fix memleak in sk_psock_queue_msg Wang Yufen
2022-03-04  8:11 ` [PATCH bpf-next v3 2/4] bpf, sockmap: Fix memleak in tcp_bpf_sendmsg while sk msg is full Wang Yufen
2022-03-11 21:55   ` John Fastabend
2022-03-04  8:11 ` [PATCH bpf-next v3 3/4] bpf, sockmap: Fix more uncharged while msg has more_data Wang Yufen
2022-03-04  8:11 ` [PATCH bpf-next v3 4/4] bpf, sockmap: Fix double uncharge the mem of sk_msg Wang Yufen
2022-03-11 21:51   ` John Fastabend
2022-03-15 15:50 ` [PATCH bpf-next v3 0/4] bpf, sockmap: Fix memleaks and issues of mem charge/uncharge patchwork-bot+netdevbpf
2022-10-27 21:10 ` Jakub Sitnicki
2022-10-28  1:41   ` wangyufen
2022-10-28  5:20   ` wangyufen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).