netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection
@ 2024-10-17  0:57 zijianzhang
  2024-10-17  0:57 ` [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress() zijianzhang
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: zijianzhang @ 2024-10-17  0:57 UTC (permalink / raw)
  To: bpf
  Cc: john.fastabend, jakub, davem, edumazet, kuba, pabeni, dsahern,
	netdev, cong.wang, zijianzhang

From: Zijian Zhang <zijianzhang@bytedance.com>

We should do sk_rmem_schedule instead of sk_wmem_schedule in function
bpf_tcp_ingress. We also need to update sk_rmem_alloc accordingly to
account for the rmem.

Cong Wang (1):
  tcp_bpf: charge receive socket buffer in bpf_tcp_ingress()

Zijian Zhang (1):
  tcp_bpf: add sk_rmem_alloc related logic for ingress redirection

 include/linux/skmsg.h | 11 ++++++++---
 include/net/sock.h    | 10 ++++++++--
 net/core/skmsg.c      |  6 +++++-
 net/ipv4/tcp_bpf.c    |  6 ++++--
 4 files changed, 25 insertions(+), 8 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress()
  2024-10-17  0:57 [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection zijianzhang
@ 2024-10-17  0:57 ` zijianzhang
  2024-10-17  0:57 ` [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection zijianzhang
  2024-11-08  3:58 ` [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling " Cong Wang
  2 siblings, 0 replies; 7+ messages in thread
From: zijianzhang @ 2024-10-17  0:57 UTC (permalink / raw)
  To: bpf
  Cc: john.fastabend, jakub, davem, edumazet, kuba, pabeni, dsahern,
	netdev, cong.wang, zijianzhang

From: Cong Wang <cong.wang@bytedance.com>

When bpf_tcp_ingress() is called, the skmsg is being redirected to the
ingress of the destination socket. Therefore, we should charge its
receive socket buffer, instead of sending socket buffer.

Because sk_rmem_schedule() tests pfmemalloc of skb, we need to
introduce a wrapper and call it for skmsg.

Signed-off-by: Cong Wang <cong.wang@bytedance.com>
---
 include/net/sock.h | 10 ++++++++--
 net/ipv4/tcp_bpf.c |  2 +-
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index c58ca8dd561b..4e796b1a92d2 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1519,7 +1519,7 @@ static inline bool sk_wmem_schedule(struct sock *sk, int size)
 }
 
 static inline bool
-sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
+__sk_rmem_schedule(struct sock *sk, int size, bool pfmemalloc)
 {
 	int delta;
 
@@ -1527,7 +1527,13 @@ sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
 		return true;
 	delta = size - sk->sk_forward_alloc;
 	return delta <= 0 || __sk_mem_schedule(sk, delta, SK_MEM_RECV) ||
-		skb_pfmemalloc(skb);
+	       pfmemalloc;
+}
+
+static inline bool
+sk_rmem_schedule(struct sock *sk, struct sk_buff *skb, int size)
+{
+	return __sk_rmem_schedule(sk, size, skb_pfmemalloc(skb));
 }
 
 static inline int sk_unused_reserved_mem(const struct sock *sk)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index e7658c5d6b79..48c412744f77 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -49,7 +49,7 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,
 		sge = sk_msg_elem(msg, i);
 		size = (apply && apply_bytes < sge->length) ?
 			apply_bytes : sge->length;
-		if (!sk_wmem_schedule(sk, size)) {
+		if (!__sk_rmem_schedule(sk, size, false)) {
 			if (!copied)
 				ret = -ENOMEM;
 			break;
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection
  2024-10-17  0:57 [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection zijianzhang
  2024-10-17  0:57 ` [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress() zijianzhang
@ 2024-10-17  0:57 ` zijianzhang
  2024-11-08  4:04   ` Cong Wang
  2024-11-08  3:58 ` [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling " Cong Wang
  2 siblings, 1 reply; 7+ messages in thread
From: zijianzhang @ 2024-10-17  0:57 UTC (permalink / raw)
  To: bpf
  Cc: john.fastabend, jakub, davem, edumazet, kuba, pabeni, dsahern,
	netdev, cong.wang, zijianzhang

From: Zijian Zhang <zijianzhang@bytedance.com>

Although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir
in bpf_tcp_ingress, we do not update sk_rmem_alloc. As a result, except
for the global memory limit, the rmem of sk_redir is nearly unlimited.

Thus, add sk_rmem_alloc related logic to limit the recv buffer.

Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
---
 include/linux/skmsg.h | 11 ++++++++---
 net/core/skmsg.c      |  6 +++++-
 net/ipv4/tcp_bpf.c    |  4 +++-
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
index d9b03e0746e7..2cbe0c22a32f 100644
--- a/include/linux/skmsg.h
+++ b/include/linux/skmsg.h
@@ -317,17 +317,22 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
 	kfree_skb(skb);
 }
 
-static inline void sk_psock_queue_msg(struct sk_psock *psock,
+static inline bool sk_psock_queue_msg(struct sk_psock *psock,
 				      struct sk_msg *msg)
 {
+	bool ret;
+
 	spin_lock_bh(&psock->ingress_lock);
-	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
+	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
 		list_add_tail(&msg->list, &psock->ingress_msg);
-	else {
+		ret = true;
+	} else {
 		sk_msg_free(psock->sk, msg);
 		kfree(msg);
+		ret = false;
 	}
 	spin_unlock_bh(&psock->ingress_lock);
+	return ret;
 }
 
 static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index b1dcbd3be89e..110ee0abcfe0 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -445,8 +445,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
 			if (likely(!peek)) {
 				sge->offset += copy;
 				sge->length -= copy;
-				if (!msg_rx->skb)
+				if (!msg_rx->skb) {
 					sk_mem_uncharge(sk, copy);
+					atomic_sub(copy, &sk->sk_rmem_alloc);
+				}
 				msg_rx->sg.size -= copy;
 
 				if (!sge->length) {
@@ -772,6 +774,8 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock)
 
 	list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) {
 		list_del(&msg->list);
+		if (!msg->skb)
+			atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc);
 		sk_msg_free(psock->sk, msg);
 		kfree(msg);
 	}
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 48c412744f77..39155bec746f 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -56,6 +56,7 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,
 		}
 
 		sk_mem_charge(sk, size);
+		atomic_add(size, &sk->sk_rmem_alloc);
 		sk_msg_xfer(tmp, msg, i, size);
 		copied += size;
 		if (sge->length)
@@ -74,7 +75,8 @@ static int bpf_tcp_ingress(struct sock *sk, struct sk_psock *psock,
 
 	if (!ret) {
 		msg->sg.start = i;
-		sk_psock_queue_msg(psock, tmp);
+		if (!sk_psock_queue_msg(psock, tmp))
+			atomic_sub(copied, &sk->sk_rmem_alloc);
 		sk_psock_data_ready(sk, psock);
 	} else {
 		sk_msg_free(sk, tmp);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection
  2024-10-17  0:57 [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection zijianzhang
  2024-10-17  0:57 ` [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress() zijianzhang
  2024-10-17  0:57 ` [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection zijianzhang
@ 2024-11-08  3:58 ` Cong Wang
  2 siblings, 0 replies; 7+ messages in thread
From: Cong Wang @ 2024-11-08  3:58 UTC (permalink / raw)
  To: zijianzhang
  Cc: bpf, john.fastabend, jakub, davem, edumazet, kuba, pabeni,
	dsahern, netdev, cong.wang

On Thu, Oct 17, 2024 at 12:57:40AM +0000, zijianzhang@bytedance.com wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
> 
> We should do sk_rmem_schedule instead of sk_wmem_schedule in function
> bpf_tcp_ingress. We also need to update sk_rmem_alloc accordingly to
> account for the rmem.
> 

Is it possible to have a test case for this? I think it would be easier
to prove and convince people to accept these changes.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection
  2024-10-17  0:57 ` [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection zijianzhang
@ 2024-11-08  4:04   ` Cong Wang
  2024-11-08 18:28     ` [External] " Zijian Zhang
  0 siblings, 1 reply; 7+ messages in thread
From: Cong Wang @ 2024-11-08  4:04 UTC (permalink / raw)
  To: zijianzhang
  Cc: bpf, john.fastabend, jakub, davem, edumazet, kuba, pabeni,
	dsahern, netdev, cong.wang

On Thu, Oct 17, 2024 at 12:57:42AM +0000, zijianzhang@bytedance.com wrote:
> From: Zijian Zhang <zijianzhang@bytedance.com>
> 
> Although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir
> in bpf_tcp_ingress, we do not update sk_rmem_alloc. As a result, except
> for the global memory limit, the rmem of sk_redir is nearly unlimited.
> 
> Thus, add sk_rmem_alloc related logic to limit the recv buffer.
> 
> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
> ---
>  include/linux/skmsg.h | 11 ++++++++---
>  net/core/skmsg.c      |  6 +++++-
>  net/ipv4/tcp_bpf.c    |  4 +++-
>  3 files changed, 16 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> index d9b03e0746e7..2cbe0c22a32f 100644
> --- a/include/linux/skmsg.h
> +++ b/include/linux/skmsg.h
> @@ -317,17 +317,22 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
>  	kfree_skb(skb);
>  }
>  
> -static inline void sk_psock_queue_msg(struct sk_psock *psock,
> +static inline bool sk_psock_queue_msg(struct sk_psock *psock,
>  				      struct sk_msg *msg)
>  {
> +	bool ret;
> +
>  	spin_lock_bh(&psock->ingress_lock);
> -	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
> +	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
>  		list_add_tail(&msg->list, &psock->ingress_msg);
> -	else {
> +		ret = true;
> +	} else {
>  		sk_msg_free(psock->sk, msg);
>  		kfree(msg);
> +		ret = false;
>  	}
>  	spin_unlock_bh(&psock->ingress_lock);
> +	return ret;
>  }
>  
>  static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock)
> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> index b1dcbd3be89e..110ee0abcfe0 100644
> --- a/net/core/skmsg.c
> +++ b/net/core/skmsg.c
> @@ -445,8 +445,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
>  			if (likely(!peek)) {
>  				sge->offset += copy;
>  				sge->length -= copy;
> -				if (!msg_rx->skb)
> +				if (!msg_rx->skb) {
>  					sk_mem_uncharge(sk, copy);
> +					atomic_sub(copy, &sk->sk_rmem_alloc);
> +				}
>  				msg_rx->sg.size -= copy;
>  
>  				if (!sge->length) {
> @@ -772,6 +774,8 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock)
>  
>  	list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) {
>  		list_del(&msg->list);
> +		if (!msg->skb)
> +			atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc);
>  		sk_msg_free(psock->sk, msg);

Why not calling this atomic_sub() in sk_msg_free_elem()?

Thanks.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection
  2024-11-08  4:04   ` Cong Wang
@ 2024-11-08 18:28     ` Zijian Zhang
  2024-11-21  6:07       ` John Fastabend
  0 siblings, 1 reply; 7+ messages in thread
From: Zijian Zhang @ 2024-11-08 18:28 UTC (permalink / raw)
  To: Cong Wang
  Cc: bpf, john.fastabend, jakub, davem, edumazet, kuba, pabeni,
	dsahern, netdev, cong.wang


On 11/7/24 8:04 PM, Cong Wang wrote:
> On Thu, Oct 17, 2024 at 12:57:42AM +0000, zijianzhang@bytedance.com wrote:
>> From: Zijian Zhang <zijianzhang@bytedance.com>
>>
>> Although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir
>> in bpf_tcp_ingress, we do not update sk_rmem_alloc. As a result, except
>> for the global memory limit, the rmem of sk_redir is nearly unlimited.
>>
>> Thus, add sk_rmem_alloc related logic to limit the recv buffer.
>>
>> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
>> ---
>>   include/linux/skmsg.h | 11 ++++++++---
>>   net/core/skmsg.c      |  6 +++++-
>>   net/ipv4/tcp_bpf.c    |  4 +++-
>>   3 files changed, 16 insertions(+), 5 deletions(-)
>>
>> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
>> index d9b03e0746e7..2cbe0c22a32f 100644
>> --- a/include/linux/skmsg.h
>> +++ b/include/linux/skmsg.h
>> @@ -317,17 +317,22 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
>>   	kfree_skb(skb);
>>   }
>>   
>> -static inline void sk_psock_queue_msg(struct sk_psock *psock,
>> +static inline bool sk_psock_queue_msg(struct sk_psock *psock,
>>   				      struct sk_msg *msg)
>>   {
>> +	bool ret;
>> +
>>   	spin_lock_bh(&psock->ingress_lock);
>> -	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
>> +	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
>>   		list_add_tail(&msg->list, &psock->ingress_msg);
>> -	else {
>> +		ret = true;
>> +	} else {
>>   		sk_msg_free(psock->sk, msg);
>>   		kfree(msg);
>> +		ret = false;
>>   	}
>>   	spin_unlock_bh(&psock->ingress_lock);
>> +	return ret;
>>   }
>>   
>>   static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock)
>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
>> index b1dcbd3be89e..110ee0abcfe0 100644
>> --- a/net/core/skmsg.c
>> +++ b/net/core/skmsg.c
>> @@ -445,8 +445,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
>>   			if (likely(!peek)) {
>>   				sge->offset += copy;
>>   				sge->length -= copy;
>> -				if (!msg_rx->skb)
>> +				if (!msg_rx->skb) {
>>   					sk_mem_uncharge(sk, copy);
>> +					atomic_sub(copy, &sk->sk_rmem_alloc);
>> +				}
>>   				msg_rx->sg.size -= copy;
>>   
>>   				if (!sge->length) {
>> @@ -772,6 +774,8 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock)
>>   
>>   	list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) {
>>   		list_del(&msg->list);
>> +		if (!msg->skb)
>> +			atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc);
>>   		sk_msg_free(psock->sk, msg);
> 
> Why not calling this atomic_sub() in sk_msg_free_elem()?
> 
> Thanks.

sk_msg_free_elem called by sk_msg_free or sk_msg_free_no_charge will
be invoked in multiple locations including TX/RX/Error and etc.

We should call atomic_sub(&sk->sk_rmem_alloc) for sk_msgs that have
been atomic_add before. In other words, we need to call atomic_sub
only for sk_msgs in ingress_msg.

As for "!msg->skb" check here, I want to make sure the sk_msg is not
from function sk_psock_skb_ingress_enqueue, because these sk_msgs'
rmem accounting has already handled by skb_set_owner_r in function
sk_psock_skb_ingress.



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection
  2024-11-08 18:28     ` [External] " Zijian Zhang
@ 2024-11-21  6:07       ` John Fastabend
  0 siblings, 0 replies; 7+ messages in thread
From: John Fastabend @ 2024-11-21  6:07 UTC (permalink / raw)
  To: Zijian Zhang, Cong Wang
  Cc: bpf, john.fastabend, jakub, davem, edumazet, kuba, pabeni,
	dsahern, netdev, cong.wang

Zijian Zhang wrote:
> 
> On 11/7/24 8:04 PM, Cong Wang wrote:
> > On Thu, Oct 17, 2024 at 12:57:42AM +0000, zijianzhang@bytedance.com wrote:
> >> From: Zijian Zhang <zijianzhang@bytedance.com>
> >>
> >> Although we sk_rmem_schedule and add sk_msg to the ingress_msg of sk_redir
> >> in bpf_tcp_ingress, we do not update sk_rmem_alloc. As a result, except
> >> for the global memory limit, the rmem of sk_redir is nearly unlimited.
> >>
> >> Thus, add sk_rmem_alloc related logic to limit the recv buffer.
> >>
> >> Signed-off-by: Zijian Zhang <zijianzhang@bytedance.com>
> >> ---
> >>   include/linux/skmsg.h | 11 ++++++++---
> >>   net/core/skmsg.c      |  6 +++++-
> >>   net/ipv4/tcp_bpf.c    |  4 +++-
> >>   3 files changed, 16 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/include/linux/skmsg.h b/include/linux/skmsg.h
> >> index d9b03e0746e7..2cbe0c22a32f 100644
> >> --- a/include/linux/skmsg.h
> >> +++ b/include/linux/skmsg.h
> >> @@ -317,17 +317,22 @@ static inline void sock_drop(struct sock *sk, struct sk_buff *skb)
> >>   	kfree_skb(skb);
> >>   }
> >>   
> >> -static inline void sk_psock_queue_msg(struct sk_psock *psock,
> >> +static inline bool sk_psock_queue_msg(struct sk_psock *psock,
> >>   				      struct sk_msg *msg)
> >>   {
> >> +	bool ret;
> >> +
> >>   	spin_lock_bh(&psock->ingress_lock);
> >> -	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED))
> >> +	if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
> >>   		list_add_tail(&msg->list, &psock->ingress_msg);
> >> -	else {
> >> +		ret = true;
> >> +	} else {
> >>   		sk_msg_free(psock->sk, msg);
> >>   		kfree(msg);
> >> +		ret = false;
> >>   	}
> >>   	spin_unlock_bh(&psock->ingress_lock);
> >> +	return ret;
> >>   }
> >>   
> >>   static inline struct sk_msg *sk_psock_dequeue_msg(struct sk_psock *psock)
> >> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
> >> index b1dcbd3be89e..110ee0abcfe0 100644
> >> --- a/net/core/skmsg.c
> >> +++ b/net/core/skmsg.c
> >> @@ -445,8 +445,10 @@ int sk_msg_recvmsg(struct sock *sk, struct sk_psock *psock, struct msghdr *msg,
> >>   			if (likely(!peek)) {
> >>   				sge->offset += copy;
> >>   				sge->length -= copy;
> >> -				if (!msg_rx->skb)
> >> +				if (!msg_rx->skb) {
> >>   					sk_mem_uncharge(sk, copy);
> >> +					atomic_sub(copy, &sk->sk_rmem_alloc);
> >> +				}
> >>   				msg_rx->sg.size -= copy;
> >>   
> >>   				if (!sge->length) {
> >> @@ -772,6 +774,8 @@ static void __sk_psock_purge_ingress_msg(struct sk_psock *psock)
> >>   
> >>   	list_for_each_entry_safe(msg, tmp, &psock->ingress_msg, list) {
> >>   		list_del(&msg->list);
> >> +		if (!msg->skb)
> >> +			atomic_sub(msg->sg.size, &psock->sk->sk_rmem_alloc);
> >>   		sk_msg_free(psock->sk, msg);
> > 
> > Why not calling this atomic_sub() in sk_msg_free_elem()?
> > 
> > Thanks.
> 
> sk_msg_free_elem called by sk_msg_free or sk_msg_free_no_charge will
> be invoked in multiple locations including TX/RX/Error and etc.
> 
> We should call atomic_sub(&sk->sk_rmem_alloc) for sk_msgs that have
> been atomic_add before. In other words, we need to call atomic_sub
> only for sk_msgs in ingress_msg.
> 
> As for "!msg->skb" check here, I want to make sure the sk_msg is not
> from function sk_psock_skb_ingress_enqueue, because these sk_msgs'
> rmem accounting has already handled by skb_set_owner_r in function
> sk_psock_skb_ingress.
> 

Assuming I read above correct this is only an issue when doing a
redirect to ingress of a socket? The other path where we do a
SK_PASS directly from the verdict to hit the ingress of the
current socket is OK because all account is done already through
skb. Basically that is what the above explanation for !msg->skb
is describing?

Can we add this to the description so we don't forget it and/or
have to look up the mailing list in the future.

Thanks,
John

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-11-21  6:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17  0:57 [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling for ingress redirection zijianzhang
2024-10-17  0:57 ` [PATCH bpf 1/2] tcp_bpf: charge receive socket buffer in bpf_tcp_ingress() zijianzhang
2024-10-17  0:57 ` [PATCH bpf 2/2] tcp_bpf: add sk_rmem_alloc related logic for ingress redirection zijianzhang
2024-11-08  4:04   ` Cong Wang
2024-11-08 18:28     ` [External] " Zijian Zhang
2024-11-21  6:07       ` John Fastabend
2024-11-08  3:58 ` [PATCH bpf 0/2] tcp_bpf: update the rmem scheduling " Cong Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).