public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
* [issue]: sockmap restrain send if receiver block
@ 2024-06-17  2:01 郑国勇
  2024-06-17 17:07 ` John Fastabend
  0 siblings, 1 reply; 6+ messages in thread
From: 郑国勇 @ 2024-06-17  2:01 UTC (permalink / raw)
  To: john.fastabend, jakub, bpf

hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive.

sender can always send msg but not aware the receiver psock ingress_msg queue size.  In mortally case, when receiver not receive again due to the application bug, 

sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system.

my question is:  is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block.

thanks very much.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: [issue]: sockmap restrain send if receiver block
  2024-06-17  2:01 [issue]: sockmap restrain send if receiver block 郑国勇
@ 2024-06-17 17:07 ` John Fastabend
  2024-06-18  7:47   ` zhengguoyong
  2024-06-18  8:08   ` zhengguoyong
  0 siblings, 2 replies; 6+ messages in thread
From: John Fastabend @ 2024-06-17 17:07 UTC (permalink / raw)
  To: 郑国勇, john.fastabend, jakub, bpf

郑国勇 wrote:
> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive.
> 

Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking
sk_stream_memory_free() and will do sk_stream_wait_memory() if under
memory pressure. This should handle sending case with lots of data
queued up on the sk.

On the redirect ingress case we do this,

  sk_psock_handle_skb()
    sk_psock_skb_ingress()
      sk_psock_create_ingress_msg()

There sk_psock_create_ingress_msg() should check the rcvbuf of the
receiving socket and shouldn't create a msg if its under memory pressure.
If its an ingress_self case we do a skb_set_owner_r which should (?) push
back on the memory side through sk_mem_charge().

Seems like I'm missing some case then if we are hitting this. What protocol
and what is the BPF program? Is it a sender redirect? I guess more details
might make it obvious to me.


> sender can always send msg but not aware the receiver psock ingress_msg queue size.  In mortally case, when receiver not receive again due to the application bug, 
> 
> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system.
> 
> my question is:  is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block.

The sender shouldn't be able to have more outstanding data than the
socket memory allows. After the redirect the skb/msg should be
charged to the receiving socket though. Agree sk_sendbuf should
limit sender. Maybe the test is not TCP protocol and we missed
adding the limits to UDP/AF_UNIX/etc?

> 
> thanks very much.
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [issue]: sockmap restrain send if receiver block
  2024-06-17 17:07 ` John Fastabend
@ 2024-06-18  7:47   ` zhengguoyong
  2024-06-18  8:08   ` zhengguoyong
  1 sibling, 0 replies; 6+ messages in thread
From: zhengguoyong @ 2024-06-18  7:47 UTC (permalink / raw)
  To: 【外部账号】 John Fastabend,
	jakub, bpf

thanks for reply.

i mean the sk_msg with TCP protocol.  in this case, sender  use sk_stream_memory_free()

to check if memory is free. and in __sk_stream_memory_free(), if sk->sk_wmem_queued

is bigger then sk->sk_sndbuf or sk notsent_bytes(tp->write_seq - tp->snd_nxt) is too

bigger then __sk_stream_memory_free() will return false and do sk_stream_wait_memory().

but in sk_msg mode, tcp_bpf_sendmsg() whill not create skb structure and not use

seq to recording sending info,so sk->sk_wmem_queued is not changed in

tcp_bpf_sendmsg() path, and __sk_stream_memory_free() will always return true.

In bpf_tcp_ingress() it will copy the sender msg and charge it, and in tcp_bpf_recvmsg(),

it will uncharge the msg after sk_msg_recvmsg() receive it from psock ingress_msg queue,

if receiver is not to read again due to application bug, but sender continuous send,

then the receiver psock ingress_msg queue will continuous increase and cannot be uncharge

until tcp socket memory is not enough in the fllowing path.

tcp_bpf_sendmsg

tcp_bpf_send_verdict

tcp_bpf_sendmsg_redir

bpf_tcp_ingress

sk_wmem_schedule

so if a sk_msg type sockmap receiver is block, then it may consume all the tcp socket memory

and influence other tcp stream,  can we have a better way to limit it ?  

just like tcp use sk_sendbuf to limit per tcp stream?


thanks.


在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道:
> 郑国勇 wrote:
>> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive.
>>
> Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking
> sk_stream_memory_free() and will do sk_stream_wait_memory() if under
> memory pressure. This should handle sending case with lots of data
> queued up on the sk.
>
> On the redirect ingress case we do this,
>
>   sk_psock_handle_skb()
>     sk_psock_skb_ingress()
>       sk_psock_create_ingress_msg()
>
> There sk_psock_create_ingress_msg() should check the rcvbuf of the
> receiving socket and shouldn't create a msg if its under memory pressure.
> If its an ingress_self case we do a skb_set_owner_r which should (?) push
> back on the memory side through sk_mem_charge().
>
> Seems like I'm missing some case then if we are hitting this. What protocol
> and what is the BPF program? Is it a sender redirect? I guess more details
> might make it obvious to me.
>
>
>> sender can always send msg but not aware the receiver psock ingress_msg queue size.  In mortally case, when receiver not receive again due to the application bug, 
>>
>> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system.
>>
>> my question is:  is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. 
> The sender shouldn't be able to have more outstanding data than the
> socket memory allows. After the redirect the skb/msg should be
> charged to the receiving socket though. Agree sk_sendbuf should
> limit sender. Maybe the test is not TCP protocol and we missed
> adding the limits to UDP/AF_UNIX/etc?
>
>> thanks very much.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [issue]: sockmap restrain send if receiver block
  2024-06-17 17:07 ` John Fastabend
  2024-06-18  7:47   ` zhengguoyong
@ 2024-06-18  8:08   ` zhengguoyong
  2024-06-21  6:44     ` zhengguoyong
  2024-06-25 19:51     ` John Fastabend
  1 sibling, 2 replies; 6+ messages in thread
From: zhengguoyong @ 2024-06-18  8:08 UTC (permalink / raw)
  To: 【外部账号】 John Fastabend,
	jakub, bpf

thanks for reply.

i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free()
to check if memory is free. and in __sk_stream_memory_free(), if
sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk
notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then
__sk_stream_memory_free() will return false and do sk_stream_wait_memory().

but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to
recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path,
and __sk_stream_memory_free() will always return true.

in bpf_tcp_ingress() will copy the sender msg and charge it, and in
tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg()
receive it from psock ingress_msg queue, and if receiver is not to read again
due to application bug, and sender continuous send, then the receiver
psock ingress_msg queue will continuous increase and cannot be uncharged
until tcp socket memory is not enough in the fllowing path.

    tcp_bpf_sendmsg
        tcp_bpf_send_verdict
            tcp_bpf_sendmsg_redir
                bpf_tcp_ingress
                    sk_wmem_schedule

so if a sk_msg type sockmap receiver is block, then it may consume all the
tcp socket memory and influence other tcp stream,
can we limit per sockmap tcp stream link sk->sk_sndbuf ?

thanks.

在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道:
> 郑国勇 wrote:
>> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive.
>>
> Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking
> sk_stream_memory_free() and will do sk_stream_wait_memory() if under
> memory pressure. This should handle sending case with lots of data
> queued up on the sk.
>
> On the redirect ingress case we do this,
>
>   sk_psock_handle_skb()
>     sk_psock_skb_ingress()
>       sk_psock_create_ingress_msg()
>
> There sk_psock_create_ingress_msg() should check the rcvbuf of the
> receiving socket and shouldn't create a msg if its under memory pressure.
> If its an ingress_self case we do a skb_set_owner_r which should (?) push
> back on the memory side through sk_mem_charge().
>
> Seems like I'm missing some case then if we are hitting this. What protocol
> and what is the BPF program? Is it a sender redirect? I guess more details
> might make it obvious to me.
>
>
>> sender can always send msg but not aware the receiver psock ingress_msg queue size.  In mortally case, when receiver not receive again due to the application bug, 
>>
>> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system.
>>
>> my question is:  is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block.
> The sender shouldn't be able to have more outstanding data than the
> socket memory allows. After the redirect the skb/msg should be
> charged to the receiving socket though. Agree sk_sendbuf should
> limit sender. Maybe the test is not TCP protocol and we missed
> adding the limits to UDP/AF_UNIX/etc?
>
>> thanks very much.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [issue]: sockmap restrain send if receiver block
  2024-06-18  8:08   ` zhengguoyong
@ 2024-06-21  6:44     ` zhengguoyong
  2024-06-25 19:51     ` John Fastabend
  1 sibling, 0 replies; 6+ messages in thread
From: zhengguoyong @ 2024-06-21  6:44 UTC (permalink / raw)
  To: 【外部账号】 John Fastabend,
	jakub, bpf

is it too heavy use tcp socket memory(defined in /proc/sys/net/ipv4/tcp_mem)
to control sender not send too much data when we run sk_msg redirect sockmap
whit TCP protocol?

look forward for your reply.

thanks  so much.

在 2024/6/18 16:08, zhengguoyong 写道:
> thanks for reply.
>
> i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free()
> to check if memory is free. and in __sk_stream_memory_free(), if
> sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk
> notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then
> __sk_stream_memory_free() will return false and do sk_stream_wait_memory().
>
> but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to
> recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path,
> and __sk_stream_memory_free() will always return true.
>
> in bpf_tcp_ingress() will copy the sender msg and charge it, and in
> tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg()
> receive it from psock ingress_msg queue, and if receiver is not to read again
> due to application bug, and sender continuous send, then the receiver
> psock ingress_msg queue will continuous increase and cannot be uncharged
> until tcp socket memory is not enough in the fllowing path.
>
>     tcp_bpf_sendmsg
>         tcp_bpf_send_verdict
>             tcp_bpf_sendmsg_redir
>                 bpf_tcp_ingress
>                     sk_wmem_schedule
>
> so if a sk_msg type sockmap receiver is block, then it may consume all the
> tcp socket memory and influence other tcp stream,
> can we limit per sockmap tcp stream link sk->sk_sndbuf ?
>
> thanks.
>
> 在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道:
>> 郑国勇 wrote:
>>> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive.
>>>
>> Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking
>> sk_stream_memory_free() and will do sk_stream_wait_memory() if under
>> memory pressure. This should handle sending case with lots of data
>> queued up on the sk.
>>
>> On the redirect ingress case we do this,
>>
>>   sk_psock_handle_skb()
>>     sk_psock_skb_ingress()
>>       sk_psock_create_ingress_msg()
>>
>> There sk_psock_create_ingress_msg() should check the rcvbuf of the
>> receiving socket and shouldn't create a msg if its under memory pressure.
>> If its an ingress_self case we do a skb_set_owner_r which should (?) push
>> back on the memory side through sk_mem_charge().
>>
>> Seems like I'm missing some case then if we are hitting this. What protocol
>> and what is the BPF program? Is it a sender redirect? I guess more details
>> might make it obvious to me.
>>
>>
>>> sender can always send msg but not aware the receiver psock ingress_msg queue size.  In mortally case, when receiver not receive again due to the application bug, 
>>>
>>> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system.
>>>
>>> my question is:  is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block.
>> The sender shouldn't be able to have more outstanding data than the
>> socket memory allows. After the redirect the skb/msg should be
>> charged to the receiving socket though. Agree sk_sendbuf should
>> limit sender. Maybe the test is not TCP protocol and we missed
>> adding the limits to UDP/AF_UNIX/etc?
>>
>>> thanks very much.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [issue]: sockmap restrain send if receiver block
  2024-06-18  8:08   ` zhengguoyong
  2024-06-21  6:44     ` zhengguoyong
@ 2024-06-25 19:51     ` John Fastabend
  1 sibling, 0 replies; 6+ messages in thread
From: John Fastabend @ 2024-06-25 19:51 UTC (permalink / raw)
  To: zhengguoyong,
	【外部账号】 John Fastabend,
	jakub, bpf

zhengguoyong wrote:
> thanks for reply.
> 
> i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free()
> to check if memory is free. and in __sk_stream_memory_free(), if
> sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk
> notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then
> __sk_stream_memory_free() will return false and do sk_stream_wait_memory().
> 
> but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to
> recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path,
> and __sk_stream_memory_free() will always return true.
> 
> in bpf_tcp_ingress() will copy the sender msg and charge it, and in
> tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg()
> receive it from psock ingress_msg queue, and if receiver is not to read again
> due to application bug, and sender continuous send, then the receiver
> psock ingress_msg queue will continuous increase and cannot be uncharged
> until tcp socket memory is not enough in the fllowing path.
> 
>     tcp_bpf_sendmsg
>         tcp_bpf_send_verdict
>             tcp_bpf_sendmsg_redir
>                 bpf_tcp_ingress
>                     sk_wmem_schedule

Shouldn't the sk_wmem_schedule push back through __sk_mem_schedule() and
cause us to return a ENOMEM here? The ENOMEM then will free the msg and
return ENOMEM all the way back through tcp_bpf_sendmsg and to the user
eventually. 

That sk_wmem_schedule() for ingress should be sk_rmem_schedule()?

I think the fix is to ensure that if we try to bpf_tcp_ingress onto a
receive socket that it pushes back if its memory limits are reached.

> 
> so if a sk_msg type sockmap receiver is block, then it may consume all the
> tcp socket memory and influence other tcp stream,
> can we limit per sockmap tcp stream link sk->sk_sndbuf ?
> 
> thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2024-06-25 19:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-17  2:01 [issue]: sockmap restrain send if receiver block 郑国勇
2024-06-17 17:07 ` John Fastabend
2024-06-18  7:47   ` zhengguoyong
2024-06-18  8:08   ` zhengguoyong
2024-06-21  6:44     ` zhengguoyong
2024-06-25 19:51     ` John Fastabend

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox