* [issue]: sockmap restrain send if receiver block @ 2024-06-17 2:01 郑国勇 2024-06-17 17:07 ` John Fastabend 0 siblings, 1 reply; 6+ messages in thread From: 郑国勇 @ 2024-06-17 2:01 UTC (permalink / raw) To: john.fastabend, jakub, bpf hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. thanks very much. ^ permalink raw reply [flat|nested] 6+ messages in thread
* RE: [issue]: sockmap restrain send if receiver block 2024-06-17 2:01 [issue]: sockmap restrain send if receiver block 郑国勇 @ 2024-06-17 17:07 ` John Fastabend 2024-06-18 7:47 ` zhengguoyong 2024-06-18 8:08 ` zhengguoyong 0 siblings, 2 replies; 6+ messages in thread From: John Fastabend @ 2024-06-17 17:07 UTC (permalink / raw) To: 郑国勇, john.fastabend, jakub, bpf 郑国勇 wrote: > hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. > Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking sk_stream_memory_free() and will do sk_stream_wait_memory() if under memory pressure. This should handle sending case with lots of data queued up on the sk. On the redirect ingress case we do this, sk_psock_handle_skb() sk_psock_skb_ingress() sk_psock_create_ingress_msg() There sk_psock_create_ingress_msg() should check the rcvbuf of the receiving socket and shouldn't create a msg if its under memory pressure. If its an ingress_self case we do a skb_set_owner_r which should (?) push back on the memory side through sk_mem_charge(). Seems like I'm missing some case then if we are hitting this. What protocol and what is the BPF program? Is it a sender redirect? I guess more details might make it obvious to me. > sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, > > sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. > > my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. The sender shouldn't be able to have more outstanding data than the socket memory allows. After the redirect the skb/msg should be charged to the receiving socket though. Agree sk_sendbuf should limit sender. Maybe the test is not TCP protocol and we missed adding the limits to UDP/AF_UNIX/etc? > > thanks very much. > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [issue]: sockmap restrain send if receiver block 2024-06-17 17:07 ` John Fastabend @ 2024-06-18 7:47 ` zhengguoyong 2024-06-18 8:08 ` zhengguoyong 1 sibling, 0 replies; 6+ messages in thread From: zhengguoyong @ 2024-06-18 7:47 UTC (permalink / raw) To: 【外部账号】 John Fastabend, jakub, bpf thanks for reply. i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free() to check if memory is free. and in __sk_stream_memory_free(), if sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then __sk_stream_memory_free() will return false and do sk_stream_wait_memory(). but in sk_msg mode, tcp_bpf_sendmsg() whill not create skb structure and not use seq to recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path, and __sk_stream_memory_free() will always return true. In bpf_tcp_ingress() it will copy the sender msg and charge it, and in tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg() receive it from psock ingress_msg queue, if receiver is not to read again due to application bug, but sender continuous send, then the receiver psock ingress_msg queue will continuous increase and cannot be uncharge until tcp socket memory is not enough in the fllowing path. tcp_bpf_sendmsg tcp_bpf_send_verdict tcp_bpf_sendmsg_redir bpf_tcp_ingress sk_wmem_schedule so if a sk_msg type sockmap receiver is block, then it may consume all the tcp socket memory and influence other tcp stream, can we have a better way to limit it ? just like tcp use sk_sendbuf to limit per tcp stream? thanks. 在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道: > 郑国勇 wrote: >> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. >> > Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking > sk_stream_memory_free() and will do sk_stream_wait_memory() if under > memory pressure. This should handle sending case with lots of data > queued up on the sk. > > On the redirect ingress case we do this, > > sk_psock_handle_skb() > sk_psock_skb_ingress() > sk_psock_create_ingress_msg() > > There sk_psock_create_ingress_msg() should check the rcvbuf of the > receiving socket and shouldn't create a msg if its under memory pressure. > If its an ingress_self case we do a skb_set_owner_r which should (?) push > back on the memory side through sk_mem_charge(). > > Seems like I'm missing some case then if we are hitting this. What protocol > and what is the BPF program? Is it a sender redirect? I guess more details > might make it obvious to me. > > >> sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, >> >> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. >> >> my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. > The sender shouldn't be able to have more outstanding data than the > socket memory allows. After the redirect the skb/msg should be > charged to the receiving socket though. Agree sk_sendbuf should > limit sender. Maybe the test is not TCP protocol and we missed > adding the limits to UDP/AF_UNIX/etc? > >> thanks very much. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [issue]: sockmap restrain send if receiver block 2024-06-17 17:07 ` John Fastabend 2024-06-18 7:47 ` zhengguoyong @ 2024-06-18 8:08 ` zhengguoyong 2024-06-21 6:44 ` zhengguoyong 2024-06-25 19:51 ` John Fastabend 1 sibling, 2 replies; 6+ messages in thread From: zhengguoyong @ 2024-06-18 8:08 UTC (permalink / raw) To: 【外部账号】 John Fastabend, jakub, bpf thanks for reply. i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free() to check if memory is free. and in __sk_stream_memory_free(), if sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then __sk_stream_memory_free() will return false and do sk_stream_wait_memory(). but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path, and __sk_stream_memory_free() will always return true. in bpf_tcp_ingress() will copy the sender msg and charge it, and in tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg() receive it from psock ingress_msg queue, and if receiver is not to read again due to application bug, and sender continuous send, then the receiver psock ingress_msg queue will continuous increase and cannot be uncharged until tcp socket memory is not enough in the fllowing path. tcp_bpf_sendmsg tcp_bpf_send_verdict tcp_bpf_sendmsg_redir bpf_tcp_ingress sk_wmem_schedule so if a sk_msg type sockmap receiver is block, then it may consume all the tcp socket memory and influence other tcp stream, can we limit per sockmap tcp stream link sk->sk_sndbuf ? thanks. 在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道: > 郑国勇 wrote: >> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. >> > Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking > sk_stream_memory_free() and will do sk_stream_wait_memory() if under > memory pressure. This should handle sending case with lots of data > queued up on the sk. > > On the redirect ingress case we do this, > > sk_psock_handle_skb() > sk_psock_skb_ingress() > sk_psock_create_ingress_msg() > > There sk_psock_create_ingress_msg() should check the rcvbuf of the > receiving socket and shouldn't create a msg if its under memory pressure. > If its an ingress_self case we do a skb_set_owner_r which should (?) push > back on the memory side through sk_mem_charge(). > > Seems like I'm missing some case then if we are hitting this. What protocol > and what is the BPF program? Is it a sender redirect? I guess more details > might make it obvious to me. > > >> sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, >> >> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. >> >> my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. > The sender shouldn't be able to have more outstanding data than the > socket memory allows. After the redirect the skb/msg should be > charged to the receiving socket though. Agree sk_sendbuf should > limit sender. Maybe the test is not TCP protocol and we missed > adding the limits to UDP/AF_UNIX/etc? > >> thanks very much. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [issue]: sockmap restrain send if receiver block 2024-06-18 8:08 ` zhengguoyong @ 2024-06-21 6:44 ` zhengguoyong 2024-06-25 19:51 ` John Fastabend 1 sibling, 0 replies; 6+ messages in thread From: zhengguoyong @ 2024-06-21 6:44 UTC (permalink / raw) To: 【外部账号】 John Fastabend, jakub, bpf is it too heavy use tcp socket memory(defined in /proc/sys/net/ipv4/tcp_mem) to control sender not send too much data when we run sk_msg redirect sockmap whit TCP protocol? look forward for your reply. thanks so much. 在 2024/6/18 16:08, zhengguoyong 写道: > thanks for reply. > > i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free() > to check if memory is free. and in __sk_stream_memory_free(), if > sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk > notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then > __sk_stream_memory_free() will return false and do sk_stream_wait_memory(). > > but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to > recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path, > and __sk_stream_memory_free() will always return true. > > in bpf_tcp_ingress() will copy the sender msg and charge it, and in > tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg() > receive it from psock ingress_msg queue, and if receiver is not to read again > due to application bug, and sender continuous send, then the receiver > psock ingress_msg queue will continuous increase and cannot be uncharged > until tcp socket memory is not enough in the fllowing path. > > tcp_bpf_sendmsg > tcp_bpf_send_verdict > tcp_bpf_sendmsg_redir > bpf_tcp_ingress > sk_wmem_schedule > > so if a sk_msg type sockmap receiver is block, then it may consume all the > tcp socket memory and influence other tcp stream, > can we limit per sockmap tcp stream link sk->sk_sndbuf ? > > thanks. > > 在 2024/6/18 1:07, 【外部账号】 John Fastabend 写道: >> 郑国勇 wrote: >>> hi, In sockmap case, when sender send msg, In function sk_psock_queue_msg(), it will put the msg into the receiver psock ingress_msg queue, and wakeup receiver to receive. >>> >> Whats the protocol? The TCP case tcp_bpf_sendmsg() is checking >> sk_stream_memory_free() and will do sk_stream_wait_memory() if under >> memory pressure. This should handle sending case with lots of data >> queued up on the sk. >> >> On the redirect ingress case we do this, >> >> sk_psock_handle_skb() >> sk_psock_skb_ingress() >> sk_psock_create_ingress_msg() >> >> There sk_psock_create_ingress_msg() should check the rcvbuf of the >> receiving socket and shouldn't create a msg if its under memory pressure. >> If its an ingress_self case we do a skb_set_owner_r which should (?) push >> back on the memory side through sk_mem_charge(). >> >> Seems like I'm missing some case then if we are hitting this. What protocol >> and what is the BPF program? Is it a sender redirect? I guess more details >> might make it obvious to me. >> >> >>> sender can always send msg but not aware the receiver psock ingress_msg queue size. In mortally case, when receiver not receive again due to the application bug, >>> >>> sender can contiunous send msg unti system memory not enough. If this happen, it will influence the whole system. >>> >>> my question is: is there a better solution for this case? just like tcp use sk_sendbuf to limit the sender to send agagin if receiver is block. >> The sender shouldn't be able to have more outstanding data than the >> socket memory allows. After the redirect the skb/msg should be >> charged to the receiving socket though. Agree sk_sendbuf should >> limit sender. Maybe the test is not TCP protocol and we missed >> adding the limits to UDP/AF_UNIX/etc? >> >>> thanks very much. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [issue]: sockmap restrain send if receiver block 2024-06-18 8:08 ` zhengguoyong 2024-06-21 6:44 ` zhengguoyong @ 2024-06-25 19:51 ` John Fastabend 1 sibling, 0 replies; 6+ messages in thread From: John Fastabend @ 2024-06-25 19:51 UTC (permalink / raw) To: zhengguoyong, 【外部账号】 John Fastabend, jakub, bpf zhengguoyong wrote: > thanks for reply. > > i mean the sk_msg with TCP protocol. in this case, sender use sk_stream_memory_free() > to check if memory is free. and in __sk_stream_memory_free(), if > sk->sk_wmem_queued is bigger then sk->sk_sndbuf or sk > notsent_bytes(tp->write_seq - tp->snd_nxt) is too bigger then > __sk_stream_memory_free() will return false and do sk_stream_wait_memory(). > > but in sk_msg mode, tcp_bpf_sendmsg() will not create skb structure and not use seq to > recording sending info,so sk->sk_wmem_queued is not changed in tcp_bpf_sendmsg() path, > and __sk_stream_memory_free() will always return true. > > in bpf_tcp_ingress() will copy the sender msg and charge it, and in > tcp_bpf_recvmsg(), it will uncharge the msg after sk_msg_recvmsg() > receive it from psock ingress_msg queue, and if receiver is not to read again > due to application bug, and sender continuous send, then the receiver > psock ingress_msg queue will continuous increase and cannot be uncharged > until tcp socket memory is not enough in the fllowing path. > > tcp_bpf_sendmsg > tcp_bpf_send_verdict > tcp_bpf_sendmsg_redir > bpf_tcp_ingress > sk_wmem_schedule Shouldn't the sk_wmem_schedule push back through __sk_mem_schedule() and cause us to return a ENOMEM here? The ENOMEM then will free the msg and return ENOMEM all the way back through tcp_bpf_sendmsg and to the user eventually. That sk_wmem_schedule() for ingress should be sk_rmem_schedule()? I think the fix is to ensure that if we try to bpf_tcp_ingress onto a receive socket that it pushes back if its memory limits are reached. > > so if a sk_msg type sockmap receiver is block, then it may consume all the > tcp socket memory and influence other tcp stream, > can we limit per sockmap tcp stream link sk->sk_sndbuf ? > > thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-06-25 19:51 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2024-06-17 2:01 [issue]: sockmap restrain send if receiver block 郑国勇 2024-06-17 17:07 ` John Fastabend 2024-06-18 7:47 ` zhengguoyong 2024-06-18 8:08 ` zhengguoyong 2024-06-21 6:44 ` zhengguoyong 2024-06-25 19:51 ` John Fastabend
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox