* make sendmsg/recvmsg process multiple messages at once @ 2021-02-01 12:41 Menglong Dong 2021-02-02 4:07 ` Jakub Kicinski 0 siblings, 1 reply; 5+ messages in thread From: Menglong Dong @ 2021-02-01 12:41 UTC (permalink / raw) To: Jakub Kicinski; +Cc: netdev Hello, guys! I am thinking about making sendmsg/recvmsg process multiple messages at once, which is possible to reduce the number of system calls. Take the receiving of udp as an example, we can copy multiple skbs to msg_iov and make sure that every iovec contains a udp package. Is this a good idea? This idea seems clumsy compared to the incoming 'io-uring' based zerocopy, but maybe it can help... Regards Menglong Dong ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: make sendmsg/recvmsg process multiple messages at once 2021-02-01 12:41 make sendmsg/recvmsg process multiple messages at once Menglong Dong @ 2021-02-02 4:07 ` Jakub Kicinski 2021-02-02 10:18 ` Paolo Abeni 0 siblings, 1 reply; 5+ messages in thread From: Jakub Kicinski @ 2021-02-02 4:07 UTC (permalink / raw) To: Menglong Dong, Willem de Bruijn, Paolo Abeni; +Cc: netdev On Mon, 1 Feb 2021 20:41:45 +0800 Menglong Dong wrote: > Hello, guys! > > I am thinking about making sendmsg/recvmsg process multiple messages > at once, which is possible to reduce the number of system calls. > > Take the receiving of udp as an example, we can copy multiple skbs to > msg_iov and make sure that every iovec contains a udp package. > > Is this a good idea? This idea seems clumsy compared to the incoming > 'io-uring' based zerocopy, but maybe it can help... Let me refer you to Willem and Paolo :) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: make sendmsg/recvmsg process multiple messages at once 2021-02-02 4:07 ` Jakub Kicinski @ 2021-02-02 10:18 ` Paolo Abeni 2021-02-02 13:39 ` Menglong Dong 2021-02-02 15:19 ` David Laight 0 siblings, 2 replies; 5+ messages in thread From: Paolo Abeni @ 2021-02-02 10:18 UTC (permalink / raw) To: Jakub Kicinski, Menglong Dong, Willem de Bruijn; +Cc: netdev On Mon, 2021-02-01 at 20:07 -0800, Jakub Kicinski wrote: > On Mon, 1 Feb 2021 20:41:45 +0800 Menglong Dong wrote: > > I am thinking about making sendmsg/recvmsg process multiple messages > > at once, which is possible to reduce the number of system calls. > > > > Take the receiving of udp as an example, we can copy multiple skbs to > > msg_iov and make sure that every iovec contains a udp package. > > > > Is this a good idea? This idea seems clumsy compared to the incoming > > 'io-uring' based zerocopy, but maybe it can help... Indeed since the introduction of some security vulnerability mitigation, syscall overhead is relevant and amortizing it with bulk operations gives very measurable performances gain. Potentially bulk operation also reduce RETPOLINE overhead, but AFAICS all the indirect calls in the relevant code path has been already mitigated with the indirect call wrappers. Note that you can already process several packets with a single syscall using sendmmsg/recvmmsg. Both have issues with error reporting and timeout and IIRC still don't amortize the overhead introduced e.g. by CONFIG_HARDENED_USERCOPY. Additionally, recvmmsg/sendmmsg are not cache-friendly. As noted by Eric long time ago: https://marc.info/?l=linux-netdev&m=148010858826712&w=2 perf tests in lab with recvmmsg/sendmmsg could be great, but performance with real workload much less. You could try fine-tuning the bulk size (mmsg nr) for your workload and H/W. Likely a burst size above 8 is a no go. For the TX path there is already a better option - for some specific workload - using UDP_SEGMENT. In the RX path, for bulk transfer, you could try enabling UDP_GRO. As far as I can see, the idea you are proposing will be quite alike recvmmsg(), with the possible additional benefit of bulk dequeue from the UDP receive queue. Note that this latter optimization, since commmit 2276f58ac5890, will give very little perfomance gain. In the TX path there is no lock at all for the uncorking case, so the performance gain should come only from the bulk syscall. You will probably also need to cope with cmsg and msgname, so overall I don't see much differences from recvmmsg()/sendmmsg(), did I misread something? Thanks! Paolo ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: make sendmsg/recvmsg process multiple messages at once 2021-02-02 10:18 ` Paolo Abeni @ 2021-02-02 13:39 ` Menglong Dong 2021-02-02 15:19 ` David Laight 1 sibling, 0 replies; 5+ messages in thread From: Menglong Dong @ 2021-02-02 13:39 UTC (permalink / raw) To: Paolo Abeni; +Cc: Jakub Kicinski, Willem de Bruijn, netdev On Tue, Feb 2, 2021 at 6:18 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2021-02-01 at 20:07 -0800, Jakub Kicinski wrote: > [...] > > https://marc.info/?l=linux-netdev&m=148010858826712&w=2 > > perf tests in lab with recvmmsg/sendmmsg could be great, but > performance with real workload much less. You could try fine-tuning the > bulk size (mmsg nr) for your workload and H/W. Likely a burst size > above 8 is a no go. > > For the TX path there is already a better option - for some specific > workload - using UDP_SEGMENT. > > In the RX path, for bulk transfer, you could try enabling UDP_GRO. > > As far as I can see, the idea you are proposing will be quite > alike recvmmsg(), with the possible additional benefit of bulk dequeue > from the UDP receive queue. Note that this latter optimization, since > commmit 2276f58ac5890, will give very little perfomance gain. > > In the TX path there is no lock at all for the uncorking case, so the > performance gain should come only from the bulk syscall. > > You will probably also need to cope with cmsg and msgname, so overall I > don't see much differences from recvmmsg()/sendmmsg(), did I misread > something? > > Thanks! > > Paolo > Wow, thanks for your professional explanation. In fact, "recvmmsg/sendmmsg" is exactly what I want. (Haha... I didn't know their exists before, so embarrassing ~) By the way, I think my idea do have some benefit(maybe). For example, it doesn't need to call __sys_recvmsg() for every skb, which avoids unnecessary path. And it doesn't need to prepare msghdr for every skb, which is possible to reduce memory copy between kernel and user space. As for msgname, we can dequeue skbs that have the same source until we meet a different one, in one cycle. However, in view of the exist of "recvmmsg/sendmmsg", my idea seems unnecessary~~~ Thanks~ Menglong Dong ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: make sendmsg/recvmsg process multiple messages at once 2021-02-02 10:18 ` Paolo Abeni 2021-02-02 13:39 ` Menglong Dong @ 2021-02-02 15:19 ` David Laight 1 sibling, 0 replies; 5+ messages in thread From: David Laight @ 2021-02-02 15:19 UTC (permalink / raw) To: 'Paolo Abeni', Jakub Kicinski, Menglong Dong, Willem de Bruijn Cc: netdev From: Paolo Abeni > Sent: 02 February 2021 10:19 ... > Note that you can already process several packets with a single syscall > using sendmmsg/recvmmsg. Both have issues with error reporting and > timeout and IIRC still don't amortize the overhead introduced e.g. by > CONFIG_HARDENED_USERCOPY. Both CONFIG_HARDENED_USERCOPY and the extra user copies needed even for sendmsg() over send() are definitely measurable. I've run tests using _copy_from_user() for many of the copies. Even the cost of reading in a single iov[] for the buffer affects things. My last attempt at speeding up writev("/dev/null", iov, 10) fell into the rabbit hole of io_uring (again). But the partial changes gave a few % improvement. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-02-02 15:23 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2021-02-01 12:41 make sendmsg/recvmsg process multiple messages at once Menglong Dong 2021-02-02 4:07 ` Jakub Kicinski 2021-02-02 10:18 ` Paolo Abeni 2021-02-02 13:39 ` Menglong Dong 2021-02-02 15:19 ` David Laight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).