From: David Howells <dhowells@redhat.com>
To: netdev@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
Matthew Wilcox <willy@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Jens Axboe <axboe@kernel.dk>, Jeff Layton <jlayton@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Chuck Lever III <chuck.lever@oracle.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: [PATCH net-next v4 00/20] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1
Date: Wed, 5 Apr 2023 17:53:19 +0100 [thread overview]
Message-ID: <20230405165339.3468808-1-dhowells@redhat.com> (raw)
Here's the first tranche of patches towards providing a MSG_SPLICE_PAGES
internal sendmsg flag that is intended to replace the ->sendpage() op with
calls to sendmsg(). MSG_SPLICE is a hint that tells the protocol that it
should splice the pages supplied if it can and copy them if not.
This will allow splice to pass multiple pages in a single call and allow
certain parts of higher protocols (e.g. sunrpc, iwarp) to pass an entire
message in one go rather than having to send them piecemeal. This should
also make it easier to handle the splicing of multipage folios.
This set consists of the following parts:
(1) Provide a set of sample functions in samples/net/ that can be used to
drive splice() and sendfile() with TCP/TCP6, UDP/UDP6, TLS over
TCP/TCP6, UNIX and ALG hash/skcipher sockets for testing.
(2) Define the MSG_SPLICE_PAGES flag and prevent sys_sendmsg() from being
able to set it.
(3) Overhaul the page_frag_alloc_align() allocator:
(a) Split it out from mm/page_alloc.c into its own file,
mm/page_frag_alloc.c.
(b) Make it use multipage folios rather than compound pages.
(c) Give it per-cpu buckets to allocate from so no locking is
required.
(d) The netdev_alloc_cache and the napi fragment cache are then cast
in terms of this and some private allocators are removed.
I'm not sure that the existing allocator is 100% thread safe.
(4) Implement MSG_SPLICE_PAGES support in TCP.
(5) Make do_tcp_sendpages() just wrap sendmsg() and then fold it in to its
various callers.
(6) Implement MSG_SPLICE_PAGES support in IP and make udp_sendpage() just
a wrapper around sendmsg().
(7) Implement MSG_SPLICE_PAGES support in IP6/UDP6.
(8) Implement MSG_SPLICE_PAGES support in AF_UNIX.
(9) Make AF_UNIX copy unspliceable pages.
I've pushed the patches here also:
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=sendpage-1
The follow-on patches are on branch iov-sendpage on the same tree.
David
Changes
=======
ver #4)
- Added some sample socket-I/O programs into samples/net/.
- Fix a missing page-get in AF_KCM.
- Init the sgtable and mark the end in AF_ALG when calling
netfs_extract_iter_to_sg().
- Add a destructor func for page frag caches prior to generalising it and
making it per-cpu.
ver #3)
- Dropped the iterator-of-iterators patch.
- Only expunge MSG_SPLICE_PAGES in sys_send[m]msg, not sys_recv[m]msg.
- Split MSG_SPLICE_PAGES code in __ip_append_data() out into helper
functions.
- Implement MSG_SPLICE_PAGES support in __ip6_append_data() using the
above helper functions.
- Rename 'xlength' to 'initial_length'.
- Minimise the changes to sunrpc for the moment.
- Don't give -EOPNOTSUPP if NETIF_F_SG not available, just copy instead.
- Implemented MSG_SPLICE_PAGES support in the TLS, Chelsio-TLS and AF_KCM
code.
ver #2)
- Overhauled the page_frag_alloc() allocator: large folios and per-cpu.
- Got rid of my own zerocopy allocator.
- Use iov_iter_extract_pages() rather poking in iter->bvec.
- Made page splicing fall back to page copying on a page-by-page basis.
- Made splice_to_socket() pass 16 pipe buffers at a time.
- Made AF_ALG/hash use finup/digest where possible in sendmsg.
- Added an iterator-of-iterators, ITER_ITERLIST.
- Made sunrpc use the iterator-of-iterators.
- Converted more drivers.
Link: https://lore.kernel.org/r/20230316152618.711970-1-dhowells@redhat.com/ # v1
Link: https://lore.kernel.org/r/20230329141354.516864-1-dhowells@redhat.com/ # v2
Link: https://lore.kernel.org/r/20230331160914.1608208-1-dhowells@redhat.com/ # v3
David Howells (20):
net: Add samples for network I/O and splicing
net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
mm: Move the page fragment allocator from page_alloc.c into its own
file
mm: Make the page_frag_cache allocator use multipage folios
mm: Make the page_frag_cache allocator use per-cpu
tcp: Support MSG_SPLICE_PAGES
tcp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data
tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES
tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around
tcp_sendmsg
espintcp: Inline do_tcp_sendpages()
tls: Inline do_tcp_sendpages()
siw: Inline do_tcp_sendpages()
tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked()
udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES
ip: Remove ip_append_page()
ip, udp: Support MSG_SPLICE_PAGES
ip, udp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data
ip6, udp6: Support MSG_SPLICE_PAGES
af_unix: Support MSG_SPLICE_PAGES
af_unix: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data
drivers/infiniband/sw/siw/siw_qp_tx.c | 17 +-
drivers/net/ethernet/mediatek/mtk_wed_wo.c | 19 +-
drivers/net/ethernet/mediatek/mtk_wed_wo.h | 2 -
drivers/nvme/host/tcp.c | 19 +-
drivers/nvme/target/tcp.c | 22 +-
include/linux/gfp.h | 17 +-
include/linux/mm_types.h | 13 +-
include/linux/socket.h | 3 +
include/net/ip.h | 3 +-
include/net/tcp.h | 2 -
include/net/tls.h | 2 +-
mm/Makefile | 2 +-
mm/page_alloc.c | 126 ----------
mm/page_frag_alloc.c | 201 ++++++++++++++++
net/core/skbuff.c | 32 +--
net/ipv4/ip_output.c | 202 ++++++----------
net/ipv4/tcp.c | 260 ++++++++-------------
net/ipv4/tcp_bpf.c | 20 +-
net/ipv4/udp.c | 50 +---
net/ipv6/ip6_output.c | 12 +
net/socket.c | 2 +
net/tls/tls_main.c | 24 +-
net/unix/af_unix.c | 115 +++++++--
net/xfrm/espintcp.c | 10 +-
samples/Kconfig | 6 +
samples/Makefile | 1 +
samples/net/Makefile | 13 ++
samples/net/alg-encrypt.c | 201 ++++++++++++++++
samples/net/alg-hash.c | 143 ++++++++++++
samples/net/splice-out.c | 142 +++++++++++
samples/net/tcp-send.c | 154 ++++++++++++
samples/net/tcp-sink.c | 76 ++++++
samples/net/tls-send.c | 176 ++++++++++++++
samples/net/tls-sink.c | 98 ++++++++
samples/net/udp-send.c | 151 ++++++++++++
samples/net/udp-sink.c | 82 +++++++
samples/net/unix-send.c | 147 ++++++++++++
samples/net/unix-sink.c | 51 ++++
38 files changed, 2017 insertions(+), 599 deletions(-)
create mode 100644 mm/page_frag_alloc.c
create mode 100644 samples/net/Makefile
create mode 100644 samples/net/alg-encrypt.c
create mode 100644 samples/net/alg-hash.c
create mode 100644 samples/net/splice-out.c
create mode 100644 samples/net/tcp-send.c
create mode 100644 samples/net/tcp-sink.c
create mode 100644 samples/net/tls-send.c
create mode 100644 samples/net/tls-sink.c
create mode 100644 samples/net/udp-send.c
create mode 100644 samples/net/udp-sink.c
create mode 100644 samples/net/unix-send.c
create mode 100644 samples/net/unix-sink.c
next reply other threads:[~2023-04-05 16:54 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-05 16:53 David Howells [this message]
2023-04-05 16:53 ` [PATCH net-next v4 01/20] net: Add samples for network I/O and splicing David Howells
2023-04-05 16:53 ` [PATCH net-next v4 02/20] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-04-05 16:53 ` [PATCH net-next v4 03/20] mm: Move the page fragment allocator from page_alloc.c into its own file David Howells
2023-04-05 16:53 ` [PATCH net-next v4 04/20] mm: Make the page_frag_cache allocator use multipage folios David Howells
2023-04-05 16:53 ` [PATCH net-next v4 05/20] mm: Make the page_frag_cache allocator use per-cpu David Howells
2023-04-05 16:53 ` [PATCH net-next v4 06/20] tcp: Support MSG_SPLICE_PAGES David Howells
2023-04-05 16:53 ` [PATCH net-next v4 07/20] tcp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data David Howells
2023-04-05 16:53 ` [PATCH net-next v4 08/20] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-04-05 16:53 ` [PATCH net-next v4 09/20] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-04-05 16:53 ` [PATCH net-next v4 10/20] espintcp: Inline do_tcp_sendpages() David Howells
2023-04-05 16:53 ` [PATCH net-next v4 11/20] tls: " David Howells
2023-04-05 16:53 ` [PATCH net-next v4 12/20] siw: " David Howells
2023-04-05 16:53 ` [PATCH net-next v4 13/20] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-04-05 16:53 ` [PATCH net-next v4 14/20] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-04-05 16:53 ` [PATCH net-next v4 15/20] ip: Remove ip_append_page() David Howells
2023-04-05 16:53 ` [PATCH net-next v4 16/20] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-04-05 16:53 ` [PATCH net-next v4 17/20] ip, udp: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data David Howells
2023-04-05 16:53 ` [PATCH net-next v4 18/20] ip6, udp6: Support MSG_SPLICE_PAGES David Howells
2023-04-05 16:53 ` [PATCH net-next v4 19/20] af_unix: " David Howells
2023-04-05 16:53 ` [PATCH net-next v4 20/20] af_unix: Make sendmsg(MSG_SPLICE_PAGES) copy unspliceable data David Howells
2023-04-06 2:19 ` [PATCH net-next v4 00/20] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 Jakub Kicinski
2023-04-06 9:12 ` David Howells
2023-04-06 15:03 ` Jakub Kicinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230405165339.3468808-1-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=hch@infradead.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willemdebruijn.kernel@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).