From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
To: David Howells <dhowells@redhat.com>,
Matthew Wilcox <willy@infradead.org>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>
Cc: David Howells <dhowells@redhat.com>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Jens Axboe <axboe@kernel.dk>, Jeff Layton <jlayton@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
netdev@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: RE: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
Date: Thu, 16 Mar 2023 14:37:01 -0400 [thread overview]
Message-ID: <641361cd8d704_33b0cc20823@willemb.c.googlers.com.notmuch> (raw)
In-Reply-To: <20230316152618.711970-4-dhowells@redhat.com>
David Howells wrote:
> Make TCP's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be
> spliced from the source iterator if possible (the iterator must be
> ITER_BVEC and the pages must be spliceable).
>
> This allows ->sendpage() to be replaced by something that can handle
> multiple multipage folios in a single transaction.
>
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Eric Dumazet <edumazet@google.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: netdev@vger.kernel.org
> ---
> net/ipv4/tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 53 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 288693981b00..77c0c69208a5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> int flags, err, copied = 0;
> int mss_now = 0, size_goal, copied_syn = 0;
> int process_backlog = 0;
> - bool zc = false;
> + int zc = 0;
> long timeo;
>
> flags = msg->msg_flags;
> @@ -1231,17 +1231,24 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> if (msg->msg_ubuf) {
> uarg = msg->msg_ubuf;
> net_zcopy_get(uarg);
> - zc = sk->sk_route_caps & NETIF_F_SG;
> + if (sk->sk_route_caps & NETIF_F_SG)
> + zc = 1;
> } else if (sock_flag(sk, SOCK_ZEROCOPY)) {
> uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
> if (!uarg) {
> err = -ENOBUFS;
> goto out_err;
> }
> - zc = sk->sk_route_caps & NETIF_F_SG;
> - if (!zc)
> + if (sk->sk_route_caps & NETIF_F_SG)
> + zc = 1;
> + else
> uarg_to_msgzc(uarg)->zerocopy = 0;
> }
> + } else if (unlikely(flags & MSG_SPLICE_PAGES) && size) {
> + if (!iov_iter_is_bvec(&msg->msg_iter))
> + return -EINVAL;
> + if (sk->sk_route_caps & NETIF_F_SG)
> + zc = 2;
> }
The commit message mentions MSG_SPLICE_PAGES as an internal flag.
It can be passed from userspace. The code anticipates that and checks
preconditions.
A side effect is that legacy applications that may already be setting
this bit in the flags now start failing. Most socket types are
historically permissive and simply ignore undefined flags.
With MSG_ZEROCOPY we chose to be extra cautious and added
SOCK_ZEROCOPY, only testing the MSG_ZEROCOPY bit if this socket option
is explicitly enabled. Perhaps more cautious than necessary, but FYI.
next prev parent reply other threads:[~2023-03-16 18:37 UTC|newest]
Thread overview: 55+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
2023-03-16 17:28 ` Matthew Wilcox
2023-03-16 18:00 ` David Howells
2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 18:37 ` Willem de Bruijn [this message]
2023-03-16 18:44 ` David Howells
2023-03-16 19:00 ` Willem de Bruijn
2023-03-21 0:38 ` David Howells
2023-03-21 14:22 ` Willem de Bruijn
2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:25 ` [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-03-16 15:25 ` [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() David Howells
2023-03-16 15:25 ` [RFC PATCH 07/28] tls: " David Howells
2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
2023-03-16 15:25 ` [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-03-16 15:26 ` [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() David Howells
2023-03-16 15:26 ` [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 17/28] Remove file->f_op->sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
2023-03-16 15:26 ` [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 20/28] iscsi: " David Howells
2023-03-16 15:26 ` [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:26 ` [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() David Howells
2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
2023-03-17 2:40 ` Herbert Xu
2023-03-24 16:47 ` David Howells
2023-03-25 6:00 ` Herbert Xu
2023-03-25 7:44 ` David Howells
2023-03-25 9:21 ` Herbert Xu
2023-03-16 15:26 ` [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 26/28] dlm: " David Howells
2023-03-16 15:26 ` [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage David Howells
2023-03-16 16:17 ` Trond Myklebust
2023-03-16 17:10 ` Chuck Lever III
2023-03-16 17:28 ` David Howells
2023-03-16 17:41 ` Chuck Lever III
2023-03-16 21:21 ` David Howells
2023-03-17 15:29 ` Chuck Lever III
2023-03-16 16:24 ` David Howells
2023-03-16 18:06 ` David Howells
2023-03-16 19:01 ` Trond Myklebust
2023-03-22 13:10 ` David Howells
2023-03-22 18:15 ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
2023-03-22 18:47 ` Trond Myklebust
2023-03-22 18:49 ` Matthew Wilcox
2023-03-16 15:26 ` [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:57 ` Marc Kleine-Budde
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=641361cd8d704_33b0cc20823@willemb.c.googlers.com.notmuch \
--to=willemdebruijn.kernel@gmail.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=davem@davemloft.net \
--cc=dhowells@redhat.com \
--cc=edumazet@google.com \
--cc=hch@infradead.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).