From: Pavel Begunkov <asml.silence@gmail.com>
To: Stefan Metzmacher <metze@samba.org>,
io-uring@vger.kernel.org, axboe@kernel.dk
Cc: Jakub Kicinski <kuba@kernel.org>, netdev@vger.kernel.org
Subject: Re: [PATCH 5/5] io_uring/notif: let userspace know how effective the zero copy usage was
Date: Sat, 17 Sep 2022 10:22:08 +0100 [thread overview]
Message-ID: <5f4059ca-cec6-e44a-ac61-b9c034b1be77@gmail.com> (raw)
In-Reply-To: <76cdd53f618e2793e1ec298c837bb17c3b9f12ee.1663363798.git.metze@samba.org>
On 9/16/22 22:36, Stefan Metzmacher wrote:
> The 2nd cqe for IORING_OP_SEND_ZC has IORING_CQE_F_NOTIF set in cqe->flags
> and it will now have the number of successful completed
> io_uring_tx_zerocopy_callback() callbacks in the lower 31-bits
> of cqe->res, the high bit (0x80000000) is set when
> io_uring_tx_zerocopy_callback() was called with success=false.
It has a couple of problems, and because that "simplify uapi"
patch is transitional it doesn't go well with what I'm queuing
for 6.1, let's hold it for a while.
> If cqe->res is still 0, zero copy wasn't used at all.
>
> These values give userspace a change to adjust its strategy
> choosing IORING_OP_SEND_ZC or IORING_OP_SEND. And it's a bit
> richer than just a simple SO_EE_CODE_ZEROCOPY_COPIED indication.
>
> Fixes: b48c312be05e8 ("io_uring/net: simplify zerocopy send user API")
> Fixes: eb315a7d1396b ("tcp: support externally provided ubufs")
> Fixes: 1fd3ae8c906c0 ("ipv6/udp: support externally provided ubufs")
> Fixes: c445f31b3cfaa ("ipv4/udp: support externally provided ubufs")
> Signed-off-by: Stefan Metzmacher <metze@samba.org>
> Cc: Pavel Begunkov <asml.silence@gmail.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: io-uring@vger.kernel.org
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: netdev@vger.kernel.org
> ---
> io_uring/notif.c | 18 ++++++++++++++++++
> net/ipv4/ip_output.c | 3 ++-
> net/ipv4/tcp.c | 2 ++
> net/ipv6/ip6_output.c | 3 ++-
> 4 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/io_uring/notif.c b/io_uring/notif.c
> index e37c6569d82e..b07d2a049931 100644
> --- a/io_uring/notif.c
> +++ b/io_uring/notif.c
> @@ -28,7 +28,24 @@ static void io_uring_tx_zerocopy_callback(struct sk_buff *skb,
> struct io_notif_data *nd = container_of(uarg, struct io_notif_data, uarg);
> struct io_kiocb *notif = cmd_to_io_kiocb(nd);
>
> + uarg->zerocopy = uarg->zerocopy & success;
> +
> + if (success && notif->cqe.res < S32_MAX)
> + notif->cqe.res++;
> +
> if (refcount_dec_and_test(&uarg->refcnt)) {
> + /*
> + * If we hit at least one case that
> + * was not able to use zero copy,
> + * we set the high bit 0x80000000
> + * so that notif->cqe.res < 0, means it was
> + * as least copied once.
> + *
> + * The other 31 bits are the success count.
> + */
> + if (!uarg->zerocopy)
> + notif->cqe.res |= S32_MIN;
> +
> notif->io_task_work.func = __io_notif_complete_tw;
> io_req_task_work_add(notif);
> }
> @@ -53,6 +70,7 @@ struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx)
>
> nd = io_notif_to_data(notif);
> nd->account_pages = 0;
> + nd->uarg.zerocopy = 1;
> nd->uarg.flags = SKBFL_ZEROCOPY_FRAG | SKBFL_DONT_ORPHAN;
> nd->uarg.callback = io_uring_tx_zerocopy_callback;
> refcount_set(&nd->uarg.refcnt, 1);
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index d7bd1daf022b..4bdea7a4b2f7 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1032,7 +1032,8 @@ static int __ip_append_data(struct sock *sk,
> paged = true;
> zc = true;
> uarg = msg->msg_ubuf;
> - }
> + } else
> + msg->msg_ubuf->zerocopy = 0;
> } else if (sock_flag(sk, SOCK_ZEROCOPY)) {
> uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb));
> if (!uarg)
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 970e9a2cca4a..27a22d470741 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1231,6 +1231,8 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
> uarg = msg->msg_ubuf;
> net_zcopy_get(uarg);
> zc = sk->sk_route_caps & NETIF_F_SG;
> + if (!zc)
> + uarg->zerocopy = 0;
> } else if (sock_flag(sk, SOCK_ZEROCOPY)) {
> uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
> if (!uarg) {
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index f152e51242cb..d85036e91cf7 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1556,7 +1556,8 @@ static int __ip6_append_data(struct sock *sk,
> paged = true;
> zc = true;
> uarg = msg->msg_ubuf;
> - }
> + } else
> + msg->msg_ubuf->zerocopy = 0;
> } else if (sock_flag(sk, SOCK_ZEROCOPY)) {
> uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb));
> if (!uarg)
--
Pavel Begunkov
next prev parent reply other threads:[~2022-09-17 9:23 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-16 21:36 [PATCH for-6.0 0/5] IORING_OP_SEND_ZC improvements Stefan Metzmacher
2022-09-16 21:36 ` [PATCH 1/5] io_uring/opdef: rename SENDZC_NOTIF to SEND_ZC Stefan Metzmacher
2022-09-17 9:17 ` Pavel Begunkov
2022-09-16 21:36 ` [PATCH 2/5] io_uring/core: move io_cqe->fd over from io_cqe->flags to io_cqe->res Stefan Metzmacher
2022-09-16 21:36 ` [PATCH 3/5] io_uring/core: keep req->cqe.flags on generic errors Stefan Metzmacher
2022-09-16 21:36 ` [PATCH 4/5] io_uring/net: let io_sendzc set IORING_CQE_F_MORE before sock_sendmsg() Stefan Metzmacher
2022-09-16 21:36 ` [PATCH 5/5] io_uring/notif: let userspace know how effective the zero copy usage was Stefan Metzmacher
2022-09-17 9:22 ` Pavel Begunkov [this message]
2022-09-17 10:24 ` Stefan Metzmacher
2022-09-21 12:04 ` Pavel Begunkov
2022-09-21 12:33 ` Stefan Metzmacher
2022-09-17 9:16 ` [PATCH for-6.0 0/5] IORING_OP_SEND_ZC improvements Pavel Begunkov
2022-09-17 10:44 ` Stefan Metzmacher
2022-09-21 11:39 ` Pavel Begunkov
2022-09-21 12:18 ` Stefan Metzmacher
2022-09-21 12:58 ` Pavel Begunkov
2022-09-18 22:49 ` (subset) " Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5f4059ca-cec6-e44a-ac61-b9c034b1be77@gmail.com \
--to=asml.silence@gmail.com \
--cc=axboe@kernel.dk \
--cc=io-uring@vger.kernel.org \
--cc=kuba@kernel.org \
--cc=metze@samba.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.