From: Peter Xu <peterx@redhat.com>
To: Tejus GK <tejus.gk@nutanix.com>
Cc: qemu-devel@nongnu.org, "Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH v3 1/1] io: invert the return semantics of qio_channel_flush
Date: Wed, 25 Mar 2026 11:34:43 -0400 [thread overview]
Message-ID: <acQAk_bdbb1rSmw_@x1.local> (raw)
In-Reply-To: <20260320083950.262119-1-tejus.gk@nutanix.com>
On Fri, Mar 20, 2026 at 08:39:41AM +0000, Tejus GK wrote:
> With the kernel's zerocopy notification mechanism, the caller can
> determine whether
> * All syscalls successfully used zero copy
> * At least one syscall failed to use zero copy
>
> But, as of now QEMU's IO channel flush function semantics are like
> * 1 => all syscalls failed to use zero copy
> * 0 => at least one syscall successfully used zero copy
>
> This is not aligned with what the kernel reports, and ends up reporting
> false negatives for cases like when there's just a single successful
> zerocopy amongst a collection of deferred zero-copies during a flush.
>
> Fix this by inverting the return semantics of the IO flush function.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Tejus GK <tejus.gk@nutanix.com>
Looks good here, thanks.
One trivial thing to mention is the stat exported in QAPI will still be
some "misterious counters":
# @dirty-sync-missed-zero-copy: Number of times dirty RAM
# synchronization could not avoid copying dirty pages. This is
# between 0 and @dirty-sync-count * @multifd-channels.
# (since 7.1)
It could be a bool too if to match with the current concept of "whether any
fallback happened" and if we want to strictly avoid any magical counter
(currently the counter is neither per IO or per request), but personally
I'm fine either way. I don't think anyone will seriously consume this
counter, so maybe no reason to bother at all.
PS: above "between 0 and @dirty-sync-count * @multifd-channels" is wrong, I
believe it came from multifd_ram_flush_and_sync() invoking flush for each
channels, but we may not always invoke it per-dirty-sync. But since
there's no good way to define this counter, I don't have a good way to fix
it either if without converting it to a bool.
> ---
> include/io/channel-socket.h | 6 +-----
> include/io/channel.h | 4 ++--
> io/channel-socket.c | 16 ++++++++--------
> 3 files changed, 11 insertions(+), 15 deletions(-)
>
> diff --git a/include/io/channel-socket.h b/include/io/channel-socket.h
> index a1ef3136ea..b07cd61477 100644
> --- a/include/io/channel-socket.h
> +++ b/include/io/channel-socket.h
> @@ -50,11 +50,7 @@ struct QIOChannelSocket {
> ssize_t zero_copy_queued;
> ssize_t zero_copy_sent;
> bool blocking;
> - /**
> - * This flag indicates whether any new data was successfully sent with
> - * zerocopy since the last qio_channel_socket_flush() call.
> - */
> - bool new_zero_copy_sent_success;
> + bool zero_copy_fallback;
> };
>
>
> diff --git a/include/io/channel.h b/include/io/channel.h
> index 1b02350437..c357da5460 100644
> --- a/include/io/channel.h
> +++ b/include/io/channel.h
> @@ -1014,8 +1014,8 @@ int coroutine_mixed_fn qio_channel_writev_full_all(QIOChannel *ioc,
> * If not implemented, acts as a no-op, and returns 0.
> *
> * Returns -1 if any error is found,
> - * 1 if every send failed to use zero copy.
> - * 0 otherwise.
> + * 1 if at least one send failed to use zero copy.
> + * 0 if every send successfully used zero copy.
> */
>
> int qio_channel_flush(QIOChannel *ioc,
> diff --git a/io/channel-socket.c b/io/channel-socket.c
> index 3053b35ad8..ea2ec84108 100644
> --- a/io/channel-socket.c
> +++ b/io/channel-socket.c
> @@ -72,7 +72,7 @@ qio_channel_socket_new(void)
> sioc->zero_copy_queued = 0;
> sioc->zero_copy_sent = 0;
> sioc->blocking = false;
> - sioc->new_zero_copy_sent_success = false;
> + sioc->zero_copy_fallback = false;
>
> ioc = QIO_CHANNEL(sioc);
> qio_channel_set_feature(ioc, QIO_CHANNEL_FEATURE_SHUTDOWN);
> @@ -880,9 +880,9 @@ static int qio_channel_socket_flush_internal(QIOChannel *ioc,
> /* No errors, count successfully finished sendmsg()*/
> sioc->zero_copy_sent += serr->ee_data - serr->ee_info + 1;
>
> - /* If any sendmsg() succeeded using zero copy, mark zerocopy success */
> - if (serr->ee_code != SO_EE_CODE_ZEROCOPY_COPIED) {
> - sioc->new_zero_copy_sent_success = true;
> + if (serr->ee_code == SO_EE_CODE_ZEROCOPY_COPIED) {
> + /* If any sendmsg() fell back to a copy, mark fallback as true */
> + sioc->zero_copy_fallback = true;
> }
> }
>
> @@ -900,12 +900,12 @@ static int qio_channel_socket_flush(QIOChannel *ioc,
> return ret;
> }
>
> - if (sioc->new_zero_copy_sent_success) {
> - sioc->new_zero_copy_sent_success = false;
> - return 0;
> + if (sioc->zero_copy_fallback) {
> + sioc->zero_copy_fallback = false;
> + return 1;
> }
>
> - return 1;
> + return 0;
> }
>
> #endif /* QEMU_MSG_ZEROCOPY */
> --
> 2.43.7
>
--
Peter Xu
prev parent reply other threads:[~2026-03-25 15:35 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-20 8:39 [PATCH v3 1/1] io: invert the return semantics of qio_channel_flush Tejus GK
2026-03-25 15:34 ` Peter Xu [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acQAk_bdbb1rSmw_@x1.local \
--to=peterx@redhat.com \
--cc=berrange@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=tejus.gk@nutanix.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox