All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Tejus GK <tejus.gk@nutanix.com>
Cc: Peter Xu <peterx@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Fabiano Rosas <farosas@suse.de>, Eric Blake <eblake@redhat.com>,
	Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH v2 1/1] io: make zerocopy fallback accounting more accurate
Date: Mon, 9 Mar 2026 17:51:29 +0000	[thread overview]
Message-ID: <aa8IoQOZHzQvacSh@redhat.com> (raw)
In-Reply-To: <0DF1A5F6-E20D-4A3F-9285-9205E87DE641@nutanix.com>

On Mon, Mar 09, 2026 at 05:42:08PM +0000, Tejus GK wrote:
> 
> 
> > On 9 Mar 2026, at 10:47 PM, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > !-------------------------------------------------------------------|
> >  CAUTION: External Email
> > 
> > |-------------------------------------------------------------------!
> > 
> > On Mon, Mar 09, 2026 at 12:59:44PM -0400, Peter Xu wrote:
> >> On Mon, Mar 09, 2026 at 04:48:37PM +0000, Daniel P. Berrangé wrote:
> >>>> @@ -881,8 +881,8 @@ static int qio_channel_socket_flush_internal(QIOChannel *ioc,
> >>>>         sioc->zero_copy_sent += serr->ee_data - serr->ee_info + 1;
> >>>> 
> >>>>         /* If any sendmsg() succeeded using zero copy, mark zerocopy success */
> >>>> -        if (serr->ee_code != SO_EE_CODE_ZEROCOPY_COPIED) {
> >>>> -            sioc->new_zero_copy_sent_success = true;
> >>>> +        if (serr->ee_code == SO_EE_CODE_ZEROCOPY_COPIED) {
> >>>> +            sioc->zero_copy_fallback++;
> >>> 
> >>> ...this is counting the number of MSG_ERRQUEUE items, which is not
> >>> the same as the number of IO requests. That's why we only used it
> >>> as a boolean marker originally, rather than making it a counter.
> >> 
> >> Would the logic still work and better than before?  Say, it's a counter of
> >> "messages" rather than "IOs" then.
> > 
> > IIUC it is a counter of processing notifications which is not directly
> > correlated to any action by QEMU - neither bytes nor syscalls.
> 
> Please correct me if I'm wrong about this, isn’t each notification an information 
> about what happened to an individual IO?

If userspace hasn't read a queued notification yet, the kernel will
merge new notifications with the existing queued one.

The line above your change

  serr->ee_data - serr->ee_info + 1;

records how many notifications were merged, so we now how many
syscalls were processed.

If ee_code is  SO_EE_CODE_ZEROCOPY_COPIED though it means at least
one syscall resulted in a copy, but that doesn't imply that *all*
syscalls resulted in a copy.

AFAICT, it could be 1 out of a 1000 syscalls resulted in a copy,
or it could be 1000 out of 1000 resulted in a copy. We don't know.

IIUC the kernel's merging of notifications appears lossy wrt this
information. It could be partially mitigated by doing a flush for
notifications really really frequently but that feels like it would
have its own downsides


> >> The problem with the old code was we may report fallback=0 even if there
> >> can have fallback happened, as we mask that fact as long as one zerocopy
> >> happened in the whole batch between two flushes.  So it seems this (even if
> >> the counter is not per-IO) is still better.
> > 
> > Better for what purpose though ?
> > 
> > If we enabled zero-copy, it is useful to know if /something/ managed
> > to benefit from zero-copy. ie if /always/ fails to zero-copy then
> > we can diagnose that the NIC driver isn't capable of it, or there
> > is some other limitation.  If something manages to zero-copy, then
> > we know the feature is functionally working.
> > 
> > What will we do with a count of notificaitons ?
> 
> I was wondering if it can be useful for debugging live migration issues where zerocopy is 
> enabled. For instance, let’s say a zerocopy write failed due to the socket error queue being
> full. Now it could be either due to the out of order processing we had seen before 
> (https://github.com/qemu/qemu/commit/84005f4a2b8745e5934f955c045a0b4311cd0992) or 
> due to it getting filled up because some copies getting deferred. For the latter, this stat can be
> worthwile as a debugging stat. 



With regards,
Daniel
-- 
|: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
|: https://libvirt.org          ~~          https://entangle-photo.org :|
|: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|



  reply	other threads:[~2026-03-09 17:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-09  9:09 [PATCH v2 1/1] io: make zerocopy fallback accounting more accurate Tejus GK
2026-03-09 16:48 ` Daniel P. Berrangé
2026-03-09 16:59   ` Peter Xu
2026-03-09 17:17     ` Daniel P. Berrangé
2026-03-09 17:42       ` Tejus GK
2026-03-09 17:51         ` Daniel P. Berrangé [this message]
2026-03-09 18:21           ` Peter Xu
2026-03-11 12:02             ` Daniel P. Berrangé
2026-03-11 15:30               ` Peter Xu
2026-03-11 16:56                 ` Daniel P. Berrangé
2026-03-11 17:28                   ` Peter Xu
2026-03-11 17:46                     ` Daniel P. Berrangé
2026-03-11 18:43                       ` Peter Xu
2026-03-16 16:26                         ` Tejus GK
2026-03-10  8:52 ` Markus Armbruster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa8IoQOZHzQvacSh@redhat.com \
    --to=berrange@redhat.com \
    --cc=armbru@redhat.com \
    --cc=eblake@redhat.com \
    --cc=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tejus.gk@nutanix.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.