qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: Peter Xu <peterx@redhat.com>,
	Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>,
	Dhruv Choudhary <dhruv.choudhary@nutanix.com>,
	qemu-devel@nongnu.org
Subject: Re: [PATCH] Improve error propagation via return path
Date: Wed, 22 Oct 2025 09:32:23 +0100	[thread overview]
Message-ID: <aPiWl39eLOfBJQ1n@redhat.com> (raw)
In-Reply-To: <87tszst2so.fsf@suse.de>

On Tue, Oct 21, 2025 at 05:31:19PM -0300, Fabiano Rosas wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Tue, Oct 21, 2025 at 05:54:09PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> >> On 21.10.25 17:34, Peter Xu wrote:
> >> > On Tue, Oct 21, 2025 at 07:52:53AM +0000, Dhruv Choudhary wrote:
> >> > > Use the return-path thread to send error details from the
> >> > > destination to the source on a migration failure. Management
> >> > > applications can then query the source QEMU for errors, as
> >> > > the single source of truth, making failures easy to trace.
> >> > > 
> >> > > Signed-off-by: Dhruv Choudhary <dhruv.choudhary@nutanix.com>
> >> > 
> >> > +Vladimir, Dan
> >> > 
> >> > IIUC we may still need to know whether the src QEMU supports this message
> >> > or not.
> >> > 
> >> > OTOH, we have introduced exit-on-error since 9.1:
> >> > 
> >> > # @exit-on-error: Exit on incoming migration failure.  Default true.
> >> > #     When set to false, the failure triggers a :qapi:event:`MIGRATION`
> >> > #     event, and error details could be retrieved with `query-migrate`.
> >> > #     (since 9.1)
> >> > 
> >> > This patch is going the other way.  That feature suggests the mgmt query
> >> > the error from dest directly.
> >> > 
> >> > We should stick with one plan rather than doing both.
> >> > 
> >> 
> >> Why?
> >> 
> >> exit-on-error=false is good anyway: when QMP connection is established, the
> >> management of target QEMU process is the same: we do call qmp commands to
> >> add devices, etc. We get QMP events. Actually, exiting is unexpected, better
> >> to fit into QMP protocol, continuing to send events and wait for qmp quit
> >> to exit.
> >> 
> >> Passing error back to the source simply improves error message on source,
> >> which otherwise is often confusing.
> >> 
> >> Using both, we of course see same error in two places.. But we do have two
> >> QEMU processes, which both handled by on-host managing services. We should
> >> correctly report error on both parts anyway.
> >> 
> >> Improving error messages on source is just and improvement, which makes
> >> current behavior better (with or without exit-on-error=false).
> >> 
> >> Removing exit-on-error=false semantics (with or without passing errors back)
> >> would be a step backward, to violating of QMP protocol by unexpected exits.
> >
> > I didn't mean to propose removing exit-on-error, what I meant is when with
> > it this patch doesn't look like helpful.
> >
> > Has libvirt been integrated with exit-on-error?  If so, IMHO we don't need
> > this patch anymore.  To me it's not an improvement when with exit-on-error,
> > because duplicating the error from dest to src makes it harder to know
> > where the error happened.
> 
> Yeah, this does introduce some complexity of the "whose error is this?"
> kind. I can imagine future users of migrate_has_error() having to handle
> the error differently whether it came from this machine or the remote
> one. Maybe with current code there's no issue, but we need to think from
> a design perspective. Another point is whether the source machine is
> always prepared to see an error that has nothing to do with its own
> operation as it usually gets to know about a destination error only when
> TCP connections start to fail.
> 
> That said, from a usability perspective, I'm in favor of having the
> source machine be able to inform the user about the destination
> machine's error. It goes in the direction of relying less on the
> management layer, which we already agree might be a good idea.

Should we neccessarily assume that target machine's error is the "best"
error message ?  Failures can result in errors being raised on both the
source and dest, and it is not clearcut which side will have the root
cause error, and which will just have a side effect error. If we pass
the target error back to the source, we need to ensure that we don't
replace a better error that the source already has.

Allowing use of 'query-migrate' to fetch errors on both the source and
dest means mgmt apps have both errors available, but that does then
mean the mgmt app needs to decide which error is "best".

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



      parent reply	other threads:[~2025-10-22  8:33 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-21  7:52 [PATCH] Improve error propagation via return path Dhruv Choudhary
2025-10-21 14:34 ` Peter Xu
2025-10-21 14:54   ` Vladimir Sementsov-Ogievskiy
2025-10-21 15:24     ` Peter Xu
2025-10-21 20:31       ` Fabiano Rosas
2025-10-21 21:18         ` Peter Xu
2025-10-22  8:32         ` Daniel P. Berrangé [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPiWl39eLOfBJQ1n@redhat.com \
    --to=berrange@redhat.com \
    --cc=dhruv.choudhary@nutanix.com \
    --cc=farosas@suse.de \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).