From: Peter Xu <peterx@redhat.com>
To: Fabiano Rosas <farosas@suse.de>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH 2/6] migration: Kick postcopy threads on cancel
Date: Thu, 5 Dec 2024 10:03:58 -0500 [thread overview]
Message-ID: <Z1HA3hk8XSnOFv6s@x1n> (raw)
In-Reply-To: <87bjxqi7ya.fsf@suse.de>
On Thu, Dec 05, 2024 at 10:18:53AM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
>
> > On Wed, Dec 04, 2024 at 03:51:27PM -0500, Peter Xu wrote:
> >> On Wed, Dec 04, 2024 at 08:02:31PM +0000, Daniel P. Berrangé wrote:
> >> > I would say the difference is like a graceful shutdown vs pulling the
> >> > power plug in a bare metal machine
> >> >
> >> > 'cancel' is intended to be graceful. It should leave you with a functional
> >> > QEMU (or refuse to run if unsafe).
> >> >
> >> > 'yank' is intended to be forceful, letting you get out of bad situations
> >> > that would otherwise require you to kill the entire QEMU process, but
> >> > still with possible associated risk data loss to the QEMU backends.
> >> >
> >> > They have overlap, but are none the less different.
> >>
> >> The question is more about whether yank should be used at all for
> >> migration only, not about the rest instances.
> >>
> >> My answer is yank should never be used for migration, because
> >> "migrate_cancel" also unplugs the power plug.. It's not anything more
> >> enforced. It's only doing less always.
> >>
> >> E.g. migration_yank_iochannel() is exactly what we do with
> >> qmp_migrate_cancel() in the first place, only that migrate_cancel only does
> >> it on the main channel (on both qemufiles even if ioc is one), however it
> >> should be suffice, and behave the same way, as strong as "yank".
> >
> > I recall at the time the yank stuff was introduced, one of the scenarios
> > they were concerned about was related to locks held by QEMU code. eg that
> > there are scenarios where migrate_cancel may not be processed promptly
> > enough due to being stalled on mutexes held by other concurrently running
> > threads. Now I would expect any such long duration stalls on migration
> > mutexes to be bugs, but the intent of yank is to give a recovery mechanism
> > that can workaround such bugs. The yank QMP command only interacts with
> > its own local mutexes.
>
> Ok, so that could only mean a thread stuck in recv() while holding the
> BQL. I don't think we have any other locks which would stop
> migrate_cancel from making progress or other stall situations that could
> be helped by a shutdown(). Note that most of locks around qemu_file were
> a late addition. I don't think that scenario is possible today. I'll
> have to do some tests.
And if that is a real difference, I'd think whether we should simply make
migrate_cancel be oob-capable too.. IOW, I still think it'll be good to
stick with always one API to cancel a migration, no matter which it is. If
we want to move over to yank then I think we should move all migrate_cancel
operations into yank and deprecate "migrate_cancel', but that sounds
overkill.
There's only one thing that might not be oob-compatible there so far, which
is bdrv_activate_all(). But I plan to remove it very soon (so that disks
will be activated in the migration thread instead, just like failure cases).
>
> On that note, how is yank supposed to be accessed? I don't see support
> in libvirt. Is there a way to hook into QMP after the fact somehow?
>
--
Peter Xu
next prev parent reply other threads:[~2024-12-05 15:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-02 22:01 [PATCH 0/6] migration: Fix issues during qmp_migrate_cancel Fabiano Rosas
2024-12-02 22:01 ` [PATCH 1/6] tests/qtest/migration: Introduce migration_test_add_suffix Fabiano Rosas
2024-12-04 18:44 ` Peter Xu
2024-12-02 22:01 ` [PATCH 2/6] migration: Kick postcopy threads on cancel Fabiano Rosas
2024-12-04 18:39 ` Peter Xu
2024-12-04 19:02 ` Fabiano Rosas
2024-12-04 19:39 ` Peter Xu
2024-12-04 20:02 ` Daniel P. Berrangé
2024-12-04 20:40 ` Fabiano Rosas
2024-12-04 20:59 ` Peter Xu
2024-12-04 20:51 ` Peter Xu
2024-12-04 21:01 ` Fabiano Rosas
2024-12-04 21:11 ` Peter Xu
2024-12-05 10:52 ` Daniel P. Berrangé
2024-12-05 13:18 ` Fabiano Rosas
2024-12-05 15:03 ` Peter Xu [this message]
2024-12-05 15:44 ` Daniel P. Berrangé
2024-12-05 15:40 ` Daniel P. Berrangé
2024-12-02 22:01 ` [PATCH 3/6] migration: Fix postcopy listen thread exit Fabiano Rosas
2024-12-02 22:01 ` [PATCH 4/6] migration: Make sure postcopy recovery doesn't hang when cancelling Fabiano Rosas
2024-12-02 22:01 ` [PATCH 5/6] migration: Fix hang after error in destination setup phase Fabiano Rosas
2024-12-04 18:44 ` Peter Xu
2024-12-04 19:05 ` Fabiano Rosas
2024-12-02 22:01 ` [PATCH 6/6] tests/qtest/migration: Add a cancel test Fabiano Rosas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1HA3hk8XSnOFv6s@x1n \
--to=peterx@redhat.com \
--cc=berrange@redhat.com \
--cc=farosas@suse.de \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.