qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] migration: Introduce POSTCOPY_DEVICE state
@ 2025-09-15 11:59 Juraj Marcin
  2025-09-15 11:59 ` [PATCH 1/4] migration: Do not try to start VM if disk activation fails Juraj Marcin
                   ` (3 more replies)
  0 siblings, 4 replies; 33+ messages in thread
From: Juraj Marcin @ 2025-09-15 11:59 UTC (permalink / raw)
  To: qemu-devel
  Cc: Juraj Marcin, Jiri Denemark, Peter Xu, Dr. David Alan Gilbert,
	Fabiano Rosas

This series is a continuation of the following RFC series and its
discussion [1].

[1]: https://lore.kernel.org/all/20250807114922.1013286-1-jmarcin@redhat.com/

This series takes a different approach to source side recoverability
than the original RFC series, it uses existing PING/PONG message types.
Although, such approach has some theoretical race conditions, when
discussed we came to a conclusion that in practice there is a very, very
slim chance if any for it to happen. On the other hand, this approach
doesn't require any changes in the migration protocol nor the
destination side QEMU instance to be functional.

In preparation for the state introduction, this series contains few
changes.

First, it includes a patch suggested by Peter, which adds a check to
block device activation when the source side tries to resume after a
failed migration.

Next, it refactors cleanup and error handling on the destination side.
This change is not strictly necessary for the feature to work. Without
this patch, if device state load failed, the destination QEMU would
either exit with an error exit code from the listen thread, or it might
crash if the main thread does some cleanup before the listen thread
exits the process. However, the source side can recover regardless of
how the destination side fails.

Finally, the last patch contains the main feature, the POSTCOPY_DEVICE
state. Compared to the approach discussed in the RFC, it uses a new PING
message with custom PING number. The reason behind that is, that the
PING 3 message is now sent only when postcopy-ram is active, but there
might be postcopy scenarios when this isn't true. The destination side
can respond to this new PING message without any changes required.

As this change introduces a new migration state, I have also tested it
with libvirt. Apart from a warning about an unknown migration state
received in an event, migration finishes without any issues.

Juraj Marcin (3):
  migration: Accept MigrationStatus in migration_has_failed()
  migration: Refactor incoming cleanup into migration_incoming_finish()
  migration: Introduce POSTCOPY_DEVICE state

Peter Xu (1):
  migration: Do not try to start VM if disk activation fails

 migration/migration.c                 | 124 +++++++++++++++++---------
 migration/migration.h                 |   3 +-
 migration/multifd.c                   |   2 +-
 migration/savevm.c                    |  48 ++++------
 migration/savevm.h                    |   2 +
 migration/trace-events                |   1 +
 qapi/migration.json                   |   8 +-
 tests/qtest/migration/precopy-tests.c |   3 +-
 8 files changed, 112 insertions(+), 79 deletions(-)

-- 
2.51.0



^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-10-02 13:15 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-15 11:59 [PATCH 0/4] migration: Introduce POSTCOPY_DEVICE state Juraj Marcin
2025-09-15 11:59 ` [PATCH 1/4] migration: Do not try to start VM if disk activation fails Juraj Marcin
2025-09-19 16:12   ` Fabiano Rosas
2025-09-15 11:59 ` [PATCH 2/4] migration: Accept MigrationStatus in migration_has_failed() Juraj Marcin
2025-09-19 14:57   ` Peter Xu
2025-09-22 11:26     ` Juraj Marcin
2025-09-15 11:59 ` [PATCH 3/4] migration: Refactor incoming cleanup into migration_incoming_finish() Juraj Marcin
2025-09-19 15:53   ` Peter Xu
2025-09-19 16:46   ` Fabiano Rosas
2025-09-22 12:58     ` Juraj Marcin
2025-09-22 15:51       ` Peter Xu
2025-09-22 17:40         ` Fabiano Rosas
2025-09-22 17:48           ` Peter Xu
2025-09-23 14:58         ` Juraj Marcin
2025-09-23 16:17           ` Peter Xu
2025-09-15 11:59 ` [PATCH 4/4] migration: Introduce POSTCOPY_DEVICE state Juraj Marcin
2025-09-19 16:58   ` Peter Xu
2025-09-19 17:50     ` Peter Xu
2025-09-22 13:34       ` Juraj Marcin
2025-09-22 16:16         ` Peter Xu
2025-09-23 14:23           ` Juraj Marcin
2025-09-25 11:54   ` Jiří Denemark
2025-09-25 18:22     ` Peter Xu
2025-09-30  7:53       ` Jiří Denemark
2025-09-30 20:04         ` Peter Xu
2025-10-01  8:43           ` Jiří Denemark
2025-10-01 11:05             ` Dr. David Alan Gilbert
2025-10-01 14:26               ` Jiří Denemark
2025-10-01 15:53                 ` Dr. David Alan Gilbert
2025-10-01 15:10               ` Daniel P. Berrangé
2025-10-02 12:17                 ` Jiří Denemark
2025-10-02 13:12                   ` Dr. David Alan Gilbert
2025-10-01 10:09           ` Juraj Marcin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).