qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: qemu-devel@nongnu.org
Cc: peterx@redhat.com, Juraj Marcin <jmarcin@redhat.com>,
	Julia Suvorova <jusual@redhat.com>,
	Prasad Pandit <ppandit@redhat.com>,
	Fabiano Rosas <farosas@suse.de>
Subject: [PATCH 00/16] migration: Switchover phase refactoring
Date: Tue, 14 Jan 2025 18:07:30 -0500	[thread overview]
Message-ID: <20250114230746.3268797-1-peterx@redhat.com> (raw)

CI: https://gitlab.com/peterx/qemu/-/pipelines/1625266692
    (note: warning is present on rust stuff, but shouldn't be relevant)

This series refactors the migration switchover path quite a bit.  I started
this work initially to measure the JSON writer overhead, but then I decided
to cleanup the switchover path in general when I am at it altogether, as I
wanted to do this for a long time.

A few major things I tried to do:

  - About the JSON writer

    Currently, precopy migration always dumps a chunk of data called VM
    description (QEMU_VM_VMDESCRIPTION) for debugging purpose.  That is a
    JSON blob explaining all the vmstates dumped in the migration stream.
    QEMU has a machine property suppress-vmdesc deciding whether migration
    will have that JSON chunk included.

    Postcopy does not have such JSON dump because postcopy is live session
    and it can't normally be debugged from stream level (e.g. as a streamed
    file).

    A tiny problem is we don't yet have a clue on how much cpu cycles we
    need to construct and dump these JSONs even if they're only for
    debugging, and even if suppress-vmdesc=on QEMU will still try to
    construct these JSONs (e.g. also for postcopy).

    This series has a few patches just to make sure the JSON blob won't be
    constructed if not needed (either postcopy, or suppress-vmdesc=on).  I
    tried to measure the downtime diff with/without these changes, the time
    QEMU takes to construct / dump the JSON blob is still not measurable.
    So I suppose unconditionally having this is ok.  Said that, let's still
    have these changes around so we avoid JSON operations if not needed.

  - DEVICE migration state

    QEMU has a very special DEVICE migration state, that only happens with
    precopy, and only when pause-before-switchover capability is enabled.
    Due to that specialty we can't merge precopy and postcopy code on
    switchover starts, because the state machine will be different.

    However after I checked the history and also with libvirt developers,
    this seems unnecessary.  So I had one patch making DEVICE state to be
    the "switchover" phase for precopy/postcopy unconditionally.  That will
    make the state machine much easier for both modes, meanwhile nothing is
    expected to break with it (but please still shoot if anyone knows /
    suspect something will, or could, break..).

  - General cleanups and fixes

    Most of the rest changes are random cleanups and fixes in the
    switchover path.

    E.g., postcopy_start() has some code that isn't easy to read due to
    some special flags here and there, mostly around the two calls of
    qemu_savevm_state_complete_precopy().  This series will remove most of
    those special treatments here and there.

    We could have done something twice in the past in postcopy switchover
    (e.g. I believe we sync CPU twice.. but only happens with postcopy),
    now they should all be sorted out.

    And quite some other things hopefully can be separately discussed and
    justified in each patch.  After these cleanups, we will be able to have
    an unified entrance for precopy/postcopy on switchover.

Initially I thought this could optimize the downtime slightly, but after
some tests, it turns out there's no measureable difference, at least in my
current setup... So let's take this as a cleanup series at least for now,
and I hope they would still make some sense.  Comments welcomed.

Thanks,

Peter Xu (16):
  migration: Remove postcopy implications in should_send_vmdesc()
  migration: Do not construct JSON description if suppressed
  migration: Optimize postcopy on downtime by avoiding JSON writer
  migration: Avoid two src-downtime-end tracepoints for postcopy
  migration: Drop inactivate_disk param in qemu_savevm_state_complete*
  migration: Synchronize all CPU states only for non-iterable dump
  migration: Adjust postcopy bandwidth during switchover
  migration: Adjust locking in migration_maybe_pause()
  migration: Drop cached migration state in migration_maybe_pause()
  migration: Take BQL slightly longer in postcopy_start()
  migration: Notify COMPLETE once for postcopy
  migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy
  migration: Cleanup qemu_savevm_state_complete_precopy()
  migration: Always set DEVICE state
  migration: Merge precopy/postcopy on switchover start
  migration: Trivial cleanup on JSON writer of vmstate_save()

 qapi/migration.json         |   7 +-
 migration/migration.h       |   1 +
 migration/savevm.h          |   6 +-
 migration/migration.c       | 209 +++++++++++++++++++++++-------------
 migration/savevm.c          | 116 ++++++++------------
 migration/vmstate.c         |   6 +-
 tests/qtest/libqos/libqos.c |   3 +-
 migration/trace-events      |   2 +-
 tests/qemu-iotests/194.out  |   1 +
 tests/qemu-iotests/203.out  |   1 +
 tests/qemu-iotests/234.out  |   2 +
 tests/qemu-iotests/262.out  |   1 +
 tests/qemu-iotests/280.out  |   1 +
 13 files changed, 200 insertions(+), 156 deletions(-)

-- 
2.47.0



             reply	other threads:[~2025-01-14 23:10 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-14 23:07 Peter Xu [this message]
2025-01-14 23:07 ` [PATCH 01/16] migration: Remove postcopy implications in should_send_vmdesc() Peter Xu
2025-01-14 23:07 ` [PATCH 02/16] migration: Do not construct JSON description if suppressed Peter Xu
2025-01-14 23:07 ` [PATCH 03/16] migration: Optimize postcopy on downtime by avoiding JSON writer Peter Xu
2025-01-14 23:07 ` [PATCH 04/16] migration: Avoid two src-downtime-end tracepoints for postcopy Peter Xu
2025-01-14 23:07 ` [PATCH 05/16] migration: Drop inactivate_disk param in qemu_savevm_state_complete* Peter Xu
2025-01-14 23:07 ` [PATCH 06/16] migration: Synchronize all CPU states only for non-iterable dump Peter Xu
2025-01-14 23:07 ` [PATCH 07/16] migration: Adjust postcopy bandwidth during switchover Peter Xu
2025-01-14 23:07 ` [PATCH 08/16] migration: Adjust locking in migration_maybe_pause() Peter Xu
2025-01-14 23:07 ` [PATCH 09/16] migration: Drop cached migration state " Peter Xu
2025-01-14 23:07 ` [PATCH 10/16] migration: Take BQL slightly longer in postcopy_start() Peter Xu
2025-01-14 23:07 ` [PATCH 11/16] migration: Notify COMPLETE once for postcopy Peter Xu
2025-01-14 23:07 ` [PATCH 12/16] migration: Unwrap qemu_savevm_state_complete_precopy() in postcopy Peter Xu
2025-01-14 23:07 ` [PATCH 13/16] migration: Cleanup qemu_savevm_state_complete_precopy() Peter Xu
2025-01-14 23:07 ` [PATCH 14/16] migration: Always set DEVICE state Peter Xu
2025-01-14 23:07 ` [PATCH 15/16] migration: Merge precopy/postcopy on switchover start Peter Xu
2025-01-14 23:07 ` [PATCH 16/16] migration: Trivial cleanup on JSON writer of vmstate_save() Peter Xu
2025-01-15  9:12 ` [PATCH 00/16] migration: Switchover phase refactoring Jiri Denemark
2025-01-15 12:55   ` Peter Xu
2025-01-15 16:13 ` Juraj Marcin
2025-01-15 16:49 ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250114230746.3268797-1-peterx@redhat.com \
    --to=peterx@redhat.com \
    --cc=farosas@suse.de \
    --cc=jmarcin@redhat.com \
    --cc=jusual@redhat.com \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).