From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
Thomas Huth <thuth@redhat.com>,
qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH v6 0/6] migration/postcopy: Sync faulted addresses after network recovered
Date: Mon, 26 Oct 2020 13:57:56 +0000 [thread overview]
Message-ID: <20201026135756.GC28658@work-vm> (raw)
In-Reply-To: <20201021212721.440373-1-peterx@redhat.com>
* Peter Xu (peterx@redhat.com) wrote:
> v6:
> - fix page mask to use ramblock psize [Dave]
>
> v5:
> - added one test patch for easier debugging for migration-test
> - added one fix patch [1] for another postcopy race
> - fixed a bug that could trigger when host/guest page size differs
>
> v4:
> - use "void */ulong" instead of "uint64_t" where proper in patch 3/4 [Dave]
>
> v3:
> - fix build on 32bit hosts & rebase
> - remove r-bs for the last 2 patches for Dave due to the changes
>
> v2:
> - add r-bs for Dave
> - add patch "migration: Properly destroy variables on incoming side" as patch 1
> - destroy page_request_mutex in migration_incoming_state_destroy() too [Dave]
> - use WITH_QEMU_LOCK_GUARD in two places where we can [Dave]
>
> We've seen conditional guest hangs on destination VM after postcopy recovered.
> However the hang will resolve itself after a few minutes.
>
> The problem is: after a postcopy recovery, the prioritized postcopy queue on
> the source VM is actually missing. So all the faulted threads before the
> postcopy recovery happened will keep halted until (accidentally) the page got
> copied by the background precopy migration stream.
>
> The solution is to also refresh this information after postcopy recovery. To
> achieve this, we need to maintain a list of faulted addresses on the
> destination node, so that we can resend the list when necessary. This work is
> done via patch 2-5.
>
> With that, the last thing we need to do is to send this extra information to
> source VM after recovered. Very luckily, this synchronization can be
> "emulated" by sending a bunch of page requests (although these pages have been
> sent previously!) to source VM just like when we've got a page fault. Even in
> the 1st version of the postcopy code we'll handle duplicated pages well. So
> this fix does not even need a new capability bit and it'll work smoothly on old
> QEMUs when we migrate from them to the new QEMUs.
>
> Please review, thanks.
Queued
Dave
>
> Peter Xu (6):
> migration: Pass incoming state into qemu_ufd_copy_ioctl()
> migration: Introduce migrate_send_rp_message_req_pages()
> migration: Maintain postcopy faulted addresses
> migration: Sync requested pages after postcopy recovery
> migration/postcopy: Release fd before going into 'postcopy-pause'
> migration-test: Only hide error if !QTEST_LOG
>
> migration/migration.c | 55 ++++++++++++++++++++++++++++++----
> migration/migration.h | 21 ++++++++++++-
> migration/postcopy-ram.c | 25 ++++++++++++----
> migration/savevm.c | 57 ++++++++++++++++++++++++++++++++++++
> migration/trace-events | 3 ++
> tests/qtest/migration-test.c | 6 +++-
> 6 files changed, 154 insertions(+), 13 deletions(-)
>
> --
> 2.26.2
>
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
prev parent reply other threads:[~2020-10-26 14:17 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-21 21:27 [PATCH v6 0/6] migration/postcopy: Sync faulted addresses after network recovered Peter Xu
2020-10-21 21:27 ` [PATCH v6 1/6] migration: Pass incoming state into qemu_ufd_copy_ioctl() Peter Xu
2020-10-21 21:27 ` [PATCH v6 2/6] migration: Introduce migrate_send_rp_message_req_pages() Peter Xu
2020-10-21 21:27 ` [PATCH v6 3/6] migration: Maintain postcopy faulted addresses Peter Xu
2020-10-22 16:46 ` Dr. David Alan Gilbert
2020-10-21 21:27 ` [PATCH v6 4/6] migration: Sync requested pages after postcopy recovery Peter Xu
2020-10-21 21:27 ` [PATCH v6 5/6] migration/postcopy: Release fd before going into 'postcopy-pause' Peter Xu
2020-10-21 21:27 ` [PATCH v6 6/6] migration-test: Only hide error if !QTEST_LOG Peter Xu
2020-10-22 5:34 ` Thomas Huth
2020-10-26 13:57 ` Dr. David Alan Gilbert [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201026135756.GC28658@work-vm \
--to=dgilbert@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).