From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Manish Mishra <manish.mishra@nutanix.com>,
Juan Quintela <quintela@redhat.com>,
ani@anisinha.ca,
Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
"Daniel P . Berrange" <berrange@redhat.com>
Subject: Re: [PATCH 12/14] migration: Send requested page directly in rp-return thread
Date: Thu, 6 Oct 2022 18:51:30 +0100 [thread overview]
Message-ID: <Yz8VoviZmNeSAgWu@work-vm> (raw)
In-Reply-To: <20220920225225.49105-1-peterx@redhat.com>
* Peter Xu (peterx@redhat.com) wrote:
> With all the facilities ready, send the requested page directly in the
> rp-return thread rather than queuing it in the request queue, if and only
> if postcopy preempt is enabled. It can achieve so because it uses separate
> channel for sending urgent pages. The only shared data is bitmap and it's
> protected by the bitmap_mutex.
>
> Note that since we're moving the ownership of the urgent channel from the
> migration thread to rp thread it also means the rp thread is responsible
> for managing the qemufile, e.g. properly close it when pausing migration
> happens. For this, let migration_release_from_dst_file to cover shutdown
> of the urgent channel too, renaming it as migration_release_dst_files() to
> better show what it does.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
Yes, getting a bit complex, but I think OK.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/migration.c | 35 +++++++------
> migration/ram.c | 112 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 131 insertions(+), 16 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 0eacc0c99b..fae8fd378b 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2845,8 +2845,11 @@ static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
> return 0;
> }
>
> -/* Release ms->rp_state.from_dst_file in a safe way */
> -static void migration_release_from_dst_file(MigrationState *ms)
> +/*
> + * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if
> + * existed) in a safe way.
> + */
> +static void migration_release_dst_files(MigrationState *ms)
> {
> QEMUFile *file;
>
> @@ -2859,6 +2862,18 @@ static void migration_release_from_dst_file(MigrationState *ms)
> ms->rp_state.from_dst_file = NULL;
> }
>
> + /*
> + * Do the same to postcopy fast path socket too if there is. No
> + * locking needed because this qemufile should only be managed by
> + * return path thread.
> + */
> + if (ms->postcopy_qemufile_src) {
> + migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src);
> + qemu_file_shutdown(ms->postcopy_qemufile_src);
> + qemu_fclose(ms->postcopy_qemufile_src);
> + ms->postcopy_qemufile_src = NULL;
> + }
> +
> qemu_fclose(file);
> }
>
> @@ -3003,7 +3018,7 @@ out:
> * Maybe there is something we can do: it looks like a
> * network down issue, and we pause for a recovery.
> */
> - migration_release_from_dst_file(ms);
> + migration_release_dst_files(ms);
> rp = NULL;
> if (postcopy_pause_return_path_thread(ms)) {
> /*
> @@ -3021,7 +3036,7 @@ out:
> }
>
> trace_source_return_path_thread_end();
> - migration_release_from_dst_file(ms);
> + migration_release_dst_files(ms);
> rcu_unregister_thread();
> return NULL;
> }
> @@ -3544,18 +3559,6 @@ static MigThrError postcopy_pause(MigrationState *s)
> qemu_file_shutdown(file);
> qemu_fclose(file);
>
> - /*
> - * Do the same to postcopy fast path socket too if there is. No
> - * locking needed because no racer as long as we do this before setting
> - * status to paused.
> - */
> - if (s->postcopy_qemufile_src) {
> - migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
> - qemu_file_shutdown(s->postcopy_qemufile_src);
> - qemu_fclose(s->postcopy_qemufile_src);
> - s->postcopy_qemufile_src = NULL;
> - }
> -
> migrate_set_state(&s->state, s->state,
> MIGRATION_STATUS_POSTCOPY_PAUSED);
>
> diff --git a/migration/ram.c b/migration/ram.c
> index fdcb61a2c8..fd301d793c 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -539,6 +539,8 @@ static QemuThread *decompress_threads;
> static QemuMutex decomp_done_lock;
> static QemuCond decomp_done_cond;
>
> +static int ram_save_host_page_urgent(PageSearchStatus *pss);
> +
> static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block,
> ram_addr_t offset, uint8_t *source_buf);
>
> @@ -553,6 +555,16 @@ static void pss_init(PageSearchStatus *pss, RAMBlock *rb, ram_addr_t page)
> pss->complete_round = false;
> }
>
> +/*
> + * Check whether two PSSs are actively sending the same page. Return true
> + * if it is, false otherwise.
> + */
> +static bool pss_overlap(PageSearchStatus *pss1, PageSearchStatus *pss2)
> +{
> + return pss1->host_page_sending && pss2->host_page_sending &&
> + (pss1->host_page_start == pss2->host_page_start);
> +}
> +
> static void *do_data_compress(void *opaque)
> {
> CompressParam *param = opaque;
> @@ -2253,6 +2265,57 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len)
> return -1;
> }
>
> + /*
> + * When with postcopy preempt, we send back the page directly in the
> + * rp-return thread.
> + */
> + if (postcopy_preempt_active()) {
> + ram_addr_t page_start = start >> TARGET_PAGE_BITS;
> + size_t page_size = qemu_ram_pagesize(ramblock);
> + PageSearchStatus *pss = &ram_state->pss[RAM_CHANNEL_POSTCOPY];
> + int ret = 0;
> +
> + qemu_mutex_lock(&rs->bitmap_mutex);
> +
> + pss_init(pss, ramblock, page_start);
> + /*
> + * Always use the preempt channel, and make sure it's there. It's
> + * safe to access without lock, because when rp-thread is running
> + * we should be the only one who operates on the qemufile
> + */
> + pss->pss_channel = migrate_get_current()->postcopy_qemufile_src;
> + pss->postcopy_requested = true;
> + assert(pss->pss_channel);
> +
> + /*
> + * It must be either one or multiple of host page size. Just
> + * assert; if something wrong we're mostly split brain anyway.
> + */
> + assert(len % page_size == 0);
> + while (len) {
> + if (ram_save_host_page_urgent(pss)) {
> + error_report("%s: ram_save_host_page_urgent() failed: "
> + "ramblock=%s, start_addr=0x"RAM_ADDR_FMT,
> + __func__, ramblock->idstr, start);
> + ret = -1;
> + break;
> + }
> + /*
> + * NOTE: after ram_save_host_page_urgent() succeeded, pss->page
> + * will automatically be moved and point to the next host page
> + * we're going to send, so no need to update here.
> + *
> + * Normally QEMU never sends >1 host page in requests, so
> + * logically we don't even need that as the loop should only
> + * run once, but just to be consistent.
> + */
> + len -= page_size;
> + };
> + qemu_mutex_unlock(&rs->bitmap_mutex);
> +
> + return ret;
> + }
> +
> struct RAMSrcPageRequest *new_entry =
> g_new0(struct RAMSrcPageRequest, 1);
> new_entry->rb = ramblock;
> @@ -2531,6 +2594,55 @@ static void pss_host_page_finish(PageSearchStatus *pss)
> pss->host_page_start = pss->host_page_end = 0;
> }
>
> +/*
> + * Send an urgent host page specified by `pss'. Need to be called with
> + * bitmap_mutex held.
> + *
> + * Returns 0 if save host page succeeded, false otherwise.
> + */
> +static int ram_save_host_page_urgent(PageSearchStatus *pss)
> +{
> + bool page_dirty, sent = false;
> + RAMState *rs = ram_state;
> + int ret = 0;
> +
> + trace_postcopy_preempt_send_host_page(pss->block->idstr, pss->page);
> + pss_host_page_prepare(pss);
> +
> + /*
> + * If precopy is sending the same page, let it be done in precopy, or
> + * we could send the same page in two channels and none of them will
> + * receive the whole page.
> + */
> + if (pss_overlap(pss, &ram_state->pss[RAM_CHANNEL_PRECOPY])) {
> + trace_postcopy_preempt_hit(pss->block->idstr,
> + pss->page << TARGET_PAGE_BITS);
> + return 0;
> + }
> +
> + do {
> + page_dirty = migration_bitmap_clear_dirty(rs, pss->block, pss->page);
> +
> + if (page_dirty) {
> + /* Be strict to return code; it must be 1, or what else? */
> + if (ram_save_target_page(rs, pss) != 1) {
> + error_report_once("%s: ram_save_target_page failed", __func__);
> + ret = -1;
> + goto out;
> + }
> + sent = true;
> + }
> + pss_find_next_dirty(pss);
> + } while (pss_within_range(pss));
> +out:
> + pss_host_page_finish(pss);
> + /* For urgent requests, flush immediately if sent */
> + if (sent) {
> + qemu_fflush(pss->pss_channel);
> + }
> + return ret;
> +}
> +
> /**
> * ram_save_host_page: save a whole host page
> *
> --
> 2.32.0
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2022-10-06 18:17 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-20 22:50 [PATCH 00/14] migration: Postcopy Preempt-Full Peter Xu
2022-09-20 22:50 ` [PATCH 01/14] migration: Add postcopy_preempt_active() Peter Xu
2022-09-20 22:50 ` [PATCH 02/14] migration: Cleanup xbzrle zero page cache update logic Peter Xu
2022-10-04 10:33 ` Dr. David Alan Gilbert
2022-09-20 22:50 ` [PATCH 03/14] migration: Trivial cleanup save_page_header() on same block check Peter Xu
2022-10-04 10:41 ` Dr. David Alan Gilbert
2022-09-20 22:50 ` [PATCH 04/14] migration: Remove RAMState.f references in compression code Peter Xu
2022-10-04 10:54 ` Dr. David Alan Gilbert
2022-10-04 14:36 ` Peter Xu
2022-09-20 22:52 ` [PATCH 05/14] migration: Yield bitmap_mutex properly when sending/sleeping Peter Xu
2022-10-04 13:55 ` Dr. David Alan Gilbert
2022-10-04 19:13 ` Peter Xu
2022-10-05 11:18 ` Dr. David Alan Gilbert
2022-10-05 13:40 ` Peter Xu
2022-10-05 19:48 ` Peter Xu
2022-09-20 22:52 ` [PATCH 06/14] migration: Use atomic ops properly for page accountings Peter Xu
2022-10-04 16:59 ` Dr. David Alan Gilbert
2022-10-04 19:23 ` Peter Xu
2022-10-05 11:38 ` Dr. David Alan Gilbert
2022-10-05 13:53 ` Peter Xu
2022-10-06 20:40 ` Peter Xu
2022-09-20 22:52 ` [PATCH 07/14] migration: Teach PSS about host page Peter Xu
2022-10-05 11:12 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 08/14] migration: Introduce pss_channel Peter Xu
2022-10-05 13:03 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 09/14] migration: Add pss_init() Peter Xu
2022-10-05 13:09 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 10/14] migration: Make PageSearchStatus part of RAMState Peter Xu
2022-10-05 18:51 ` Dr. David Alan Gilbert
2022-10-05 19:41 ` Peter Xu
2022-10-06 8:36 ` Dr. David Alan Gilbert
2022-10-06 8:37 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 11/14] migration: Move last_sent_block into PageSearchStatus Peter Xu
2022-10-06 16:59 ` Dr. David Alan Gilbert
2022-10-06 18:34 ` Peter Xu
2022-10-06 18:38 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 12/14] migration: Send requested page directly in rp-return thread Peter Xu
2022-10-06 17:51 ` Dr. David Alan Gilbert [this message]
2022-09-20 22:52 ` [PATCH 13/14] migration: Remove old preempt code around state maintainance Peter Xu
2022-09-21 0:47 ` Peter Xu
2022-09-21 13:54 ` Peter Xu
2022-10-06 17:56 ` Dr. David Alan Gilbert
2022-09-20 22:52 ` [PATCH 14/14] migration: Drop rs->f Peter Xu
2022-10-06 17:57 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yz8VoviZmNeSAgWu@work-vm \
--to=dgilbert@redhat.com \
--cc=ani@anisinha.ca \
--cc=berrange@redhat.com \
--cc=lsoaresp@redhat.com \
--cc=manish.mishra@nutanix.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).