From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
Juan Quintela <quintela@redhat.com>,
Manish Mishra <manish.mishra@nutanix.com>,
"Daniel P . Berrange" <berrange@redhat.com>
Subject: Re: [PATCH v6 10/13] migration: Respect postcopy request order in preemption mode
Date: Mon, 23 May 2022 11:56:14 +0100 [thread overview]
Message-ID: <YotoTrRaTIaQdVR4@work-vm> (raw)
In-Reply-To: <20220517195730.32312-11-peterx@redhat.com>
* Peter Xu (peterx@redhat.com) wrote:
> With preemption mode on, when we see a postcopy request that was requesting
> for exactly the page that we have preempted before (so we've partially sent
> the page already via PRECOPY channel and it got preempted by another
> postcopy request), currently we drop the request so that after all the
> other postcopy requests are serviced then we'll go back to precopy stream
> and start to handle that.
>
> We dropped the request because we can't send it via postcopy channel since
> the precopy channel already contains partial of the data, and we can only
> send a huge page via one channel as a whole. We can't split a huge page
> into two channels.
>
> That's a very corner case and that works, but there's a change on the order
> of postcopy requests that we handle since we're postponing this (unlucky)
> postcopy request to be later than the other queued postcopy requests. The
> problem is there's a possibility that when the guest was very busy, the
> postcopy queue can be always non-empty, it means this dropped request will
> never be handled until the end of postcopy migration. So, there's a chance
> that there's one dest QEMU vcpu thread waiting for a page fault for an
> extremely long time just because it's unluckily accessing the specific page
> that was preempted before.
>
> The worst case time it needs can be as long as the whole postcopy migration
> procedure. It's extremely unlikely to happen, but when it happens it's not
> good.
>
> The root cause of this problem is because we treat pss->postcopy_requested
> variable as with two meanings bound together, as the variable shows:
>
> 1. Whether this page request is urgent, and,
> 2. Which channel we should use for this page request.
>
> With the old code, when we set postcopy_requested it means either both (1)
> and (2) are true, or both (1) and (2) are false. We can never have (1)
> and (2) to have different values.
>
> However it doesn't necessarily need to be like that. It's very legal that
> there's one request that has (1) very high urgency, but (2) we'd like to
> use the precopy channel. Just like the corner case we were discussing
> above.
>
> To differenciate the two meanings better, introduce a new field called
> postcopy_target_channel, showing which channel we should use for this page
> request, so as to cover the old meaning (2) only. Then we leave the
> postcopy_requested variable to stand only for meaning (1), which is the
> urgency of this page request.
>
> With this change, we can easily boost priority of a preempted precopy page
> as long as we know that page is also requested as a postcopy page. So with
> the new approach in get_queued_page() instead of dropping that request, we
> send it right away with the precopy channel so we get back the ordering of
> the page faults just like how they're requested on dest.
>
> Alongside, I touched up find_dirty_block() to only set the postcopy fields
> in the pss section if we're going through a postcopy migration. That's a
> very light optimization and shouldn't affect much.
>
> Reported-by: manish.mishra@nutanix.com
> Signed-off-by: Peter Xu <peterx@redhat.com>
So I think this is OK; getting a bit complicated!
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
> migration/ram.c | 69 +++++++++++++++++++++++++++++++++++++++----------
> 1 file changed, 55 insertions(+), 14 deletions(-)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 9d76db8491..fdcd9984fa 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -441,8 +441,28 @@ struct PageSearchStatus {
> unsigned long page;
> /* Set once we wrap around */
> bool complete_round;
> - /* Whether current page is explicitly requested by postcopy */
> + /*
> + * [POSTCOPY-ONLY] Whether current page is explicitly requested by
> + * postcopy. When set, the request is "urgent" because the dest QEMU
> + * threads are waiting for us.
> + */
> bool postcopy_requested;
> + /*
> + * [POSTCOPY-ONLY] The target channel to use to send current page.
> + *
> + * Note: This may _not_ match with the value in postcopy_requested
> + * above. Let's imagine the case where the postcopy request is exactly
> + * the page that we're sending in progress during precopy. In this case
> + * we'll have postcopy_requested set to true but the target channel
> + * will be the precopy channel (so that we don't split brain on that
> + * specific page since the precopy channel already contains partial of
> + * that page data).
> + *
> + * Besides that specific use case, postcopy_target_channel should
> + * always be equal to postcopy_requested, because by default we send
> + * postcopy pages via postcopy preempt channel.
> + */
> + bool postcopy_target_channel;
> };
> typedef struct PageSearchStatus PageSearchStatus;
>
> @@ -496,6 +516,9 @@ static QemuCond decomp_done_cond;
> static bool do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block,
> ram_addr_t offset, uint8_t *source_buf);
>
> +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss,
> + bool postcopy_requested);
> +
> static void *do_data_compress(void *opaque)
> {
> CompressParam *param = opaque;
> @@ -1516,8 +1539,14 @@ retry:
> */
> static bool find_dirty_block(RAMState *rs, PageSearchStatus *pss, bool *again)
> {
> - /* This is not a postcopy requested page */
> - pss->postcopy_requested = false;
> + if (migration_in_postcopy()) {
> + /*
> + * This is not a postcopy requested page, mark it "not urgent", and
> + * use precopy channel to send it.
> + */
> + pss->postcopy_requested = false;
> + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY;
> + }
Do you need the 'if' here?
Dave
> pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
> if (pss->complete_round && pss->block == rs->last_seen_block &&
> @@ -2038,15 +2067,20 @@ static bool get_queued_page(RAMState *rs, PageSearchStatus *pss)
> RAMBlock *block;
> ram_addr_t offset;
>
> -again:
> block = unqueue_page(rs, &offset);
>
> if (block) {
> /* See comment above postcopy_preempted_contains() */
> if (postcopy_preempted_contains(rs, block, offset)) {
> trace_postcopy_preempt_hit(block->idstr, offset);
> - /* This request is dropped */
> - goto again;
> + /*
> + * If what we preempted previously was exactly what we're
> + * requesting right now, restore the preempted precopy
> + * immediately, boosting its priority as it's requested by
> + * postcopy.
> + */
> + postcopy_preempt_restore(rs, pss, true);
> + return true;
> }
> } else {
> /*
> @@ -2070,7 +2104,9 @@ again:
> * really rare.
> */
> pss->complete_round = false;
> + /* Mark it an urgent request, meanwhile using POSTCOPY channel */
> pss->postcopy_requested = true;
> + pss->postcopy_target_channel = RAM_CHANNEL_POSTCOPY;
> }
>
> return !!block;
> @@ -2324,7 +2360,8 @@ static bool postcopy_preempt_triggered(RAMState *rs)
> return rs->postcopy_preempt_state.preempted;
> }
>
> -static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss)
> +static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss,
> + bool postcopy_requested)
> {
> PostcopyPreemptState *state = &rs->postcopy_preempt_state;
>
> @@ -2332,8 +2369,15 @@ static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss)
>
> pss->block = state->ram_block;
> pss->page = state->ram_page;
> - /* This is not a postcopy request but restoring previous precopy */
> - pss->postcopy_requested = false;
> +
> + /* Whether this is a postcopy request? */
> + pss->postcopy_requested = postcopy_requested;
> + /*
> + * When restoring a preempted page, the old data resides in PRECOPY
> + * slow channel, even if postcopy_requested is set. So always use
> + * PRECOPY channel here.
> + */
> + pss->postcopy_target_channel = RAM_CHANNEL_PRECOPY;
>
> trace_postcopy_preempt_restored(pss->block->idstr, pss->page);
>
> @@ -2344,12 +2388,9 @@ static void postcopy_preempt_restore(RAMState *rs, PageSearchStatus *pss)
> static void postcopy_preempt_choose_channel(RAMState *rs, PageSearchStatus *pss)
> {
> MigrationState *s = migrate_get_current();
> - unsigned int channel;
> + unsigned int channel = pss->postcopy_target_channel;
> QEMUFile *next;
>
> - channel = pss->postcopy_requested ?
> - RAM_CHANNEL_POSTCOPY : RAM_CHANNEL_PRECOPY;
> -
> if (channel != rs->postcopy_channel) {
> if (channel == RAM_CHANNEL_PRECOPY) {
> next = s->to_dst_file;
> @@ -2505,7 +2546,7 @@ static int ram_find_and_save_block(RAMState *rs)
> * preempted precopy. Otherwise find the next dirty bit.
> */
> if (postcopy_preempt_triggered(rs)) {
> - postcopy_preempt_restore(rs, &pss);
> + postcopy_preempt_restore(rs, &pss, false);
> found = true;
> } else {
> /* priority queue empty, so just search for something dirty */
> --
> 2.32.0
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2022-05-23 11:04 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-17 19:57 [PATCH v6 00/13] migration: Postcopy Preemption Peter Xu
2022-05-17 19:57 ` [PATCH v6 01/13] migration: Add postcopy-preempt capability Peter Xu
2022-05-17 19:57 ` [PATCH v6 02/13] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-05-17 19:57 ` [PATCH v6 03/13] migration: Postcopy preemption enablement Peter Xu
2022-05-17 19:57 ` [PATCH v6 04/13] migration: Postcopy recover with preempt enabled Peter Xu
2022-05-17 19:57 ` [PATCH v6 05/13] migration: Create the postcopy preempt channel asynchronously Peter Xu
2022-05-17 19:57 ` [PATCH v6 06/13] migration: Add property x-postcopy-preempt-break-huge Peter Xu
2022-05-17 19:57 ` [PATCH v6 07/13] migration: Add helpers to detect TLS capability Peter Xu
2022-05-18 8:57 ` Daniel P. Berrangé
2022-05-18 13:04 ` Peter Xu
2022-05-17 19:57 ` [PATCH v6 08/13] migration: Export tls-[creds|hostname|authz] params to cmdline too Peter Xu
2022-05-18 14:05 ` Daniel P. Berrangé
2022-05-17 19:57 ` [PATCH v6 09/13] migration: Enable TLS for preempt channel Peter Xu
2022-05-18 14:07 ` Daniel P. Berrangé
2022-05-17 19:57 ` [PATCH v6 10/13] migration: Respect postcopy request order in preemption mode Peter Xu
2022-05-23 10:56 ` Dr. David Alan Gilbert [this message]
2022-05-23 17:18 ` manish.mishra
2022-05-24 18:42 ` Peter Xu
2022-05-17 19:57 ` [PATCH v6 11/13] tests: Add postcopy tls migration test Peter Xu
2022-05-19 9:45 ` Dr. David Alan Gilbert
2022-05-19 10:11 ` Daniel P. Berrangé
2022-05-24 21:06 ` Peter Xu
2022-05-17 19:57 ` [PATCH v6 12/13] tests: Add postcopy tls recovery " Peter Xu
2022-05-19 9:34 ` Dr. David Alan Gilbert
2022-05-17 19:57 ` [PATCH v6 13/13] tests: Add postcopy preempt tests Peter Xu
2022-05-19 8:58 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YotoTrRaTIaQdVR4@work-vm \
--to=dgilbert@redhat.com \
--cc=berrange@redhat.com \
--cc=lsoaresp@redhat.com \
--cc=manish.mishra@nutanix.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).