qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Wei Yang <richardw.yang@linux.intel.com>
Cc: qemu-devel@nongnu.org,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
	quintela@redhat.com
Subject: Re: [PATCH 3/3] migration/postcopy: handle POSTCOPY_INCOMING_RUNNING corner case properly
Date: Wed, 9 Oct 2019 12:12:25 +0800	[thread overview]
Message-ID: <20191009041225.GF10750@xz-x1> (raw)
In-Reply-To: <20191009010204.GC26203@richard>

On Wed, Oct 09, 2019 at 09:02:04AM +0800, Wei Yang wrote:
> On Tue, Oct 08, 2019 at 05:40:46PM +0100, Dr. David Alan Gilbert wrote:
> >* Wei Yang (richardw.yang@linux.intel.com) wrote:
> >> Currently, we set PostcopyState blindly to RUNNING, even we found the
> >> previous state is not LISTENING. This will lead to a corner case.
> >> 
> >> First let's look at the code flow:
> >> 
> >> qemu_loadvm_state_main()
> >>     ret = loadvm_process_command()
> >>         loadvm_postcopy_handle_run()
> >>             return -1;
> >>     if (ret < 0) {
> >>         if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING)
> >>             ...
> >>     }
> >> 
> >> From above snippet, the corner case is loadvm_postcopy_handle_run()
> >> always sets state to RUNNING. And then it checks the previous state. If
> >> the previous state is not LISTENING, it will return -1. But at this
> >> moment, PostcopyState is already been set to RUNNING.
> >> 
> >> Then ret is checked in qemu_loadvm_state_main(), when it is -1
> >> PostcopyState is checked. Current logic would pause postcopy and retry
> >> if PostcopyState is RUNNING. This is not what we expect, because
> >> postcopy is not active yet.
> >> 
> >> This patch makes sure state is set to RUNNING only previous state is
> >> LISTENING by introducing an old_state parameter in postcopy_state_set().
> >> New state only would be set when current state equals to old_state.
> >> 
> >> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> >
> >OK, it's a shame to use a pointer there, but it works.
> 
> You mean second parameter of postcopy_state_set()?
> 
> I don't have a better idea. Or we introduce a new state
> POSTCOPY_INCOMING_NOCHECK. Do you feel better with this?

Maybe simply fix loadvm_postcopy_handle_run() to set the state after
the POSTCOPY_INCOMING_LISTENING check?

> 
> >Note, something else; using '-1' as the return value and checking for it
> >is something we do a lot; but in this case it's an example of an error
> >we could never recover from so it never makes sense to try and recover.
> >We should probably look at different types of error.

It is true that we might hang on some real errors, but IMHO it might
be no where better to quit QEMU if we're in postcopy...

(What I'm thinking in mind here is that sometimes even if postcopy
 failed we might still have a chance to recover a full VM by dumping
 both src/dst of the during-postcopy VM instances and manually merge
 them by black magic, in very extreme cases)

Regards,

-- 
Peter Xu


  reply	other threads:[~2019-10-09 16:41 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 10:01 [PATCH 0/3] migration/postcopy: cleanup related to postcopy Wei Yang
2019-10-01 10:01 ` [PATCH 1/3] migration/postcopy: rename postcopy_ram_enable_notify to postcopy_ram_incoming_setup Wei Yang
2019-10-08 14:17   ` Dr. David Alan Gilbert
2019-10-01 10:01 ` [PATCH 2/3] migration/postcopy: not necessary to do postcopy_ram_incoming_cleanup when state is ADVISE Wei Yang
2019-10-08 16:02   ` Dr. David Alan Gilbert
2019-10-09  0:55     ` Wei Yang
2019-10-09  9:03       ` Dr. David Alan Gilbert
2019-10-01 10:01 ` [PATCH 3/3] migration/postcopy: handle POSTCOPY_INCOMING_RUNNING corner case properly Wei Yang
2019-10-08 16:40   ` Dr. David Alan Gilbert
2019-10-09  1:02     ` Wei Yang
2019-10-09  4:12       ` Peter Xu [this message]
2019-10-09  5:07         ` Wei Yang
2019-10-09  5:36           ` Peter Xu
2019-10-09  6:07             ` Wei Yang
2019-10-09  9:08               ` Dr. David Alan Gilbert
2019-10-10  0:54                 ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191009041225.GF10750@xz-x1 \
    --to=peterx@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=richardw.yang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).