qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>,
	Igor Mammedov <imammedo@redhat.com>
Subject: Re: [PATCH RFC 4/5] cpu: Allow cpu_synchronize_all_post_init() to take an errp
Date: Thu, 9 Jun 2022 17:02:29 -0400	[thread overview]
Message-ID: <YqJf5WlSyK2o2xJg@xz-m1.local> (raw)
In-Reply-To: <YqDW2AZDb3buF9YQ@work-vm>

On Wed, Jun 08, 2022 at 06:05:28PM +0100, Dr. David Alan Gilbert wrote:
> > @@ -2005,7 +2005,17 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
> >      /* TODO we should move all of this lot into postcopy_ram.c or a shared code
> >       * in migration.c
> >       */
> > -    cpu_synchronize_all_post_init();
> > +    cpu_synchronize_all_post_init(&local_err);
> > +    if (local_err) {
> > +        /*
> > +         * TODO: a better way to do this is to tell the src that we cannot
> > +         * run the VM here so hopefully we can keep the VM running on src
> > +         * and immediately halt the switch-over.  But that needs work.
> 
> Yes, I think it is possible; unlike some of the later errors in the same
> function, in this case we know no disks/network/etc have been touched,
> so we should be able to recover.
> I wonder if we can move the postcopy_state_set(POSTCOPY_INCOMING_RUNNING)
> out of loadvm_postcopy_handle_run to after this point.
> 
> We've already got the return path, so we should be able to signal the
> failure unless we're very unlucky.

Right.  It's just that for the new ACK we may need to modify the return
path protocol for sure, because none of the existing ones can notify such
an information.

One idea is to reuse MIG_RP_MSG_RESUME_ACK, it was only used for postcopy
recovery before to do the final handshake with offload=1 only (which is
defined as MIGRATION_RESUME_ACK_VALUE).  We could try to fill in the
payload with some !1 value, to tell the source that we NACK the migration
then src fails the migration as long as possible?

That seems to be even compatibile with one old qemu migrating to a new qemu
scenario, because when the old qemu notices the MIG_RP_MSG_RESUME_ACK
message with !1 payload, it'll mark the rp bad:

  if (migrate_handle_rp_resume_ack(ms, tmp32)) {
      mark_source_rp_bad(ms);
      goto out;
  }

  static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
  {
      trace_source_return_path_thread_resume_ack(value);
  
      if (value != MIGRATION_RESUME_ACK_VALUE) {
          error_report("%s: illegal resume_ack value %"PRIu32,
                       __func__, value);
          return -1;
      }
      ...
  }

If it looks generally good, I can try with such a change in v2.

Thanks,

-- 
Peter Xu



  reply	other threads:[~2022-06-09 21:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-06-07 23:06 [PATCH RFC 0/5] CPU: Detect put cpu register errors for migrations Peter Xu
2022-06-07 23:06 ` [PATCH RFC 1/5] cpus-common: Introduce run_on_cpu_func2 which allows error returns Peter Xu
2022-06-07 23:06 ` [PATCH RFC 2/5] cpus-common: Add run_on_cpu2() Peter Xu
2022-06-07 23:06 ` [PATCH RFC 3/5] accel: Allow synchronize_post_init() to take an Error** Peter Xu
2022-06-07 23:06 ` [PATCH RFC 4/5] cpu: Allow cpu_synchronize_all_post_init() to take an errp Peter Xu
2022-06-08 17:05   ` Dr. David Alan Gilbert
2022-06-09 21:02     ` Peter Xu [this message]
2022-06-10 14:19       ` Peter Xu
2022-06-13 11:13         ` Dr. David Alan Gilbert
2022-06-07 23:06 ` [PATCH RFC 5/5] KVM: Hook kvm_arch_put_registers() errors to the caller Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqJf5WlSyK2o2xJg@xz-m1.local \
    --to=peterx@redhat.com \
    --cc=dgilbert@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=lsoaresp@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=rth@twiddle.net \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).