qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
Cc: aik@au1.ibm.com, qemu-devel@nongnu.org, groug@kaod.org,
	paulus@ozlabs.org, qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [PATCH v9 6/6] migration: Include migration support for machine check handling
Date: Thu, 6 Jun 2019 13:06:14 +1000	[thread overview]
Message-ID: <20190606030614.GK10319@umbus.fritz.box> (raw)
In-Reply-To: <155910845769.13149.8097972239187020170.stgit@aravinda>

[-- Attachment #1: Type: text/plain, Size: 5336 bytes --]

On Wed, May 29, 2019 at 11:10:57AM +0530, Aravinda Prasad wrote:
> This patch includes migration support for machine check
> handling. Especially this patch blocks VM migration
> requests until the machine check error handling is
> complete as (i) these errors are specific to the source
> hardware and is irrelevant on the target hardware,
> (ii) these errors cause data corruption and should
> be handled before migration.
> 
> Signed-off-by: Aravinda Prasad <aravinda@linux.vnet.ibm.com>
> ---
>  hw/ppc/spapr.c         |   20 ++++++++++++++++++++
>  hw/ppc/spapr_events.c  |   17 +++++++++++++++++
>  hw/ppc/spapr_rtas.c    |    4 ++++
>  include/hw/ppc/spapr.h |    2 ++
>  4 files changed, 43 insertions(+)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index e8a77636..31c4850 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2104,6 +2104,25 @@ static const VMStateDescription vmstate_spapr_dtb = {
>      },
>  };
>  
> +static bool spapr_fwnmi_needed(void *opaque)
> +{
> +    SpaprMachineState *spapr = (SpaprMachineState *)opaque;
> +
> +    return (spapr->guest_machine_check_addr == -1) ? 0 : 1;

Since we're introducing a PAPR capability to enable this, it would
actually be better to check that here, rather than the runtime state.
That leads to less cases and easier to understand semantics for the
migration stream.

> +}
> +
> +static const VMStateDescription vmstate_spapr_machine_check = {
> +    .name = "spapr_machine_check",
> +    .version_id = 1,
> +    .minimum_version_id = 1,
> +    .needed = spapr_fwnmi_needed,
> +    .fields = (VMStateField[]) {
> +        VMSTATE_UINT64(guest_machine_check_addr, SpaprMachineState),
> +        VMSTATE_INT32(mc_status, SpaprMachineState),
> +        VMSTATE_END_OF_LIST()
> +    },
> +};
> +
>  static const VMStateDescription vmstate_spapr = {
>      .name = "spapr",
>      .version_id = 3,
> @@ -2137,6 +2156,7 @@ static const VMStateDescription vmstate_spapr = {
>          &vmstate_spapr_dtb,
>          &vmstate_spapr_cap_large_decr,
>          &vmstate_spapr_cap_ccf_assist,
> +        &vmstate_spapr_machine_check,
>          NULL
>      }
>  };
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 573c0b7..35e21e4 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -41,6 +41,7 @@
>  #include "qemu/bcd.h"
>  #include "hw/ppc/spapr_ovec.h"
>  #include <libfdt.h>
> +#include "migration/blocker.h"
>  
>  #define RTAS_LOG_VERSION_MASK                   0xff000000
>  #define   RTAS_LOG_VERSION_6                    0x06000000
> @@ -855,6 +856,22 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>  void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>  {
>      SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
> +    int ret;
> +    Error *local_err = NULL;
> +
> +    error_setg(&spapr->fwnmi_migration_blocker,
> +            "Live migration not supported during machine check handling");
> +    ret = migrate_add_blocker(spapr->fwnmi_migration_blocker, &local_err);
> +    if (ret < 0) {
> +        /*
> +         * We don't want to abort and let the migration to continue. In a
> +         * rare case, the machine check handler will run on the target
> +         * hardware. Though this is not preferable, it is better than aborting
> +         * the migration or killing the VM.
> +         */
> +        error_free(spapr->fwnmi_migration_blocker);

You should set fwnmi_migration_blocker to NULL here as well.

As mentioned on an earlier iteration, the migration blocker is the
same every time.  Couldn't you just create it once and free at final
teardown, rather than recreating it for every NMI?

> +        warn_report_err(local_err);
> +    }
>  
>      while (spapr->mc_status != -1) {
>          /*
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 91a7ab9..c849223 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -50,6 +50,7 @@
>  #include "target/ppc/mmu-hash64.h"
>  #include "target/ppc/mmu-book3s-v3.h"
>  #include "kvm_ppc.h"
> +#include "migration/blocker.h"
>  
>  static void rtas_display_character(PowerPCCPU *cpu, SpaprMachineState *spapr,
>                                     uint32_t token, uint32_t nargs,
> @@ -404,6 +405,9 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>          spapr->mc_status = -1;
>          qemu_cond_signal(&spapr->mc_delivery_cond);
>          rtas_st(rets, 0, RTAS_OUT_SUCCESS);
> +        migrate_del_blocker(spapr->fwnmi_migration_blocker);
> +        error_free(spapr->fwnmi_migration_blocker);
> +        spapr->fwnmi_migration_blocker = NULL;
>      }
>  }
>  
> diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> index bd75d4b..6c0cfd8 100644
> --- a/include/hw/ppc/spapr.h
> +++ b/include/hw/ppc/spapr.h
> @@ -214,6 +214,8 @@ struct SpaprMachineState {
>      SpaprCapabilities def, eff, mig;
>  
>      unsigned gpu_numa_id;
> +
> +    Error *fwnmi_migration_blocker;
>  };
>  
>  #define H_SUCCESS         0
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2019-06-06  3:10 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-29  5:40 [Qemu-devel] [PATCH v9 0/6] target-ppc/spapr: Add FWNMI support in QEMU for PowerKVM guests Aravinda Prasad
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 1/6] ppc: spapr: Handle "ibm, nmi-register" and "ibm, nmi-interlock" RTAS calls Aravinda Prasad
2019-06-03 10:12   ` Greg Kurz
2019-06-03 11:17     ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-06-04  6:08       ` Aravinda Prasad
2019-06-04 14:50         ` Greg Kurz
2019-06-06  5:17           ` Aravinda Prasad
2019-06-06  1:35       ` David Gibson
2019-06-06  4:39         ` Aravinda Prasad
2019-06-06  1:34   ` [Qemu-devel] " David Gibson
2019-06-06  5:26     ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 2/6] Wrapper function to wait on condition for the main loop mutex Aravinda Prasad
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 3/6] target/ppc: Handle NMI guest exit Aravinda Prasad
2019-06-03 11:53   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-06-06  1:43   ` [Qemu-devel] " David Gibson
2019-06-06  4:45     ` Aravinda Prasad
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 4/6] target/ppc: Build rtas error log upon an MCE Aravinda Prasad
2019-06-03 14:00   ` Greg Kurz
2019-06-04  6:29     ` Aravinda Prasad
2019-06-04  9:01       ` Greg Kurz
2019-06-04 10:10         ` [Qemu-devel] [Qemu-ppc] " Aravinda Prasad
2019-06-06  2:58         ` [Qemu-devel] " David Gibson
2019-06-06  2:57       ` David Gibson
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 5/6] ppc: spapr: Enable FWNMI capability Aravinda Prasad
2019-06-03 15:25   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-06-04  6:45     ` Aravinda Prasad
2019-06-06  3:02       ` David Gibson
2019-06-06  4:50         ` Aravinda Prasad
2019-06-06  7:51         ` Aravinda Prasad
2019-06-06  3:00   ` [Qemu-devel] " David Gibson
2019-05-29  5:40 ` [Qemu-devel] [PATCH v9 6/6] migration: Include migration support for machine check handling Aravinda Prasad
2019-06-03 15:40   ` [Qemu-devel] [Qemu-ppc] " Greg Kurz
2019-06-04  7:04     ` Aravinda Prasad
2019-06-04 20:04       ` Greg Kurz
2019-06-04 20:11         ` Greg Kurz
2019-06-06  3:06   ` David Gibson [this message]
2019-06-06  6:06     ` [Qemu-devel] " Greg Kurz
2019-06-06 11:15       ` Aravinda Prasad
2019-06-06 12:10         ` Greg Kurz
2019-06-07  0:22           ` David Gibson
2019-06-07 10:30             ` Greg Kurz
2019-06-06 11:25     ` Aravinda Prasad
2019-06-06 12:24       ` Greg Kurz
2019-06-07  0:23       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190606030614.GK10319@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=aik@au1.ibm.com \
    --cc=aravinda@linux.vnet.ibm.com \
    --cc=groug@kaod.org \
    --cc=paulus@ozlabs.org \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).