qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] FWNMI follow up patches
@ 2020-03-17  5:02 Nicholas Piggin
  2020-03-17  5:02 ` [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

Here's a bunch of other patches remaining  after the last round,
for some less critical issues. Take these before or after the 5.0
freeze as you like.

Patch 1 is the main thing I think should be considered as a fix:
without it, non-FWNMI guests under KVM see behaviour change with
the FWNMI feature. I kept it out of the "must have" round because
Linux has long been FWNMI capable (I don't know state of FreeBSD
though), and because I have not tested under KVM with hardware
MCE injection.

2-3 are hopefully quite harmless comments and messages.

Patch 4 helps the guest stay up under some QoS corner cases.
Lastly is a machine check injection monitor command which helps
test things, it may not be ready for merge but it's useful for
the series.

Patch 5 is monitor command to inject MCEs, it's a bit janky
but it works to test qemu and guests.

Nicholas Piggin (5):
  ppc/spapr: KVM FWNMI should not be enabled until guest requests it
  ppc/spapr: Improve FWNMI machine check delivery corner case comments
  ppc/spapr: Add FWNMI machine check delivery warnings
  ppc/spapr: Don't kill the guest if a recovered FWNMI machine check
    delivery fails
  target/ppc: Implement simple monitor mce injection

 hmp-commands.hx        | 20 +++++++++++++++++++-
 hw/ppc/spapr.c         | 42 ++++++++++++++++++++++++++++++++++++++++++
 hw/ppc/spapr_caps.c    |  5 +++--
 hw/ppc/spapr_events.c  | 40 +++++++++++++++++++++++++++++-----------
 hw/ppc/spapr_rtas.c    | 11 +++++++++++
 include/hw/ppc/spapr.h |  3 +++
 target/ppc/cpu.h       |  3 +++
 target/ppc/kvm.c       |  7 +++++++
 target/ppc/kvm_ppc.h   |  6 ++++++
 target/ppc/monitor.c   | 26 ++++++++++++++++++++++++++
 10 files changed, 149 insertions(+), 14 deletions(-)

-- 
2.23.0



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it
  2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
@ 2020-03-17  5:02 ` Nicholas Piggin
  2020-03-17 11:02   ` Greg Kurz
  2020-03-17  5:02 ` [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments Nicholas Piggin
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

The KVM FWNMI capability should be enabled with the "ibm,nmi-register"
rtas call. Although MCEs from KVM will be delivered as architected
interrupts to the guest before "ibm,nmi-register" is called, KVM has
different behaviour depending on whether the guest has enabled FWNMI
(it attempts to do more recovery on behalf of a non-FWNMI guest).

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_caps.c  | 5 +++--
 hw/ppc/spapr_rtas.c  | 7 +++++++
 target/ppc/kvm.c     | 7 +++++++
 target/ppc/kvm_ppc.h | 6 ++++++
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
index 679ae7959f..eb5521d0c2 100644
--- a/hw/ppc/spapr_caps.c
+++ b/hw/ppc/spapr_caps.c
@@ -517,9 +517,10 @@ static void cap_fwnmi_apply(SpaprMachineState *spapr, uint8_t val,
     }
 
     if (kvm_enabled()) {
-        if (kvmppc_set_fwnmi() < 0) {
+        if (!kvmppc_get_fwnmi()) {
             error_setg(errp, "Firmware Assisted Non-Maskable Interrupts(FWNMI) "
-                             "not supported by KVM");
+                             "not supported by KVM, "
+                             "try appending -machine cap-fwnmi=off");
         }
     }
 }
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 9fb8c8632a..29abe66d01 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -437,6 +437,13 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
         return;
     }
 
+    if (kvm_enabled()) {
+        if (kvmppc_set_fwnmi() < 0) {
+            rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED);
+            return;
+        }
+    }
+
     spapr->fwnmi_system_reset_addr = sreset_addr;
     spapr->fwnmi_machine_check_addr = mce_addr;
 
diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
index 597f72be1b..03d0667e8f 100644
--- a/target/ppc/kvm.c
+++ b/target/ppc/kvm.c
@@ -88,6 +88,7 @@ static int cap_ppc_safe_indirect_branch;
 static int cap_ppc_count_cache_flush_assist;
 static int cap_ppc_nested_kvm_hv;
 static int cap_large_decr;
+static int cap_fwnmi;
 
 static uint32_t debug_inst_opcode;
 
@@ -136,6 +137,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
     kvmppc_get_cpu_characteristics(s);
     cap_ppc_nested_kvm_hv = kvm_vm_check_extension(s, KVM_CAP_PPC_NESTED_HV);
     cap_large_decr = kvmppc_get_dec_bits();
+    cap_fwnmi = kvm_vm_check_extension(s, KVM_CAP_PPC_FWNMI);
     /*
      * Note: setting it to false because there is not such capability
      * in KVM at this moment.
@@ -2064,6 +2066,11 @@ void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
     }
 }
 
+bool kvmppc_get_fwnmi(void)
+{
+    return cap_fwnmi;
+}
+
 int kvmppc_set_fwnmi(void)
 {
     PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
index 332fa0aa1c..fcaf745516 100644
--- a/target/ppc/kvm_ppc.h
+++ b/target/ppc/kvm_ppc.h
@@ -27,6 +27,7 @@ void kvmppc_enable_h_page_init(void);
 void kvmppc_set_papr(PowerPCCPU *cpu);
 int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
 void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
+bool kvmppc_get_fwnmi(void);
 int kvmppc_set_fwnmi(void);
 int kvmppc_smt_threads(void);
 void kvmppc_error_append_smt_possible_hint(Error *const *errp);
@@ -163,6 +164,11 @@ static inline void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
 {
 }
 
+static inline bool kvmppc_get_fwnmi(void)
+{
+    return false;
+}
+
 static inline int kvmppc_set_fwnmi(void)
 {
     return -1;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments
  2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
  2020-03-17  5:02 ` [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
@ 2020-03-17  5:02 ` Nicholas Piggin
  2020-03-17 12:16   ` Greg Kurz
  2020-03-17  5:02 ` [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings Nicholas Piggin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

Some of the conditions are not as clearly documented as they could be.
Also the non-FWNMI case does not need a large comment.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_events.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 323fcef4aa..05337f0671 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -834,17 +834,13 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
     Error *local_err = NULL;
 
     if (spapr->fwnmi_machine_check_addr == -1) {
-        /*
-         * This implies that we have hit a machine check either when the
-         * guest has not registered FWNMI (i.e., "ibm,nmi-register" not
-         * called) or between system reset and "ibm,nmi-register".
-         * Fall back to the old machine check behavior in such cases.
-         */
+        /* Non-FWNMI case, deliver it like an architected CPU interrupt. */
         cs->exception_index = POWERPC_EXCP_MCHECK;
         ppc_cpu_do_interrupt(cs);
         return;
     }
 
+    /* Wait for FWNMI interlock. */
     while (spapr->fwnmi_machine_check_interlock != -1) {
         /*
          * Check whether the same CPU got machine check error
@@ -856,8 +852,13 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
             return;
         }
         qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
-        /* Meanwhile if the system is reset, then just return */
         if (spapr->fwnmi_machine_check_addr == -1) {
+            /*
+             * If the machine was reset while waiting for the interlock,
+             * abort the delivery. The machine check applies to a context
+             * that no longer exists, so it wouldn't make sense to deliver
+             * it now.
+             */
             return;
         }
     }
@@ -868,7 +869,9 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
          * We don't want to abort so we let the migration to continue.
          * In a rare case, the machine check handler will run on the target.
          * Though this is not preferable, it is better than aborting
-         * the migration or killing the VM.
+         * the migration or killing the VM. It is okay to call
+         * migrate_del_blocker on a blocker that was not added (which the
+         * nmi-interlock handler would do when it's called after this).
          */
         warn_report("Received a fwnmi while migration was in progress");
     }
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings
  2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
  2020-03-17  5:02 ` [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
  2020-03-17  5:02 ` [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments Nicholas Piggin
@ 2020-03-17  5:02 ` Nicholas Piggin
  2020-03-17 12:20   ` Greg Kurz
  2020-03-17  5:02 ` [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails Nicholas Piggin
  2020-03-17  5:02 ` [PATCH 5/5] target/ppc: Implement simple monitor mce injection Nicholas Piggin
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

Add some messages which explain problems and guest misbehaviour that
may be difficult to diagnose in rare cases of machine checks.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_events.c | 4 ++++
 hw/ppc/spapr_rtas.c   | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 05337f0671..d35151eeb0 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -807,6 +807,8 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
     /* get rtas addr from fdt */
     rtas_addr = spapr_get_rtas_addr();
     if (!rtas_addr) {
+        warn_report("FWNMI: Unable to deliver machine check to guest: "
+                    "rtas_addr not found.");
         qemu_system_guest_panicked(NULL);
         g_free(ext_elog);
         return;
@@ -848,6 +850,8 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
          * that CPU called "ibm,nmi-interlock")
          */
         if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
+            warn_report("FWNMI: Unable to deliver machine check to guest: "
+                        "nested machine check.");
             qemu_system_guest_panicked(NULL);
             return;
         }
diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 29abe66d01..12cd09701c 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -462,6 +462,10 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
     }
 
     if (spapr->fwnmi_machine_check_addr == -1) {
+        qemu_log_mask(LOG_GUEST_ERROR,
+                      "FWNMI: ibm,nmi-interlock RTAS called with FWNMI not "
+                       "registered.\n");
+
         /* NMI register not called */
         rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
         return;
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails
  2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
                   ` (2 preceding siblings ...)
  2020-03-17  5:02 ` [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings Nicholas Piggin
@ 2020-03-17  5:02 ` Nicholas Piggin
  2020-03-17 16:57   ` Greg Kurz
  2020-03-17  5:02 ` [PATCH 5/5] target/ppc: Implement simple monitor mce injection Nicholas Piggin
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

Try to be tolerant of errors if the machine check had been recovered
by the host.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/spapr_events.c | 25 ++++++++++++++++++-------
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index d35151eeb0..3f524cb0ca 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -807,13 +807,20 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
     /* get rtas addr from fdt */
     rtas_addr = spapr_get_rtas_addr();
     if (!rtas_addr) {
-        warn_report("FWNMI: Unable to deliver machine check to guest: "
-                    "rtas_addr not found.");
-        qemu_system_guest_panicked(NULL);
+        if (!recovered) {
+            warn_report("FWNMI: Unable to deliver machine check to guest: "
+                        "rtas_addr not found.");
+            qemu_system_guest_panicked(NULL);
+        } else {
+            warn_report("FWNMI: Unable to deliver machine check to guest: "
+                        "rtas_addr not found. Machine check recovered.");
+        }
         g_free(ext_elog);
         return;
     }
 
+    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
+
     stq_be_phys(&address_space_memory, rtas_addr + RTAS_ERROR_LOG_OFFSET,
                 env->gpr[3]);
     cpu_physical_memory_write(rtas_addr + RTAS_ERROR_LOG_OFFSET +
@@ -850,9 +857,14 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
          * that CPU called "ibm,nmi-interlock")
          */
         if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
-            warn_report("FWNMI: Unable to deliver machine check to guest: "
-                        "nested machine check.");
-            qemu_system_guest_panicked(NULL);
+            if (!recovered) {
+                warn_report("FWNMI: Unable to deliver machine check to guest: "
+                            "nested machine check.");
+                qemu_system_guest_panicked(NULL);
+            } else {
+                warn_report("FWNMI: Unable to deliver machine check to guest: "
+                            "nested machine check. Machine check recovered.");
+            }
             return;
         }
         qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
@@ -880,7 +892,6 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
         warn_report("Received a fwnmi while migration was in progress");
     }
 
-    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
     spapr_mce_dispatch_elog(cpu, recovered);
 }
 
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 5/5] target/ppc: Implement simple monitor mce injection
  2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
                   ` (3 preceding siblings ...)
  2020-03-17  5:02 ` [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails Nicholas Piggin
@ 2020-03-17  5:02 ` Nicholas Piggin
  4 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-17  5:02 UTC (permalink / raw)
  To: qemu-ppc
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Nicholas Piggin, Greg Kurz, Cédric Le Goater, Ganesh Goudar,
	David Gibson

This enables the mce monitor command for ppc, and adds a spapr
facility to inject machine check exception to a CPU by setting
low level registers.

  (qemu) mce 0 0x200000 0x80 0xdeadbeef 1

    Disabling lock debugging due to kernel taint
    MCE: CPU0: machine check (Severe) Host SLB Multihit [Recovered]
    MCE: CPU0: PID: 495 Comm: a NIP: [0000000130ee07c8]
    MCE: CPU0: Initiator CPU
    MCE: CPU0: Unknown

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hmp-commands.hx        | 20 +++++++++++++++++++-
 hw/ppc/spapr.c         | 42 ++++++++++++++++++++++++++++++++++++++++++
 include/hw/ppc/spapr.h |  3 +++
 target/ppc/cpu.h       |  3 +++
 target/ppc/monitor.c   | 26 ++++++++++++++++++++++++++
 5 files changed, 93 insertions(+), 1 deletion(-)

diff --git a/hmp-commands.hx b/hmp-commands.hx
index 7f0f3974ad..4a9089b431 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1581,12 +1581,30 @@ ERST
         .cmd        = hmp_mce,
     },
 
-#endif
 SRST
 ``mce`` *cpu* *bank* *status* *mcgstatus* *addr* *misc*
   Inject an MCE on the given CPU (x86 only).
 ERST
 
+#endif
+
+#if defined(TARGET_PPC)
+
+    {
+        .name       = "mce",
+        .args_type  = "cpu_index:i,srr1_mask:l,dsisr:i,dar:l,recovered:i",
+        .params     = "cpu srr1_mask dsisr dar recovered",
+        .help       = "inject a MCE on the given CPU",
+        .cmd        = hmp_mce,
+    },
+
+SRST
+``mce`` *cpu* *srr1_mask* *dsisr* *dar* *recovered*
+  Inject an MCE on the given CPU (PPC only).
+ERST
+
+#endif
+
     {
         .name       = "getfd",
         .args_type  = "fdname:s",
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 78e649f47d..d83245c438 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -3454,6 +3454,47 @@ static void spapr_nmi(NMIState *n, int cpu_index, Error **errp)
     }
 }
 
+typedef struct MCEInjectionParams {
+    uint64_t srr1_mask;
+    uint32_t dsisr;
+    uint64_t dar;
+    bool recovered;
+} MCEInjectionParams;
+
+static void spapr_do_mce_on_cpu(CPUState *cs, run_on_cpu_data data)
+{
+    MCEInjectionParams *params = data.host_ptr;
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+    uint64_t srr1_mce_bits = PPC_BITMASK(42,45) | PPC_BIT(36); /* POWER9 bits */
+
+    cpu_synchronize_state(cs);
+
+    env->spr[SPR_SRR0] = env->nip;
+    env->spr[SPR_SRR1] = (env->msr & ~srr1_mce_bits) |
+                         (params->srr1_mask & srr1_mce_bits);
+    if (params->dsisr) {
+        env->spr[SPR_DSISR] = params->dsisr;
+        env->spr[SPR_DAR] = params->dar;
+    }
+
+    spapr_mce_req_event(cpu, params->recovered);
+}
+
+static void spapr_cpu_mce_inject(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
+                                 uint64_t srr1_mask, uint32_t dsisr,
+                                 uint64_t dar, bool recovered)
+{
+    CPUState *cs = CPU(cpu);
+    MCEInjectionParams params = {
+        .srr1_mask = srr1_mask,
+        .dsisr = dsisr,
+        .dar = dar,
+        .recovered = recovered,
+    };
+    run_on_cpu(cs, spapr_do_mce_on_cpu, RUN_ON_CPU_HOST_PTR(&params));
+}
+
 int spapr_lmb_dt_populate(SpaprDrc *drc, SpaprMachineState *spapr,
                           void *fdt, int *fdt_start_offset, Error **errp)
 {
@@ -4556,6 +4597,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
     vhc->encode_hpt_for_kvm_pr = spapr_encode_hpt_for_kvm_pr;
     vhc->cpu_exec_enter = spapr_cpu_exec_enter;
     vhc->cpu_exec_exit = spapr_cpu_exec_exit;
+    vhc->cpu_mce_inject = spapr_cpu_mce_inject;
     xic->ics_get = spapr_ics_get;
     xic->ics_resend = spapr_ics_resend;
     xic->icp_get = spapr_icp_get;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 42d64a0368..72f86a2ee8 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -929,4 +929,7 @@ void spapr_check_pagesize(SpaprMachineState *spapr, hwaddr pagesize,
 
 void spapr_set_all_lpcrs(target_ulong value, target_ulong mask);
 hwaddr spapr_get_rtas_addr(void);
+
+void spapr_mce_inject(CPUState *cs, uint64_t srr1_mask, uint32_t dsisr,
+                      uint64_t dar, bool recovered);
 #endif /* HW_SPAPR_H */
diff --git a/target/ppc/cpu.h b/target/ppc/cpu.h
index f8c7d6f19c..ed8d2015bd 100644
--- a/target/ppc/cpu.h
+++ b/target/ppc/cpu.h
@@ -1189,6 +1189,9 @@ struct PPCVirtualHypervisorClass {
 #ifndef CONFIG_USER_ONLY
     void (*cpu_exec_enter)(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu);
     void (*cpu_exec_exit)(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu);
+    void (*cpu_mce_inject)(PPCVirtualHypervisor *vhyp, PowerPCCPU *cpu,
+                           uint64_t srr1_mask, uint32_t dsisr, uint64_t dar,
+                           bool recovered);
 #endif
 };
 
diff --git a/target/ppc/monitor.c b/target/ppc/monitor.c
index a5a177d717..ec997ce673 100644
--- a/target/ppc/monitor.c
+++ b/target/ppc/monitor.c
@@ -28,6 +28,7 @@
 #include "qemu/ctype.h"
 #include "monitor/hmp-target.h"
 #include "monitor/hmp.h"
+#include "qapi/qmp/qdict.h"
 
 static target_long monitor_get_ccr(const struct MonitorDef *md, int val)
 {
@@ -72,6 +73,31 @@ void hmp_info_tlb(Monitor *mon, const QDict *qdict)
     dump_mmu(env1);
 }
 
+void hmp_mce(Monitor *mon, const QDict *qdict)
+{
+    CPUState *cs;
+    int cpu_index = qdict_get_int(qdict, "cpu_index");
+    uint64_t srr1_mask = qdict_get_int(qdict, "srr1_mask");
+    uint32_t dsisr = qdict_get_int(qdict, "dsisr");
+    uint64_t dar = qdict_get_int(qdict, "dar");
+    bool recovered = qdict_get_int(qdict, "recovered");
+
+    cs = qemu_get_cpu(cpu_index);
+
+    if (cs != NULL) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+
+        if (cpu->vhyp) {
+            PPCVirtualHypervisorClass *vhc =
+                PPC_VIRTUAL_HYPERVISOR_GET_CLASS(cpu->vhyp);
+            if (vhc->cpu_mce_inject) {
+                vhc->cpu_mce_inject(cpu->vhyp, cpu,
+                                    srr1_mask, dsisr, dar, recovered);
+            }
+        }
+    }
+}
+
 const MonitorDef monitor_defs[] = {
     { "fpscr", offsetof(CPUPPCState, fpscr) },
     /* Next instruction pointer */
-- 
2.23.0



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it
  2020-03-17  5:02 ` [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
@ 2020-03-17 11:02   ` Greg Kurz
  2020-03-19  2:27     ` Nicholas Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2020-03-17 11:02 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Cédric Le Goater, Ganesh Goudar, qemu-ppc, David Gibson

On Tue, 17 Mar 2020 15:02:11 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> The KVM FWNMI capability should be enabled with the "ibm,nmi-register"
> rtas call. Although MCEs from KVM will be delivered as architected
> interrupts to the guest before "ibm,nmi-register" is called, KVM has
> different behaviour depending on whether the guest has enabled FWNMI
> (it attempts to do more recovery on behalf of a non-FWNMI guest).
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  hw/ppc/spapr_caps.c  | 5 +++--
>  hw/ppc/spapr_rtas.c  | 7 +++++++
>  target/ppc/kvm.c     | 7 +++++++
>  target/ppc/kvm_ppc.h | 6 ++++++
>  4 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
> index 679ae7959f..eb5521d0c2 100644
> --- a/hw/ppc/spapr_caps.c
> +++ b/hw/ppc/spapr_caps.c
> @@ -517,9 +517,10 @@ static void cap_fwnmi_apply(SpaprMachineState *spapr, uint8_t val,
>      }
>  
>      if (kvm_enabled()) {
> -        if (kvmppc_set_fwnmi() < 0) {
> +        if (!kvmppc_get_fwnmi()) {
>              error_setg(errp, "Firmware Assisted Non-Maskable Interrupts(FWNMI) "
> -                             "not supported by KVM");
> +                             "not supported by KVM, "
> +                             "try appending -machine cap-fwnmi=off");

It is usually preferred to keep error message strings on one
line for easier grepping. Also hints should be specified with
error_append_hint() because they are treated differently by
QMP (ie. not printed).

Something like:

        if (!kvmppc_get_fwnmi()) {
            error_setg(errp,
       "Firmware Assisted Non-Maskable Interrupts(FWNMI) not supported by KVM");
            error_append_hint(errp, "Try appending -machine cap-fwnmi=off\n");
        }

Note that the current error handling code has an issue that
prevents hints to be printed when errp == &error_fatal, which
is exactly what spapr_caps_apply() does. Since this affects
a lot of locations in the code base, there's an on-going
effort to fix that globally:

https://patchwork.ozlabs.org/project/qemu-devel/list/?series=163907

I don't know if this will make it for 5.0, but in any case I
think you should call error_append_hint() in this patch anyway
and the code will just work at some later point.

Rest looks good.

>          }
>      }
>  }
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 9fb8c8632a..29abe66d01 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -437,6 +437,13 @@ static void rtas_ibm_nmi_register(PowerPCCPU *cpu,
>          return;
>      }
>  
> +    if (kvm_enabled()) {
> +        if (kvmppc_set_fwnmi() < 0) {
> +            rtas_st(rets, 0, RTAS_OUT_NOT_SUPPORTED);
> +            return;
> +        }
> +    }
> +
>      spapr->fwnmi_system_reset_addr = sreset_addr;
>      spapr->fwnmi_machine_check_addr = mce_addr;
>  
> diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c
> index 597f72be1b..03d0667e8f 100644
> --- a/target/ppc/kvm.c
> +++ b/target/ppc/kvm.c
> @@ -88,6 +88,7 @@ static int cap_ppc_safe_indirect_branch;
>  static int cap_ppc_count_cache_flush_assist;
>  static int cap_ppc_nested_kvm_hv;
>  static int cap_large_decr;
> +static int cap_fwnmi;
>  
>  static uint32_t debug_inst_opcode;
>  
> @@ -136,6 +137,7 @@ int kvm_arch_init(MachineState *ms, KVMState *s)
>      kvmppc_get_cpu_characteristics(s);
>      cap_ppc_nested_kvm_hv = kvm_vm_check_extension(s, KVM_CAP_PPC_NESTED_HV);
>      cap_large_decr = kvmppc_get_dec_bits();
> +    cap_fwnmi = kvm_vm_check_extension(s, KVM_CAP_PPC_FWNMI);
>      /*
>       * Note: setting it to false because there is not such capability
>       * in KVM at this moment.
> @@ -2064,6 +2066,11 @@ void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
>      }
>  }
>  
> +bool kvmppc_get_fwnmi(void)
> +{
> +    return cap_fwnmi;
> +}
> +
>  int kvmppc_set_fwnmi(void)
>  {
>      PowerPCCPU *cpu = POWERPC_CPU(first_cpu);
> diff --git a/target/ppc/kvm_ppc.h b/target/ppc/kvm_ppc.h
> index 332fa0aa1c..fcaf745516 100644
> --- a/target/ppc/kvm_ppc.h
> +++ b/target/ppc/kvm_ppc.h
> @@ -27,6 +27,7 @@ void kvmppc_enable_h_page_init(void);
>  void kvmppc_set_papr(PowerPCCPU *cpu);
>  int kvmppc_set_compat(PowerPCCPU *cpu, uint32_t compat_pvr);
>  void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy);
> +bool kvmppc_get_fwnmi(void);
>  int kvmppc_set_fwnmi(void);
>  int kvmppc_smt_threads(void);
>  void kvmppc_error_append_smt_possible_hint(Error *const *errp);
> @@ -163,6 +164,11 @@ static inline void kvmppc_set_mpic_proxy(PowerPCCPU *cpu, int mpic_proxy)
>  {
>  }
>  
> +static inline bool kvmppc_get_fwnmi(void)
> +{
> +    return false;
> +}
> +
>  static inline int kvmppc_set_fwnmi(void)
>  {
>      return -1;



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments
  2020-03-17  5:02 ` [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments Nicholas Piggin
@ 2020-03-17 12:16   ` Greg Kurz
  0 siblings, 0 replies; 13+ messages in thread
From: Greg Kurz @ 2020-03-17 12:16 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Cédric Le Goater, Ganesh Goudar, qemu-ppc, David Gibson

On Tue, 17 Mar 2020 15:02:12 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Some of the conditions are not as clearly documented as they could be.
> Also the non-FWNMI case does not need a large comment.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

LGTM

Reviewed-by: Greg Kurz <groug@kaod.org>

>  hw/ppc/spapr_events.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 323fcef4aa..05337f0671 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -834,17 +834,13 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>      Error *local_err = NULL;
>  
>      if (spapr->fwnmi_machine_check_addr == -1) {
> -        /*
> -         * This implies that we have hit a machine check either when the
> -         * guest has not registered FWNMI (i.e., "ibm,nmi-register" not
> -         * called) or between system reset and "ibm,nmi-register".
> -         * Fall back to the old machine check behavior in such cases.
> -         */
> +        /* Non-FWNMI case, deliver it like an architected CPU interrupt. */
>          cs->exception_index = POWERPC_EXCP_MCHECK;
>          ppc_cpu_do_interrupt(cs);
>          return;
>      }
>  
> +    /* Wait for FWNMI interlock. */
>      while (spapr->fwnmi_machine_check_interlock != -1) {
>          /*
>           * Check whether the same CPU got machine check error
> @@ -856,8 +852,13 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>              return;
>          }
>          qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
> -        /* Meanwhile if the system is reset, then just return */
>          if (spapr->fwnmi_machine_check_addr == -1) {
> +            /*
> +             * If the machine was reset while waiting for the interlock,
> +             * abort the delivery. The machine check applies to a context
> +             * that no longer exists, so it wouldn't make sense to deliver
> +             * it now.
> +             */
>              return;
>          }
>      }
> @@ -868,7 +869,9 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>           * We don't want to abort so we let the migration to continue.
>           * In a rare case, the machine check handler will run on the target.
>           * Though this is not preferable, it is better than aborting
> -         * the migration or killing the VM.
> +         * the migration or killing the VM. It is okay to call
> +         * migrate_del_blocker on a blocker that was not added (which the
> +         * nmi-interlock handler would do when it's called after this).
>           */
>          warn_report("Received a fwnmi while migration was in progress");
>      }



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings
  2020-03-17  5:02 ` [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings Nicholas Piggin
@ 2020-03-17 12:20   ` Greg Kurz
  2020-03-19  2:28     ` Nicholas Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2020-03-17 12:20 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Aravinda Prasad, Alexey Kardashevskiy, qemu-devel,
	Cédric Le Goater, Ganesh Goudar, qemu-ppc, David Gibson

On Tue, 17 Mar 2020 15:02:13 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Add some messages which explain problems and guest misbehaviour that
> may be difficult to diagnose in rare cases of machine checks.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  hw/ppc/spapr_events.c | 4 ++++
>  hw/ppc/spapr_rtas.c   | 4 ++++
>  2 files changed, 8 insertions(+)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index 05337f0671..d35151eeb0 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -807,6 +807,8 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>      /* get rtas addr from fdt */
>      rtas_addr = spapr_get_rtas_addr();
>      if (!rtas_addr) {
> +        warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                    "rtas_addr not found.");

Why a warning and not an error ?

Also maybe change the string to fit on one line ?

>          qemu_system_guest_panicked(NULL);
>          g_free(ext_elog);
>          return;
> @@ -848,6 +850,8 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>           * that CPU called "ibm,nmi-interlock")
>           */
>          if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
> +            warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                        "nested machine check.");

Ditto.

>              qemu_system_guest_panicked(NULL);
>              return;
>          }
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 29abe66d01..12cd09701c 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -462,6 +462,10 @@ static void rtas_ibm_nmi_interlock(PowerPCCPU *cpu,
>      }
>  
>      if (spapr->fwnmi_machine_check_addr == -1) {
> +        qemu_log_mask(LOG_GUEST_ERROR,
> +                      "FWNMI: ibm,nmi-interlock RTAS called with FWNMI not "
> +                       "registered.\n");

whitespace damage here   ^ best fixed by making the message a one liner.

> +
>          /* NMI register not called */
>          rtas_st(rets, 0, RTAS_OUT_PARAM_ERROR);
>          return;



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails
  2020-03-17  5:02 ` [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails Nicholas Piggin
@ 2020-03-17 16:57   ` Greg Kurz
  2020-03-19  2:29     ` Nicholas Piggin
  0 siblings, 1 reply; 13+ messages in thread
From: Greg Kurz @ 2020-03-17 16:57 UTC (permalink / raw)
  To: Nicholas Piggin
  Cc: Aravinda Prasad, Alexey Kardashevskiy, Mahesh Salgaonkar,
	qemu-devel, Cédric Le Goater, Ganesh Goudar, qemu-ppc,
	David Gibson

On Tue, 17 Mar 2020 15:02:14 +1000
Nicholas Piggin <npiggin@gmail.com> wrote:

> Try to be tolerant of errors if the machine check had been recovered
> by the host.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---

Same comment as previous patch on multi-line error strings and
warn_report() in the !recovered case.

>  hw/ppc/spapr_events.c | 25 ++++++++++++++++++-------
>  1 file changed, 18 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
> index d35151eeb0..3f524cb0ca 100644
> --- a/hw/ppc/spapr_events.c
> +++ b/hw/ppc/spapr_events.c
> @@ -807,13 +807,20 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>      /* get rtas addr from fdt */
>      rtas_addr = spapr_get_rtas_addr();
>      if (!rtas_addr) {
> -        warn_report("FWNMI: Unable to deliver machine check to guest: "
> -                    "rtas_addr not found.");
> -        qemu_system_guest_panicked(NULL);
> +        if (!recovered) {
> +            warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                        "rtas_addr not found.");
> +            qemu_system_guest_panicked(NULL);
> +        } else {
> +            warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                        "rtas_addr not found. Machine check recovered.");
> +        }
>          g_free(ext_elog);
>          return;
>      }
>  
> +    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
> +

I don't understand this change.

>      stq_be_phys(&address_space_memory, rtas_addr + RTAS_ERROR_LOG_OFFSET,
>                  env->gpr[3]);
>      cpu_physical_memory_write(rtas_addr + RTAS_ERROR_LOG_OFFSET +
> @@ -850,9 +857,14 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>           * that CPU called "ibm,nmi-interlock")
>           */
>          if (spapr->fwnmi_machine_check_interlock == cpu->vcpu_id) {
> -            warn_report("FWNMI: Unable to deliver machine check to guest: "
> -                        "nested machine check.");
> -            qemu_system_guest_panicked(NULL);
> +            if (!recovered) {
> +                warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                            "nested machine check.");
> +                qemu_system_guest_panicked(NULL);
> +            } else {
> +                warn_report("FWNMI: Unable to deliver machine check to guest: "
> +                            "nested machine check. Machine check recovered.");
> +            }
>              return;
>          }
>          qemu_cond_wait_iothread(&spapr->fwnmi_machine_check_interlock_cond);
> @@ -880,7 +892,6 @@ void spapr_mce_req_event(PowerPCCPU *cpu, bool recovered)
>          warn_report("Received a fwnmi while migration was in progress");
>      }
>  
> -    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
>      spapr_mce_dispatch_elog(cpu, recovered);
>  }
>  



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it
  2020-03-17 11:02   ` Greg Kurz
@ 2020-03-19  2:27     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-19  2:27 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Aravinda Prasad, Alexey Kardashevskiy, Mahesh Salgaonkar,
	qemu-devel, Cédric Le,  Goater, Ganesh Goudar, qemu-ppc,
	David,  Gibson

Greg Kurz's on March 17, 2020 9:02 pm:
> On Tue, 17 Mar 2020 15:02:11 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
>> The KVM FWNMI capability should be enabled with the "ibm,nmi-register"
>> rtas call. Although MCEs from KVM will be delivered as architected
>> interrupts to the guest before "ibm,nmi-register" is called, KVM has
>> different behaviour depending on whether the guest has enabled FWNMI
>> (it attempts to do more recovery on behalf of a non-FWNMI guest).
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  hw/ppc/spapr_caps.c  | 5 +++--
>>  hw/ppc/spapr_rtas.c  | 7 +++++++
>>  target/ppc/kvm.c     | 7 +++++++
>>  target/ppc/kvm_ppc.h | 6 ++++++
>>  4 files changed, 23 insertions(+), 2 deletions(-)
>> 
>> diff --git a/hw/ppc/spapr_caps.c b/hw/ppc/spapr_caps.c
>> index 679ae7959f..eb5521d0c2 100644
>> --- a/hw/ppc/spapr_caps.c
>> +++ b/hw/ppc/spapr_caps.c
>> @@ -517,9 +517,10 @@ static void cap_fwnmi_apply(SpaprMachineState *spapr, uint8_t val,
>>      }
>>  
>>      if (kvm_enabled()) {
>> -        if (kvmppc_set_fwnmi() < 0) {
>> +        if (!kvmppc_get_fwnmi()) {
>>              error_setg(errp, "Firmware Assisted Non-Maskable Interrupts(FWNMI) "
>> -                             "not supported by KVM");
>> +                             "not supported by KVM, "
>> +                             "try appending -machine cap-fwnmi=off");
> 
> It is usually preferred to keep error message strings on one
> line for easier grepping. Also hints should be specified with
> error_append_hint() because they are treated differently by
> QMP (ie. not printed).
> 
> Something like:
> 
>         if (!kvmppc_get_fwnmi()) {
>             error_setg(errp,
>        "Firmware Assisted Non-Maskable Interrupts(FWNMI) not supported by KVM");
>             error_append_hint(errp, "Try appending -machine cap-fwnmi=off\n");
>         }

Hmm, okay.

> 
> Note that the current error handling code has an issue that
> prevents hints to be printed when errp == &error_fatal, which
> is exactly what spapr_caps_apply() does. Since this affects
> a lot of locations in the code base, there's an on-going
> effort to fix that globally:
> 
> https://patchwork.ozlabs.org/project/qemu-devel/list/?series=163907
> 
> I don't know if this will make it for 5.0, but in any case I
> think you should call error_append_hint() in this patch anyway
> and the code will just work at some later point.
> 
> Rest looks good.

Thanks will do,
Nick



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings
  2020-03-17 12:20   ` Greg Kurz
@ 2020-03-19  2:28     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-19  2:28 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Aravinda Prasad, Alexey Kardashevskiy, Mahesh Salgaonkar,
	qemu-devel, Cédric Le,  Goater, Ganesh Goudar, qemu-ppc,
	David,  Gibson

Greg Kurz's on March 17, 2020 10:20 pm:
> On Tue, 17 Mar 2020 15:02:13 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
>> Add some messages which explain problems and guest misbehaviour that
>> may be difficult to diagnose in rare cases of machine checks.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>  hw/ppc/spapr_events.c | 4 ++++
>>  hw/ppc/spapr_rtas.c   | 4 ++++
>>  2 files changed, 8 insertions(+)
>> 
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index 05337f0671..d35151eeb0 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -807,6 +807,8 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>>      /* get rtas addr from fdt */
>>      rtas_addr = spapr_get_rtas_addr();
>>      if (!rtas_addr) {
>> +        warn_report("FWNMI: Unable to deliver machine check to guest: "
>> +                    "rtas_addr not found.");
> 
> Why a warning and not an error ?
> 
> Also maybe change the string to fit on one line ?

Not sure, I guess it should be.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails
  2020-03-17 16:57   ` Greg Kurz
@ 2020-03-19  2:29     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2020-03-19  2:29 UTC (permalink / raw)
  To: Greg Kurz
  Cc: Aravinda Prasad, Alexey Kardashevskiy, Mahesh Salgaonkar,
	qemu-devel, Cédric Le,  Goater, Ganesh Goudar, qemu-ppc,
	David,  Gibson

Greg Kurz's on March 18, 2020 2:57 am:
> On Tue, 17 Mar 2020 15:02:14 +1000
> Nicholas Piggin <npiggin@gmail.com> wrote:
> 
>> Try to be tolerant of errors if the machine check had been recovered
>> by the host.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
> 
> Same comment as previous patch on multi-line error strings and
> warn_report() in the !recovered case.
> 
>>  hw/ppc/spapr_events.c | 25 ++++++++++++++++++-------
>>  1 file changed, 18 insertions(+), 7 deletions(-)
>> 
>> diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
>> index d35151eeb0..3f524cb0ca 100644
>> --- a/hw/ppc/spapr_events.c
>> +++ b/hw/ppc/spapr_events.c
>> @@ -807,13 +807,20 @@ static void spapr_mce_dispatch_elog(PowerPCCPU *cpu, bool recovered)
>>      /* get rtas addr from fdt */
>>      rtas_addr = spapr_get_rtas_addr();
>>      if (!rtas_addr) {
>> -        warn_report("FWNMI: Unable to deliver machine check to guest: "
>> -                    "rtas_addr not found.");
>> -        qemu_system_guest_panicked(NULL);
>> +        if (!recovered) {
>> +            warn_report("FWNMI: Unable to deliver machine check to guest: "
>> +                        "rtas_addr not found.");
>> +            qemu_system_guest_panicked(NULL);
>> +        } else {
>> +            warn_report("FWNMI: Unable to deliver machine check to guest: "
>> +                        "rtas_addr not found. Machine check recovered.");
>> +        }
>>          g_free(ext_elog);
>>          return;
>>      }
>>  
>> +    spapr->fwnmi_machine_check_interlock = cpu->vcpu_id;
>> +
> 
> I don't understand this change.

If we bail out without delivering the interrupt, we can't take the
interlock otherwise the guest can never release it.

Thanks,
Nick


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-03-19  2:32 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-03-17  5:02 [PATCH 0/5] FWNMI follow up patches Nicholas Piggin
2020-03-17  5:02 ` [PATCH 1/5] ppc/spapr: KVM FWNMI should not be enabled until guest requests it Nicholas Piggin
2020-03-17 11:02   ` Greg Kurz
2020-03-19  2:27     ` Nicholas Piggin
2020-03-17  5:02 ` [PATCH 2/5] ppc/spapr: Improve FWNMI machine check delivery corner case comments Nicholas Piggin
2020-03-17 12:16   ` Greg Kurz
2020-03-17  5:02 ` [PATCH 3/5] ppc/spapr: Add FWNMI machine check delivery warnings Nicholas Piggin
2020-03-17 12:20   ` Greg Kurz
2020-03-19  2:28     ` Nicholas Piggin
2020-03-17  5:02 ` [PATCH 4/5] ppc/spapr: Don't kill the guest if a recovered FWNMI machine check delivery fails Nicholas Piggin
2020-03-17 16:57   ` Greg Kurz
2020-03-19  2:29     ` Nicholas Piggin
2020-03-17  5:02 ` [PATCH 5/5] target/ppc: Implement simple monitor mce injection Nicholas Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).