qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: "Cédric Le Goater" <clg@kaod.org>,
	qemu-ppc@nongnu.org, qemu-devel@nongnu.org,
	"David Gibson" <david@gibson.dropbear.id.au>,
	"Nikunj A Dadhania" <nikunj@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v2 2/4] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
Date: Tue, 10 Oct 2017 10:08:47 +0200	[thread overview]
Message-ID: <1507622927.25065.200.camel@kernel.crashing.org> (raw)
In-Reply-To: <20171009154930.29095-3-clg@kaod.org>

On Mon, 2017-10-09 at 17:49 +0200, Cédric Le Goater wrote:
> When a CPU is stopped with the 'stop-self' RTAS call, its state
> 'halted' is switched to 1 and, in this case, the MSR is not taken into
> account anymore in the cpu_has_work() routine. Only the pending
> hardware interrupts are checked with their LPCR:PECE* enablement bit.
> 
> If the DECR timer fires after 'stop-self' is called and before the CPU
> 'stop' state is reached, the nearly-dead CPU will have some work to do
> and the guest will crash. This case happens very frequently with the
> not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
> occasionally fired but after 'stop' state, so no work is to be done
> and the guest survives.
> 
> I suspect there is a race between the QEMU mainloop triggering the
> timers and the TCG CPU thread but I could not quite identify the root
> cause. To be safe, let's disable the decrementer interrupt in the LPCR
> when the CPU is halted and reenable it when the CPU is restarted.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

We should disable external interrupts and doorbells too no ? IE, we
could clear all of PECE in fact.

> ---
> 
> Changes in v2:
> 
>  - used a new routine ppc_cpu_pvr_match() to discriminate CPU versions
>  - removed the LPCR:PECE* enablement bit when the CPU is initialized
>    if it is a secondary
> 
>  hw/ppc/spapr_rtas.c         | 20 ++++++++++++++++++++
>  target/ppc/translate_init.c | 19 +++++++++++++++++--
>  2 files changed, 37 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index cdf0b607a0a0..dfdbf1e2c6f8 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -46,6 +46,7 @@
>  #include "qemu/cutils.h"
>  #include "trace.h"
>  #include "hw/ppc/fdt.h"
> +#include "target/ppc/cpu-models.h"
>  
>  static void rtas_display_character(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>                                     uint32_t token, uint32_t nargs,
> @@ -174,6 +175,15 @@ static void rtas_start_cpu(PowerPCCPU *cpu_, sPAPRMachineState *spapr,
>          kvm_cpu_synchronize_state(cs);
>  
>          env->msr = (1ULL << MSR_SF) | (1ULL << MSR_ME);
> +
> +        /* Enable DECR interrupt */
> +        if (ppc_cpu_pvr_match(cpu, CPU_POWERPC_LOGICAL_3_00)) {
> +            env->spr[SPR_LPCR] |= LPCR_DEE;
> +        } else {
> +            /* P7 and P8 both have same bit for DECR */
> +            env->spr[SPR_LPCR] |= LPCR_P8_PECE3;
> +        }
> +
>          env->nip = start;
>          env->gpr[3] = r3;
>          cs->halted = 0;
> @@ -210,6 +220,16 @@ static void rtas_stop_self(PowerPCCPU *cpu, sPAPRMachineState *spapr,
>       * no need to bother with specific bits, we just clear it.
>       */
>      env->msr = 0;
> +
> +    /* Don't let the decremeter run on a CPU being stopped. This could
> +     * deliver an interrupt on a dying CPU and crash the guest.
> +     */
> +    if (ppc_cpu_pvr_match(cpu, CPU_POWERPC_LOGICAL_3_00)) {
> +        env->spr[SPR_LPCR] &= ~LPCR_DEE;
> +    } else {
> +        /* P7 and P8 both have same bit for DECR */
> +        env->spr[SPR_LPCR] &= ~LPCR_P8_PECE3;
> +    }
>  }
>  
>  static inline int sysparm_st(target_ulong addr, target_ulong len,
> diff --git a/target/ppc/translate_init.c b/target/ppc/translate_init.c
> index 0d6379fcc5b4..1a62159843e7 100644
> --- a/target/ppc/translate_init.c
> +++ b/target/ppc/translate_init.c
> @@ -8905,6 +8905,7 @@ void cpu_ppc_set_papr(PowerPCCPU *cpu, PPCVirtualHypervisor *vhyp)
>      CPUPPCState *env = &cpu->env;
>      ppc_spr_t *lpcr = &env->spr_cb[SPR_LPCR];
>      ppc_spr_t *amor = &env->spr_cb[SPR_AMOR];
> +    CPUState *cs = CPU(cpu);
>  
>      cpu->vhyp = vhyp;
>  
> @@ -8946,8 +8947,15 @@ void cpu_ppc_set_papr(PowerPCCPU *cpu, PPCVirtualHypervisor *vhyp)
>          } else {
>              lpcr->default_value &= ~(LPCR_UPRT | LPCR_GTSE);
>          }
> -        lpcr->default_value |= LPCR_PDEE | LPCR_HDEE | LPCR_EEE | LPCR_DEE |
> +        lpcr->default_value |= LPCR_PDEE | LPCR_HDEE | LPCR_EEE |
>                                 LPCR_OEE;
> +
> +        /* Only let the decremeter wake up the boot CPU. The RTAS
> +         * command start-cpu will enable it on secondaries.
> +         */
> +        if (cs == first_cpu) {
> +            lpcr->default_value |= LPCR_DEE;
> +        }
>          break;
>      default:
>          /* P7 and P8 has slightly different PECE bits, mostly because P8 adds
> @@ -8955,7 +8963,14 @@ void cpu_ppc_set_papr(PowerPCCPU *cpu, PPCVirtualHypervisor *vhyp)
>           * will work as expected for both implementations
>           */
>          lpcr->default_value |= LPCR_P8_PECE0 | LPCR_P8_PECE1 | LPCR_P8_PECE2 |
> -                               LPCR_P8_PECE3 | LPCR_P8_PECE4;
> +                               LPCR_P8_PECE4;
> +
> +        /* Only let the decremeter wake up the boot CPU. The RTAS
> +         * command start-cpu will enable it on secondaries.
> +         */
> +        if (cs == first_cpu) {
> +            lpcr->default_value |= LPCR_P8_PECE3;
> +        }
>      }
>  
>      /* We should be followed by a CPU reset but update the active value

  reply	other threads:[~2017-10-10  8:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-09 15:49 [Qemu-devel] [PATCH v2 0/4] disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
2017-10-09 15:49 ` [Qemu-devel] [PATCH v2 1/4] target/ppc: export ppc_cpu_pvr_match() helper Cédric Le Goater
2017-10-11  6:41   ` David Gibson
2017-10-09 15:49 ` [Qemu-devel] [PATCH v2 2/4] spapr/rtas: disable the decrementer interrupt when a CPU is unplugged Cédric Le Goater
2017-10-10  8:08   ` Benjamin Herrenschmidt [this message]
2017-10-10 15:56     ` Cédric Le Goater
2017-10-11  6:45   ` David Gibson
2017-10-11 11:55     ` Cédric Le Goater
2017-10-11 22:46       ` David Gibson
2017-10-12  9:25         ` Cédric Le Goater
2017-10-12  9:29           ` Cédric Le Goater
2017-10-09 15:49 ` [Qemu-devel] [PATCH v2 3/4] spapr/rtas: fix reboot of a SMP TCG guest Cédric Le Goater
2017-10-12  4:34   ` Nikunj A Dadhania
2017-10-09 15:49 ` [Qemu-devel] [PATCH v2 4/4] spapr/rtas: do not reset the MSR in stop-self command Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1507622927.25065.200.camel@kernel.crashing.org \
    --to=benh@kernel.crashing.org \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=nikunj@linux.vnet.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).