All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: qemu-devel@nongnu.org, david@gibson.dropbear.id.au, qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] target/ppc: Fix system lockups caused by interrupt_request state corruption
Date: Tue, 21 Nov 2017 17:56:19 +0000	[thread overview]
Message-ID: <87y3mzwm64.fsf@linaro.org> (raw)
In-Reply-To: <1511285538-22883-1-git-send-email-richard.purdie@linuxfoundation.org>


Richard Purdie <richard.purdie@linuxfoundation.org> writes:

> Occasionally in Linux guests on x86_64 we're seeing logs like:
>
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004
>
> when they should read:
>
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002
>
> The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
> cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
> just before the log message. Something is causing the HARD bit setting
> to get lost.
>
> The knock on effect of losing that bit is the decrementer timer interrupts
> don't get delivered which causes the guest to sit idle in its idle handler
> and 'hang'.
>
> The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.
>
> Rather than poking directly into cs->interrupt_request, that code needs to:
>
> a) hold BQL
> b) use the cpu_interrupt() helper
>
> This patch fixes the call sites to do this, fixing the hang.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
>  target/ppc/excp_helper.c | 12 +++++++++---
>  target/ppc/helper_regs.h |  8 ++++++--
>  2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index e6009e7..f175c21 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -207,7 +207,9 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
>                          "Entering checkstop state\n");
>              }
>              cs->halted = 1;
> -            cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> +            qemu_mutex_lock_iothread();
> +            cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> +            qemu_mutex_unlock_iothread();

Not directly related but I wonder why cs->halted is set here rather than
raising a CPU_INTERRUPT_HALT exception?

My worry with these locks is I think powerpc_excp gets called from both
paths which means you'll see an assert fire when the BQL is
double-locked.

>          }
>          if (env->msr_mask & MSR_HVB) {
>              /* ISA specifies HV, but can be delivered to guest with HV clear
> @@ -940,7 +942,9 @@ void helper_store_msr(CPUPPCState *env, target_ulong val)
>
>      if (excp != 0) {
>          CPUState *cs = CPU(ppc_env_get_cpu(env));
> -        cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> +        qemu_mutex_lock_iothread();
> +        cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> +        qemu_mutex_unlock_iothread();
>          raise_exception(env, excp);

This is fine as you only come here from TCG code.

>      }
>  }
> @@ -995,7 +999,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr)
>      /* No need to raise an exception here,
>       * as rfi is always the last insn of a TB
>       */
> -    cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> +    qemu_mutex_lock_iothread();
> +    cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> +    qemu_mutex_unlock_iothread();

I think this is fine - as again you come from TCG code.

>
>      /* Reset the reservation */
>      env->reserve_addr = -1;
> diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h
> index 2627a70..13dd0b8 100644
> --- a/target/ppc/helper_regs.h
> +++ b/target/ppc/helper_regs.h
> @@ -114,11 +114,15 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value,
>      }
>      if (((value >> MSR_IR) & 1) != msr_ir ||
>          ((value >> MSR_DR) & 1) != msr_dr) {
> -        cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> +        qemu_mutex_lock_iothread();
> +        cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> +        qemu_mutex_unlock_iothread();
>      }
>      if ((env->mmu_model & POWERPC_MMU_BOOKE) &&
>          ((value >> MSR_GS) & 1) != msr_gs) {
> -        cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> +        qemu_mutex_lock_iothread();
> +        cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> +        qemu_mutex_unlock_iothread();
>      }
>      if (unlikely((env->flags & POWERPC_FLAG_TGPR) &&
>                   ((value ^ env->msr) & (1 << MSR_TGPR)))) {

And this looks good too.

--
Alex Bennée

  parent reply	other threads:[~2017-11-21 17:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-21 17:32 [Qemu-devel] [PATCH] target/ppc: Fix system lockups caused by interrupt_request state corruption Richard Purdie
2017-11-21 17:43 ` no-reply
2017-11-21 17:56 ` Alex Bennée [this message]
2017-11-21 18:41 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87y3mzwm64.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=richard.purdie@linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.