From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Purdie <richard.purdie@linuxfoundation.org>
Cc: qemu-devel@nongnu.org, david@gibson.dropbear.id.au, qemu-ppc@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] target/ppc: Fix system lockups caused by interrupt_request state corruption
Date: Tue, 21 Nov 2017 17:56:19 +0000 [thread overview]
Message-ID: <87y3mzwm64.fsf@linaro.org> (raw)
In-Reply-To: <1511285538-22883-1-git-send-email-richard.purdie@linuxfoundation.org>
Richard Purdie <richard.purdie@linuxfoundation.org> writes:
> Occasionally in Linux guests on x86_64 we're seeing logs like:
>
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004
>
> when they should read:
>
> ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002
>
> The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
> cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
> just before the log message. Something is causing the HARD bit setting
> to get lost.
>
> The knock on effect of losing that bit is the decrementer timer interrupts
> don't get delivered which causes the guest to sit idle in its idle handler
> and 'hang'.
>
> The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.
>
> Rather than poking directly into cs->interrupt_request, that code needs to:
>
> a) hold BQL
> b) use the cpu_interrupt() helper
>
> This patch fixes the call sites to do this, fixing the hang.
>
> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
> ---
> target/ppc/excp_helper.c | 12 +++++++++---
> target/ppc/helper_regs.h | 8 ++++++--
> 2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/target/ppc/excp_helper.c b/target/ppc/excp_helper.c
> index e6009e7..f175c21 100644
> --- a/target/ppc/excp_helper.c
> +++ b/target/ppc/excp_helper.c
> @@ -207,7 +207,9 @@ static inline void powerpc_excp(PowerPCCPU *cpu, int excp_model, int excp)
> "Entering checkstop state\n");
> }
> cs->halted = 1;
> - cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> + qemu_mutex_lock_iothread();
> + cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> + qemu_mutex_unlock_iothread();
Not directly related but I wonder why cs->halted is set here rather than
raising a CPU_INTERRUPT_HALT exception?
My worry with these locks is I think powerpc_excp gets called from both
paths which means you'll see an assert fire when the BQL is
double-locked.
> }
> if (env->msr_mask & MSR_HVB) {
> /* ISA specifies HV, but can be delivered to guest with HV clear
> @@ -940,7 +942,9 @@ void helper_store_msr(CPUPPCState *env, target_ulong val)
>
> if (excp != 0) {
> CPUState *cs = CPU(ppc_env_get_cpu(env));
> - cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> + qemu_mutex_lock_iothread();
> + cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> + qemu_mutex_unlock_iothread();
> raise_exception(env, excp);
This is fine as you only come here from TCG code.
> }
> }
> @@ -995,7 +999,9 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr)
> /* No need to raise an exception here,
> * as rfi is always the last insn of a TB
> */
> - cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> + qemu_mutex_lock_iothread();
> + cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> + qemu_mutex_unlock_iothread();
I think this is fine - as again you come from TCG code.
>
> /* Reset the reservation */
> env->reserve_addr = -1;
> diff --git a/target/ppc/helper_regs.h b/target/ppc/helper_regs.h
> index 2627a70..13dd0b8 100644
> --- a/target/ppc/helper_regs.h
> +++ b/target/ppc/helper_regs.h
> @@ -114,11 +114,15 @@ static inline int hreg_store_msr(CPUPPCState *env, target_ulong value,
> }
> if (((value >> MSR_IR) & 1) != msr_ir ||
> ((value >> MSR_DR) & 1) != msr_dr) {
> - cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> + qemu_mutex_lock_iothread();
> + cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> + qemu_mutex_unlock_iothread();
> }
> if ((env->mmu_model & POWERPC_MMU_BOOKE) &&
> ((value >> MSR_GS) & 1) != msr_gs) {
> - cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
> + qemu_mutex_lock_iothread();
> + cpu_interrupt(cs, CPU_INTERRUPT_EXITTB);
> + qemu_mutex_unlock_iothread();
> }
> if (unlikely((env->flags & POWERPC_FLAG_TGPR) &&
> ((value ^ env->msr) & (1 << MSR_TGPR)))) {
And this looks good too.
--
Alex Bennée
next prev parent reply other threads:[~2017-11-21 17:56 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-21 17:32 [Qemu-devel] [PATCH] target/ppc: Fix system lockups caused by interrupt_request state corruption Richard Purdie
2017-11-21 17:43 ` no-reply
2017-11-21 17:56 ` Alex Bennée [this message]
2017-11-21 18:41 ` no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87y3mzwm64.fsf@linaro.org \
--to=alex.bennee@linaro.org \
--cc=david@gibson.dropbear.id.au \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=richard.purdie@linuxfoundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).