From: Hugh Dickins <hughd@google.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@ozlabs.org, hughd@google.com, npiggin@gmail.com
Subject: Re: [PATCH] powerpc/64s: Fix unrecoverable SLB crashes due to preemption check
Date: Sun, 3 May 2020 00:10:10 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.2005030008400.1557@eggly.anvils> (raw)
In-Reply-To: <20200502143316.929341-1-mpe@ellerman.id.au>
On Sun, 3 May 2020, Michael Ellerman wrote:
> Hugh reported that his trusty G5 crashed after a few hours under load
> with an "Unrecoverable exception 380".
>
> The crash is in interrupt_return() where we check lazy_irq_pending(),
> which calls get_paca() and with CONFIG_DEBUG_PREEMPT=y that goes to
> check_preemption_disabled() via debug_smp_processor_id().
>
> As Nick explained on the list:
>
> Problem is MSR[RI] is cleared here, ready to do the last few things
> for interrupt return where we're not allowed to take any other
> interrupts.
>
> SLB interrupts can happen just about anywhere aside from kernel
> text, global variables, and stack. When that hits, it appears to be
> unrecoverable due to RI=0.
>
> The problematic access is in preempt_count() which is:
>
> return READ_ONCE(current_thread_info()->preempt_count);
>
> Because of THREAD_INFO_IN_TASK, current_thread_info() just points to
> current, so the access is to somewhere in kernel memory, but not on
> the stack or in .data, which means it can cause an SLB miss. If we
> take an SLB miss with RI=0 it is fatal.
>
> The easiest solution is to add a version of lazy_irq_pending() that
> doesn't do the preemption check and call it from the interrupt return
> path.
>
> Fixes: 68b34588e202 ("powerpc/64/sycall: Implement syscall entry/exit logic in C")
> Reported-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Thank you, Michael and Nick: this has been running fine all day for me.
Hugh
> ---
> arch/powerpc/include/asm/hw_irq.h | 20 +++++++++++++++++++-
> arch/powerpc/kernel/syscall_64.c | 6 +++---
> 2 files changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/hw_irq.h b/arch/powerpc/include/asm/hw_irq.h
> index e0e71777961f..3a0db7b0b46e 100644
> --- a/arch/powerpc/include/asm/hw_irq.h
> +++ b/arch/powerpc/include/asm/hw_irq.h
> @@ -250,9 +250,27 @@ static inline bool arch_irqs_disabled(void)
> } \
> } while(0)
>
> +static inline bool __lazy_irq_pending(u8 irq_happened)
> +{
> + return !!(irq_happened & ~PACA_IRQ_HARD_DIS);
> +}
> +
> +/*
> + * Check if a lazy IRQ is pending. Should be called with IRQs hard disabled.
> + */
> static inline bool lazy_irq_pending(void)
> {
> - return !!(get_paca()->irq_happened & ~PACA_IRQ_HARD_DIS);
> + return __lazy_irq_pending(get_paca()->irq_happened);
> +}
> +
> +/*
> + * Check if a lazy IRQ is pending, with no debugging checks.
> + * Should be called with IRQs hard disabled.
> + * For use in RI disabled code or other constrained situations.
> + */
> +static inline bool lazy_irq_pending_nocheck(void)
> +{
> + return __lazy_irq_pending(local_paca->irq_happened);
> }
>
> /*
> diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
> index c74295a7765b..1fe94dd9de32 100644
> --- a/arch/powerpc/kernel/syscall_64.c
> +++ b/arch/powerpc/kernel/syscall_64.c
> @@ -189,7 +189,7 @@ notrace unsigned long syscall_exit_prepare(unsigned long r3,
>
> /* This pattern matches prep_irq_for_idle */
> __hard_EE_RI_disable();
> - if (unlikely(lazy_irq_pending())) {
> + if (unlikely(lazy_irq_pending_nocheck())) {
> __hard_RI_enable();
> trace_hardirqs_off();
> local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> @@ -264,7 +264,7 @@ notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs, unsigned
>
> trace_hardirqs_on();
> __hard_EE_RI_disable();
> - if (unlikely(lazy_irq_pending())) {
> + if (unlikely(lazy_irq_pending_nocheck())) {
> __hard_RI_enable();
> trace_hardirqs_off();
> local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
> @@ -334,7 +334,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs, unsign
>
> trace_hardirqs_on();
> __hard_EE_RI_disable();
> - if (unlikely(lazy_irq_pending())) {
> + if (unlikely(lazy_irq_pending_nocheck())) {
> __hard_RI_enable();
> irq_soft_mask_set(IRQS_ALL_DISABLED);
> trace_hardirqs_off();
> --
> 2.25.1
>
>
next prev parent reply other threads:[~2020-05-03 7:12 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-05-02 14:33 [PATCH] powerpc/64s: Fix unrecoverable SLB crashes due to preemption check Michael Ellerman
2020-05-03 7:10 ` Hugh Dickins [this message]
2020-05-04 10:53 ` Michael Ellerman
2020-05-13 12:43 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.2005030008400.1557@eggly.anvils \
--to=hughd@google.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox