From: Nicholas Piggin <npiggin@gmail.com>
To: Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Laurent Dufour <ldufour@linux.vnet.ibm.com>,
Michal Suchanek <msuchanek@suse.com>,
Michael Ellerman <mpe@ellerman.id.au>
Subject: Re: [PATCH v6 5/8] powerpc/pseries: flush SLB contents on SLB MCE errors.
Date: Wed, 1 Aug 2018 15:58:53 +1000 [thread overview]
Message-ID: <20180801155853.3af801c2@roar.ozlabs.ibm.com> (raw)
In-Reply-To: <153072708065.29016.482194584457257883.stgit@jupiter.in.ibm.com>
On Wed, 04 Jul 2018 23:28:21 +0530
Mahesh J Salgaonkar <mahesh@linux.vnet.ibm.com> wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
>
> On pseries, as of today system crashes if we get a machine check
> exceptions due to SLB errors. These are soft errors and can be fixed by
> flushing the SLBs so the kernel can continue to function instead of
> system crash. We do this in real mode before turning on MMU. Otherwise
> we would run into nested machine checks. This patch now fetches the
> rtas error log in real mode and flushes the SLBs on SLB errors.
>
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/mmu-hash.h | 1
> arch/powerpc/include/asm/machdep.h | 1
> arch/powerpc/kernel/exceptions-64s.S | 42 +++++++++++++++++++++
> arch/powerpc/kernel/mce.c | 16 +++++++-
> arch/powerpc/mm/slb.c | 6 +++
> arch/powerpc/platforms/pseries/pseries.h | 1
> arch/powerpc/platforms/pseries/ras.c | 51 +++++++++++++++++++++++++
> arch/powerpc/platforms/pseries/setup.c | 1
> 8 files changed, 116 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 50ed64fba4ae..cc00a7088cf3 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -487,6 +487,7 @@ extern void hpte_init_native(void);
>
> extern void slb_initialize(void);
> extern void slb_flush_and_rebolt(void);
> +extern void slb_flush_and_rebolt_realmode(void);
>
> extern void slb_vmalloc_update(void);
> extern void slb_set_size(u16 size);
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index ffe7c71e1132..fe447e0d4140 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -108,6 +108,7 @@ struct machdep_calls {
>
> /* Early exception handlers called in realmode */
> int (*hmi_exception_early)(struct pt_regs *regs);
> + int (*machine_check_early)(struct pt_regs *regs);
>
> /* Called during machine check exception to retrive fixup address. */
> bool (*mce_check_early_recovery)(struct pt_regs *regs);
> diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
> index f283958129f2..0038596b7906 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -332,6 +332,9 @@ TRAMP_REAL_BEGIN(machine_check_pSeries)
> machine_check_fwnmi:
> SET_SCRATCH0(r13) /* save r13 */
> EXCEPTION_PROLOG_0(PACA_EXMC)
> +BEGIN_FTR_SECTION
> + b machine_check_pSeries_early
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> machine_check_pSeries_0:
> EXCEPTION_PROLOG_1(PACA_EXMC, KVMTEST_PR, 0x200)
> /*
> @@ -343,6 +346,45 @@ machine_check_pSeries_0:
>
> TRAMP_KVM_SKIP(PACA_EXMC, 0x200)
>
> +TRAMP_REAL_BEGIN(machine_check_pSeries_early)
> +BEGIN_FTR_SECTION
> + EXCEPTION_PROLOG_1(PACA_EXMC, NOTEST, 0x200)
> + mr r10,r1 /* Save r1 */
> + ld r1,PACAMCEMERGSP(r13) /* Use MC emergency stack */
> + subi r1,r1,INT_FRAME_SIZE /* alloc stack frame */
> + mfspr r11,SPRN_SRR0 /* Save SRR0 */
> + mfspr r12,SPRN_SRR1 /* Save SRR1 */
> + EXCEPTION_PROLOG_COMMON_1()
> + EXCEPTION_PROLOG_COMMON_2(PACA_EXMC)
> + EXCEPTION_PROLOG_COMMON_3(0x200)
> + addi r3,r1,STACK_FRAME_OVERHEAD
> + BRANCH_LINK_TO_FAR(machine_check_early) /* Function call ABI */
> +
> + /* Move original SRR0 and SRR1 into the respective regs */
> + ld r9,_MSR(r1)
> + mtspr SPRN_SRR1,r9
> + ld r3,_NIP(r1)
> + mtspr SPRN_SRR0,r3
> + ld r9,_CTR(r1)
> + mtctr r9
> + ld r9,_XER(r1)
> + mtxer r9
> + ld r9,_LINK(r1)
> + mtlr r9
> + REST_GPR(0, r1)
> + REST_8GPRS(2, r1)
> + REST_GPR(10, r1)
> + ld r11,_CCR(r1)
> + mtcr r11
> + REST_GPR(11, r1)
> + REST_2GPRS(12, r1)
> + /* restore original r1. */
> + ld r1,GPR1(r1)
> + SET_SCRATCH0(r13) /* save r13 */
> + EXCEPTION_PROLOG_0(PACA_EXMC)
> + b machine_check_pSeries_0
> +END_FTR_SECTION_IFCLR(CPU_FTR_HVMODE)
> +
> EXC_COMMON_BEGIN(machine_check_common)
> /*
> * Machine check is different because we use a different
> diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
> index efdd16a79075..221271c96a57 100644
> --- a/arch/powerpc/kernel/mce.c
> +++ b/arch/powerpc/kernel/mce.c
> @@ -488,9 +488,21 @@ long machine_check_early(struct pt_regs *regs)
> {
> long handled = 0;
>
> - __this_cpu_inc(irq_stat.mce_exceptions);
> + /*
> + * For pSeries we count mce when we go into virtual mode machine
> + * check handler. Hence skip it. Also, We can't access per cpu
> + * variables in real mode for LPAR.
> + */
> + if (early_cpu_has_feature(CPU_FTR_HVMODE))
> + __this_cpu_inc(irq_stat.mce_exceptions);
>
> - if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
> + /*
> + * See if platform is capable of handling machine check.
> + * Otherwise fallthrough and allow CPU to handle this machine check.
> + */
> + if (ppc_md.machine_check_early)
> + handled = ppc_md.machine_check_early(regs);
> + else if (cur_cpu_spec && cur_cpu_spec->machine_check_early)
> handled = cur_cpu_spec->machine_check_early(regs);
> return handled;
> }
This looks fine to me after Michal's patch. Not sure if you want to
fold them or add his immediately after this one in your series.
> diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
> index 66577cc66dc9..5b1813b98358 100644
> --- a/arch/powerpc/mm/slb.c
> +++ b/arch/powerpc/mm/slb.c
> @@ -145,6 +145,12 @@ void slb_flush_and_rebolt(void)
> get_paca()->slb_cache_ptr = 0;
> }
>
> +void slb_flush_and_rebolt_realmode(void)
> +{
> + __slb_flush_and_rebolt();
> + get_paca()->slb_cache_ptr = 0;
> +}
I think this should do something more like flush_and_reload_slb from
powernv machine check code. We are real mode so should invalidate all
SLBs.
It happens I also need very similar code (without the initial
invalidate) for implementing idle wakeup code in C, so we should move
that function and variants into mm/slb.c IMO.
Thanks,
Nick
next prev parent reply other threads:[~2018-08-01 5:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-04 17:56 [PATCH v6 0/8] powerpc/pseries: Machine check handler improvements Mahesh J Salgaonkar
2018-07-04 17:57 ` [PATCH v6 1/8] powerpc/pseries: Avoid using the size greater than RTAS_ERROR_LOG_MAX Mahesh J Salgaonkar
2018-08-08 14:25 ` [v6, " Michael Ellerman
2018-07-04 17:57 ` [PATCH v6 2/8] powerpc/pseries: Defer the logging of rtas error to irq work queue Mahesh J Salgaonkar
2018-08-08 14:25 ` [v6, " Michael Ellerman
2018-07-04 17:57 ` [PATCH v6 3/8] powerpc/pseries: Fix endainness while restoring of r3 in MCE handler Mahesh J Salgaonkar
2018-07-04 17:57 ` [PATCH v6 4/8] powerpc/pseries: Define MCE error event section Mahesh J Salgaonkar
2018-07-04 17:58 ` [PATCH v6 5/8] powerpc/pseries: flush SLB contents on SLB MCE errors Mahesh J Salgaonkar
2018-07-10 16:53 ` Michal Suchánek
2018-08-01 5:58 ` Nicholas Piggin [this message]
2018-08-02 5:00 ` Mahesh Jagannath Salgaonkar
2018-08-02 7:50 ` Nicholas Piggin
2018-07-04 17:58 ` [PATCH v6 6/8] powerpc/pseries: Display machine check error details Mahesh J Salgaonkar
2018-07-04 17:59 ` [PATCH v6 7/8] powerpc/pseries: Dump the SLB contents on SLB MCE errors Mahesh J Salgaonkar
2018-07-04 18:00 ` [PATCH v6 8/8] powernv/pseries: consolidate code for mce early handling Mahesh J Salgaonkar
2018-07-06 9:40 ` Nicholas Piggin
2018-07-09 16:02 ` Michal Suchánek
2018-08-01 6:10 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180801155853.3af801c2@roar.ozlabs.ibm.com \
--to=npiggin@gmail.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=ldufour@linux.vnet.ibm.com \
--cc=linuxppc-dev@ozlabs.org \
--cc=mahesh@linux.vnet.ibm.com \
--cc=mpe@ellerman.id.au \
--cc=msuchanek@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).