From: Guenter Roeck <linux@roeck-us.net>
To: Scott Wood <scottwood@freescale.com>
Cc: hongtao.jia@freescale.com, linux-kernel@vger.kernel.org,
Guenter Roeck <groeck@juniper.net>,
Paul Mackerras <paulus@samba.org>,
linuxppc-dev@lists.ozlabs.org,
Jojy G Varghese <jojyv@juniper.net>
Subject: Re: [PATCH] powerpc/fsl: Add support for pci(e) machine check exception on E500MC / E5500
Date: Mon, 29 Sep 2014 12:06:43 -0700 [thread overview]
Message-ID: <20140929190643.GA20189@roeck-us.net> (raw)
In-Reply-To: <1412015766.13320.253.camel@snotra.buserror.net>
On Mon, Sep 29, 2014 at 01:36:06PM -0500, Scott Wood wrote:
> On Mon, 2014-09-29 at 09:48 -0700, Guenter Roeck wrote:
> > From: Jojy G Varghese <jojyv@juniper.net>
> >
> > For E500MC and E5500, a machine check exception in pci(e) memory space
> > crashes the kernel.
> >
> > Testing shows that the MCAR(U) register is zero on a MC exception for the
> > E5500 core. At the same time, DEAR register has been found to have the
> > address of the faulty load address during an MC exception for this core.
> >
> > This fix changes the current behavior to fixup the result register
> > and instruction pointers in the case of a load operation on a faulty
> > PCI address.
> >
> > The changes are:
> > - Added the hook to pci machine check handing to the e500mc machine check
> > exception handler.
> > - For the E5500 core, load faulting address from SPRN_DEAR register.
> > As mentioned above, this is necessary because the E5500 core does not
> > report the fault address in the MCAR register.
> >
> > Cc: Scott Wood <scottwood@freescale.com>
> > Signed-off-by: Jojy G Varghese <jojyv@juniper.net>
> > [Guenter Roeck: updated description]
> > Signed-off-by: Guenter Roeck <groeck@juniper.net>
> > Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> > ---
> > arch/powerpc/kernel/traps.c | 3 ++-
> > arch/powerpc/sysdev/fsl_pci.c | 5 +++++
> > 2 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
> > index 0dc43f9..ecb709b 100644
> > --- a/arch/powerpc/kernel/traps.c
> > +++ b/arch/powerpc/kernel/traps.c
> > @@ -494,7 +494,8 @@ int machine_check_e500mc(struct pt_regs *regs)
> > int recoverable = 1;
> >
> > if (reason & MCSR_LD) {
> > - recoverable = fsl_rio_mcheck_exception(regs);
> > + recoverable = fsl_rio_mcheck_exception(regs) ||
> > + fsl_pci_mcheck_exception(regs);
> > if (recoverable == 1)
> > goto silent_out;
> > }
> > diff --git a/arch/powerpc/sysdev/fsl_pci.c b/arch/powerpc/sysdev/fsl_pci.c
> > index c507767..bdb956b 100644
> > --- a/arch/powerpc/sysdev/fsl_pci.c
> > +++ b/arch/powerpc/sysdev/fsl_pci.c
> > @@ -1021,6 +1021,11 @@ int fsl_pci_mcheck_exception(struct pt_regs *regs)
> > #endif
> > addr += mfspr(SPRN_MCAR);
> >
> > +#ifdef CONFIG_E5500_CPU
> > + if (mfspr(SPRN_EPCR) & SPRN_EPCR_ICM)
> > + addr = PFN_PHYS(vmalloc_to_pfn((void *)mfspr(SPRN_DEAR)));
> > +#endif
>
> Kconfig tells you what hardware is supported, not what hardware you're
> actually running on.
>
Hi Scott,
Good point. Jojy, guess we'll have to check if the CPU is actually an E5500.
Can you look into that ?
> Jia Hongtao, do you know anything about this issue? Is there an
> erratum? What chips are affected by the the erratum covered by
> <http://patchwork.ozlabs.org/patch/240239/>?
>
We already have and use the above patch(es) in our kernel. It works fine
for E500 (P2020), but does not address E5500 (P5020/P5040).
> Can we rely on DEAR or is this just a side effect of likely having taken
> a TLB miss for the address recently? Perhaps we should use the
> instruction emulation to determine the effective address instead.
>
> Guenter, is this patch intended to deal with an erratum or are you
> covering up legitimate errors?
>
Those are errors related to PCIe hotplug, and are seen with unexpected PCIe
device removals (triggered, for example, by removing power from a PCIe adapter).
The behavior we see on E5500 is quite similar to the same behavior on E500:
If unhandled, the CPU keeps executing the same instruction over and over again
if there is an error on a PCIe access and thus stalls. I don't know if this
is considered an erratum or expected behavior, but it is one we have to address
since we have to be able to handle that condition. Ultimately, we'll want to
implement PCIe error handlers for the affected drivers, but that will be a next
step.
Please let me know if you have a better solution to address this problem.
Thanks,
Guenter
next prev parent reply other threads:[~2014-09-29 19:06 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-29 16:48 [PATCH] powerpc/fsl: Add support for pci(e) machine check exception on E500MC / E5500 Guenter Roeck
2014-09-29 18:36 ` Scott Wood
2014-09-29 19:06 ` Guenter Roeck [this message]
2014-09-29 23:03 ` Jojy Varghese
2014-09-29 23:31 ` Scott Wood
2014-09-30 15:50 ` Guenter Roeck
2014-09-30 20:15 ` Jojy Varghese
2014-09-30 20:17 ` Scott Wood
2014-09-30 20:20 ` Jojy Varghese
2014-10-01 0:43 ` Scott Wood
2014-10-08 3:10 ` Hongtao Jia
2014-10-08 3:08 ` Hongtao Jia
2014-10-08 23:48 ` Scott Wood
2014-10-09 2:18 ` Hongtao Jia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140929190643.GA20189@roeck-us.net \
--to=linux@roeck-us.net \
--cc=groeck@juniper.net \
--cc=hongtao.jia@freescale.com \
--cc=jojyv@juniper.net \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulus@samba.org \
--cc=scottwood@freescale.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).