Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

From: "Oliver O'Halloran" <oohall@gmail.com>
To: Michael Ellerman <mpe@ellerman.id.au>
Cc: Brian King <brking@linux.vnet.ibm.com>,
	Wen Xiong <wenxiong@linux.vnet.ibm.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	wenxiong@us.ibm.com
Subject: Re: powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset
Date: Fri, 19 Jun 2020 16:09:15 +1000	[thread overview]
Message-ID: <CAOSf1CEusxHAvXOUJ=R8MV8z0rt8Hh0Ag33QmSDopH1FQrxEGg@mail.gmail.com> (raw)
In-Reply-To: <87ftaudx1x.fsf@mpe.ellerman.id.au>

On Wed, Jun 17, 2020 at 4:29 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> "Oliver O'Halloran" <oohall@gmail.com> writes:
> > On Tue, Jun 16, 2020 at 9:55 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
> >> wenxiong@linux.vnet.ibm.com writes:
> >> > From: Wen Xiong <wenxiong@linux.vnet.ibm.com>
> >> >
> >> > Several device drivers hit EEH(Extended Error handling) when triggering
> >> > kdump on Pseries PowerVM. This patch implemented a reset of the PHBs
> >> > in pci general code when triggering kdump.
> >>
> >> Actually it's in pseries specific PCI code, and the reset is done in the
> >> 2nd kernel as it boots, not when triggering the kdump.
> >>
> >> You're doing it as a:
> >>
> >>   machine_postcore_initcall(pseries, pseries_phb_reset);
> >>
> >> But we do the EEH initialisation in:
> >>
> >>   core_initcall_sync(eeh_init);
> >>
> >> Which happens first.
> >>
> >> So it seems to me that this should be called from pseries_eeh_init().
> >
> > This happens to use some of the same RTAS calls as EEH, but it's
> > entirely orthogonal to it.
>
> I don't agree. I mean it's literally calling EEH_RESET_FUNDAMENTAL etc.
> Those RTAS calls are all documented in the EEH section of PAPR.
>
> I guess you're saying it's orthogonal to the kernel handling an EEH and
> doing the recovery process etc, which I can kind of see.
>
> > Wedging the two together doesn't make any real sense IMO since this
> > should be usable even with !CONFIG_EEH.
>
> You can't turn CONFIG_EEH off for pseries or powernv.

Not yet :)

> And if you could this patch wouldn't compile because it uses EEH
> constants that are behind #ifdef CONFIG_EEH.

That's fixable.

> If you could turn CONFIG_EEH off it would presumably be because you were
> on a platform that didn't support EEH, in which case you wouldn't need
> this code.

I think there's an argument to be made for disabling EEH in some
situations. A lot of drivers do a pretty poor job of recovering in the
first place so it's conceivable that someone might want to disable it
in say, a kdump kernel. That said, the real reason is mostly for the
sake of code organisation. EEH is an optional platform feature but you
wouldn't know it looking at the implementation and I'd like to stop it
bleeding into odd places. Making it buildable without !CONFIG_EEH
would probably help.

> So IMO this is EEH code, and should be with the other EEH code and
> should be behind CONFIG_EEH.

*shrug*

I wanted it to follow the model of the powernv implementation of the
same feature which is done immediately after initialising the
pci_controller and independent of all of the EEH setup. Although,
looking at it again I see it calls pnv_eeh_phb_reset() which is in
eeh_powernv.c so I guess that's pretty similar to what you're
suggesting.

> That sounds like a good cleanup. I'm not concerned about conflicts
> within arch/powerpc, I can fix them up.
>
> >> > +             list_for_each_entry(phb, &hose_list, list_node) {
> >> > +                     config_addr = pseries_get_pdn_addr(phb);
> >> > +                     if (config_addr == -1)
> >> > +                             continue;
> >> > +
> >> > +                     ret = rtas_call(ibm_set_slot_reset, 4, 1, NULL,
> >> > +                             config_addr, BUID_HI(phb->buid),
> >> > +                             BUID_LO(phb->buid), EEH_RESET_FUNDAMENTAL);
> >> > +
> >> > +                     /* If fundamental-reset not supported, try hot-reset */
> >> > +                     if (ret == -8)
> >>
> >> Where does -8 come from?
> >
> > There's a comment right there.
>
> Yeah I guess. I was expecting it would map to some RTAS_ERROR_FOO value,
> but it's just literally -8 in PAPR.

Yeah, as far as I can tell the meaning of the return codes are
specific to each RTAS call, it's a bit bad.

     prev parent reply	other threads:[~2020-06-19  6:11 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-26 13:21 powerpc/pci: [PATCH 1/1 V3] PCIE PHB reset wenxiong
2020-05-29  2:56 ` Oliver O'Halloran
2020-05-29  3:17 ` Sam Bobroff
2020-06-16 11:56 ` Michael Ellerman
2020-06-17  0:45   ` Oliver O'Halloran
2020-06-17  6:30     ` Michael Ellerman
2020-06-19  6:09       ` Oliver O'Halloran [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOSf1CEusxHAvXOUJ=R8MV8z0rt8Hh0Ag33QmSDopH1FQrxEGg@mail.gmail.com' \
    --to=oohall@gmail.com \
    --cc=brking@linux.vnet.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mpe@ellerman.id.au \
    --cc=wenxiong@linux.vnet.ibm.com \
    --cc=wenxiong@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).