From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Sean V Kelley <sean.v.kelley@intel.com>, <rjw@rjwysocki.net>,
<bhelgaas@google.com>, <ashok.raj@intel.com>,
<tony.luck@intel.com>,
<sathyanarayanan.kuppuswamy@linux.intel.com>,
<linux-pci@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH V2 4/9] PCI/AER: Extend AER error handling to RCECs
Date: Tue, 18 Aug 2020 10:01:50 +0100 [thread overview]
Message-ID: <20200818100150.00007f29@Huawei.com> (raw)
In-Reply-To: <20200817222433.GA1453800@bjorn-Precision-5520>
On Mon, 17 Aug 2020 17:24:33 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:
> On Mon, Aug 10, 2020 at 10:32:52AM +0100, Jonathan Cameron wrote:
> > On Fri, 7 Aug 2020 17:55:17 -0700
> > Sean V Kelley <sean.v.kelley@intel.com> wrote:
> > > On 7 Aug 2020, at 15:53, Bjorn Helgaas wrote:
> > > > On Tue, Aug 04, 2020 at 12:40:47PM -0700, Sean V Kelley wrote:
>
> > > >> if (!(pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT ||
> > > >> - pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM))
> > > >> + pci_pcie_type(dev) == PCI_EXP_TYPE_DOWNSTREAM ||
> > > >> + pci_pcie_type(dev) == PCI_EXP_TYPE_RC_END ||
> > > >> + pci_pcie_type(dev) == PCI_EXP_TYPE_RC_EC))
> > > >> dev = dev->bus->self;
>
> I'm not sure I understand this "if" statement. Previously (with no
> RCEC support), the possible ways I see to call pcie_do_recovery() are
> with:
>
> AER native: Root Port
> AER via APEI: Root Port or other PCIe device (ACPI v6.3, 18.3.2.5)
> DPC: Root Port or Switch Downstream Port
> EDR: Root Port or Switch Downstream Port
>
> I *guess* the reason we have this "if" statement is for the AER/APEI
> case? And the effect is that even if AER/APEI gives us an Endpoint,
> we back up and handle it as though we got it from the Downstream Port
> above it, i.e., we reset the Endpoint along with any other children of
> that Downstream Port?
>
> Then, IIUC, your patches add this case:
>
> AER native: Root Port or RCEC
> AER via APEI: Root Port, RCEC, or other PCIe device
>
> Just noodling here, but I wonder if this would be more understandable
> as something like:
>
> type = pci_pcie_type(dev);
> if (type == PCI_EXP_TYPE_ROOT_PORT ||
> type == PCI_EXP_TYPE_DOWNSTREAM ||
> type == PCI_EXP_TYPE_RC_EC)
> bridge = dev;
> else if (type == PCI_EXP_TYPE_RC_END)
> bridge = dev->rcec;
> else
> bridge = pci_upstream_bridge(dev);
>
> and then we could do:
>
> if (type == PCI_EXP_TYPE_RC_END)
> flr_on_rciep(dev);
> else
> reset_link(bridge);
>
> It's still awkward to have to deal with being supplied either
> endpoints or bridges. But I guess in the AER/APEI case, we aren't
> allowed to touch the error registers so maybe we can't avoid the
> awkwardness.
Agreed with your analysis with one exception. It isn't just that we
aren't allowed to touch the error registers, but also that they may
not even exist (i.e. there is no RCEC).
There are quite a lot of places where we have to then handle the
cases separately. For an RC_END in the APEI case we don't
have to have an RCEC as we should never be touching it or
any of its registers. We have platforms that do it this way
(obviously there is a hardware entity doing RCEC like stuff, but it is
not visible to the OS).
In these cases (bridge == NULL) and we can't call the bus_walk on it
to call the various desired resets on the RCiEP. We could
do something like
pci_walk_affected(bridge, dev, report_frozen, &status);
and if bridge is NULL, perform the reset just on dev.
Would that be clearer?
Thanks,
Jonathan
>
> > > >> status = reset_link(dev);
> > > >
> > > > reset_link() might be misnamed. IIUC "dev" is a bridge, and the point
> > > > is really to reset any devices below "dev." Whether we do that by
> > > > resetting link, DPC trigger, secondary bus reset, FLR, etc, is sort of
> > > > immaterial. Some of those methods might be applicable for RCiEPs.
> > > >
> > > > But you didn't add that name; I'm just trying to understand this
> > > > better.
> > >
> > > Yes, that’s a confusing term with the _link attached. It’s difficult
> > > to relate to the different resets that might be applicable. I was
> > > thinking about that when looking at the callback path via the
> > > “reset_link” of the RCiEP to the RCEC for the sole purpose of
> > > clearing the Root Port Error Status. It would be worth time to spend
> > > looking at better descriptive naming/methods.
> >
> > Agreed, this caused me some some confusion as well so more descriptive
> > naming would be good.
>
> Maybe something like reset_subordinate_devices()? Then it's clear
> that we pass a bridge and reset the devices *below* it. It's not
> quite as obvious for RCECs, since they aren't bridges and the RCiEPs
> aren't actually *subordinates*, but maybe it's still suggestive of the
> logical relationship?
>
> Bjorn
next prev parent reply other threads:[~2020-08-18 9:03 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-04 19:40 [PATCH V2 0/9] Add RCEC handling to PCI/AER Sean V Kelley
2020-08-04 19:40 ` [PATCH V2 1/9] pci_ids: Add class code and extended capability for RCEC Sean V Kelley
2020-08-05 3:01 ` Bjorn Helgaas
2020-08-04 19:40 ` [PATCH V2 2/9] PCI: Extend Root Port Driver to support RCEC Sean V Kelley
2020-08-05 17:43 ` Bjorn Helgaas
2020-08-05 18:14 ` Sean V Kelley
2020-08-05 17:45 ` Jonathan Cameron
2020-08-05 17:50 ` Sean V Kelley
2020-08-26 16:16 ` Kuppuswamy, Sathyanarayanan
2020-08-26 18:29 ` sean.v.kelley
2020-08-04 19:40 ` [PATCH V2 3/9] PCI/portdrv: Add pcie_walk_rcec() to walk RCiEPs associated with RCEC Sean V Kelley
2020-08-05 17:43 ` Bjorn Helgaas
2020-08-05 18:07 ` Sean V Kelley
2020-08-26 16:20 ` Kuppuswamy, Sathyanarayanan
2020-08-26 18:37 ` sean.v.kelley
2020-08-04 19:40 ` [PATCH V2 4/9] PCI/AER: Extend AER error handling to RCECs Sean V Kelley
2020-08-07 22:53 ` Bjorn Helgaas
2020-08-08 0:55 ` Sean V Kelley
2020-08-10 9:32 ` Jonathan Cameron
2020-08-17 22:24 ` Bjorn Helgaas
2020-08-18 9:01 ` Jonathan Cameron [this message]
2020-08-04 19:40 ` [PATCH V2 5/9] PCI/AER: Apply function level reset to RCiEP on fatal error Sean V Kelley
2020-08-04 19:40 ` [PATCH V2 6/9] PCI: Add 'rcec' field to pci_dev for associated RCiEPs Sean V Kelley
2020-08-05 17:40 ` Jonathan Cameron
2020-08-05 17:48 ` Sean V Kelley
2020-08-04 19:40 ` [PATCH V2 7/9] PCI/AER: Add RCEC AER handling Sean V Kelley
2020-08-05 17:49 ` Jonathan Cameron
2020-08-04 19:40 ` [PATCH V2 8/9] PCI/PME: Add RCEC PME handling Sean V Kelley
2020-08-05 17:51 ` Jonathan Cameron
2020-08-04 19:40 ` [PATCH V2 9/9] PCI/AER: Add RCEC AER error injection support Sean V Kelley
2020-08-05 17:54 ` Jonathan Cameron
2020-08-05 18:09 ` Sean V Kelley
2020-08-05 18:00 ` [PATCH V2 0/9] Add RCEC handling to PCI/AER Bjorn Helgaas
2020-08-05 18:12 ` Sean V Kelley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200818100150.00007f29@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=ashok.raj@intel.com \
--cc=bhelgaas@google.com \
--cc=helgaas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=rjw@rjwysocki.net \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=sean.v.kelley@intel.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).