All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Shradha Todi" <shradha.t@samsung.com>
To: "'Manivannan Sadhasivam'" <manivannan.sadhasivam@linaro.org>,
	"'Jonathan Cameron'" <Jonathan.Cameron@Huawei.com>
Cc: <bp@alien8.de>, <tony.luck@intel.com>, <james.morse@arm.com>,
	<mchehab@kernel.org>, <rric@kernel.org>, <lpieralisi@kernel.org>,
	<kw@linux.com>, <robh@kernel.org>, <bhelgaas@google.com>,
	<jingoohan1@gmail.com>, <gustavo.pimentel@synopsys.com>,
	<josh@joshtriplett.org>, <lukas.bulwahn@gmail.com>,
	<hongxing.zhu@nxp.com>, <pankaj.dubey@samsung.com>,
	<linux-kernel@vger.kernel.org>, <linux-pci@vger.kernel.org>,
	<vidyas@nvidia.com>, <gost.dev@samsung.com>,
	<alim.akhtar@samsung.com>, <shiju.jose@huawei.com>,
	"'Terry	Bowman'" <Terry.Bowman@amd.com>
Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller
Date: Fri, 22 Mar 2024 16:51:16 +0530	[thread overview]
Message-ID: <14ef01da7c4b$15dbaac0$41930040$@samsung.com> (raw)
In-Reply-To: <20240322103935.GD3638@thinkpad>



> -----Original Message-----
> From: 'Manivannan Sadhasivam' <manivannan.sadhasivam@linaro.org>
> Sent: 22 March 2024 16:10
> To: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
> Cc: Shradha Todi <shradha.t@samsung.com>; bp@alien8.de;
> tony.luck@intel.com; james.morse@arm.com; mchehab@kernel.org;
> rric@kernel.org; lpieralisi@kernel.org; kw@linux.com; robh@kernel.org;
> bhelgaas@google.com; jingoohan1@gmail.com;
> gustavo.pimentel@synopsys.com; josh@joshtriplett.org;
> lukas.bulwahn@gmail.com; hongxing.zhu@nxp.com;
> pankaj.dubey@samsung.com; linux-kernel@vger.kernel.org; linux-
> pci@vger.kernel.org; vidyas@nvidia.com; gost.dev@samsung.com;
> alim.akhtar@samsung.com; shiju.jose@huawei.com; Terry Bowman
> <Terry.Bowman@amd.com>
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
> 
> On Wed, Mar 20, 2024 at 10:01:44AM +0000, Jonathan Cameron wrote:
> > On Tue, 19 Mar 2024 22:03:15 +0530
> > 'Manivannan Sadhasivam' <manivannan.sadhasivam@linaro.org> wrote:
> >
> > > On Thu, Feb 22, 2024 at 04:30:47PM +0530, Shradha Todi wrote:
> > > > + Borislav, Tony, James, Mauro, Robert
> > > >
> > > > Hi All,
> > > >
> > > > Synopsys DesignWare PCIe controllers have a vendor specific
> > > > capability (which means that this set of registers are only
> > > > present in DesignWare controllers) to perform debug operations called
> "RASDES".
> > > > The functionalities provided by this extended capability are:
> > > >
> > > > 1. Debug: This has some debug related diagnostic features like
> > > > holding LTSSM in certain states, reading the status of lane
> > > > detection, checking if any PCIe lanes are broken (RX Valid) and so
> > > > on. It's a debug only feature used for diagnostic use-cases.
> > > >
> > > > 2. Error Injection: This is a way to inject certain errors in PCIe
> > > > like LCRC, ECRC, Bad TLPs and so on. Again, this is a debug
> > > > feature and generally not used in functional use-case.
> > > >
> > > > 3. Statistical counters: This has 3 parts
> > > >  - Error counters
> > > >  - Non error counters (covered as part of perf [1])
> > > >  - Time based analysis counters (covered as part of perf [1])
> > > >
> > > > Selective features of  the above functionality has been
> > > > implemented by vendor specific PCIe controller drivers
> > > > (pcie-tegra194.c) that use Synopsys DesignWare PCIe controllers.
> > > > In order to make it useful to all vendors using DWC controller, we
> > > > had proposed a common implementation in DWC PCIe controller
> > > > directory
> > > > (drivers/pci/controller/dwc/) and our original idea was based on
> > > > debugfs filesystem. v1 and v2 are mentioned in [2] and [3].
> > > >
> > > > We got a suggestion to implement this as part of EDAC framework
> > > > [3] and we looked into the same. But as far as I understood, what
> > > > I am trying to implement is a very specific feature (only valid for Synopsys
> DWC PCIe controllers).
> >
> > For error part there are (at least superficially) similar features in
> > the PCIe standard that we've started thinking about how to support.
> >
> > See Flit Logging Extended capablity (7.7.8 in PCIe Base Spec rev6.
> > That has the benefit that they are part of the standard so we can
> > support them directly in portdrv / EP drivers using some library code
> > in the PCI core.
> >
> 
> Sounds good. But v6 is a relatively new version and the DWC RAS predates that.
> So we still need to support it somehow (either in EDAC or in
> drivers/pci/controller/dwc).
> 

For all PCIe spec standard errors (All of those mentioned in 6.2 in PCIe Base Spec rev6 like AER,
FLIT logging, DPC, SR-IOV Baseline Error Handling), extending the EDAC / portdrv / EP drivers
seems like the best way. But this particular RAS-DES functionality exists only for Synopsys
DesignWare controller, hence I feel the *DWC controller driver* should extend this as it is a
controller specific feature. What would you suggest?
Unless there a plan to support vendor specific implementations through EDAC or
portdrv callbacks? Because writing a new pci driver would cause 2 driver binding problem.

> > There are other interconnect and PCI PMU drivers that log retries etc
> > which are also basically error counts. At least some of that is done through perf
> today.
> >
> 
> IMO all the RAS support should be exposed through EDAC, otherwise it defeats
> the purpose of the subsystem.
> 
> - Mani
> 

Yes, I see that lot of non-error based counters like memory packet tx/rx received,
nak dllp received, etc is done as a part of perf today. But I believe, the functionality I want to
implement has features like error injection which would not be suitable to be a part of perf.

> >
> > > > This doesn't seem to fit in very well with the EDAC framework and
> > > > we can hardly use any of the EDAC framework APIs. We tried
> > > > implementing a "pci_driver" but since a function driver will
> > > > already be running on the EP and portdrv on the root-complex, we
> > > > will not be able to bind 2 drivers to a single PCI device
> > > > (root-complex or endpoint). Ultimately, what I will be doing is
> > > > writing a platform driver with debugfs entries which will be present in EDAC
> directory instead of DWC directory.
> >
> > The addition of this type of functionality to pordrv is a long running question.
> > Everyone wants a solution, I believe some people are looking at it
> > (+CC Terry)
> >
> > Terry, another case for your long list.
> >
> > For the EP end, this should be fired up by the EP driver, whilst it
> > might be infrastructure used on a bunch of devices,  it is a feature
> > of that particular EP - so you'd want to provide any functionality in
> > a form that could be used by both the EP driver and a nice shiny new portdrv
> replacement.
> >
> > > >
> > > > Can  you please help us out by going through this thread [3] and
> > > > letting us know if our understanding is wrong at any point. If you
> > > > think it is a better idea to integrate this in the EDAC framework,
> > > > can you guide me as to how I can utilize the framework better?
> > > > Please let me know if you need any other information to conclude.
> > > >
> > > > [1]
> > > > https://lore.kernel.org/linux-pci/20231121013400.18367-1-xueshuai@
> > > > linux.alibaba.com/ [2]
> > > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@samsu
> > > > ng.com/T/ [3]
> > > > https://lore.kernel.org/all/20231130115044.53512-1-shradha.t@samsu
> > > > ng.com/
> > > >
> > >
> > > Gentle ping for the EDAC maintainers.
> > >
> > > - Mani
> > >
> > > > Thanks,
> > > > Shradha
> > > >
> > > > > -----Original Message-----
> > > > > From: 'Manivannan Sadhasivam' <manivannan.sadhasivam@linaro.org>
> > > > > Sent: 16 February 2024 19:19
> > > > > To: Shradha Todi <shradha.t@samsung.com>
> > > > > Cc: lpieralisi@kernel.org; kw@linux.com; robh@kernel.org;
> > > > > bhelgaas@google.com; jingoohan1@gmail.com;
> > > > > gustavo.pimentel@synopsys.com; josh@joshtriplett.org;
> > > > > lukas.bulwahn@gmail.com; hongxing.zhu@nxp.com;
> > > > > pankaj.dubey@samsung.com; linux-kernel@vger.kernel.org; linux-
> > > > > pci@vger.kernel.org; vidyas@nvidia.com; gost.dev@samsung.com
> > > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in
> > > > > PCIe DW controller
> > > > >
> > > > > On Thu, Feb 15, 2024 at 02:55:06PM +0530, Shradha Todi wrote:
> > > > > >
> > > > > >
> > > > >
> > > > > [...]
> > > > >
> > > > > > > For the error injection and counters, we already have the
> > > > > > > EDAC framework. So adding them in the DWC driver doesn't make
> sense to me.
> > > > > > >
> > > > > >
> > > > > > Sorry for late response, was going through the EDAC framework
> > > > > > to understand
> > > > > better how we can fit RAS DES support in it. Below are some
> > > > > technical challenges found so far:
> > > > > > 1: This debugfs framework proposed [1] can run on both side of
> > > > > > the link i.e. RC
> > > > > and EP as it will be a part of the link controller platform
> > > > > driver. Here for the EP side the assumption is that it has Linux
> > > > > running, which is primarily a use case for chip-to-chip
> > > > > communication.  After your suggestion to migrate to EDAC framework we
> studied and here are the findings:
> > > > > > - If we move to EDAC framework, we need to have RAS DES as a
> > > > > > pci_driver which will be binded based on vendor_id and
> > > > > > device_id. Our observation is that on EP side system we are
> > > > > > unable to bind two function driver (pci_driver), as
> > > > > > pci_endpoint_test function driver or some other chip-to-chip
> > > > > > function driver will already be bound. On the other hand, on
> > > > > > RC side we observed that if we have portdrv enabled in Linux
> > > > > > running on RC system, it gets bound to RC controller and then
> > > > > > it does not allow EDAC pci_driver to bind. So basically we see
> > > > > > a problem here, that we can't have two pci_driver binding to
> > > > > > same PCI device
> > > > > > 2: Another point is even though we use EDAC driver framework,
> > > > > > we may not be
> > > > > able to use any of EDAC framework APIs as they are mostly
> > > > > suitable for memory controller devices sitting on PCI BUS. We
> > > > > will end up using debugfs entries just via a pci_driver placed inside EDAC
> framework.
> > > > >
> > > > > Please wrap your replies to 80 characters.
> > > > >
> > > > > There is no need to bind the edac driver to VID:PID of the
> > > > > device. The edac driver can be a platform driver and you can
> > > > > instantiate the platform device from the DWC driver. This way,
> > > > > the PCI device can be assocaited with whatever driver, but still there can
> be a separate edac driver for handling errors.
> > > > >
> > > > > Regarding API limitation, you should ask the maintainer about
> > > > > the possibility of extending them.
> > > > >
> > > > > >
> > > > > > Please let me know if my understanding is wrong.
> > > > > >
> > > > > > > But first check with the perf driver author if they have any
> > > > > > > plans on adding the proposed functionality. If they do not
> > > > > > > have any plan or not working on it, then look into EDAC.
> > > > > > >
> > > > > > > - Mani
> > > > > > >
> > > > > >
> > > > > > Since we already worked and posted patches [1], [2], we will
> > > > > > continue to work
> > > > > on this and based on consent from community we will adopt to
> > > > > most suitable framework.
> > > > > > We see many subsystems like ethernet, usb, gpu, cxl having
> > > > > > debugfs files that
> > > > > give information about the current status of the running system
> > > > > and as of now based on our findings, we still feel there is no
> > > > > harm in having debugfs entry based support in DesignWare controller
> driver itself.
> > > > >
> > > > > There is no issue in exposing the debug information through
> > > > > debugfs, that's the sole purpose of the interface. But here, you
> > > > > are trying to add support for DWC RAS feature for which a dedicated
> framework already exists.
> > > > >
> > > > > And there will be more similar requests coming for vendor
> > > > > specific error protocols as well. So your investigation could benefit
> everyone.
> > > > >
> > > > > From your above investigation, looks like there are some
> > > > > shortcomings of the EDAC framework. So let's get that clarified
> > > > > by writing to the EDAC maintainers (keep us in CC). If the EDAC
> > > > > maintainer suggests you to add support for this feature in DWC driver
> itself citing some reasons, then no issues with me.
> > > > >
> > > > > - Mani
> > > > >
> > > > > --
> > > > > மணிவண்ணன் சதாசிவம்
> > > >
> > > >
> > >
> >
> 
> --
> மணிவண்ணன் சதாசிவம்



  reply	other threads:[~2024-03-22 11:37 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20231130115055epcas5p4e29befa80877be45dbee308846edc0ba@epcas5p4.samsung.com>
2023-11-30 11:50 ` [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller Shradha Todi
2023-11-30 11:50   ` [PATCH v2 1/3] PCI: dwc: Add support for vendor specific capability search Shradha Todi
2023-11-30 11:50   ` [PATCH v2 2/3] PCI: debugfs: Add support for RASDES framework in DWC Shradha Todi
2023-11-30 11:50   ` [PATCH v2 3/3] PCI: dwc: Create debugfs files in DWC driver Shradha Todi
2023-11-30 16:55   ` [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller Manivannan Sadhasivam
2023-12-04  8:40     ` Shradha Todi
2024-01-03  5:43     ` Shradha Todi
2024-01-04  5:50       ` 'Manivannan Sadhasivam'
2024-02-15  9:25         ` Shradha Todi
2024-02-16 13:49           ` 'Manivannan Sadhasivam'
2024-02-22 11:00             ` Shradha Todi
2024-03-19 16:33               ` 'Manivannan Sadhasivam'
2024-03-20 10:01                 ` Jonathan Cameron
2024-03-22 10:39                   ` 'Manivannan Sadhasivam'
2024-03-22 11:21                     ` Shradha Todi [this message]
2024-03-22 12:58                     ` Jonathan Cameron
2024-03-22 14:41                       ` 'Manivannan Sadhasivam'
2024-04-24 15:32                 ` 'Manivannan Sadhasivam'

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='14ef01da7c4b$15dbaac0$41930040$@samsung.com' \
    --to=shradha.t@samsung.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=Terry.Bowman@amd.com \
    --cc=alim.akhtar@samsung.com \
    --cc=bhelgaas@google.com \
    --cc=bp@alien8.de \
    --cc=gost.dev@samsung.com \
    --cc=gustavo.pimentel@synopsys.com \
    --cc=hongxing.zhu@nxp.com \
    --cc=james.morse@arm.com \
    --cc=jingoohan1@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=manivannan.sadhasivam@linaro.org \
    --cc=mchehab@kernel.org \
    --cc=pankaj.dubey@samsung.com \
    --cc=robh@kernel.org \
    --cc=rric@kernel.org \
    --cc=shiju.jose@huawei.com \
    --cc=tony.luck@intel.com \
    --cc=vidyas@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.