All of lore.kernel.org
 help / color / mirror / Atom feed
From: fan <nifan.cxl@gmail.com>
To: Terry Bowman <Terry.Bowman@amd.com>
Cc: nifan.cxl@gmail.com, Dan Williams <dan.j.williams@intel.com>,
	ira.weiny@intel.com, dave@stgolabs.net, dave.jiang@intel.com,
	alison.schofield@intel.com, ming4.li@intel.com,
	vishal.l.verma@intel.com, jim.harris@samsung.com,
	ilpo.jarvinen@linux.intel.com, ardb@kernel.org,
	sathyanarayanan.kuppuswamy@linux.intel.com,
	linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
	Yazen.Ghannam@amd.com, Robert.Richter@amd.com,
	Bjorn Helgaas <bhelgaas@google.com>,
	linux-pci@vger.kernel.org
Subject: Re: [RFC PATCH 1/9] PCI/AER: Update AER driver to call root port and downstream port UCE handlers
Date: Wed, 10 Jul 2024 18:14:25 -0700	[thread overview]
Message-ID: <Zo8x8YvdHSgcjGcR@debian> (raw)
In-Reply-To: <8bf0d1c9-632e-458b-9b78-0faeea0472f8@amd.com>

On Wed, Jul 10, 2024 at 04:48:09PM -0500, Terry Bowman wrote:
> Hi Fan,
> 
> On 7/10/24 15:48, nifan.cxl@gmail.com wrote:
> > On Mon, Jun 24, 2024 at 12:56:29PM -0500, Terry Bowman wrote:
> >> Hi Dan,
> >>
> >> I added a response below.
> >>
> >> On 6/21/24 14:17, Dan Williams wrote:
> >>> Terry Bowman wrote:
> >>>> The AER service driver does not currently call a handler for AER
> >>>> uncorrectable errors (UCE) detected in root ports or downstream
> >>>> ports. This is not needed in most cases because common PCIe port
> >>>> functionality is handled by portdrv service drivers.
> >>>>
> >>>> CXL root ports include CXL specific RAS registers that need logging
> >>>> before starting do_recovery() in the UCE case.
> >>>>
> >>>> Update the AER service driver to call the UCE handler for root ports
> >>>> and downstream ports. These PCIe port devices are bound to the portdrv
> >>>> driver that includes a CE and UCE handler to be called.
> >>>>
> >>>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> >>>> Cc: Bjorn Helgaas <bhelgaas@google.com>
> >>>> Cc: linux-pci@vger.kernel.org
> >>>> ---
> >>>>  drivers/pci/pcie/err.c | 20 ++++++++++++++++++++
> >>>>  1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> >>>> index 705893b5f7b0..a4db474b2be5 100644
> >>>> --- a/drivers/pci/pcie/err.c
> >>>> +++ b/drivers/pci/pcie/err.c
> >>>> @@ -203,6 +203,26 @@ pci_ers_result_t pcie_do_recovery(struct pci_dev *dev,
> >>>>  	pci_ers_result_t status = PCI_ERS_RESULT_CAN_RECOVER;
> >>>>  	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
> >>>>  
> >>>> +	/*
> >>>> +	 * PCIe ports may include functionality beyond the standard
> >>>> +	 * extended port capabilities. This may present a need to log and
> >>>> +	 * handle errors not addressed in this driver. Examples are CXL
> >>>> +	 * root ports and CXL downstream switch ports using AER UIE to
> >>>> +	 * indicate CXL UCE RAS protocol errors.
> >>>> +	 */
> >>>> +	if (type == PCI_EXP_TYPE_ROOT_PORT ||
> >>>> +	    type == PCI_EXP_TYPE_DOWNSTREAM) {
> >>>> +		struct pci_driver *pdrv = dev->driver;
> >>>> +
> >>>> +		if (pdrv && pdrv->err_handler &&
> >>>> +		    pdrv->err_handler->error_detected) {
> >>>> +			const struct pci_error_handlers *err_handler;
> >>>> +
> >>>> +			err_handler = pdrv->err_handler;
> >>>> +			status = err_handler->error_detected(dev, state);
> >>>> +		}
> >>>> +	}
> >>>> +
> >>>
> >>> Would not a more appropriate place for this be pci_walk_bridge() where
> >>> the ->subordinate == NULL and these type-check cases are unified?
> >>
> >> It does. I can take a look at moving that.
> > 
> > Has that already been handled in pci_walk_bridge?
> > 
> > The function pci_walk_bridge() will call report_error_detected, where
> > the err handler will be called. 
> > https://elixir.bootlin.com/linux/v6.10-rc6/source/drivers/pci/pcie/err.c#L80
> > 
> > Fan
> > 
> 
> You would think so but the UCE handler was not called in my testing for the PCIe 
> ports (RP,USP,DSP). The pci_walk_bridge() function has 2 cases:
> - If there is a subordinate/secondary bus then the callback is called for
> those downstream devices but not the port itself.
> - If there is no subordinate/secondary bus then the callback is invoked for the 
> port itself.
> 
> The function header comment may explain it better:
> /**                                                                                                                                                                                                                
>  * pci_walk_bridge - walk bridges potentially AER affected                                                                                                                                                         
>  * @bridge:     bridge which may be a Port, an RCEC, or an RCiEP                                                                                                                                                   
>  * @cb:         callback to be called for each device found                                                                                                                                                        
>  * @userdata:   arbitrary pointer to be passed to callback                                                                                                                                                         
>  *                                                             
>  * If the device provided is a bridge, walk the subordinate bus, including                                                                                                                                         
>  * any bridged devices on buses under this bus.  Call the provided callback                                                                                                                                        
>  * on each device found.                                                                                                                                                                                           
>  *                                                                                                                                                                                                                 
>  * If the device provided has no subordinate bus, e.g., an RCEC or RCiEP,                                                                                                                                          
>  * call the callback on the device itself. 
>  */
> 
> Regards,
> Terry

OK, interesting.
Btw, what is the "state" passed to pcie_do_recovery(...state...)?

Fan

> 
> >>
> >> Regards,
> >> Terry

  reply	other threads:[~2024-07-11  1:14 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-17 20:04 [RFC PATCH 0/9] Add RAS support for CXL root ports, CXL downstream switch ports, and CXL upstream switch ports Terry Bowman
2024-06-17 20:04 ` [RFC PATCH 1/9] PCI/AER: Update AER driver to call root port and downstream port UCE handlers Terry Bowman
2024-06-20 11:21   ` Jonathan Cameron
2024-06-24 14:58     ` Terry Bowman
2024-06-21 19:17   ` Dan Williams
2024-06-24 17:56     ` Terry Bowman
2024-07-10 20:48       ` nifan.cxl
2024-07-10 21:48         ` Terry Bowman
2024-07-11  1:14           ` fan [this message]
2024-08-19 18:35       ` Fan Ni
2024-06-17 20:04 ` [RFC PATCH 2/9] PCI/AER: Call AER CE handler before clearing AER CE status register Terry Bowman
2024-06-20 11:31   ` Jonathan Cameron
2024-06-24 15:08     ` Terry Bowman
2024-06-21 19:23   ` Dan Williams
2024-06-24 18:00     ` Terry Bowman
2024-06-17 20:04 ` [RFC PATCH 3/9] PCI/portdrv: Update portdrv with an atomic notifier for reporting AER internal errors Terry Bowman
2024-06-20 12:30   ` Jonathan Cameron
2024-06-24 15:22     ` Terry Bowman
2024-06-21 19:36   ` Dan Williams
2024-06-24 18:21     ` Terry Bowman
2024-06-24 21:46       ` Dan Williams
2024-06-25 14:41         ` Terry Bowman
2024-06-26  2:54   ` Li, Ming4
2024-06-26 13:39     ` Terry Bowman
2024-06-17 20:04 ` [RFC PATCH 4/9] cxl/pci: Map CXL PCIe ports' RAS registers Terry Bowman
2024-06-20 12:46   ` Jonathan Cameron
2024-06-24 15:51     ` Terry Bowman
2024-07-02 15:18       ` Jonathan Cameron
2024-06-26  3:39   ` Li, Ming4
2024-06-17 20:04 ` [RFC PATCH 5/9] cxl/pci: Update RAS handler interfaces to support CXL PCIe ports Terry Bowman
2024-06-20 12:49   ` Jonathan Cameron
2024-07-15 17:50   ` nifan.cxl
2024-06-17 20:04 ` [RFC PATCH 6/9] cxl/pci: Add trace logging for CXL PCIe port RAS errors Terry Bowman
2024-06-20 12:53   ` Jonathan Cameron
2024-06-24 15:53     ` Terry Bowman
2024-07-02 15:53       ` Jonathan Cameron
2024-06-17 20:04 ` [RFC PATCH 7/9] cxl/pci: Add atomic notifier callback for CXL PCIe port AER internal errors Terry Bowman
2024-06-20 13:09   ` Jonathan Cameron
2024-06-24 16:09     ` Terry Bowman
2024-07-02 15:58       ` Jonathan Cameron
2024-06-26  6:22   ` Li, Ming4
2024-06-26 13:51     ` Terry Bowman
2024-06-17 20:04 ` [RFC PATCH 8/9] PCI/AER: Export pci_aer_unmask_internal_errors() Terry Bowman
2024-06-19  7:09   ` Christoph Hellwig
2024-06-19 15:40     ` Terry Bowman
2024-06-20 13:11   ` Jonathan Cameron
2024-06-24 16:22     ` Terry Bowman
2024-07-10 21:47   ` Bjorn Helgaas
2024-06-17 20:04 ` [RFC PATCH 9/9] cxl/pci: Enable interrupts for CXL PCIe ports' AER internal errors Terry Bowman
2024-06-20 13:15   ` Jonathan Cameron
2024-06-24 16:46     ` Terry Bowman
2024-07-02 16:00       ` Jonathan Cameron
2024-06-21 19:04 ` [RFC PATCH 0/9] Add RAS support for CXL root ports, CXL downstream switch ports, and CXL upstream switch ports Dan Williams
2024-06-24 17:47   ` Terry Bowman
2024-06-24 20:51     ` Dan Williams
2024-06-25 14:29       ` Terry Bowman
2024-07-25 18:49 ` fan
2024-08-19 16:21   ` Terry Bowman
2024-08-19 18:17     ` Fan Ni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zo8x8YvdHSgcjGcR@debian \
    --to=nifan.cxl@gmail.com \
    --cc=Robert.Richter@amd.com \
    --cc=Terry.Bowman@amd.com \
    --cc=Yazen.Ghannam@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=ardb@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jim.harris@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=ming4.li@intel.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.