All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Terry Bowman <terry.bowman@amd.com>
Cc: <dave@stgolabs.net>, <dave.jiang@intel.com>,
	<alison.schofield@intel.com>, <dan.j.williams@intel.com>,
	<bhelgaas@google.com>, <shiju.jose@huawei.com>,
	<ming.li@zohomail.com>, <Smita.KoralahalliChannabasappa@amd.com>,
	<rrichter@amd.com>, <dan.carpenter@linaro.org>,
	<PradeepVineshReddy.Kodamati@amd.com>, <lukas@wunner.de>,
	<Benjamin.Cheatham@amd.com>,
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	<linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-pci@vger.kernel.org>
Subject: Re: [PATCH v10 14/17] cxl/pci: Introduce CXL Endpoint protocol error handlers
Date: Fri, 27 Jun 2025 12:52:03 +0100	[thread overview]
Message-ID: <20250627125203.00002564@huawei.com> (raw)
In-Reply-To: <20250626224252.1415009-15-terry.bowman@amd.com>

On Thu, 26 Jun 2025 17:42:49 -0500
Terry Bowman <terry.bowman@amd.com> wrote:

> CXL Endpoint protocol errors are currently handled using PCI error
> handlers. The CXL Endpoint requires CXL specific handling in the case of
> uncorrectable error (UCE) handling not provided by the PCI handlers.
> 
> Add CXL specific handlers for CXL Endpoints. Rename the existing
> cxl_error_handlers to be pci_error_handlers to more correctly indicate
> the error type and follow naming consistency.
> 
> The PCI handlers will be called if the CXL device is not trained for
> alternate protocol (CXL). Update the CXL Endpoint PCI handlers to call the
> CXL UCE handlers.
> 
> The existing EP UCE handler includes checks for various results. These are
> no longer needed because CXL UCE recovery will not be attempted. Implement
> cxl_handle_ras() to return PCI_ERS_RESULT_NONE or PCI_ERS_RESULT_PANIC. The
> CXL UCE handler is called by cxl_do_recovery() that acts on the return
> value. In the case of the PCI handler path, call panic() if the result is
> PCI_ERS_RESULT_PANIC.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

A few minor comments inline.

J
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 887b54cf3395..7209ffb5c2fe 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c


>  
> -	scoped_guard(device, dev) {
> -		if (!dev->driver) {
> +pci_ers_result_t cxl_error_detected(struct device *dev)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> +	struct device *cxlmd_dev = &cxlds->cxlmd->dev;
> +	pci_ers_result_t ue;
> +
> +	scoped_guard(device, cxlmd_dev) {
I think there is nothing much happening after this (maybe introduced in later
patches in which case ignore this comment).

So can you just use a guard and reduce the indent of the rest?

> +
> +		if (!cxlmd_dev->driver) {
>  			dev_warn(&pdev->dev,
>  				 "%s: memdev disabled, abort error handling\n",
>  				 dev_name(dev));
> -			return PCI_ERS_RESULT_DISCONNECT;
> +			return PCI_ERS_RESULT_PANIC;
>  		}
>  
>  		if (cxlds->rcd)
> @@ -881,29 +888,23 @@ pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>  		ue = cxl_handle_ras(&cxlds->cxlmd->dev, cxlds->serial, cxlds->regs.ras);

little hard to tell from this code blob but can you return here?

>  	}
>  
> -
> -	switch (state) {
> -	case pci_channel_io_normal:
> -		if (ue) {
> -			device_release_driver(dev);
> -			return PCI_ERS_RESULT_NEED_RESET;
> -		}
> -		return PCI_ERS_RESULT_CAN_RECOVER;
> -	case pci_channel_io_frozen:
> -		dev_warn(&pdev->dev,
> -			 "%s: frozen state error detected, disable CXL.mem\n",
> -			 dev_name(dev));
> -		device_release_driver(dev);
> -		return PCI_ERS_RESULT_NEED_RESET;
> -	case pci_channel_io_perm_failure:
> -		dev_warn(&pdev->dev,
> -			 "failure state error detected, request disconnect\n");
> -		return PCI_ERS_RESULT_DISCONNECT;
> -	}
> -	return PCI_ERS_RESULT_NEED_RESET;
> +	return ue;
>  }
>  EXPORT_SYMBOL_NS_GPL(cxl_error_detected, "CXL");

  reply	other threads:[~2025-06-27 11:52 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-26 22:42 [PATCH v10 00/17] Enable CXL PCIe Port Protocol Error handling and logging Terry Bowman
2025-06-26 22:42 ` [PATCH v10 01/17] cxl/pci: Remove unnecessary CXL Endpoint handling helper functions Terry Bowman
2025-07-18 17:55   ` Dave Jiang
2025-07-23 21:58   ` dan.j.williams
2025-07-23 22:15     ` Dave Jiang
2025-06-26 22:42 ` [PATCH v10 02/17] PCI/CXL: Add pcie_is_cxl() Terry Bowman
2025-07-23 22:30   ` dan.j.williams
2025-07-23 23:21     ` Bowman, Terry
2025-07-24 18:00       ` dan.j.williams
2025-08-09 10:56   ` Alejandro Lucero Palau
2025-08-11 19:14     ` Bowman, Terry
2025-08-11 23:14       ` dan.j.williams
2025-06-26 22:42 ` [PATCH v10 03/17] PCI/AER: Report CXL or PCIe bus error type in trace logging Terry Bowman
2025-06-26 23:25   ` Sathyanarayanan Kuppuswamy
2025-06-27 14:14     ` Bowman, Terry
2025-06-27  9:53   ` Jonathan Cameron
2025-07-02 16:00     ` Bowman, Terry
2025-06-27 11:32   ` Shiju Jose
2025-06-27 14:24     ` Bowman, Terry
2025-07-01 21:27   ` Dave Jiang
2025-07-23 22:56   ` dan.j.williams
2025-06-26 22:42 ` [PATCH v10 04/17] CXL/AER: Introduce CXL specific AER driver file Terry Bowman
2025-06-26 23:42   ` Sathyanarayanan Kuppuswamy
2025-06-27 10:12     ` Jonathan Cameron
2025-06-27 14:29     ` Bowman, Terry
2025-07-24  0:01   ` dan.j.williams
2025-07-24 17:06     ` Bowman, Terry
2025-07-24 20:32       ` dan.j.williams
2025-07-24  1:16   ` dan.j.williams
2025-07-24 17:02     ` Bowman, Terry
2025-07-24 20:23       ` dan.j.williams
2025-06-26 22:42 ` [PATCH v10 05/17] CXL/AER: Introduce kfifo for forwarding CXL errors Terry Bowman
2025-06-27 10:24   ` Jonathan Cameron
2025-07-02 16:21     ` Bowman, Terry
2025-07-02 19:54       ` Dan Carpenter
2025-07-02 19:57         ` Bowman, Terry
2025-07-03 10:06       ` Jonathan Cameron
2025-07-01 21:53   ` Dave Jiang
2025-07-02 17:10     ` Bowman, Terry
2025-07-24  2:01   ` dan.j.williams
2025-07-24 17:21     ` Bowman, Terry
2025-07-24 20:55       ` dan.j.williams
2025-06-26 22:42 ` [PATCH v10 06/17] PCI/AER: Dequeue forwarded CXL error Terry Bowman
2025-06-27 11:00   ` Jonathan Cameron
2025-07-02 17:51     ` Bowman, Terry
2025-07-01 23:04   ` Dave Jiang
2025-07-02 17:56     ` Bowman, Terry
2025-07-03 10:11       ` Jonathan Cameron
2025-07-25  0:38   ` dan.j.williams
2025-06-26 22:42 ` [PATCH v10 07/17] CXL/PCI: Introduce CXL uncorrectable protocol error recovery Terry Bowman
2025-06-27 11:05   ` Jonathan Cameron
2025-07-02 21:06     ` Bowman, Terry
2025-06-27 12:27   ` Shiju Jose
2025-07-02 21:34     ` Bowman, Terry
2025-06-26 22:42 ` [PATCH v10 08/17] cxl/pci: Move RAS initialization to cxl_port driver Terry Bowman
2025-06-27 11:12   ` Jonathan Cameron
2025-07-18 18:01   ` Dave Jiang
2025-06-26 22:42 ` [PATCH v10 09/17] cxl/pci: Map CXL Endpoint Port and CXL Switch Port RAS registers Terry Bowman
2025-06-27 11:17   ` Jonathan Cameron
2025-07-02 21:41     ` Bowman, Terry
2025-07-18 21:28   ` Dave Jiang
2025-07-18 21:55     ` Bowman, Terry
2025-07-18 22:01       ` Dave Jiang
2025-07-18 22:40         ` Bowman, Terry
2025-07-18 22:45           ` Dave Jiang
2025-06-26 22:42 ` [PATCH v10 10/17] cxl/pci: Update RAS handler interfaces to also support CXL Ports Terry Bowman
2025-06-26 22:42 ` [PATCH v10 11/17] cxl/pci: Log message if RAS registers are unmapped Terry Bowman
2025-07-21 21:56   ` Dave Jiang
2025-06-26 22:42 ` [PATCH v10 12/17] cxl/pci: Unify CXL trace logging for CXL Endpoints and CXL Ports Terry Bowman
2025-06-27 12:22   ` Shiju Jose
2025-07-02  1:18     ` Alison Schofield
2025-07-02 22:07       ` Bowman, Terry
2025-07-02 21:56     ` Bowman, Terry
2025-06-26 22:42 ` [PATCH v10 13/17] cxl/pci: Update cxl_handle_cor_ras() to return early if no RAS errors Terry Bowman
2025-06-27 11:48   ` Jonathan Cameron
2025-07-21 22:17   ` Dave Jiang
2025-06-26 22:42 ` [PATCH v10 14/17] cxl/pci: Introduce CXL Endpoint protocol error handlers Terry Bowman
2025-06-27 11:52   ` Jonathan Cameron [this message]
2025-06-27 12:27   ` Shiju Jose
2025-07-21 22:35   ` Dave Jiang
2025-07-22 18:23     ` Bowman, Terry
2025-06-26 22:42 ` [PATCH v10 15/17] CXL/PCI: Introduce CXL Port " Terry Bowman
2025-06-26 22:42 ` [PATCH v10 16/17] CXL/PCI: Enable CXL protocol errors during CXL Port probe Terry Bowman
2025-06-26 22:42 ` [PATCH v10 17/17] CXL/PCI: Disable CXL protocol error interrupts during CXL Port cleanup Terry Bowman
2025-07-23 21:55 ` [PATCH v10 00/17] Enable CXL PCIe Port Protocol Error handling and logging dan.j.williams
2025-07-24 15:58   ` Bowman, Terry
2025-08-18 15:18 ` Joshua Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250627125203.00002564@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=PradeepVineshReddy.Kodamati@amd.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.carpenter@linaro.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=ming.li@zohomail.com \
    --cc=rrichter@amd.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=shiju.jose@huawei.com \
    --cc=terry.bowman@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.