All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Terry Bowman <terry.bowman@amd.com>
Cc: <dave@stgolabs.net>, <dave.jiang@intel.com>,
	<alison.schofield@intel.com>, <dan.j.williams@intel.com>,
	<bhelgaas@google.com>, <shiju.jose@huawei.com>,
	<ming.li@zohomail.com>, <Smita.KoralahalliChannabasappa@amd.com>,
	<rrichter@amd.com>, <dan.carpenter@linaro.org>,
	<PradeepVineshReddy.Kodamati@amd.com>, <lukas@wunner.de>,
	<Benjamin.Cheatham@amd.com>,
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	<linux-cxl@vger.kernel.org>, <alucerop@amd.com>,
	<ira.weiny@intel.com>, <linux-kernel@vger.kernel.org>,
	<linux-pci@vger.kernel.org>
Subject: Re: [PATCH v11 06/23] CXL/AER: Introduce rch_aer.c into AER driver for handling CXL RCH errors
Date: Wed, 10 Sep 2025 13:43:34 +0100	[thread overview]
Message-ID: <20250910134334.000062b5@huawei.com> (raw)
In-Reply-To: <20250827013539.903682-7-terry.bowman@amd.com>

On Tue, 26 Aug 2025 20:35:21 -0500
Terry Bowman <terry.bowman@amd.com> wrote:

> The restricted CXL Host (RCH) AER error handling logic currently resides
> in the AER driver file, drivers/pci/pcie/aer.c. CXL specific changes are
> conditionally compiled using #ifdefs.
> 
> Improve the AER driver maintainability by separating the RCH specific logic
> from the AER driver's core functionality and removing the ifdefs. Introduce
> drivers/pci/pcie/rch_aer.c for moving the RCH AER logic into.
> 
> Move the CXL logic into the new file but leave helper functions in aer.c
> for now as they will be moved in future patch for CXL virtual hierarchy
> handling.
> 
> 2 changes are required to maintain compilation after the move. Change
> cxl_rch_handle_error() & cxl_rch_enable_rcec() to be non-static inorder for
> accessing from the AER driver in aer.c.
> 
> Introduce CONFIG_CXL_RCH_RAS in cxl/Kconfig. Update pcie/pcie/Makefile to
> conditionally compile rch_aer.c file using CONFIG_CXL_RCH_RAS.
> 
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Hi Terry,

Sorry it took me so long to get back this.

Anyhow a few 'scope' of export questions inline.

Jonathan


> 
> ---
> Changes in v10->v11:
> - Remove changes in code-split and move to earlier, new patch
> - Add #include <linux/bitfield.h> to cxl_ras.c
> - Move cxl_rch_handle_error() & cxl_rch_enable_rcec() declarations from pci.h
> to aer.h, more localized.
> - Introduce CONFIG_CXL_RCH_RAS, includes Makefile changes, ras.c ifdef changes
> ---
>  drivers/cxl/Kconfig        |   9 +++-
>  drivers/cxl/core/ras.c     |   3 ++
>  drivers/pci/pci.h          |  20 +++++++
>  drivers/pci/pcie/Makefile  |   1 +
>  drivers/pci/pcie/aer.c     | 108 +++----------------------------------
>  drivers/pci/pcie/rch_aer.c |  99 ++++++++++++++++++++++++++++++++++
>  6 files changed, 138 insertions(+), 102 deletions(-)
>  create mode 100644 drivers/pci/pcie/rch_aer.c
> 
> diff --git a/drivers/cxl/Kconfig b/drivers/cxl/Kconfig
> index 1c7c8989fd8b..028201e24523 100644
> --- a/drivers/cxl/Kconfig
> +++ b/drivers/cxl/Kconfig
> @@ -235,5 +235,12 @@ config CXL_MCE
>  
>  config CXL_RAS
>  	def_bool y
> -	depends on ACPI_APEI_GHES && PCIEAER_CXL
> +	depends on ACPI_APEI_GHES && PCIEAER && CXL_PCI
> +
> +config CXL_RCH_RAS
> +	bool "CXL: Restricted CXL Host (RCH) protocol error handling"
> +	def_bool n

Isn't that the default anyway?  So probably drop that explicit default.


> +	depends on CXL_RAS
> +	help
> +	  RAS support for Restricted CXL Host (RCH) defined in CXL1.1.
>  endif

> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 12215ee72afb..c8a0c0ec0073 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h

> +#ifdef CONFIG_CXL_RAS
> +void pci_aer_unmask_internal_errors(struct pci_dev *dev);
> +bool cxl_error_is_native(struct pci_dev *dev);
> +bool is_internal_error(struct aer_err_info *info);
> +#else
> +static inline void pci_aer_unmask_internal_errors(struct pci_dev *dev) { }
> +static inline bool cxl_error_is_native(struct pci_dev *dev) { return false; }
> +static inline bool is_internal_error(struct aer_err_info *info) { return false; }

For me the ifdef makes sense for the cxl specific one, but not the other two
which I think are reasonable interfaces to expose more generally.

> +#endif
> +
>  #endif /* DRIVERS_PCI_H */
> diff --git a/drivers/pci/pcie/Makefile b/drivers/pci/pcie/Makefile
> index 173829aa02e6..07c299dbcdd7 100644
> --- a/drivers/pci/pcie/Makefile
> +++ b/drivers/pci/pcie/Makefile
> @@ -8,6 +8,7 @@ obj-$(CONFIG_PCIEPORTBUS)	+= pcieportdrv.o bwctrl.o
>  
>  obj-y				+= aspm.o
>  obj-$(CONFIG_PCIEAER)		+= aer.o err.o tlp.o
> +obj-$(CONFIG_CXL_RCH_RAS)	+= rch_aer.o
>  obj-$(CONFIG_PCIEAER_INJECT)	+= aer_inject.o
>  obj-$(CONFIG_PCIE_PME)		+= pme.o
>  obj-$(CONFIG_PCIE_DPC)		+= dpc.o
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 7fe9f883f5c5..29de7ee861f7 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1098,7 +1098,7 @@ static bool find_source_device(struct pci_dev *parent,
>   * Note: AER must be enabled and supported by the device which must be
>   * checked in advance, e.g. with pcie_aer_is_native().
>   */
> -static void pci_aer_unmask_internal_errors(struct pci_dev *dev)
> +void pci_aer_unmask_internal_errors(struct pci_dev *dev)
>  {
>  	int aer = dev->aer_cap;
>  	u32 mask;
> @@ -1111,119 +1111,25 @@ static void pci_aer_unmask_internal_errors(struct pci_dev *dev)
>  	mask &= ~PCI_ERR_COR_INTERNAL;
>  	pci_write_config_dword(dev, aer + PCI_ERR_COR_MASK, mask);
>  }
> +EXPORT_SYMBOL_NS_GPL(pci_aer_unmask_internal_errors, "CXL");
Why put this one in the CXL namespace? 

Maybe check the others as well. For instance is_internal_error()
doesn't feel CXL specific either.


>  
>  /**
>   * pci_aer_handle_error - handle logging error into an event log

> diff --git a/drivers/pci/pcie/rch_aer.c b/drivers/pci/pcie/rch_aer.c
> new file mode 100644
> index 000000000000..bfe071eebf67
> --- /dev/null
> +++ b/drivers/pci/pcie/rch_aer.c


> +static int cxl_rch_handle_error_iter(struct pci_dev *dev, void *data)
> +{
> +	struct aer_err_info *info = (struct aer_err_info *)data;
> +	const struct pci_error_handlers *err_handler;
> +
> +	if (!is_cxl_mem_dev(dev) || !cxl_error_is_native(dev))
> +		return 0;
> +
> +	/* Protect dev->driver */
> +	device_lock(&dev->dev);

Probably not one to bury in this patch (maybe you do it later in which
case ignore this) but given we are touching the code,
	guard(device)(&dev->dev);

to allow early returns and no need to have the goto.

> +
> +	err_handler = dev->driver ? dev->driver->err_handler : NULL;
> +	if (!err_handler)
> +		goto out;
> +
> +	if (info->severity == AER_CORRECTABLE) {
> +		if (err_handler->cor_error_detected)
> +			err_handler->cor_error_detected(dev);
> +	} else if (err_handler->error_detected) {
> +		if (info->severity == AER_NONFATAL)
> +			err_handler->error_detected(dev, pci_channel_io_normal);
> +		else if (info->severity == AER_FATAL)
> +			err_handler->error_detected(dev, pci_channel_io_frozen);
> +	}
> +out:
> +	device_unlock(&dev->dev);
> +	return 0;
> +}


  parent reply	other threads:[~2025-09-10 12:43 UTC|newest]

Thread overview: 89+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-27  1:35 [PATCH v11 00/23] Enable CXL PCIe Port Protocol Error handling and logging Terry Bowman
2025-08-27  1:35 ` [PATCH v11 01/23] cxl: Remove ifdef blocks of CONFIG_PCIEAER_CXL from core/pci.c Terry Bowman
2025-08-27  1:35 ` [PATCH v11 02/23] CXL/AER: Remove CONFIG_PCIEAER_CXL and replace with CONFIG_CXL_RAS Terry Bowman
2025-08-29 15:24   ` Jonathan Cameron
2025-08-29 18:16   ` Sathyanarayanan Kuppuswamy
2025-08-27  1:35 ` [PATCH v11 03/23] cxl/pci: Remove unnecessary CXL Endpoint handling helper functions Terry Bowman
2025-08-28 15:28   ` Dave Jiang
2025-08-27  1:35 ` [PATCH v11 04/23] cxl/pci: Remove unnecessary CXL RCH " Terry Bowman
2025-08-28  8:35   ` Alejandro Lucero Palau
2025-08-28 17:32   ` Dave Jiang
2025-08-27  1:35 ` [PATCH v11 05/23] cxl: Move CXL driver RCH error handling into CONFIG_CXL_RCH_RAS conditional block Terry Bowman
2025-08-28  8:57   ` Alejandro Lucero Palau
2025-09-10 16:43     ` Bowman, Terry
2025-08-29 15:33   ` Jonathan Cameron
2025-09-11 17:48     ` Bowman, Terry
2025-09-11 19:41       ` Dave Jiang
2025-09-15 13:32         ` Bowman, Terry
2025-08-27  1:35 ` [PATCH v11 06/23] CXL/AER: Introduce rch_aer.c into AER driver for handling CXL RCH errors Terry Bowman
2025-08-28 20:53   ` Dave Jiang
2025-08-29  8:39     ` Lukas Wunner
2025-09-10 17:01       ` Bowman, Terry
2025-09-10 17:26         ` Dave Jiang
2025-09-12 13:59       ` Bowman, Terry
2025-09-12 19:09         ` Lukas Wunner
2025-09-10 16:56     ` Bowman, Terry
2025-09-10 12:43   ` Jonathan Cameron [this message]
2025-08-27  1:35 ` [PATCH v11 07/23] CXL/PCI: Move CXL DVSEC definitions into uapi/linux/pci_regs.h Terry Bowman
2025-08-27 14:51   ` Lukas Wunner
2025-08-29 15:42     ` Jonathan Cameron
2025-08-29 15:47     ` Jonathan Cameron
2025-08-28 21:07   ` Dave Jiang
2025-09-10 18:11     ` Bowman, Terry
2025-09-10 20:06       ` Dave Jiang
2025-08-27  1:35 ` [PATCH v11 08/23] PCI/CXL: Introduce pcie_is_cxl() Terry Bowman
2025-08-28  8:18   ` Alejandro Lucero Palau
2025-09-10 16:24     ` Bowman, Terry
2025-09-11  3:48       ` Lukas Wunner
2025-09-10 13:10   ` Jonathan Cameron
2025-08-27  1:35 ` [PATCH v11 09/23] PCI/AER: Report CXL or PCIe bus error type in trace logging Terry Bowman
2025-08-27  7:37   ` Lukas Wunner
2025-09-10 15:26     ` Bowman, Terry
2025-09-10 15:33       ` Lukas Wunner
2025-09-11  5:07   ` Lukas Wunner
2025-08-27  1:35 ` [PATCH v11 10/23] CXL/AER: Update PCI class code check to use FIELD_GET() Terry Bowman
2025-08-29 16:03   ` Jonathan Cameron
2025-09-11 18:18     ` Bowman, Terry
2025-08-27  1:35 ` [PATCH v11 11/23] cxl/pci: Update RAS handler interfaces to also support CXL Ports Terry Bowman
2025-08-27  1:35 ` [PATCH v11 12/23] cxl/pci: Log message if RAS registers are unmapped Terry Bowman
2025-08-27  1:35 ` [PATCH v11 13/23] cxl/pci: Unify CXL trace logging for CXL Endpoints and CXL Ports Terry Bowman
2025-08-27 11:55   ` Shiju Jose
2025-08-29 16:06     ` Jonathan Cameron
2025-08-27  1:35 ` [PATCH v11 14/23] cxl/pci: Update cxl_handle_cor_ras() to return early if no RAS errors Terry Bowman
2025-08-27  1:35 ` [PATCH v11 15/23] cxl/pci: Map CXL Endpoint Port and CXL Switch Port RAS registers Terry Bowman
2025-08-28 23:05   ` Dave Jiang
2025-09-10 18:40     ` Bowman, Terry
2025-08-27  1:35 ` [PATCH v11 16/23] cxl/pci: Introduce CXL Endpoint protocol error handlers Terry Bowman
2025-08-27  7:48   ` Lukas Wunner
2025-09-10 13:23   ` Jonathan Cameron
2025-08-27  1:35 ` [PATCH v11 17/23] CXL/AER: Introduce cxl_aer.c into AER driver for forwarding CXL errors Terry Bowman
2025-08-27  7:56   ` Lukas Wunner
2025-08-27  1:35 ` [PATCH v11 18/23] PCI/AER: Dequeue forwarded CXL error Terry Bowman
2025-08-29  0:43   ` Dave Jiang
2025-08-29  7:10     ` Lukas Wunner
2025-09-16 15:18       ` Bowman, Terry
2025-09-11 14:33     ` Bowman, Terry
2025-09-11 15:41       ` Dave Jiang
2025-09-11 16:47         ` Bowman, Terry
2025-09-11 19:45           ` Dave Jiang
2025-09-10 13:29   ` Jonathan Cameron
2025-08-27  1:35 ` [PATCH v11 19/23] CXL/PCI: Introduce CXL Port protocol error handlers Terry Bowman
2025-08-30  0:17   ` Dave Jiang
2025-08-27  1:35 ` [PATCH v11 20/23] CXL/PCI: Export and rename merge_result() to pci_ers_merge_result() Terry Bowman
2025-08-27  8:04   ` Lukas Wunner
2025-09-10 15:57     ` Bowman, Terry
2025-09-11  3:44       ` Lukas Wunner
2025-08-27 12:19   ` kernel test robot
2025-08-27  1:35 ` [PATCH v11 21/23] CXL/PCI: Introduce CXL uncorrectable protocol error recovery Terry Bowman
2025-09-03 22:30   ` Dave Jiang
2025-09-11 19:19     ` Bowman, Terry
2025-09-11 19:48       ` Dave Jiang
2025-09-10 13:33   ` Jonathan Cameron
2025-08-27  1:35 ` [PATCH v11 22/23] CXL/PCI: Enable CXL protocol errors during CXL Port probe Terry Bowman
2025-09-03 23:23   ` Dave Jiang
2025-08-27  1:35 ` [PATCH v11 23/23] CXL/PCI: Disable CXL protocol error interrupts during CXL Port cleanup Terry Bowman
2025-09-10 15:07   ` Gregory Price
2025-09-11 20:19     ` Bowman, Terry
2025-08-29  0:07 ` [PATCH v11 00/23] Enable CXL PCIe Port Protocol Error handling and logging Dave Jiang
2025-09-23  3:29 ` Gregory Price
2025-09-23  9:21   ` Srinivasulu Thanneeru

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250910134334.000062b5@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Benjamin.Cheatham@amd.com \
    --cc=PradeepVineshReddy.Kodamati@amd.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=alucerop@amd.com \
    --cc=bhelgaas@google.com \
    --cc=dan.carpenter@linaro.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=ira.weiny@intel.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=ming.li@zohomail.com \
    --cc=rrichter@amd.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=shiju.jose@huawei.com \
    --cc=terry.bowman@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.