From: Alejandro Lucero Palau <alucerop@amd.com>
To: Terry Bowman <terry.bowman@amd.com>,
linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-pci@vger.kernel.org, nifan.cxl@gmail.com,
ming4.li@intel.com, dave@stgolabs.net,
jonathan.cameron@huawei.com, dave.jiang@intel.com,
alison.schofield@intel.com, vishal.l.verma@intel.com,
dan.j.williams@intel.com, bhelgaas@google.com,
mahesh@linux.ibm.com, ira.weiny@intel.com, oohall@gmail.com,
Benjamin.Cheatham@amd.com, rrichter@amd.com,
nathan.fontenot@amd.com, Smita.KoralahalliChannabasappa@amd.com,
lukas@wunner.de, PradeepVineshReddy.Kodamati@amd.com
Subject: Re: [PATCH v4 13/15] cxl/pci: Add trace logging for CXL PCIe Port RAS errors
Date: Thu, 12 Dec 2024 09:46:19 +0000 [thread overview]
Message-ID: <fe67ae94-1c12-288e-07ed-4391fead9949@amd.com> (raw)
In-Reply-To: <20241211234002.3728674-14-terry.bowman@amd.com>
On 12/11/24 23:40, Terry Bowman wrote:
> The CXL drivers use kernel trace functions for logging endpoint and RCH
> Downstream Port RAS errors. Similar functionality is required for CXL Root
> Ports, CXL Downstream Switch Ports, and CXL Upstream Switch Ports.
>
> Introduce trace logging functions for both RAS correctable and
> uncorrectable errors specific to CXL PCIe Ports. Additionally, update
> the PCIe Port error handlers to invoke these new trace functions.
>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
Reviewed-by: Alejandro Lucero <alucerop@amd.com>
> ---
> drivers/cxl/core/pci.c | 16 ++++++++++----
> drivers/cxl/core/trace.h | 47 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 59 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 52afaedf5171..3294ad5ff28f 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -661,10 +661,14 @@ static void __cxl_handle_cor_ras(struct device *dev,
>
> addr = ras_base + CXL_RAS_CORRECTABLE_STATUS_OFFSET;
> status = readl(addr);
> - if (status & CXL_RAS_CORRECTABLE_STATUS_MASK) {
> - writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr);
> + if (!(status & CXL_RAS_CORRECTABLE_STATUS_MASK))
> + return;
> + writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr);
> +
> + if (is_cxl_memdev(dev))
> trace_cxl_aer_correctable_error(to_cxl_memdev(dev), status);
> - }
> + else
> + trace_cxl_port_aer_correctable_error(dev, status);
> }
>
> static void cxl_handle_endpoint_cor_ras(struct cxl_dev_state *cxlds)
> @@ -720,7 +724,11 @@ static bool __cxl_handle_ras(struct device *dev, void __iomem *ras_base)
> }
>
> header_log_copy(ras_base, hl);
> - trace_cxl_aer_uncorrectable_error(to_cxl_memdev(dev), status, fe, hl);
> + if (is_cxl_memdev(dev))
> + trace_cxl_aer_uncorrectable_error(to_cxl_memdev(dev), status, fe, hl);
> + else
> + trace_cxl_port_aer_uncorrectable_error(dev, status, fe, hl);
> +
> writel(status & CXL_RAS_UNCORRECTABLE_STATUS_MASK, addr);
>
> return true;
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 8389a94adb1a..681e415ac8f5 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -48,6 +48,34 @@
> { CXL_RAS_UC_IDE_RX_ERR, "IDE Rx Error" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_uncorrectable_error,
> + TP_PROTO(struct device *dev, u32 status, u32 fe, u32 *hl),
> + TP_ARGS(dev, status, fe, hl),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __string(host, dev_name(dev->parent))
> + __field(u32, status)
> + __field(u32, first_error)
> + __array(u32, header_log, CXL_HEADERLOG_SIZE_U32)
> + ),
> + TP_fast_assign(
> + __assign_str(devname);
> + __assign_str(host);
> + __entry->status = status;
> + __entry->first_error = fe;
> + /*
> + * Embed the 512B headerlog data for user app retrieval and
> + * parsing, but no need to print this in the trace buffer.
> + */
> + memcpy(__entry->header_log, hl, CXL_HEADERLOG_SIZE);
> + ),
> + TP_printk("device=%s host=%s status: '%s' first_error: '%s'",
> + __get_str(devname), __get_str(host),
> + show_uc_errs(__entry->status),
> + show_uc_errs(__entry->first_error)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_uncorrectable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status, u32 fe, u32 *hl),
> TP_ARGS(cxlmd, status, fe, hl),
> @@ -96,6 +124,25 @@ TRACE_EVENT(cxl_aer_uncorrectable_error,
> { CXL_RAS_CE_PHYS_LAYER_ERR, "Received Error From Physical Layer" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_correctable_error,
> + TP_PROTO(struct device *dev, u32 status),
> + TP_ARGS(dev, status),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __string(host, dev_name(dev->parent))
> + __field(u32, status)
> + ),
> + TP_fast_assign(
> + __assign_str(devname);
> + __assign_str(host);
> + __entry->status = status;
> + ),
> + TP_printk("device=%s host=%s status='%s'",
> + __get_str(devname), __get_str(host),
> + show_ce_errs(__entry->status)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_correctable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status),
> TP_ARGS(cxlmd, status),
next prev parent reply other threads:[~2024-12-12 9:46 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-11 23:39 [PATCH v4 0/15] Enable CXL PCIe Port protocol error handling and logging Terry Bowman
2024-12-11 23:39 ` [PATCH v4 01/15] PCI/AER: Introduce 'struct cxl_err_handlers' and add to 'struct pci_driver' Terry Bowman
2024-12-11 23:39 ` [PATCH v4 02/15] PCI/AER: Rename AER driver's interfaces to also indicate CXL PCIe Port support Terry Bowman
2024-12-11 23:39 ` [PATCH v4 03/15] cxl/pci: Introduce PCIe helper functions pcie_is_cxl() and pcie_is_cxl_port() Terry Bowman
2024-12-11 23:39 ` [PATCH v4 04/15] PCI/AER: Modify AER driver logging to report CXL or PCIe bus error type Terry Bowman
2024-12-12 1:34 ` Li Ming
2024-12-12 19:59 ` Bowman, Terry
2024-12-14 13:34 ` Li Ming
2024-12-11 23:39 ` [PATCH v4 05/15] PCI/AER: Add CXL PCIe Port correctable error support in AER service driver Terry Bowman
2024-12-11 23:39 ` [PATCH v4 06/15] PCI/AER: Change AER driver to read UCE fatal status for all CXL PCIe Port devices Terry Bowman
2024-12-24 18:28 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 07/15] PCI/AER: Add CXL PCIe Port Uncorrectable Error recovery in AER service driver Terry Bowman
2024-12-12 9:28 ` Alejandro Lucero Palau
2024-12-13 15:07 ` Bowman, Terry
2024-12-24 18:31 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 08/15] cxl/pci: Map CXL PCIe Root Port and Downstream Switch Port RAS registers Terry Bowman
2024-12-12 10:36 ` Alejandro Lucero Palau
2024-12-13 15:10 ` Bowman, Terry
2024-12-24 18:38 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 09/15] cxl/pci: Map CXL PCIe Upstream " Terry Bowman
2024-12-24 18:41 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 10/15] cxl/pci: Update RAS handler interfaces to also support CXL PCIe Ports Terry Bowman
2024-12-12 10:38 ` Alejandro Lucero Palau
2024-12-24 18:42 ` Jonathan Cameron
2024-12-11 23:39 ` [PATCH v4 11/15] cxl/pci: Change find_cxl_port() to non-static Terry Bowman
2024-12-11 23:39 ` [PATCH v4 12/15] cxl/pci: Add error handler for CXL PCIe Port RAS errors Terry Bowman
2024-12-12 2:19 ` Li Ming
2024-12-24 18:43 ` Jonathan Cameron
2024-12-11 23:40 ` [PATCH v4 13/15] cxl/pci: Add trace logging " Terry Bowman
2024-12-12 9:46 ` Alejandro Lucero Palau [this message]
2024-12-24 18:46 ` Jonathan Cameron
2024-12-26 17:01 ` Bowman, Terry
2024-12-11 23:40 ` [PATCH v4 14/15] cxl/pci: Add support to assign and clear pci_driver::cxl_err_handlers Terry Bowman
2024-12-12 2:31 ` Li Ming
2024-12-17 14:39 ` Bowman, Terry
2024-12-24 18:50 ` Jonathan Cameron
2024-12-26 17:07 ` Bowman, Terry
2025-01-07 11:32 ` Jonathan Cameron
2024-12-11 23:40 ` [PATCH v4 15/15] PCI/AER: Enable internal errors for CXL Upstream and Downstream Switch Ports Terry Bowman
2024-12-12 9:44 ` Alejandro Lucero Palau
2024-12-12 10:44 ` Alejandro Lucero Palau
2024-12-13 15:22 ` Bowman, Terry
2024-12-13 15:34 ` Bowman, Terry
2024-12-24 18:53 ` Jonathan Cameron
2024-12-26 17:19 ` Bowman, Terry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fe67ae94-1c12-288e-07ed-4391fead9949@amd.com \
--to=alucerop@amd.com \
--cc=Benjamin.Cheatham@amd.com \
--cc=PradeepVineshReddy.Kodamati@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=ming4.li@intel.com \
--cc=nathan.fontenot@amd.com \
--cc=nifan.cxl@gmail.com \
--cc=oohall@gmail.com \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox