From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Terry Bowman <terry.bowman@amd.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-pci@vger.kernel.org>, <nifan.cxl@gmail.com>,
<dave@stgolabs.net>, <dave.jiang@intel.com>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<dan.j.williams@intel.com>, <bhelgaas@google.com>,
<mahesh@linux.ibm.com>, <ira.weiny@intel.com>, <oohall@gmail.com>,
<Benjamin.Cheatham@amd.com>, <rrichter@amd.com>,
<nathan.fontenot@amd.com>,
<Smita.KoralahalliChannabasappa@amd.com>, <lukas@wunner.de>,
<ming.li@zohomail.com>, <PradeepVineshReddy.Kodamati@amd.com>,
<alucerop@amd.com>
Subject: Re: [PATCH v5 14/16] cxl/pci: Add trace logging for CXL PCIe Port RAS errors
Date: Tue, 14 Jan 2025 11:49:27 +0000 [thread overview]
Message-ID: <20250114114927.000022ef@huawei.com> (raw)
In-Reply-To: <20250107143852.3692571-15-terry.bowman@amd.com>
On Tue, 7 Jan 2025 08:38:50 -0600
Terry Bowman <terry.bowman@amd.com> wrote:
> The CXL drivers use kernel trace functions for logging endpoint and
> Restricted CXL host (RCH) Downstream Port RAS errors. Similar functionality
> is required for CXL Root Ports, CXL Downstream Switch Ports, and CXL
> Upstream Switch Ports.
>
> Introduce trace logging functions for both RAS correctable and
> uncorrectable errors specific to CXL PCIe Ports. Additionally, update
> the CXL Port Protocol Error handlers to invoke these new trace functions.
>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Reviewed-by: Alejandro Lucero <alucerop@amd.com>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
An example print in commit message would help understand what the tracepoints
look like.
Few more things inline following on from earlier comments.
Jonathan
> ---
> drivers/cxl/core/pci.c | 17 +++++++++++----
> drivers/cxl/core/trace.h | 47 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 60 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index 411834f7efe0..3e87fe54a1a2 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -663,10 +663,15 @@ static void __cxl_handle_cor_ras(struct device *dev,
>
> addr = ras_base + CXL_RAS_CORRECTABLE_STATUS_OFFSET;
> status = readl(addr);
> - if (status & CXL_RAS_CORRECTABLE_STATUS_MASK) {
> - writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr);
> + if (!(status & CXL_RAS_CORRECTABLE_STATUS_MASK))
> + return;
> +
> + writel(status & CXL_RAS_CORRECTABLE_STATUS_MASK, addr);
> +
> + if (is_cxl_memdev(dev))
As below. Drag to earlier patch.
> trace_cxl_aer_correctable_error(to_cxl_memdev(dev), status);
> - }
> + else
and perhaps check it's a port mostly for documentation purposes.
> + trace_cxl_port_aer_correctable_error(dev, status);
> }
>
> static void cxl_handle_endpoint_cor_ras(struct cxl_dev_state *cxlds)
> @@ -724,7 +729,11 @@ static bool __cxl_handle_ras(struct device *dev, void __iomem *ras_base)
> }
>
> header_log_copy(ras_base, hl);
> - trace_cxl_aer_uncorrectable_error(to_cxl_memdev(dev), status, fe, hl);
> + if (is_cxl_memdev(dev))
As mentioned above, drag this if to the earlier patch.
> + trace_cxl_aer_uncorrectable_error(to_cxl_memdev(dev), status, fe, hl);
> + else
For documentation purposes mostly I'd be tempted to have an is_cxl_port() check
before calling the following.
> + trace_cxl_port_aer_uncorrectable_error(dev, status, fe, hl);
> +
> writel(status & CXL_RAS_UNCORRECTABLE_STATUS_MASK, addr);
>
> return true;
> diff --git a/drivers/cxl/core/trace.h b/drivers/cxl/core/trace.h
> index 8389a94adb1a..681e415ac8f5 100644
> --- a/drivers/cxl/core/trace.h
> +++ b/drivers/cxl/core/trace.h
> @@ -48,6 +48,34 @@
> { CXL_RAS_UC_IDE_RX_ERR, "IDE Rx Error" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_uncorrectable_error,
> + TP_PROTO(struct device *dev, u32 status, u32 fe, u32 *hl),
> + TP_ARGS(dev, status, fe, hl),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __string(host, dev_name(dev->parent))
What is host in this case? Perhaps a comment.
> + __field(u32, status)
> + __field(u32, first_error)
> + __array(u32, header_log, CXL_HEADERLOG_SIZE_U32)
> + ),
> + TP_fast_assign(
> + __assign_str(devname);
> + __assign_str(host);
> + __entry->status = status;
> + __entry->first_error = fe;
> + /*
> + * Embed the 512B headerlog data for user app retrieval and
> + * parsing, but no need to print this in the trace buffer.
> + */
> + memcpy(__entry->header_log, hl, CXL_HEADERLOG_SIZE);
> + ),
> + TP_printk("device=%s host=%s status: '%s' first_error: '%s'",
> + __get_str(devname), __get_str(host),
> + show_uc_errs(__entry->status),
> + show_uc_errs(__entry->first_error)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_uncorrectable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status, u32 fe, u32 *hl),
> TP_ARGS(cxlmd, status, fe, hl),
> @@ -96,6 +124,25 @@ TRACE_EVENT(cxl_aer_uncorrectable_error,
> { CXL_RAS_CE_PHYS_LAYER_ERR, "Received Error From Physical Layer" } \
> )
>
> +TRACE_EVENT(cxl_port_aer_correctable_error,
> + TP_PROTO(struct device *dev, u32 status),
> + TP_ARGS(dev, status),
> + TP_STRUCT__entry(
> + __string(devname, dev_name(dev))
> + __string(host, dev_name(dev->parent))
> + __field(u32, status)
> + ),
> + TP_fast_assign(
> + __assign_str(devname);
> + __assign_str(host);
> + __entry->status = status;
> + ),
> + TP_printk("device=%s host=%s status='%s'",
> + __get_str(devname), __get_str(host),
> + show_ce_errs(__entry->status)
> + )
> +);
> +
> TRACE_EVENT(cxl_aer_correctable_error,
> TP_PROTO(const struct cxl_memdev *cxlmd, u32 status),
> TP_ARGS(cxlmd, status),
next prev parent reply other threads:[~2025-01-14 11:49 UTC|newest]
Thread overview: 96+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 14:38 [PATCH v5 0/16] Enable CXL PCIe port protocol error handling and logging Terry Bowman
2025-01-07 14:38 ` [PATCH v5 01/16] PCI/AER: Introduce 'struct cxl_err_handlers' and add to 'struct pci_driver' Terry Bowman
2025-01-13 23:45 ` Ira Weiny
2025-02-06 17:01 ` Gregory Price
2025-02-07 18:35 ` Bowman, Terry
2025-01-07 14:38 ` [PATCH v5 02/16] PCI/AER: Rename AER driver's interfaces to also indicate CXL PCIe Port support Terry Bowman
2025-01-13 23:45 ` Ira Weiny
2025-02-06 17:02 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 03/16] CXL/PCI: Introduce PCIe helper functions pcie_is_cxl() and pcie_is_cxl_port() Terry Bowman
2025-01-13 23:49 ` Ira Weiny
2025-01-14 15:19 ` Bowman, Terry
2025-01-14 23:33 ` Ira Weiny
2025-01-14 23:39 ` Bowman, Terry
2025-01-16 15:35 ` Ira Weiny
2025-01-15 10:03 ` Lukas Wunner
2025-01-07 14:38 ` [PATCH v5 04/16] PCI/AER: Modify AER driver logging to report CXL or PCIe bus error type Terry Bowman
2025-01-13 23:51 ` Ira Weiny
2025-02-06 18:18 ` Gregory Price
2025-02-07 18:50 ` Bowman, Terry
2025-01-07 14:38 ` [PATCH v5 05/16] PCI/AER: Add CXL PCIe Port correctable error support in AER service driver Terry Bowman
2025-01-14 6:54 ` Li Ming
2025-01-14 11:20 ` Jonathan Cameron
2025-01-14 20:10 ` Bowman, Terry
2025-01-14 19:29 ` Bowman, Terry
2025-01-15 1:18 ` Li Ming
2025-01-15 14:39 ` Bowman, Terry
2025-01-16 3:15 ` Li Ming
2025-02-05 3:46 ` Bowman, Terry
2025-02-05 13:58 ` Li Ming
2025-02-05 14:22 ` Bowman, Terry
2025-01-14 16:35 ` Ira Weiny
2025-02-06 18:33 ` Gregory Price
2025-02-07 17:54 ` Jonathan Cameron
2025-02-07 19:05 ` Bowman, Terry
2025-01-07 14:38 ` [PATCH v5 06/16] PCI/AER: Change AER driver to read UCE fatal status for all CXL PCIe Port devices Terry Bowman
2025-01-14 11:32 ` Jonathan Cameron
2025-01-14 20:44 ` Bowman, Terry
2025-01-28 20:25 ` Bowman, Terry
2025-01-29 18:04 ` Jonathan Cameron
2025-01-14 16:57 ` Ira Weiny
2025-01-07 14:38 ` [PATCH v5 07/16] PCI/AER: Add CXL PCIe Port uncorrectable error recovery in AER service driver Terry Bowman
2025-01-14 11:33 ` Jonathan Cameron
2025-01-14 20:28 ` Bowman, Terry
2025-01-15 11:37 ` Jonathan Cameron
2025-01-14 17:27 ` Ira Weiny
2025-01-07 14:38 ` [PATCH v5 08/16] cxl/pci: Map CXL PCIe Root Port and Downstream Switch Port RAS registers Terry Bowman
2025-01-14 21:37 ` Ira Weiny
2025-02-07 7:30 ` Gregory Price
2025-02-07 19:08 ` Bowman, Terry
2025-02-07 19:39 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 09/16] cxl/pci: Map CXL PCIe Upstream " Terry Bowman
2025-01-14 11:35 ` Jonathan Cameron
2025-01-14 15:24 ` Bowman, Terry
2025-01-14 22:02 ` Ira Weiny
2025-01-14 22:11 ` Bowman, Terry
2025-01-14 23:38 ` Ira Weiny
2025-01-14 23:49 ` Bowman, Terry
2025-01-15 11:40 ` Jonathan Cameron
2025-02-07 7:35 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 10/16] cxl/pci: Update RAS handler interfaces to also support CXL PCIe Ports Terry Bowman
2025-01-14 11:39 ` Jonathan Cameron
2025-01-14 22:20 ` Ira Weiny
2025-02-07 7:38 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 11/16] cxl/pci: Add log message for umnapped registers in existing RAS handlers Terry Bowman
2025-01-14 11:41 ` Jonathan Cameron
2025-01-14 22:21 ` Ira Weiny
2025-02-07 7:39 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 12/16] cxl/pci: Change find_cxl_port() to non-static Terry Bowman
2025-01-14 22:23 ` Ira Weiny
2025-02-07 7:45 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 13/16] cxl/pci: Add error handler for CXL PCIe Port RAS errors Terry Bowman
2025-01-14 11:46 ` Jonathan Cameron
2025-01-14 21:20 ` Bowman, Terry
2025-01-14 22:51 ` Ira Weiny
2025-01-14 23:10 ` Bowman, Terry
2025-01-14 23:42 ` Bowman, Terry
2025-02-07 8:01 ` Gregory Price
2025-02-07 19:23 ` Bowman, Terry
2025-02-07 19:41 ` Gregory Price
2025-02-07 21:04 ` Bowman, Terry
2025-01-07 14:38 ` [PATCH v5 14/16] cxl/pci: Add trace logging " Terry Bowman
2025-01-14 11:49 ` Jonathan Cameron [this message]
2025-01-14 20:56 ` Bowman, Terry
2025-01-15 11:42 ` Jonathan Cameron
2025-01-14 22:58 ` Ira Weiny
2025-01-07 14:38 ` [PATCH v5 15/16] cxl/pci: Add support to assign and clear pci_driver::cxl_err_handlers Terry Bowman
2025-01-14 11:51 ` Jonathan Cameron
2025-01-14 23:03 ` Ira Weiny
2025-02-07 8:08 ` Gregory Price
2025-01-07 14:38 ` [PATCH v5 16/16] PCI/AER: Enable internal errors for CXL Upstream and Downstream Switch Ports Terry Bowman
2025-01-14 23:26 ` Ira Weiny
2025-01-14 23:34 ` Bowman, Terry
2025-01-14 23:45 ` Ira Weiny
2025-01-15 0:09 ` Bowman, Terry
2025-01-15 0:20 ` Bowman, Terry
2025-01-16 21:42 ` Ira Weiny
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250114114927.000022ef@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=Benjamin.Cheatham@amd.com \
--cc=PradeepVineshReddy.Kodamati@amd.com \
--cc=Smita.KoralahalliChannabasappa@amd.com \
--cc=alison.schofield@intel.com \
--cc=alucerop@amd.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=ming.li@zohomail.com \
--cc=nathan.fontenot@amd.com \
--cc=nifan.cxl@gmail.com \
--cc=oohall@gmail.com \
--cc=rrichter@amd.com \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.