From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <linux-pci@vger.kernel.org>,
<dan.j.williams@intel.com>, <ira.weiny@intel.com>,
<vishal.l.verma@intel.com>, <alison.schofield@intel.com>,
<rostedt@goodmis.org>, <terry.bowman@amd.com>,
<bhelgaas@google.com>
Subject: Re: [PATCH v3 08/11] cxl/pci: add tracepoint events for CXL RAS
Date: Mon, 21 Nov 2022 11:37:43 +0000 [thread overview]
Message-ID: <20221121113743.000075b6@Huawei.com> (raw)
In-Reply-To: <166879132997.674819.12112190531427523276.stgit@djiang5-desk3.ch.intel.com>
On Fri, 18 Nov 2022 10:08:49 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> Add tracepoint events for recording the CXL uncorrectable and correctable
> errors. For uncorrectable errors, there is additional data of 512B from
> the header log register (CXL spec rev3 8.2.4.16.7). The trace event will
> intake a dynamic array that will dump the entire Header Log data. If
> multiple errors are set in the status register, then the
> 'first error' field (CXL spec rev3 v8.2.4.16.6) is read from the Error
> Capabilities and Control Register in order to determine the error.
>
> This implementation does not include CXL IDE Error details.
>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
With the stuff Steven raised tidied up this looks good to me now.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> drivers/cxl/pci.c | 2 +
> include/trace/events/cxl.h | 110 ++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 112 insertions(+)
> create mode 100644 include/trace/events/cxl.h
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 9428f3e0d99b..0f36a5861a7b 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -13,6 +13,8 @@
> #include "cxlmem.h"
> #include "cxlpci.h"
> #include "cxl.h"
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/cxl.h>
>
> /**
> * DOC: cxl pci
> diff --git a/include/trace/events/cxl.h b/include/trace/events/cxl.h
> new file mode 100644
> index 000000000000..f8e95d977133
> --- /dev/null
> +++ b/include/trace/events/cxl.h
> @@ -0,0 +1,110 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM cxl
> +
> +#if !defined(_CXL_EVENTS_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _CXL_EVENTS_H
> +
> +#include <linux/tracepoint.h>
> +
> +#define CXL_HEADERLOG_SIZE SZ_512
> +#define CXL_HEADERLOG_SIZE_U32 SZ_512 / sizeof(u32)
> +
> +#define CXL_RAS_UC_CACHE_DATA_PARITY BIT(0)
> +#define CXL_RAS_UC_CACHE_ADDR_PARITY BIT(1)
> +#define CXL_RAS_UC_CACHE_BE_PARITY BIT(2)
> +#define CXL_RAS_UC_CACHE_DATA_ECC BIT(3)
> +#define CXL_RAS_UC_MEM_DATA_PARITY BIT(4)
> +#define CXL_RAS_UC_MEM_ADDR_PARITY BIT(5)
> +#define CXL_RAS_UC_MEM_BE_PARITY BIT(6)
> +#define CXL_RAS_UC_MEM_DATA_ECC BIT(7)
> +#define CXL_RAS_UC_REINIT_THRESH BIT(8)
> +#define CXL_RAS_UC_RSVD_ENCODE BIT(9)
> +#define CXL_RAS_UC_POISON BIT(10)
> +#define CXL_RAS_UC_RECV_OVERFLOW BIT(11)
> +#define CXL_RAS_UC_INTERNAL_ERR BIT(14)
> +#define CXL_RAS_UC_IDE_TX_ERR BIT(15)
> +#define CXL_RAS_UC_IDE_RX_ERR BIT(16)
> +
> +#define show_uc_errs(status) __print_flags(status, " | ", \
> + { CXL_RAS_UC_CACHE_DATA_PARITY, "Cache Data Parity Error" }, \
> + { CXL_RAS_UC_CACHE_ADDR_PARITY, "Cache Address Parity Error" }, \
> + { CXL_RAS_UC_CACHE_BE_PARITY, "Cache Byte Enable Parity Error" }, \
> + { CXL_RAS_UC_CACHE_DATA_ECC, "Cache Data ECC Error" }, \
> + { CXL_RAS_UC_MEM_DATA_PARITY, "Memory Data Parity Error" }, \
> + { CXL_RAS_UC_MEM_ADDR_PARITY, "Memory Address Parity Error" }, \
> + { CXL_RAS_UC_MEM_BE_PARITY, "Memory Byte Enable Parity Error" }, \
> + { CXL_RAS_UC_MEM_DATA_ECC, "Memory Data ECC Error" }, \
> + { CXL_RAS_UC_REINIT_THRESH, "REINIT Threshold Hit" }, \
> + { CXL_RAS_UC_RSVD_ENCODE, "Received Unrecognized Encoding" }, \
> + { CXL_RAS_UC_POISON, "Received Poison From Peer" }, \
> + { CXL_RAS_UC_RECV_OVERFLOW, "Receiver Overflow" }, \
> + { CXL_RAS_UC_INTERNAL_ERR, "Component Specific Error" }, \
> + { CXL_RAS_UC_IDE_TX_ERR, "IDE Tx Error" }, \
> + { CXL_RAS_UC_IDE_RX_ERR, "IDE Rx Error" } \
> +)
> +
> +TRACE_EVENT(cxl_aer_uncorrectable_error,
> + TP_PROTO(const char *dev_name, u32 status, u32 fe, u32 *hl),
> + TP_ARGS(dev_name, status, fe, hl),
> + TP_STRUCT__entry(
> + __string(dev_name, dev_name)
> + __field(u32, status)
> + __field(u32, first_error)
> + __dynamic_array(u32, header_log, CXL_HEADERLOG_SIZE_U32)
> + ),
> + TP_fast_assign(
> + __assign_str(dev_name, dev_name);
> + __entry->status = status;
> + __entry->first_error = fe;
> + /*
> + * Embed the 512B headerlog data for user app retrieval and
> + * parsing, but no need to print this in the trace buffer.
> + */
> + memcpy(__get_dynamic_array(header_log), hl, CXL_HEADERLOG_SIZE);
> + ),
> + TP_printk("%s: status: '%s' first_error: '%s'",
> + __get_str(dev_name),
> + show_uc_errs(__entry->status),
> + show_uc_errs(__entry->first_error)
> + )
> +);
> +
> +#define CXL_RAS_CE_CACHE_DATA_ECC BIT(0)
> +#define CXL_RAS_CE_MEM_DATA_ECC BIT(1)
> +#define CXL_RAS_CE_CRC_THRESH BIT(2)
> +#define CXL_RAS_CE_CACHE_POISON BIT(3)
> +#define CXL_RAS_CE_MEM_POISON BIT(4)
> +#define CXL_RAS_CE_PHYS_LAYER_ERR BIT(5)
> +
> +#define show_ce_errs(status) __print_flags(status, " | ", \
> + { CXL_RAS_CE_CACHE_DATA_ECC, "Cache Data ECC Error" }, \
> + { CXL_RAS_CE_MEM_DATA_ECC, "Memory Data Ecc Error" }, \
> + { CXL_RAS_CE_CRC_THRESH, "CRC Threshold Hit" }, \
> + { CXL_RAS_CE_CACHE_POISON, "Received Cache Poison From Peer" }, \
> + { CXL_RAS_CE_MEM_POISON, "Received Memory Poison From Peer" }, \
> + { CXL_RAS_CE_PHYS_LAYER_ERR, "Received Error From Physical Layer" } \
> +)
> +
> +TRACE_EVENT(cxl_aer_correctable_error,
> + TP_PROTO(const char *dev_name, u32 status),
> + TP_ARGS(dev_name, status),
> + TP_STRUCT__entry(
> + __string(dev_name, dev_name)
> + __field(u32, status)
> + ),
> + TP_fast_assign(
> + __assign_str(dev_name, dev_name);
> + __entry->status = status;
> + ),
> + TP_printk("%s: status: '%s'",
> + __get_str(dev_name), show_ce_errs(__entry->status)
> + )
> +);
> +
> +#endif /* _CXL_EVENTS_H */
> +
> +/* This part must be outside protection */
> +#undef TRACE_INCLUDE_FILE
> +#define TRACE_INCLUDE_FILE cxl
> +#include <trace/define_trace.h>
>
>
next prev parent reply other threads:[~2022-11-21 11:39 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-18 17:08 [PATCH v3 00/11] cxl/pci: Add fundamental error handling Dave Jiang
2022-11-18 17:08 ` [PATCH v3 01/11] cxl/pci: Cleanup repeated code in cxl_probe_regs() helpers Dave Jiang
2022-11-18 17:08 ` [PATCH v3 02/11] cxl/pci: Cleanup cxl_map_device_regs() Dave Jiang
2022-11-18 17:08 ` [PATCH v3 03/11] cxl/pci: Kill cxl_map_regs() Dave Jiang
2022-11-18 17:08 ` [PATCH v3 04/11] cxl/core/regs: Make cxl_map_{component, device}_regs() device generic Dave Jiang
2022-11-18 17:08 ` [PATCH v3 05/11] cxl/port: Limit the port driver to just the HDM Decoder Capability Dave Jiang
2022-11-18 17:08 ` [PATCH v3 06/11] cxl/pci: Prepare for mapping RAS Capability Structure Dave Jiang
2022-11-18 17:08 ` [PATCH v3 07/11] cxl/pci: Find and map the " Dave Jiang
2022-11-18 17:08 ` [PATCH v3 08/11] cxl/pci: add tracepoint events for CXL RAS Dave Jiang
2022-11-18 17:17 ` Steven Rostedt
2022-11-18 17:31 ` Dave Jiang
2022-11-21 11:37 ` Jonathan Cameron [this message]
2022-11-21 13:08 ` Shiju Jose
2022-11-28 17:54 ` Dave Jiang
2022-11-18 17:08 ` [PATCH v3 09/11] cxl/pci: Add (hopeful) error handling support Dave Jiang
2022-11-21 11:56 ` Jonathan Cameron
2022-11-18 17:09 ` [PATCH v3 10/11] PCI/AER: Add optional logging callback for correctable error Dave Jiang
2022-11-19 1:08 ` Sathyanarayanan Kuppuswamy
2022-11-28 18:19 ` Dave Jiang
2022-11-21 12:05 ` Jonathan Cameron
2022-11-21 12:17 ` Jonathan Cameron
2022-11-28 21:01 ` Dave Jiang
2022-11-18 17:09 ` [PATCH v3 11/11] cxl/pci: Add callback to log AER " Dave Jiang
2022-11-21 12:21 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221121113743.000075b6@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=terry.bowman@amd.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.