linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v10 0/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes
@ 2025-09-20  6:01 Shuai Xue
  2025-09-20  6:01 ` [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event Shuai Xue
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Shuai Xue @ 2025-09-20  6:01 UTC (permalink / raw)
  To: rostedt, lukas, linux-pci, linux-kernel, linux-edac,
	linux-trace-kernel, helgaas, ilpo.jarvinen, mattc,
	Jonathan.Cameron
  Cc: bhelgaas, tony.luck, bp, xueshuai, mhiramat, mathieu.desnoyers,
	oleg, naveen, davem, anil.s.keshavamurthy, mark.rutland, peterz,
	tianruidong

changes since v9:
- add a documentation about PCI tracepoints per Bjorn
- create a dedicated drivers/pci/trace.c that always defines the PCI tracepoints per Steve
- move tracepoint callite into __pcie_update_link_speed() per Lukas and Bjorn

changes since v8:
- rewrite commit log from Bjorn
- move pci_hp_event to a common place (include/trace/events/pci.h) per Ilpo
- rename hotplug event strings per Bjorn and Lukas
- add PCIe link tracepoint per Bjorn, Lukas, and Ilpo

Hotplug events are critical indicators for analyzing hardware health, and
surprise link downs can significantly impact system performance and reliability.
In addition, PCIe link speed degradation directly impacts system performance and
often indicates hardware issues such as faulty devices, physical layer problems,
or configuration errors.

This patch set add PCI hotplug and PCIe link tracepoint to help analyze PCI
hotplug events and PCIe link speed degradation.

Shuai Xue (3):
  PCI: trace: Add a generic RAS tracepoint for hotplug event
  PCI: trace: Add a RAS tracepoint to monitor link speed changes
  Documentation: tracing: Add documentation about PCI tracepoints

 Documentation/trace/events-pci.rst |  74 ++++++++++++++++++
 drivers/pci/Makefile               |   2 +
 drivers/pci/hotplug/Makefile       |   3 +-
 drivers/pci/hotplug/pciehp_ctrl.c  |  31 ++++++--
 drivers/pci/hotplug/pciehp_hpc.c   |   3 +-
 drivers/pci/pci.c                  |   2 +-
 drivers/pci/pci.h                  |  22 +++++-
 drivers/pci/pcie/bwctrl.c          |   4 +-
 drivers/pci/probe.c                |   9 ++-
 drivers/pci/trace.c                |  11 +++
 include/linux/pci.h                |   1 +
 include/trace/events/pci.h         | 119 +++++++++++++++++++++++++++++
 include/uapi/linux/pci.h           |   7 ++
 13 files changed, 271 insertions(+), 17 deletions(-)
 create mode 100644 Documentation/trace/events-pci.rst
 create mode 100644 drivers/pci/trace.c
 create mode 100644 include/trace/events/pci.h

-- 
2.39.3


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-20  6:01 [PATCH v10 0/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
@ 2025-09-20  6:01 ` Shuai Xue
  2025-09-22 13:10   ` Ilpo Järvinen
  2025-09-20  6:01 ` [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
  2025-09-20  6:01 ` [PATCH v10 3/3] Documentation: tracing: Add documentation about PCI tracepoints Shuai Xue
  2 siblings, 1 reply; 13+ messages in thread
From: Shuai Xue @ 2025-09-20  6:01 UTC (permalink / raw)
  To: rostedt, lukas, linux-pci, linux-kernel, linux-edac,
	linux-trace-kernel, helgaas, ilpo.jarvinen, mattc,
	Jonathan.Cameron
  Cc: bhelgaas, tony.luck, bp, xueshuai, mhiramat, mathieu.desnoyers,
	oleg, naveen, davem, anil.s.keshavamurthy, mark.rutland, peterz,
	tianruidong

Hotplug events are critical indicators for analyzing hardware health,
and surprise link downs can significantly impact system performance and
reliability.

Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
for hotplug event to help health checks. Add enum pci_hotplug_event in
include/uapi/linux/pci.h so applications like rasdaemon can register
tracepoint event handlers for it.

The following output is generated when a device is hotplugged:

$ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
$ cat /sys/kernel/debug/tracing/trace_pipe
   irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event: 0000:00:02.0 slot:10, event:CARD_PRESENT

   irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event: 0000:00:02.0 slot:10, event:LINK_UP

Suggested-by: Lukas Wunner <lukas@wunner.de>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
 drivers/pci/Makefile              |  2 +
 drivers/pci/hotplug/Makefile      |  3 +-
 drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
 drivers/pci/trace.c               | 11 ++++++
 include/trace/events/pci.h        | 63 +++++++++++++++++++++++++++++++
 include/uapi/linux/pci.h          |  7 ++++
 6 files changed, 110 insertions(+), 7 deletions(-)
 create mode 100644 drivers/pci/trace.c
 create mode 100644 include/trace/events/pci.h

diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index 67647f1880fb..bf389bc4dd3c 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -45,3 +45,5 @@ obj-y				+= controller/
 obj-y				+= switch/
 
 subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
+
+CFLAGS_trace.o := -I$(src)
diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
index 40aaf31fe338..d41f7050b072 100644
--- a/drivers/pci/hotplug/Makefile
+++ b/drivers/pci/hotplug/Makefile
@@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
 pciehp-objs		:=	pciehp_core.o	\
 				pciehp_ctrl.o	\
 				pciehp_pci.o	\
-				pciehp_hpc.o
+				pciehp_hpc.o	\
+				../trace.o
 
 shpchp-objs		:=	shpchp_core.o	\
 				shpchp_ctrl.o	\
diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
index bcc938d4420f..7805f697a02c 100644
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -19,6 +19,7 @@
 #include <linux/types.h>
 #include <linux/pm_runtime.h>
 #include <linux/pci.h>
+#include <trace/events/pci.h>
 
 #include "../pci.h"
 #include "pciehp.h"
@@ -244,12 +245,20 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 	case ON_STATE:
 		ctrl->state = POWEROFF_STATE;
 		mutex_unlock(&ctrl->state_lock);
-		if (events & PCI_EXP_SLTSTA_DLLSC)
+		if (events & PCI_EXP_SLTSTA_DLLSC) {
 			ctrl_info(ctrl, "Slot(%s): Link Down\n",
 				  slot_name(ctrl));
-		if (events & PCI_EXP_SLTSTA_PDC)
+			trace_pci_hp_event(pci_name(ctrl->pcie->port),
+					   slot_name(ctrl),
+					   PCI_HOTPLUG_LINK_DOWN);
+		}
+		if (events & PCI_EXP_SLTSTA_PDC) {
 			ctrl_info(ctrl, "Slot(%s): Card not present\n",
 				  slot_name(ctrl));
+			trace_pci_hp_event(pci_name(ctrl->pcie->port),
+					   slot_name(ctrl),
+					   PCI_HOTPLUG_CARD_NOT_PRESENT);
+		}
 		pciehp_disable_slot(ctrl, SURPRISE_REMOVAL);
 		break;
 	default:
@@ -269,6 +278,9 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 					      INDICATOR_NOOP);
 			ctrl_info(ctrl, "Slot(%s): Card not present\n",
 				  slot_name(ctrl));
+			trace_pci_hp_event(pci_name(ctrl->pcie->port),
+					   slot_name(ctrl),
+					   PCI_HOTPLUG_CARD_NOT_PRESENT);
 		}
 		mutex_unlock(&ctrl->state_lock);
 		return;
@@ -281,12 +293,19 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
 	case OFF_STATE:
 		ctrl->state = POWERON_STATE;
 		mutex_unlock(&ctrl->state_lock);
-		if (present)
+		if (present) {
 			ctrl_info(ctrl, "Slot(%s): Card present\n",
 				  slot_name(ctrl));
-		if (link_active)
-			ctrl_info(ctrl, "Slot(%s): Link Up\n",
-				  slot_name(ctrl));
+			trace_pci_hp_event(pci_name(ctrl->pcie->port),
+					   slot_name(ctrl),
+					   PCI_HOTPLUG_CARD_PRESENT);
+		}
+		if (link_active) {
+			ctrl_info(ctrl, "Slot(%s): Link Up\n", slot_name(ctrl));
+			trace_pci_hp_event(pci_name(ctrl->pcie->port),
+					   slot_name(ctrl),
+					   PCI_HOTPLUG_LINK_UP);
+		}
 		ctrl->request_result = pciehp_enable_slot(ctrl);
 		break;
 	default:
diff --git a/drivers/pci/trace.c b/drivers/pci/trace.c
new file mode 100644
index 000000000000..cf11abca8602
--- /dev/null
+++ b/drivers/pci/trace.c
@@ -0,0 +1,11 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Tracepoints for PCI system
+ *
+ * Copyright (C) 2025 Alibaba Corporation
+ */
+
+#include <linux/pci.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/pci.h>
diff --git a/include/trace/events/pci.h b/include/trace/events/pci.h
new file mode 100644
index 000000000000..208609492c06
--- /dev/null
+++ b/include/trace/events/pci.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM pci
+
+#if !defined(_TRACE_HW_EVENT_PCI_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HW_EVENT_PCI_H
+
+#include <linux/tracepoint.h>
+
+#define PCI_HOTPLUG_EVENT						\
+	EM(PCI_HOTPLUG_LINK_UP,			"LINK_UP")		\
+	EM(PCI_HOTPLUG_LINK_DOWN,		"LINK_DOWN")		\
+	EM(PCI_HOTPLUG_CARD_PRESENT,		"CARD_PRESENT")		\
+	EMe(PCI_HOTPLUG_CARD_NOT_PRESENT,	"CARD_NOT_PRESENT")
+
+/* Enums require being exported to userspace, for user tool parsing */
+#undef EM
+#undef EMe
+#define EM(a, b)	TRACE_DEFINE_ENUM(a);
+#define EMe(a, b)	TRACE_DEFINE_ENUM(a);
+
+PCI_HOTPLUG_EVENT
+
+/*
+ * Now redefine the EM() and EMe() macros to map the enums to the strings
+ * that will be printed in the output.
+ */
+#undef EM
+#undef EMe
+#define EM(a, b)	{a, b},
+#define EMe(a, b)	{a, b}
+
+TRACE_EVENT(pci_hp_event,
+
+	TP_PROTO(const char *port_name,
+		 const char *slot,
+		 const int event),
+
+	TP_ARGS(port_name, slot, event),
+
+	TP_STRUCT__entry(
+		__string(	port_name,	port_name	)
+		__string(	slot,		slot		)
+		__field(	int,		event	)
+	),
+
+	TP_fast_assign(
+		__assign_str(port_name);
+		__assign_str(slot);
+		__entry->event = event;
+	),
+
+	TP_printk("%s slot:%s, event:%s\n",
+		__get_str(port_name),
+		__get_str(slot),
+		__print_symbolic(__entry->event, PCI_HOTPLUG_EVENT)
+	)
+);
+
+#endif /* _TRACE_HW_EVENT_PCI_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/include/uapi/linux/pci.h b/include/uapi/linux/pci.h
index a769eefc5139..4f150028965d 100644
--- a/include/uapi/linux/pci.h
+++ b/include/uapi/linux/pci.h
@@ -39,4 +39,11 @@
 #define PCIIOC_MMAP_IS_MEM	(PCIIOC_BASE | 0x02)	/* Set mmap state to MEM space. */
 #define PCIIOC_WRITE_COMBINE	(PCIIOC_BASE | 0x03)	/* Enable/disable write-combining. */
 
+enum pci_hotplug_event {
+	PCI_HOTPLUG_LINK_UP,
+	PCI_HOTPLUG_LINK_DOWN,
+	PCI_HOTPLUG_CARD_PRESENT,
+	PCI_HOTPLUG_CARD_NOT_PRESENT,
+};
+
 #endif /* _UAPILINUX_PCI_H */
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes
  2025-09-20  6:01 [PATCH v10 0/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
  2025-09-20  6:01 ` [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event Shuai Xue
@ 2025-09-20  6:01 ` Shuai Xue
  2025-09-22 13:06   ` Ilpo Järvinen
  2025-09-20  6:01 ` [PATCH v10 3/3] Documentation: tracing: Add documentation about PCI tracepoints Shuai Xue
  2 siblings, 1 reply; 13+ messages in thread
From: Shuai Xue @ 2025-09-20  6:01 UTC (permalink / raw)
  To: rostedt, lukas, linux-pci, linux-kernel, linux-edac,
	linux-trace-kernel, helgaas, ilpo.jarvinen, mattc,
	Jonathan.Cameron
  Cc: bhelgaas, tony.luck, bp, xueshuai, mhiramat, mathieu.desnoyers,
	oleg, naveen, davem, anil.s.keshavamurthy, mark.rutland, peterz,
	tianruidong

PCIe link speed degradation directly impacts system performance and
often indicates hardware issues such as faulty devices, physical layer
problems, or configuration errors.

To this end, add a RAS tracepoint to monitor link speed changes,
enabling proactive health checks and diagnostic analysis.

The following output is generated when a device is hotplugged:

$ echo 1 > /sys/kernel/debug/tracing/events/pci/pcie_link_event/enable
$ cat /sys/kernel/debug/tracing/trace_pipe
   irq/51-pciehp-88      [001] .....   381.545386: pcie_link_event: 0000:00:02.0 type:4, reason:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA

Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Suggested-by: Matthew W Carlis <mattc@purestorage.com>
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 drivers/pci/hotplug/pciehp_hpc.c |  3 +-
 drivers/pci/pci.c                |  2 +-
 drivers/pci/pci.h                | 22 +++++++++++--
 drivers/pci/pcie/bwctrl.c        |  4 +--
 drivers/pci/probe.c              |  9 +++--
 include/linux/pci.h              |  1 +
 include/trace/events/pci.h       | 56 ++++++++++++++++++++++++++++++++
 7 files changed, 87 insertions(+), 10 deletions(-)

diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
index bcc51b26d03d..ad5f28f6a8b1 100644
--- a/drivers/pci/hotplug/pciehp_hpc.c
+++ b/drivers/pci/hotplug/pciehp_hpc.c
@@ -320,7 +320,8 @@ int pciehp_check_link_status(struct controller *ctrl)
 	}
 
 	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &linksta2);
-	__pcie_update_link_speed(ctrl->pcie->port->subordinate, lnk_status, linksta2);
+	__pcie_update_link_speed(ctrl->pcie->port->subordinate, PCIE_HOTPLUG,
+				 lnk_status, linksta2);
 
 	if (!found) {
 		ctrl_info(ctrl, "Slot(%s): No device found\n",
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b0f4d98036cd..96755ffd3841 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4749,7 +4749,7 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt)
 	 * Link Speed.
 	 */
 	if (pdev->subordinate)
-		pcie_update_link_speed(pdev->subordinate);
+		pcie_update_link_speed(pdev->subordinate, PCIE_LINK_RETRAIN);
 
 	return rc;
 }
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index b8d364545e7d..422406a0695c 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -3,6 +3,7 @@
 #define DRIVERS_PCI_H
 
 #include <linux/pci.h>
+#include <trace/events/pci.h>
 
 struct pcie_tlp_log;
 
@@ -455,16 +456,31 @@ static inline int pcie_dev_speed_mbps(enum pci_bus_speed speed)
 }
 
 u8 pcie_get_supported_speeds(struct pci_dev *dev);
-const char *pci_speed_string(enum pci_bus_speed speed);
 void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
 void pcie_report_downtraining(struct pci_dev *dev);
 
-static inline void __pcie_update_link_speed(struct pci_bus *bus, u16 linksta, u16 linksta2)
+enum pcie_link_change_reason {
+	PCIE_LINK_RETRAIN,
+	PCIE_ADD_BUS,
+	PCIE_BWCTRL_ENABLE,
+	PCIE_BWCTRL_IRQ,
+	PCIE_HOTPLUG
+};
+
+static inline void __pcie_update_link_speed(struct pci_bus *bus,
+					    enum pcie_link_change_reason reason,
+					    u16 linksta, u16 linksta2)
 {
 	bus->cur_bus_speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS];
 	bus->flit_mode = (linksta2 & PCI_EXP_LNKSTA2_FLIT) ? 1 : 0;
+
+	trace_pcie_link_event(bus,
+			     reason,
+			     FIELD_GET(PCI_EXP_LNKSTA_NLW, linksta),
+			     linksta & PCI_EXP_LNKSTA_LINK_STATUS_MASK);
 }
-void pcie_update_link_speed(struct pci_bus *bus);
+
+void pcie_update_link_speed(struct pci_bus *bus, enum pcie_link_change_reason reason);
 
 /* Single Root I/O Virtualization */
 struct pci_sriov {
diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c
index 36f939f23d34..32f1b30ecb84 100644
--- a/drivers/pci/pcie/bwctrl.c
+++ b/drivers/pci/pcie/bwctrl.c
@@ -199,7 +199,7 @@ static void pcie_bwnotif_enable(struct pcie_device *srv)
 	 * Update after enabling notifications & clearing status bits ensures
 	 * link speed is up to date.
 	 */
-	pcie_update_link_speed(port->subordinate);
+	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_ENABLE);
 }
 
 static void pcie_bwnotif_disable(struct pci_dev *port)
@@ -234,7 +234,7 @@ static irqreturn_t pcie_bwnotif_irq(int irq, void *context)
 	 * speed (inside pcie_update_link_speed()) after LBMS has been
 	 * cleared to avoid missing link speed changes.
 	 */
-	pcie_update_link_speed(port->subordinate);
+	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_IRQ);
 
 	return IRQ_HANDLED;
 }
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index f41128f91ca7..c4cae2664156 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -21,6 +21,7 @@
 #include <linux/irqdomain.h>
 #include <linux/pm_runtime.h>
 #include <linux/bitfield.h>
+#include <trace/events/pci.h>
 #include "pci.h"
 
 #define CARDBUS_LATENCY_TIMER	176	/* secondary latency timer */
@@ -788,14 +789,16 @@ const char *pci_speed_string(enum pci_bus_speed speed)
 }
 EXPORT_SYMBOL_GPL(pci_speed_string);
 
-void pcie_update_link_speed(struct pci_bus *bus)
+void pcie_update_link_speed(struct pci_bus *bus,
+			    enum pcie_link_change_reason reason)
 {
 	struct pci_dev *bridge = bus->self;
 	u16 linksta, linksta2;
 
 	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA, &linksta);
 	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA2, &linksta2);
-	__pcie_update_link_speed(bus, linksta, linksta2);
+
+	__pcie_update_link_speed(bus, reason, linksta, linksta2);
 }
 EXPORT_SYMBOL_GPL(pcie_update_link_speed);
 
@@ -882,7 +885,7 @@ static void pci_set_bus_speed(struct pci_bus *bus)
 		pcie_capability_read_dword(bridge, PCI_EXP_LNKCAP, &linkcap);
 		bus->max_bus_speed = pcie_link_speed[linkcap & PCI_EXP_LNKCAP_SLS];
 
-		pcie_update_link_speed(bus);
+		pcie_update_link_speed(bus, PCIE_ADD_BUS);
 	}
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 59876de13860..edd8a61ec44e 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -305,6 +305,7 @@ enum pci_bus_speed {
 	PCI_SPEED_UNKNOWN		= 0xff,
 };
 
+const char *pci_speed_string(enum pci_bus_speed speed);
 enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
 enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
 
diff --git a/include/trace/events/pci.h b/include/trace/events/pci.h
index 208609492c06..78e651b95cb3 100644
--- a/include/trace/events/pci.h
+++ b/include/trace/events/pci.h
@@ -57,6 +57,62 @@ TRACE_EVENT(pci_hp_event,
 	)
 );
 
+#define PCI_EXP_LNKSTA_LINK_STATUS_MASK (PCI_EXP_LNKSTA_LBMS | \
+					 PCI_EXP_LNKSTA_LABS | \
+					 PCI_EXP_LNKSTA_LT | \
+					 PCI_EXP_LNKSTA_DLLLA)
+
+#define LNKSTA_FLAGS					\
+	{ PCI_EXP_LNKSTA_LT,	"LT"},			\
+	{ PCI_EXP_LNKSTA_DLLLA,	"DLLLA"},		\
+	{ PCI_EXP_LNKSTA_LBMS,	"LBMS"},		\
+	{ PCI_EXP_LNKSTA_LABS,	"LABS"}
+
+TRACE_EVENT(pcie_link_event,
+
+	TP_PROTO(struct pci_bus *bus,
+		  unsigned int reason,
+		  unsigned int width,
+		  unsigned int status
+		),
+
+	TP_ARGS(bus, reason, width, status),
+
+	TP_STRUCT__entry(
+		__string(	port_name,	pci_name(bus->self))
+		__field(	unsigned int,	type		)
+		__field(	unsigned int,	reason		)
+		__field(	unsigned int,	cur_bus_speed	)
+		__field(	unsigned int,	max_bus_speed	)
+		__field(	unsigned int,	width		)
+		__field(	unsigned int,	flit_mode	)
+		__field(	unsigned int,	link_status	)
+	),
+
+	TP_fast_assign(
+		__assign_str(port_name);
+		__entry->type			= pci_pcie_type(bus->self);
+		__entry->reason			= reason;
+		__entry->cur_bus_speed		= bus->cur_bus_speed;
+		__entry->max_bus_speed		= bus->max_bus_speed;
+		__entry->width			= width;
+		__entry->flit_mode		= bus->flit_mode;
+		__entry->link_status		= status;
+	),
+
+	TP_printk("%s type:%d, reason:%d, cur_bus_speed:%s, max_bus_speed:%s, width:%u, flit_mode:%u, status:%s\n",
+		__get_str(port_name),
+		__entry->type,
+		__entry->reason,
+		pci_speed_string(__entry->cur_bus_speed),
+		pci_speed_string(__entry->max_bus_speed),
+		__entry->width,
+		__entry->flit_mode,
+		__print_flags((unsigned long)__entry->link_status, "|",
+				LNKSTA_FLAGS)
+	)
+);
+
 #endif /* _TRACE_HW_EVENT_PCI_H */
 
 /* This part must be outside protection */
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v10 3/3] Documentation: tracing: Add documentation about PCI tracepoints
  2025-09-20  6:01 [PATCH v10 0/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
  2025-09-20  6:01 ` [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event Shuai Xue
  2025-09-20  6:01 ` [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
@ 2025-09-20  6:01 ` Shuai Xue
  2 siblings, 0 replies; 13+ messages in thread
From: Shuai Xue @ 2025-09-20  6:01 UTC (permalink / raw)
  To: rostedt, lukas, linux-pci, linux-kernel, linux-edac,
	linux-trace-kernel, helgaas, ilpo.jarvinen, mattc,
	Jonathan.Cameron
  Cc: bhelgaas, tony.luck, bp, xueshuai, mhiramat, mathieu.desnoyers,
	oleg, naveen, davem, anil.s.keshavamurthy, mark.rutland, peterz,
	tianruidong

The PCI tracing system provides tracepoints to monitor critical hardware
events that can impact system performance and reliability. Add
documentation about it.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
---
 Documentation/trace/events-pci.rst | 74 ++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)
 create mode 100644 Documentation/trace/events-pci.rst

diff --git a/Documentation/trace/events-pci.rst b/Documentation/trace/events-pci.rst
new file mode 100644
index 000000000000..500b27713224
--- /dev/null
+++ b/Documentation/trace/events-pci.rst
@@ -0,0 +1,74 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+Subsystem Trace Points: PCI
+===========================
+
+Overview
+========
+The PCI tracing system provides tracepoints to monitor critical hardware events
+that can impact system performance and reliability. These events normally show
+up here:
+
+	/sys/kernel/tracing/events/pci
+
+Cf. include/trace/events/pci.h for the events definitions.
+
+Available Tracepoints
+=====================
+
+pci_hp_event
+------------
+
+Monitors PCI hotplug events including card insertion/removal and link
+state changes.
+::
+
+    pci_hp_event  "%s slot:%s, event:%s\n"
+
+**Event Types**:
+
+* ``LINK_UP`` - PCIe link established
+* ``LINK_DOWN`` - PCIe link lost
+* ``CARD_PRESENT`` - Card detected in slot
+* ``CARD_NOT_PRESENT`` - Card removed from slot
+
+**Example Usage**:
+
+    # Enable the tracepoint
+    echo 1> /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
+
+    # Monitor events (the following output is generated when a device is hotplugged)
+    cat /sys/kernel/debug/tracing/trace_pipe
+       irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event: 0000:00:02.0 slot:10, event:CARD_PRESENT
+
+       irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event: 0000:00:02.0 slot:10, event:LINK_UP
+
+pcie_link_event
+---------------
+
+Monitors PCIe link speed changes and provides detailed link status information.
+::
+
+    pcie_link_event  "%s type:%d, reason:%d, cur_bus_speed:%s, max_bus_speed:%s, width:%u, flit_mode:%u, status:%s\n"
+
+**Parameters**:
+
+* ``type`` - PCIe device type (4=Root Port, etc.)
+* ``reason`` - Reason for link change:
+
+  - ``0`` - Link retrain
+  - ``1`` - Bus enumeration
+  - ``2`` - Bandwidth controller enable
+  - ``3`` - Bandwidth controller IRQ
+  - ``4`` - Hotplug event
+
+
+**Example Usage**:
+
+    # Enable the tracepoint
+    echo1 > /sys/kernel/debug/tracing/events/pci/pcie_link_event/enable
+
+    # Monitor events (the following output is generated when a device is hotplugged)
+    cat /sys/kernel/debug/tracing/trace_pipe
+       irq/51-pciehp-88      [001] .....   381.545386: pcie_link_event: 0000:00:02.0 type:4, reason:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA
-- 
2.39.3


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes
  2025-09-20  6:01 ` [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
@ 2025-09-22 13:06   ` Ilpo Järvinen
  2025-09-23  2:16     ` Shuai Xue
  0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-09-22 13:06 UTC (permalink / raw)
  To: Shuai Xue
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong

[-- Attachment #1: Type: text/plain, Size: 9411 bytes --]

On Sat, 20 Sep 2025, Shuai Xue wrote:

> PCIe link speed degradation directly impacts system performance and
> often indicates hardware issues such as faulty devices, physical layer
> problems, or configuration errors.
> 
> To this end, add a RAS tracepoint to monitor link speed changes,
> enabling proactive health checks and diagnostic analysis.
> 
> The following output is generated when a device is hotplugged:
> 
> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pcie_link_event/enable
> $ cat /sys/kernel/debug/tracing/trace_pipe
>    irq/51-pciehp-88      [001] .....   381.545386: pcie_link_event: 0000:00:02.0 type:4, reason:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA
> 
> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> Suggested-by: Matthew W Carlis <mattc@purestorage.com>
> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> ---
>  drivers/pci/hotplug/pciehp_hpc.c |  3 +-
>  drivers/pci/pci.c                |  2 +-
>  drivers/pci/pci.h                | 22 +++++++++++--
>  drivers/pci/pcie/bwctrl.c        |  4 +--
>  drivers/pci/probe.c              |  9 +++--
>  include/linux/pci.h              |  1 +
>  include/trace/events/pci.h       | 56 ++++++++++++++++++++++++++++++++
>  7 files changed, 87 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
> index bcc51b26d03d..ad5f28f6a8b1 100644
> --- a/drivers/pci/hotplug/pciehp_hpc.c
> +++ b/drivers/pci/hotplug/pciehp_hpc.c
> @@ -320,7 +320,8 @@ int pciehp_check_link_status(struct controller *ctrl)
>  	}
>  
>  	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &linksta2);
> -	__pcie_update_link_speed(ctrl->pcie->port->subordinate, lnk_status, linksta2);
> +	__pcie_update_link_speed(ctrl->pcie->port->subordinate, PCIE_HOTPLUG,
> +				 lnk_status, linksta2);
>  
>  	if (!found) {
>  		ctrl_info(ctrl, "Slot(%s): No device found\n",
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b0f4d98036cd..96755ffd3841 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4749,7 +4749,7 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt)
>  	 * Link Speed.
>  	 */
>  	if (pdev->subordinate)
> -		pcie_update_link_speed(pdev->subordinate);
> +		pcie_update_link_speed(pdev->subordinate, PCIE_LINK_RETRAIN);
>  
>  	return rc;
>  }
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index b8d364545e7d..422406a0695c 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -3,6 +3,7 @@
>  #define DRIVERS_PCI_H
>  
>  #include <linux/pci.h>
> +#include <trace/events/pci.h>
>  
>  struct pcie_tlp_log;
>  
> @@ -455,16 +456,31 @@ static inline int pcie_dev_speed_mbps(enum pci_bus_speed speed)
>  }
>  
>  u8 pcie_get_supported_speeds(struct pci_dev *dev);
> -const char *pci_speed_string(enum pci_bus_speed speed);
>  void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
>  void pcie_report_downtraining(struct pci_dev *dev);
>  
> -static inline void __pcie_update_link_speed(struct pci_bus *bus, u16 linksta, u16 linksta2)
> +enum pcie_link_change_reason {
> +	PCIE_LINK_RETRAIN,
> +	PCIE_ADD_BUS,
> +	PCIE_BWCTRL_ENABLE,
> +	PCIE_BWCTRL_IRQ,
> +	PCIE_HOTPLUG

Please use comma on any non-terminator entry so that adding to the list 
later will not mess up diffs.

> +};
> +
> +static inline void __pcie_update_link_speed(struct pci_bus *bus,
> +					    enum pcie_link_change_reason reason,
> +					    u16 linksta, u16 linksta2)
>  {
>  	bus->cur_bus_speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS];
>  	bus->flit_mode = (linksta2 & PCI_EXP_LNKSTA2_FLIT) ? 1 : 0;
> +
> +	trace_pcie_link_event(bus,
> +			     reason,
> +			     FIELD_GET(PCI_EXP_LNKSTA_NLW, linksta),
> +			     linksta & PCI_EXP_LNKSTA_LINK_STATUS_MASK);
>  }
> -void pcie_update_link_speed(struct pci_bus *bus);
> +
> +void pcie_update_link_speed(struct pci_bus *bus, enum pcie_link_change_reason reason);
>  
>  /* Single Root I/O Virtualization */
>  struct pci_sriov {
> diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c
> index 36f939f23d34..32f1b30ecb84 100644
> --- a/drivers/pci/pcie/bwctrl.c
> +++ b/drivers/pci/pcie/bwctrl.c
> @@ -199,7 +199,7 @@ static void pcie_bwnotif_enable(struct pcie_device *srv)
>  	 * Update after enabling notifications & clearing status bits ensures
>  	 * link speed is up to date.
>  	 */
> -	pcie_update_link_speed(port->subordinate);
> +	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_ENABLE);
>  }
>  
>  static void pcie_bwnotif_disable(struct pci_dev *port)
> @@ -234,7 +234,7 @@ static irqreturn_t pcie_bwnotif_irq(int irq, void *context)
>  	 * speed (inside pcie_update_link_speed()) after LBMS has been
>  	 * cleared to avoid missing link speed changes.
>  	 */
> -	pcie_update_link_speed(port->subordinate);
> +	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_IRQ);
>  
>  	return IRQ_HANDLED;
>  }
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index f41128f91ca7..c4cae2664156 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -21,6 +21,7 @@
>  #include <linux/irqdomain.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/bitfield.h>
> +#include <trace/events/pci.h>
>  #include "pci.h"
>  
>  #define CARDBUS_LATENCY_TIMER	176	/* secondary latency timer */
> @@ -788,14 +789,16 @@ const char *pci_speed_string(enum pci_bus_speed speed)
>  }
>  EXPORT_SYMBOL_GPL(pci_speed_string);
>  
> -void pcie_update_link_speed(struct pci_bus *bus)
> +void pcie_update_link_speed(struct pci_bus *bus,
> +			    enum pcie_link_change_reason reason)
>  {
>  	struct pci_dev *bridge = bus->self;
>  	u16 linksta, linksta2;
>  
>  	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA, &linksta);
>  	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA2, &linksta2);
> -	__pcie_update_link_speed(bus, linksta, linksta2);
> +
> +	__pcie_update_link_speed(bus, reason, linksta, linksta2);
>  }
>  EXPORT_SYMBOL_GPL(pcie_update_link_speed);
>  
> @@ -882,7 +885,7 @@ static void pci_set_bus_speed(struct pci_bus *bus)
>  		pcie_capability_read_dword(bridge, PCI_EXP_LNKCAP, &linkcap);
>  		bus->max_bus_speed = pcie_link_speed[linkcap & PCI_EXP_LNKCAP_SLS];
>  
> -		pcie_update_link_speed(bus);
> +		pcie_update_link_speed(bus, PCIE_ADD_BUS);
>  	}
>  }
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 59876de13860..edd8a61ec44e 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -305,6 +305,7 @@ enum pci_bus_speed {
>  	PCI_SPEED_UNKNOWN		= 0xff,
>  };
>  
> +const char *pci_speed_string(enum pci_bus_speed speed);
>  enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
>  enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
>  
> diff --git a/include/trace/events/pci.h b/include/trace/events/pci.h
> index 208609492c06..78e651b95cb3 100644
> --- a/include/trace/events/pci.h
> +++ b/include/trace/events/pci.h
> @@ -57,6 +57,62 @@ TRACE_EVENT(pci_hp_event,
>  	)
>  );
>  
> +#define PCI_EXP_LNKSTA_LINK_STATUS_MASK (PCI_EXP_LNKSTA_LBMS | \
> +					 PCI_EXP_LNKSTA_LABS | \
> +					 PCI_EXP_LNKSTA_LT | \
> +					 PCI_EXP_LNKSTA_DLLLA)

This looks fragile because of the headers, I don't think there anything 
that pulls these required defines within this header itself (so it only 
works because the .c files have the pci.h include before it so that that 
the defines from uapi side will be include).

If it's allowed for these files, you should include uapi/linux/pci_regs.h.

> +
> +#define LNKSTA_FLAGS					\
> +	{ PCI_EXP_LNKSTA_LT,	"LT"},			\
> +	{ PCI_EXP_LNKSTA_DLLLA,	"DLLLA"},		\
> +	{ PCI_EXP_LNKSTA_LBMS,	"LBMS"},		\
> +	{ PCI_EXP_LNKSTA_LABS,	"LABS"}
> +
> +TRACE_EVENT(pcie_link_event,
> +
> +	TP_PROTO(struct pci_bus *bus,
> +		  unsigned int reason,
> +		  unsigned int width,
> +		  unsigned int status
> +		),
> +
> +	TP_ARGS(bus, reason, width, status),
> +
> +	TP_STRUCT__entry(
> +		__string(	port_name,	pci_name(bus->self))
> +		__field(	unsigned int,	type		)
> +		__field(	unsigned int,	reason		)
> +		__field(	unsigned int,	cur_bus_speed	)
> +		__field(	unsigned int,	max_bus_speed	)
> +		__field(	unsigned int,	width		)
> +		__field(	unsigned int,	flit_mode	)
> +		__field(	unsigned int,	link_status	)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(port_name);
> +		__entry->type			= pci_pcie_type(bus->self);
> +		__entry->reason			= reason;
> +		__entry->cur_bus_speed		= bus->cur_bus_speed;
> +		__entry->max_bus_speed		= bus->max_bus_speed;
> +		__entry->width			= width;
> +		__entry->flit_mode		= bus->flit_mode;
> +		__entry->link_status		= status;
> +	),
> +
> +	TP_printk("%s type:%d, reason:%d, cur_bus_speed:%s, max_bus_speed:%s, width:%u, flit_mode:%u, status:%s\n",
> +		__get_str(port_name),
> +		__entry->type,
> +		__entry->reason,
> +		pci_speed_string(__entry->cur_bus_speed),
> +		pci_speed_string(__entry->max_bus_speed),
> +		__entry->width,
> +		__entry->flit_mode,
> +		__print_flags((unsigned long)__entry->link_status, "|",
> +				LNKSTA_FLAGS)
> +	)
> +);
> +
>  #endif /* _TRACE_HW_EVENT_PCI_H */
>  
>  /* This part must be outside protection */
> 

-- 
 i.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-20  6:01 ` [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event Shuai Xue
@ 2025-09-22 13:10   ` Ilpo Järvinen
  2025-09-23  2:08     ` Shuai Xue
  0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-09-22 13:10 UTC (permalink / raw)
  To: Shuai Xue
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong

On Sat, 20 Sep 2025, Shuai Xue wrote:

> Hotplug events are critical indicators for analyzing hardware health,
> and surprise link downs can significantly impact system performance and
> reliability.
> 
> Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
> for hotplug event to help health checks. Add enum pci_hotplug_event in
> include/uapi/linux/pci.h so applications like rasdaemon can register
> tracepoint event handlers for it.
> 
> The following output is generated when a device is hotplugged:
> 
> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
> $ cat /sys/kernel/debug/tracing/trace_pipe
>    irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event: 0000:00:02.0 slot:10, event:CARD_PRESENT
> 
>    irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event: 0000:00:02.0 slot:10, event:LINK_UP
> 
> Suggested-by: Lukas Wunner <lukas@wunner.de>
> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> Reviewed-by: Lukas Wunner <lukas@wunner.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
>  drivers/pci/Makefile              |  2 +
>  drivers/pci/hotplug/Makefile      |  3 +-
>  drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
>  drivers/pci/trace.c               | 11 ++++++
>  include/trace/events/pci.h        | 63 +++++++++++++++++++++++++++++++
>  include/uapi/linux/pci.h          |  7 ++++
>  6 files changed, 110 insertions(+), 7 deletions(-)
>  create mode 100644 drivers/pci/trace.c
>  create mode 100644 include/trace/events/pci.h
> 
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index 67647f1880fb..bf389bc4dd3c 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -45,3 +45,5 @@ obj-y				+= controller/
>  obj-y				+= switch/
>  
>  subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
> +
> +CFLAGS_trace.o := -I$(src)
> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
> index 40aaf31fe338..d41f7050b072 100644
> --- a/drivers/pci/hotplug/Makefile
> +++ b/drivers/pci/hotplug/Makefile
> @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
>  pciehp-objs		:=	pciehp_core.o	\
>  				pciehp_ctrl.o	\
>  				pciehp_pci.o	\
> -				pciehp_hpc.o
> +				pciehp_hpc.o	\
> +				../trace.o

To make it useful for any PCI tracing, not juse hotplug, this object file 
should be added in drivers/pci/Makefile, not here.

>  shpchp-objs		:=	shpchp_core.o	\
>  				shpchp_ctrl.o	\
> diff --git a/drivers/pci/hotplug/pciehp_ctrl.c b/drivers/pci/hotplug/pciehp_ctrl.c
> index bcc938d4420f..7805f697a02c 100644
> --- a/drivers/pci/hotplug/pciehp_ctrl.c
> +++ b/drivers/pci/hotplug/pciehp_ctrl.c
> @@ -19,6 +19,7 @@
>  #include <linux/types.h>
>  #include <linux/pm_runtime.h>
>  #include <linux/pci.h>
> +#include <trace/events/pci.h>
>  
>  #include "../pci.h"
>  #include "pciehp.h"
> @@ -244,12 +245,20 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  	case ON_STATE:
>  		ctrl->state = POWEROFF_STATE;
>  		mutex_unlock(&ctrl->state_lock);
> -		if (events & PCI_EXP_SLTSTA_DLLSC)
> +		if (events & PCI_EXP_SLTSTA_DLLSC) {
>  			ctrl_info(ctrl, "Slot(%s): Link Down\n",
>  				  slot_name(ctrl));
> -		if (events & PCI_EXP_SLTSTA_PDC)
> +			trace_pci_hp_event(pci_name(ctrl->pcie->port),
> +					   slot_name(ctrl),
> +					   PCI_HOTPLUG_LINK_DOWN);
> +		}
> +		if (events & PCI_EXP_SLTSTA_PDC) {
>  			ctrl_info(ctrl, "Slot(%s): Card not present\n",
>  				  slot_name(ctrl));
> +			trace_pci_hp_event(pci_name(ctrl->pcie->port),
> +					   slot_name(ctrl),
> +					   PCI_HOTPLUG_CARD_NOT_PRESENT);
> +		}
>  		pciehp_disable_slot(ctrl, SURPRISE_REMOVAL);
>  		break;
>  	default:
> @@ -269,6 +278,9 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  					      INDICATOR_NOOP);
>  			ctrl_info(ctrl, "Slot(%s): Card not present\n",
>  				  slot_name(ctrl));
> +			trace_pci_hp_event(pci_name(ctrl->pcie->port),
> +					   slot_name(ctrl),
> +					   PCI_HOTPLUG_CARD_NOT_PRESENT);
>  		}
>  		mutex_unlock(&ctrl->state_lock);
>  		return;
> @@ -281,12 +293,19 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
>  	case OFF_STATE:
>  		ctrl->state = POWERON_STATE;
>  		mutex_unlock(&ctrl->state_lock);
> -		if (present)
> +		if (present) {
>  			ctrl_info(ctrl, "Slot(%s): Card present\n",
>  				  slot_name(ctrl));
> -		if (link_active)
> -			ctrl_info(ctrl, "Slot(%s): Link Up\n",
> -				  slot_name(ctrl));
> +			trace_pci_hp_event(pci_name(ctrl->pcie->port),
> +					   slot_name(ctrl),
> +					   PCI_HOTPLUG_CARD_PRESENT);
> +		}
> +		if (link_active) {
> +			ctrl_info(ctrl, "Slot(%s): Link Up\n", slot_name(ctrl));
> +			trace_pci_hp_event(pci_name(ctrl->pcie->port),
> +					   slot_name(ctrl),
> +					   PCI_HOTPLUG_LINK_UP);
> +		}
>  		ctrl->request_result = pciehp_enable_slot(ctrl);
>  		break;
>  	default:
> diff --git a/drivers/pci/trace.c b/drivers/pci/trace.c
> new file mode 100644
> index 000000000000..cf11abca8602
> --- /dev/null
> +++ b/drivers/pci/trace.c
> @@ -0,0 +1,11 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Tracepoints for PCI system
> + *
> + * Copyright (C) 2025 Alibaba Corporation
> + */
> +
> +#include <linux/pci.h>
> +
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/pci.h>
> diff --git a/include/trace/events/pci.h b/include/trace/events/pci.h
> new file mode 100644
> index 000000000000..208609492c06
> --- /dev/null
> +++ b/include/trace/events/pci.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#undef TRACE_SYSTEM
> +#define TRACE_SYSTEM pci
> +
> +#if !defined(_TRACE_HW_EVENT_PCI_H) || defined(TRACE_HEADER_MULTI_READ)
> +#define _TRACE_HW_EVENT_PCI_H
> +
> +#include <linux/tracepoint.h>
> +
> +#define PCI_HOTPLUG_EVENT						\
> +	EM(PCI_HOTPLUG_LINK_UP,			"LINK_UP")		\
> +	EM(PCI_HOTPLUG_LINK_DOWN,		"LINK_DOWN")		\
> +	EM(PCI_HOTPLUG_CARD_PRESENT,		"CARD_PRESENT")		\
> +	EMe(PCI_HOTPLUG_CARD_NOT_PRESENT,	"CARD_NOT_PRESENT")
> +
> +/* Enums require being exported to userspace, for user tool parsing */
> +#undef EM
> +#undef EMe
> +#define EM(a, b)	TRACE_DEFINE_ENUM(a);
> +#define EMe(a, b)	TRACE_DEFINE_ENUM(a);
> +
> +PCI_HOTPLUG_EVENT
> +
> +/*
> + * Now redefine the EM() and EMe() macros to map the enums to the strings
> + * that will be printed in the output.
> + */
> +#undef EM
> +#undef EMe
> +#define EM(a, b)	{a, b},
> +#define EMe(a, b)	{a, b}
> +
> +TRACE_EVENT(pci_hp_event,
> +
> +	TP_PROTO(const char *port_name,
> +		 const char *slot,
> +		 const int event),
> +
> +	TP_ARGS(port_name, slot, event),
> +
> +	TP_STRUCT__entry(
> +		__string(	port_name,	port_name	)
> +		__string(	slot,		slot		)
> +		__field(	int,		event	)
> +	),
> +
> +	TP_fast_assign(
> +		__assign_str(port_name);
> +		__assign_str(slot);
> +		__entry->event = event;
> +	),
> +
> +	TP_printk("%s slot:%s, event:%s\n",
> +		__get_str(port_name),
> +		__get_str(slot),
> +		__print_symbolic(__entry->event, PCI_HOTPLUG_EVENT)
> +	)
> +);
> +
> +#endif /* _TRACE_HW_EVENT_PCI_H */
> +
> +/* This part must be outside protection */
> +#include <trace/define_trace.h>
> diff --git a/include/uapi/linux/pci.h b/include/uapi/linux/pci.h
> index a769eefc5139..4f150028965d 100644
> --- a/include/uapi/linux/pci.h
> +++ b/include/uapi/linux/pci.h
> @@ -39,4 +39,11 @@
>  #define PCIIOC_MMAP_IS_MEM	(PCIIOC_BASE | 0x02)	/* Set mmap state to MEM space. */
>  #define PCIIOC_WRITE_COMBINE	(PCIIOC_BASE | 0x03)	/* Enable/disable write-combining. */
>  
> +enum pci_hotplug_event {
> +	PCI_HOTPLUG_LINK_UP,
> +	PCI_HOTPLUG_LINK_DOWN,
> +	PCI_HOTPLUG_CARD_PRESENT,
> +	PCI_HOTPLUG_CARD_NOT_PRESENT,
> +};
> +
>  #endif /* _UAPILINUX_PCI_H */
> 

-- 
 i.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-22 13:10   ` Ilpo Järvinen
@ 2025-09-23  2:08     ` Shuai Xue
  2025-09-23  6:46       ` Ilpo Järvinen
  0 siblings, 1 reply; 13+ messages in thread
From: Shuai Xue @ 2025-09-23  2:08 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong



在 2025/9/22 21:10, Ilpo Järvinen 写道:
> On Sat, 20 Sep 2025, Shuai Xue wrote:
> 
>> Hotplug events are critical indicators for analyzing hardware health,
>> and surprise link downs can significantly impact system performance and
>> reliability.
>>
>> Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
>> for hotplug event to help health checks. Add enum pci_hotplug_event in
>> include/uapi/linux/pci.h so applications like rasdaemon can register
>> tracepoint event handlers for it.
>>
>> The following output is generated when a device is hotplugged:
>>
>> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
>> $ cat /sys/kernel/debug/tracing/trace_pipe
>>     irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event: 0000:00:02.0 slot:10, event:CARD_PRESENT
>>
>>     irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event: 0000:00:02.0 slot:10, event:LINK_UP
>>
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> Reviewed-by: Lukas Wunner <lukas@wunner.de>
>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>> ---
>>   drivers/pci/Makefile              |  2 +
>>   drivers/pci/hotplug/Makefile      |  3 +-
>>   drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
>>   drivers/pci/trace.c               | 11 ++++++
>>   include/trace/events/pci.h        | 63 +++++++++++++++++++++++++++++++
>>   include/uapi/linux/pci.h          |  7 ++++
>>   6 files changed, 110 insertions(+), 7 deletions(-)
>>   create mode 100644 drivers/pci/trace.c
>>   create mode 100644 include/trace/events/pci.h
>>
>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>> index 67647f1880fb..bf389bc4dd3c 100644
>> --- a/drivers/pci/Makefile
>> +++ b/drivers/pci/Makefile
>> @@ -45,3 +45,5 @@ obj-y				+= controller/
>>   obj-y				+= switch/
>>   
>>   subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
>> +
>> +CFLAGS_trace.o := -I$(src)
>> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
>> index 40aaf31fe338..d41f7050b072 100644
>> --- a/drivers/pci/hotplug/Makefile
>> +++ b/drivers/pci/hotplug/Makefile
>> @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
>>   pciehp-objs		:=	pciehp_core.o	\
>>   				pciehp_ctrl.o	\
>>   				pciehp_pci.o	\
>> -				pciehp_hpc.o
>> +				pciehp_hpc.o	\
>> +				../trace.o
> 
> To make it useful for any PCI tracing, not juse hotplug, this object file
> should be added in drivers/pci/Makefile, not here.

Make sence. How about adding to the main CONFIG_PCI object:

diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
index bf389bc4dd3c..d7f83d06351d 100644
--- a/drivers/pci/Makefile
+++ b/drivers/pci/Makefile
@@ -5,7 +5,7 @@
  obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
                                    remove.o pci.o pci-driver.o search.o \
                                    rom.o setup-res.o irq.o vpd.o \
-                                  setup-bus.o vc.o mmap.o devres.o
+                                  setup-bus.o vc.o mmap.o devres.o trace.o

  obj-$(CONFIG_PCI)              += msi/
  obj-$(CONFIG_PCI)              += pcie/

Thanks.

Best Regards,
Shuai

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes
  2025-09-22 13:06   ` Ilpo Järvinen
@ 2025-09-23  2:16     ` Shuai Xue
  0 siblings, 0 replies; 13+ messages in thread
From: Shuai Xue @ 2025-09-23  2:16 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong



在 2025/9/22 21:06, Ilpo Järvinen 写道:
> On Sat, 20 Sep 2025, Shuai Xue wrote:
> 
>> PCIe link speed degradation directly impacts system performance and
>> often indicates hardware issues such as faulty devices, physical layer
>> problems, or configuration errors.
>>
>> To this end, add a RAS tracepoint to monitor link speed changes,
>> enabling proactive health checks and diagnostic analysis.
>>
>> The following output is generated when a device is hotplugged:
>>
>> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pcie_link_event/enable
>> $ cat /sys/kernel/debug/tracing/trace_pipe
>>     irq/51-pciehp-88      [001] .....   381.545386: pcie_link_event: 0000:00:02.0 type:4, reason:4, cur_bus_speed:2.5 GT/s PCIe, max_bus_speed:16.0 GT/s PCIe, width:1, flit_mode:0, status:DLLLA
>>
>> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
>> Suggested-by: Matthew W Carlis <mattc@purestorage.com>
>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>> ---
>>   drivers/pci/hotplug/pciehp_hpc.c |  3 +-
>>   drivers/pci/pci.c                |  2 +-
>>   drivers/pci/pci.h                | 22 +++++++++++--
>>   drivers/pci/pcie/bwctrl.c        |  4 +--
>>   drivers/pci/probe.c              |  9 +++--
>>   include/linux/pci.h              |  1 +
>>   include/trace/events/pci.h       | 56 ++++++++++++++++++++++++++++++++
>>   7 files changed, 87 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c
>> index bcc51b26d03d..ad5f28f6a8b1 100644
>> --- a/drivers/pci/hotplug/pciehp_hpc.c
>> +++ b/drivers/pci/hotplug/pciehp_hpc.c
>> @@ -320,7 +320,8 @@ int pciehp_check_link_status(struct controller *ctrl)
>>   	}
>>   
>>   	pcie_capability_read_word(pdev, PCI_EXP_LNKSTA2, &linksta2);
>> -	__pcie_update_link_speed(ctrl->pcie->port->subordinate, lnk_status, linksta2);
>> +	__pcie_update_link_speed(ctrl->pcie->port->subordinate, PCIE_HOTPLUG,
>> +				 lnk_status, linksta2);
>>   
>>   	if (!found) {
>>   		ctrl_info(ctrl, "Slot(%s): No device found\n",
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b0f4d98036cd..96755ffd3841 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4749,7 +4749,7 @@ int pcie_retrain_link(struct pci_dev *pdev, bool use_lt)
>>   	 * Link Speed.
>>   	 */
>>   	if (pdev->subordinate)
>> -		pcie_update_link_speed(pdev->subordinate);
>> +		pcie_update_link_speed(pdev->subordinate, PCIE_LINK_RETRAIN);
>>   
>>   	return rc;
>>   }
>> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
>> index b8d364545e7d..422406a0695c 100644
>> --- a/drivers/pci/pci.h
>> +++ b/drivers/pci/pci.h
>> @@ -3,6 +3,7 @@
>>   #define DRIVERS_PCI_H
>>   
>>   #include <linux/pci.h>
>> +#include <trace/events/pci.h>
>>   
>>   struct pcie_tlp_log;
>>   
>> @@ -455,16 +456,31 @@ static inline int pcie_dev_speed_mbps(enum pci_bus_speed speed)
>>   }
>>   
>>   u8 pcie_get_supported_speeds(struct pci_dev *dev);
>> -const char *pci_speed_string(enum pci_bus_speed speed);
>>   void __pcie_print_link_status(struct pci_dev *dev, bool verbose);
>>   void pcie_report_downtraining(struct pci_dev *dev);
>>   
>> -static inline void __pcie_update_link_speed(struct pci_bus *bus, u16 linksta, u16 linksta2)
>> +enum pcie_link_change_reason {
>> +	PCIE_LINK_RETRAIN,
>> +	PCIE_ADD_BUS,
>> +	PCIE_BWCTRL_ENABLE,
>> +	PCIE_BWCTRL_IRQ,
>> +	PCIE_HOTPLUG
> 
> Please use comma on any non-terminator entry so that adding to the list
> later will not mess up diffs.

Sure.

> 
>> +};
>> +
>> +static inline void __pcie_update_link_speed(struct pci_bus *bus,
>> +					    enum pcie_link_change_reason reason,
>> +					    u16 linksta, u16 linksta2)
>>   {
>>   	bus->cur_bus_speed = pcie_link_speed[linksta & PCI_EXP_LNKSTA_CLS];
>>   	bus->flit_mode = (linksta2 & PCI_EXP_LNKSTA2_FLIT) ? 1 : 0;
>> +
>> +	trace_pcie_link_event(bus,
>> +			     reason,
>> +			     FIELD_GET(PCI_EXP_LNKSTA_NLW, linksta),
>> +			     linksta & PCI_EXP_LNKSTA_LINK_STATUS_MASK);
>>   }
>> -void pcie_update_link_speed(struct pci_bus *bus);
>> +
>> +void pcie_update_link_speed(struct pci_bus *bus, enum pcie_link_change_reason reason);
>>   
>>   /* Single Root I/O Virtualization */
>>   struct pci_sriov {
>> diff --git a/drivers/pci/pcie/bwctrl.c b/drivers/pci/pcie/bwctrl.c
>> index 36f939f23d34..32f1b30ecb84 100644
>> --- a/drivers/pci/pcie/bwctrl.c
>> +++ b/drivers/pci/pcie/bwctrl.c
>> @@ -199,7 +199,7 @@ static void pcie_bwnotif_enable(struct pcie_device *srv)
>>   	 * Update after enabling notifications & clearing status bits ensures
>>   	 * link speed is up to date.
>>   	 */
>> -	pcie_update_link_speed(port->subordinate);
>> +	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_ENABLE);
>>   }
>>   
>>   static void pcie_bwnotif_disable(struct pci_dev *port)
>> @@ -234,7 +234,7 @@ static irqreturn_t pcie_bwnotif_irq(int irq, void *context)
>>   	 * speed (inside pcie_update_link_speed()) after LBMS has been
>>   	 * cleared to avoid missing link speed changes.
>>   	 */
>> -	pcie_update_link_speed(port->subordinate);
>> +	pcie_update_link_speed(port->subordinate, PCIE_BWCTRL_IRQ);
>>   
>>   	return IRQ_HANDLED;
>>   }
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index f41128f91ca7..c4cae2664156 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -21,6 +21,7 @@
>>   #include <linux/irqdomain.h>
>>   #include <linux/pm_runtime.h>
>>   #include <linux/bitfield.h>
>> +#include <trace/events/pci.h>
>>   #include "pci.h"
>>   
>>   #define CARDBUS_LATENCY_TIMER	176	/* secondary latency timer */
>> @@ -788,14 +789,16 @@ const char *pci_speed_string(enum pci_bus_speed speed)
>>   }
>>   EXPORT_SYMBOL_GPL(pci_speed_string);
>>   
>> -void pcie_update_link_speed(struct pci_bus *bus)
>> +void pcie_update_link_speed(struct pci_bus *bus,
>> +			    enum pcie_link_change_reason reason)
>>   {
>>   	struct pci_dev *bridge = bus->self;
>>   	u16 linksta, linksta2;
>>   
>>   	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA, &linksta);
>>   	pcie_capability_read_word(bridge, PCI_EXP_LNKSTA2, &linksta2);
>> -	__pcie_update_link_speed(bus, linksta, linksta2);
>> +
>> +	__pcie_update_link_speed(bus, reason, linksta, linksta2);
>>   }
>>   EXPORT_SYMBOL_GPL(pcie_update_link_speed);
>>   
>> @@ -882,7 +885,7 @@ static void pci_set_bus_speed(struct pci_bus *bus)
>>   		pcie_capability_read_dword(bridge, PCI_EXP_LNKCAP, &linkcap);
>>   		bus->max_bus_speed = pcie_link_speed[linkcap & PCI_EXP_LNKCAP_SLS];
>>   
>> -		pcie_update_link_speed(bus);
>> +		pcie_update_link_speed(bus, PCIE_ADD_BUS);
>>   	}
>>   }
>>   
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 59876de13860..edd8a61ec44e 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -305,6 +305,7 @@ enum pci_bus_speed {
>>   	PCI_SPEED_UNKNOWN		= 0xff,
>>   };
>>   
>> +const char *pci_speed_string(enum pci_bus_speed speed);
>>   enum pci_bus_speed pcie_get_speed_cap(struct pci_dev *dev);
>>   enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev);
>>   
>> diff --git a/include/trace/events/pci.h b/include/trace/events/pci.h
>> index 208609492c06..78e651b95cb3 100644
>> --- a/include/trace/events/pci.h
>> +++ b/include/trace/events/pci.h
>> @@ -57,6 +57,62 @@ TRACE_EVENT(pci_hp_event,
>>   	)
>>   );
>>   
>> +#define PCI_EXP_LNKSTA_LINK_STATUS_MASK (PCI_EXP_LNKSTA_LBMS | \
>> +					 PCI_EXP_LNKSTA_LABS | \
>> +					 PCI_EXP_LNKSTA_LT | \
>> +					 PCI_EXP_LNKSTA_DLLLA)
> 
> This looks fragile because of the headers, I don't think there anything
> that pulls these required defines within this header itself (so it only
> works because the .c files have the pci.h include before it so that that
> the defines from uapi side will be include).
> 
> If it's allowed for these files, you should include uapi/linux/pci_regs.h.
> 

Thanks for pointing out this dependency issue. You're absolutely right.

I'll add the explicit include for uapi/linux/pci_regs.h at the top of
the trace header file to ensure the PCI register macros are always
available, regardless of inclusion order.

Will fix this in next version.

Thanks.
Shuai

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-23  2:08     ` Shuai Xue
@ 2025-09-23  6:46       ` Ilpo Järvinen
  2025-09-23  7:15         ` Shuai Xue
  0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-09-23  6:46 UTC (permalink / raw)
  To: Shuai Xue
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong

[-- Attachment #1: Type: text/plain, Size: 3609 bytes --]

On Tue, 23 Sep 2025, Shuai Xue wrote:

> 
> 
> 在 2025/9/22 21:10, Ilpo Järvinen 写道:
> > On Sat, 20 Sep 2025, Shuai Xue wrote:
> > 
> > > Hotplug events are critical indicators for analyzing hardware health,
> > > and surprise link downs can significantly impact system performance and
> > > reliability.
> > > 
> > > Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
> > > for hotplug event to help health checks. Add enum pci_hotplug_event in
> > > include/uapi/linux/pci.h so applications like rasdaemon can register
> > > tracepoint event handlers for it.
> > > 
> > > The following output is generated when a device is hotplugged:
> > > 
> > > $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
> > > $ cat /sys/kernel/debug/tracing/trace_pipe
> > >     irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event:
> > > 0000:00:02.0 slot:10, event:CARD_PRESENT
> > > 
> > >     irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event:
> > > 0000:00:02.0 slot:10, event:LINK_UP
> > > 
> > > Suggested-by: Lukas Wunner <lukas@wunner.de>
> > > Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> > > Reviewed-by: Lukas Wunner <lukas@wunner.de>
> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > ---
> > >   drivers/pci/Makefile              |  2 +
> > >   drivers/pci/hotplug/Makefile      |  3 +-
> > >   drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
> > >   drivers/pci/trace.c               | 11 ++++++
> > >   include/trace/events/pci.h        | 63 +++++++++++++++++++++++++++++++
> > >   include/uapi/linux/pci.h          |  7 ++++
> > >   6 files changed, 110 insertions(+), 7 deletions(-)
> > >   create mode 100644 drivers/pci/trace.c
> > >   create mode 100644 include/trace/events/pci.h
> > > 
> > > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> > > index 67647f1880fb..bf389bc4dd3c 100644
> > > --- a/drivers/pci/Makefile
> > > +++ b/drivers/pci/Makefile
> > > @@ -45,3 +45,5 @@ obj-y				+= controller/
> > >   obj-y				+= switch/
> > >     subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
> > > +
> > > +CFLAGS_trace.o := -I$(src)
> > > diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
> > > index 40aaf31fe338..d41f7050b072 100644
> > > --- a/drivers/pci/hotplug/Makefile
> > > +++ b/drivers/pci/hotplug/Makefile
> > > @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
> > >   pciehp-objs		:=	pciehp_core.o	\
> > >   				pciehp_ctrl.o	\
> > >   				pciehp_pci.o	\
> > > -				pciehp_hpc.o
> > > +				pciehp_hpc.o	\
> > > +				../trace.o
> > 
> > To make it useful for any PCI tracing, not juse hotplug, this object file
> > should be added in drivers/pci/Makefile, not here.
> 
> Make sence. How about adding to the main CONFIG_PCI object:
> 
> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> index bf389bc4dd3c..d7f83d06351d 100644
> --- a/drivers/pci/Makefile
> +++ b/drivers/pci/Makefile
> @@ -5,7 +5,7 @@
>  obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
>                                    remove.o pci.o pci-driver.o search.o \
>                                    rom.o setup-res.o irq.o vpd.o \
> -                                  setup-bus.o vc.o mmap.o devres.o
> +                                  setup-bus.o vc.o mmap.o devres.o trace.o
> 
>  obj-$(CONFIG_PCI)              += msi/
>  obj-$(CONFIG_PCI)              += pcie/

Yes, that's the right place to add it.


-- 
 i.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-23  6:46       ` Ilpo Järvinen
@ 2025-09-23  7:15         ` Shuai Xue
  2025-09-23  7:20           ` Ilpo Järvinen
  0 siblings, 1 reply; 13+ messages in thread
From: Shuai Xue @ 2025-09-23  7:15 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong



在 2025/9/23 14:46, Ilpo Järvinen 写道:
> On Tue, 23 Sep 2025, Shuai Xue wrote:
> 
>>
>>
>> 在 2025/9/22 21:10, Ilpo Järvinen 写道:
>>> On Sat, 20 Sep 2025, Shuai Xue wrote:
>>>
>>>> Hotplug events are critical indicators for analyzing hardware health,
>>>> and surprise link downs can significantly impact system performance and
>>>> reliability.
>>>>
>>>> Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
>>>> for hotplug event to help health checks. Add enum pci_hotplug_event in
>>>> include/uapi/linux/pci.h so applications like rasdaemon can register
>>>> tracepoint event handlers for it.
>>>>
>>>> The following output is generated when a device is hotplugged:
>>>>
>>>> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
>>>> $ cat /sys/kernel/debug/tracing/trace_pipe
>>>>      irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event:
>>>> 0000:00:02.0 slot:10, event:CARD_PRESENT
>>>>
>>>>      irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event:
>>>> 0000:00:02.0 slot:10, event:LINK_UP
>>>>
>>>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>>>> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>>>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>>>> Reviewed-by: Lukas Wunner <lukas@wunner.de>
>>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>> ---
>>>>    drivers/pci/Makefile              |  2 +
>>>>    drivers/pci/hotplug/Makefile      |  3 +-
>>>>    drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
>>>>    drivers/pci/trace.c               | 11 ++++++
>>>>    include/trace/events/pci.h        | 63 +++++++++++++++++++++++++++++++
>>>>    include/uapi/linux/pci.h          |  7 ++++
>>>>    6 files changed, 110 insertions(+), 7 deletions(-)
>>>>    create mode 100644 drivers/pci/trace.c
>>>>    create mode 100644 include/trace/events/pci.h
>>>>
>>>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>>>> index 67647f1880fb..bf389bc4dd3c 100644
>>>> --- a/drivers/pci/Makefile
>>>> +++ b/drivers/pci/Makefile
>>>> @@ -45,3 +45,5 @@ obj-y				+= controller/
>>>>    obj-y				+= switch/
>>>>      subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
>>>> +
>>>> +CFLAGS_trace.o := -I$(src)
>>>> diff --git a/drivers/pci/hotplug/Makefile b/drivers/pci/hotplug/Makefile
>>>> index 40aaf31fe338..d41f7050b072 100644
>>>> --- a/drivers/pci/hotplug/Makefile
>>>> +++ b/drivers/pci/hotplug/Makefile
>>>> @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
>>>>    pciehp-objs		:=	pciehp_core.o	\
>>>>    				pciehp_ctrl.o	\
>>>>    				pciehp_pci.o	\
>>>> -				pciehp_hpc.o
>>>> +				pciehp_hpc.o	\
>>>> +				../trace.o
>>>
>>> To make it useful for any PCI tracing, not juse hotplug, this object file
>>> should be added in drivers/pci/Makefile, not here.
>>
>> Make sence. How about adding to the main CONFIG_PCI object:
>>
>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>> index bf389bc4dd3c..d7f83d06351d 100644
>> --- a/drivers/pci/Makefile
>> +++ b/drivers/pci/Makefile
>> @@ -5,7 +5,7 @@
>>   obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
>>                                     remove.o pci.o pci-driver.o search.o \
>>                                     rom.o setup-res.o irq.o vpd.o \
>> -                                  setup-bus.o vc.o mmap.o devres.o
>> +                                  setup-bus.o vc.o mmap.o devres.o trace.o
>>
>>   obj-$(CONFIG_PCI)              += msi/
>>   obj-$(CONFIG_PCI)              += pcie/
> 
> Yes, that's the right place to add it.
> 

Thanks for confirm.
Will send a new version to fix it.

Thanks.
Shuai


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-23  7:15         ` Shuai Xue
@ 2025-09-23  7:20           ` Ilpo Järvinen
  2025-09-23  7:34             ` Ilpo Järvinen
  0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-09-23  7:20 UTC (permalink / raw)
  To: Shuai Xue
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong

[-- Attachment #1: Type: text/plain, Size: 4417 bytes --]

On Tue, 23 Sep 2025, Shuai Xue wrote:

> 
> 
> 在 2025/9/23 14:46, Ilpo Järvinen 写道:
> > On Tue, 23 Sep 2025, Shuai Xue wrote:
> > 
> > > 
> > > 
> > > 在 2025/9/22 21:10, Ilpo Järvinen 写道:
> > > > On Sat, 20 Sep 2025, Shuai Xue wrote:
> > > > 
> > > > > Hotplug events are critical indicators for analyzing hardware health,
> > > > > and surprise link downs can significantly impact system performance
> > > > > and
> > > > > reliability.
> > > > > 
> > > > > Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
> > > > > for hotplug event to help health checks. Add enum pci_hotplug_event in
> > > > > include/uapi/linux/pci.h so applications like rasdaemon can register
> > > > > tracepoint event handlers for it.
> > > > > 
> > > > > The following output is generated when a device is hotplugged:
> > > > > 
> > > > > $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
> > > > > $ cat /sys/kernel/debug/tracing/trace_pipe
> > > > >      irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event:
> > > > > 0000:00:02.0 slot:10, event:CARD_PRESENT
> > > > > 
> > > > >      irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event:
> > > > > 0000:00:02.0 slot:10, event:LINK_UP
> > > > > 
> > > > > Suggested-by: Lukas Wunner <lukas@wunner.de>
> > > > > Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> > > > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> > > > > Reviewed-by: Lukas Wunner <lukas@wunner.de>
> > > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > ---
> > > > >    drivers/pci/Makefile              |  2 +
> > > > >    drivers/pci/hotplug/Makefile      |  3 +-
> > > > >    drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
> > > > >    drivers/pci/trace.c               | 11 ++++++
> > > > >    include/trace/events/pci.h        | 63
> > > > > +++++++++++++++++++++++++++++++
> > > > >    include/uapi/linux/pci.h          |  7 ++++
> > > > >    6 files changed, 110 insertions(+), 7 deletions(-)
> > > > >    create mode 100644 drivers/pci/trace.c
> > > > >    create mode 100644 include/trace/events/pci.h
> > > > > 
> > > > > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> > > > > index 67647f1880fb..bf389bc4dd3c 100644
> > > > > --- a/drivers/pci/Makefile
> > > > > +++ b/drivers/pci/Makefile
> > > > > @@ -45,3 +45,5 @@ obj-y				+= controller/
> > > > >    obj-y				+= switch/
> > > > >      subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
> > > > > +
> > > > > +CFLAGS_trace.o := -I$(src)
> > > > > diff --git a/drivers/pci/hotplug/Makefile
> > > > > b/drivers/pci/hotplug/Makefile
> > > > > index 40aaf31fe338..d41f7050b072 100644
> > > > > --- a/drivers/pci/hotplug/Makefile
> > > > > +++ b/drivers/pci/hotplug/Makefile
> > > > > @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
> > > > >    pciehp-objs		:=	pciehp_core.o	\
> > > > >    				pciehp_ctrl.o	\
> > > > >    				pciehp_pci.o	\
> > > > > -				pciehp_hpc.o
> > > > > +				pciehp_hpc.o	\
> > > > > +				../trace.o
> > > > 
> > > > To make it useful for any PCI tracing, not juse hotplug, this object
> > > > file
> > > > should be added in drivers/pci/Makefile, not here.
> > > 
> > > Make sence. How about adding to the main CONFIG_PCI object:
> > > 
> > > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> > > index bf389bc4dd3c..d7f83d06351d 100644
> > > --- a/drivers/pci/Makefile
> > > +++ b/drivers/pci/Makefile
> > > @@ -5,7 +5,7 @@
> > >   obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
> > >                                     remove.o pci.o pci-driver.o search.o \
> > >                                     rom.o setup-res.o irq.o vpd.o \
> > > -                                  setup-bus.o vc.o mmap.o devres.o
> > > +                                  setup-bus.o vc.o mmap.o devres.o
> > > trace.o
> > > 
> > >   obj-$(CONFIG_PCI)              += msi/
> > >   obj-$(CONFIG_PCI)              += pcie/
> > 
> > Yes, that's the right place to add it.
> > 
> 
> Thanks for confirm.
> Will send a new version to fix it.

I actually now started to wonder if it should be made depend on some 
tracing related config (sending this out quickly if you were just 
waiting for my confirmation to send quickly... I'm still investigating
what other subsystems do).

-- 
 i.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-23  7:20           ` Ilpo Järvinen
@ 2025-09-23  7:34             ` Ilpo Järvinen
  2025-09-23  8:09               ` Shuai Xue
  0 siblings, 1 reply; 13+ messages in thread
From: Ilpo Järvinen @ 2025-09-23  7:34 UTC (permalink / raw)
  To: Shuai Xue
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong

[-- Attachment #1: Type: text/plain, Size: 4746 bytes --]

On Tue, 23 Sep 2025, Ilpo Järvinen wrote:

> On Tue, 23 Sep 2025, Shuai Xue wrote:
> 
> > 
> > 
> > 在 2025/9/23 14:46, Ilpo Järvinen 写道:
> > > On Tue, 23 Sep 2025, Shuai Xue wrote:
> > > 
> > > > 
> > > > 
> > > > 在 2025/9/22 21:10, Ilpo Järvinen 写道:
> > > > > On Sat, 20 Sep 2025, Shuai Xue wrote:
> > > > > 
> > > > > > Hotplug events are critical indicators for analyzing hardware health,
> > > > > > and surprise link downs can significantly impact system performance
> > > > > > and
> > > > > > reliability.
> > > > > > 
> > > > > > Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
> > > > > > for hotplug event to help health checks. Add enum pci_hotplug_event in
> > > > > > include/uapi/linux/pci.h so applications like rasdaemon can register
> > > > > > tracepoint event handlers for it.
> > > > > > 
> > > > > > The following output is generated when a device is hotplugged:
> > > > > > 
> > > > > > $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
> > > > > > $ cat /sys/kernel/debug/tracing/trace_pipe
> > > > > >      irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event:
> > > > > > 0000:00:02.0 slot:10, event:CARD_PRESENT
> > > > > > 
> > > > > >      irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event:
> > > > > > 0000:00:02.0 slot:10, event:LINK_UP
> > > > > > 
> > > > > > Suggested-by: Lukas Wunner <lukas@wunner.de>
> > > > > > Suggested-by: Steven Rostedt <rostedt@goodmis.org>
> > > > > > Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
> > > > > > Reviewed-by: Lukas Wunner <lukas@wunner.de>
> > > > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > > > > > ---
> > > > > >    drivers/pci/Makefile              |  2 +
> > > > > >    drivers/pci/hotplug/Makefile      |  3 +-
> > > > > >    drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
> > > > > >    drivers/pci/trace.c               | 11 ++++++
> > > > > >    include/trace/events/pci.h        | 63
> > > > > > +++++++++++++++++++++++++++++++
> > > > > >    include/uapi/linux/pci.h          |  7 ++++
> > > > > >    6 files changed, 110 insertions(+), 7 deletions(-)
> > > > > >    create mode 100644 drivers/pci/trace.c
> > > > > >    create mode 100644 include/trace/events/pci.h
> > > > > > 
> > > > > > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> > > > > > index 67647f1880fb..bf389bc4dd3c 100644
> > > > > > --- a/drivers/pci/Makefile
> > > > > > +++ b/drivers/pci/Makefile
> > > > > > @@ -45,3 +45,5 @@ obj-y				+= controller/
> > > > > >    obj-y				+= switch/
> > > > > >      subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
> > > > > > +
> > > > > > +CFLAGS_trace.o := -I$(src)
> > > > > > diff --git a/drivers/pci/hotplug/Makefile
> > > > > > b/drivers/pci/hotplug/Makefile
> > > > > > index 40aaf31fe338..d41f7050b072 100644
> > > > > > --- a/drivers/pci/hotplug/Makefile
> > > > > > +++ b/drivers/pci/hotplug/Makefile
> > > > > > @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
> > > > > >    pciehp-objs		:=	pciehp_core.o	\
> > > > > >    				pciehp_ctrl.o	\
> > > > > >    				pciehp_pci.o	\
> > > > > > -				pciehp_hpc.o
> > > > > > +				pciehp_hpc.o	\
> > > > > > +				../trace.o
> > > > > 
> > > > > To make it useful for any PCI tracing, not juse hotplug, this object
> > > > > file
> > > > > should be added in drivers/pci/Makefile, not here.
> > > > 
> > > > Make sence. How about adding to the main CONFIG_PCI object:
> > > > 
> > > > diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
> > > > index bf389bc4dd3c..d7f83d06351d 100644
> > > > --- a/drivers/pci/Makefile
> > > > +++ b/drivers/pci/Makefile
> > > > @@ -5,7 +5,7 @@
> > > >   obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
> > > >                                     remove.o pci.o pci-driver.o search.o \
> > > >                                     rom.o setup-res.o irq.o vpd.o \
> > > > -                                  setup-bus.o vc.o mmap.o devres.o
> > > > +                                  setup-bus.o vc.o mmap.o devres.o
> > > > trace.o
> > > > 
> > > >   obj-$(CONFIG_PCI)              += msi/
> > > >   obj-$(CONFIG_PCI)              += pcie/
> > > 
> > > Yes, that's the right place to add it.
> > > 
> > 
> > Thanks for confirm.
> > Will send a new version to fix it.
> 
> I actually now started to wonder if it should be made depend on some 
> tracing related config (sending this out quickly if you were just 
> waiting for my confirmation to send quickly... I'm still investigating
> what other subsystems do).

Probably this is what we actually want:

obj-$(CONFIG_TRACING)			+= trace.o

-- 
 i.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event
  2025-09-23  7:34             ` Ilpo Järvinen
@ 2025-09-23  8:09               ` Shuai Xue
  0 siblings, 0 replies; 13+ messages in thread
From: Shuai Xue @ 2025-09-23  8:09 UTC (permalink / raw)
  To: Ilpo Järvinen
  Cc: rostedt, Lukas Wunner, linux-pci, LKML, linux-edac,
	linux-trace-kernel, helgaas, mattc, Jonathan.Cameron, bhelgaas,
	tony.luck, bp, mhiramat, mathieu.desnoyers, oleg, naveen, davem,
	anil.s.keshavamurthy, mark.rutland, peterz, tianruidong



在 2025/9/23 15:34, Ilpo Järvinen 写道:
> On Tue, 23 Sep 2025, Ilpo Järvinen wrote:
> 
>> On Tue, 23 Sep 2025, Shuai Xue wrote:
>>
>>>
>>>
>>> 在 2025/9/23 14:46, Ilpo Järvinen 写道:
>>>> On Tue, 23 Sep 2025, Shuai Xue wrote:
>>>>
>>>>>
>>>>>
>>>>> 在 2025/9/22 21:10, Ilpo Järvinen 写道:
>>>>>> On Sat, 20 Sep 2025, Shuai Xue wrote:
>>>>>>
>>>>>>> Hotplug events are critical indicators for analyzing hardware health,
>>>>>>> and surprise link downs can significantly impact system performance
>>>>>>> and
>>>>>>> reliability.
>>>>>>>
>>>>>>> Define a new TRACING_SYSTEM named "pci", add a generic RAS tracepoint
>>>>>>> for hotplug event to help health checks. Add enum pci_hotplug_event in
>>>>>>> include/uapi/linux/pci.h so applications like rasdaemon can register
>>>>>>> tracepoint event handlers for it.
>>>>>>>
>>>>>>> The following output is generated when a device is hotplugged:
>>>>>>>
>>>>>>> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
>>>>>>> $ cat /sys/kernel/debug/tracing/trace_pipe
>>>>>>>       irq/51-pciehp-88      [001] .....  1311.177459: pci_hp_event:
>>>>>>> 0000:00:02.0 slot:10, event:CARD_PRESENT
>>>>>>>
>>>>>>>       irq/51-pciehp-88      [001] .....  1311.177566: pci_hp_event:
>>>>>>> 0000:00:02.0 slot:10, event:LINK_UP
>>>>>>>
>>>>>>> Suggested-by: Lukas Wunner <lukas@wunner.de>
>>>>>>> Suggested-by: Steven Rostedt <rostedt@goodmis.org>
>>>>>>> Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
>>>>>>> Reviewed-by: Lukas Wunner <lukas@wunner.de>
>>>>>>> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
>>>>>>> ---
>>>>>>>     drivers/pci/Makefile              |  2 +
>>>>>>>     drivers/pci/hotplug/Makefile      |  3 +-
>>>>>>>     drivers/pci/hotplug/pciehp_ctrl.c | 31 ++++++++++++---
>>>>>>>     drivers/pci/trace.c               | 11 ++++++
>>>>>>>     include/trace/events/pci.h        | 63
>>>>>>> +++++++++++++++++++++++++++++++
>>>>>>>     include/uapi/linux/pci.h          |  7 ++++
>>>>>>>     6 files changed, 110 insertions(+), 7 deletions(-)
>>>>>>>     create mode 100644 drivers/pci/trace.c
>>>>>>>     create mode 100644 include/trace/events/pci.h
>>>>>>>
>>>>>>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>>>>>>> index 67647f1880fb..bf389bc4dd3c 100644
>>>>>>> --- a/drivers/pci/Makefile
>>>>>>> +++ b/drivers/pci/Makefile
>>>>>>> @@ -45,3 +45,5 @@ obj-y				+= controller/
>>>>>>>     obj-y				+= switch/
>>>>>>>       subdir-ccflags-$(CONFIG_PCI_DEBUG) := -DDEBUG
>>>>>>> +
>>>>>>> +CFLAGS_trace.o := -I$(src)
>>>>>>> diff --git a/drivers/pci/hotplug/Makefile
>>>>>>> b/drivers/pci/hotplug/Makefile
>>>>>>> index 40aaf31fe338..d41f7050b072 100644
>>>>>>> --- a/drivers/pci/hotplug/Makefile
>>>>>>> +++ b/drivers/pci/hotplug/Makefile
>>>>>>> @@ -65,7 +65,8 @@ rpadlpar_io-objs	:=	rpadlpar_core.o \
>>>>>>>     pciehp-objs		:=	pciehp_core.o	\
>>>>>>>     				pciehp_ctrl.o	\
>>>>>>>     				pciehp_pci.o	\
>>>>>>> -				pciehp_hpc.o
>>>>>>> +				pciehp_hpc.o	\
>>>>>>> +				../trace.o
>>>>>>
>>>>>> To make it useful for any PCI tracing, not juse hotplug, this object
>>>>>> file
>>>>>> should be added in drivers/pci/Makefile, not here.
>>>>>
>>>>> Make sence. How about adding to the main CONFIG_PCI object:
>>>>>
>>>>> diff --git a/drivers/pci/Makefile b/drivers/pci/Makefile
>>>>> index bf389bc4dd3c..d7f83d06351d 100644
>>>>> --- a/drivers/pci/Makefile
>>>>> +++ b/drivers/pci/Makefile
>>>>> @@ -5,7 +5,7 @@
>>>>>    obj-$(CONFIG_PCI)              += access.o bus.o probe.o host-bridge.o \
>>>>>                                      remove.o pci.o pci-driver.o search.o \
>>>>>                                      rom.o setup-res.o irq.o vpd.o \
>>>>> -                                  setup-bus.o vc.o mmap.o devres.o
>>>>> +                                  setup-bus.o vc.o mmap.o devres.o
>>>>> trace.o
>>>>>
>>>>>    obj-$(CONFIG_PCI)              += msi/
>>>>>    obj-$(CONFIG_PCI)              += pcie/
>>>>
>>>> Yes, that's the right place to add it.
>>>>
>>>
>>> Thanks for confirm.
>>> Will send a new version to fix it.
>>
>> I actually now started to wonder if it should be made depend on some
>> tracing related config (sending this out quickly if you were just
>> waiting for my confirmation to send quickly... I'm still investigating
>> what other subsystems do).
> 
> Probably this is what we actually want:
> 
> obj-$(CONFIG_TRACING)			+= trace.o
> 


Thanks for the input, lots of trace.o, e.g. for cxl and nvme, are compiled
under CONFIG_TRACING.

Will use CONFIG_TRACING :)

Thanks.
Shuai

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-09-23  8:10 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-20  6:01 [PATCH v10 0/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
2025-09-20  6:01 ` [PATCH v10 1/3] PCI: trace: Add a generic RAS tracepoint for hotplug event Shuai Xue
2025-09-22 13:10   ` Ilpo Järvinen
2025-09-23  2:08     ` Shuai Xue
2025-09-23  6:46       ` Ilpo Järvinen
2025-09-23  7:15         ` Shuai Xue
2025-09-23  7:20           ` Ilpo Järvinen
2025-09-23  7:34             ` Ilpo Järvinen
2025-09-23  8:09               ` Shuai Xue
2025-09-20  6:01 ` [PATCH v10 2/3] PCI: trace: Add a RAS tracepoint to monitor link speed changes Shuai Xue
2025-09-22 13:06   ` Ilpo Järvinen
2025-09-23  2:16     ` Shuai Xue
2025-09-20  6:01 ` [PATCH v10 3/3] Documentation: tracing: Add documentation about PCI tracepoints Shuai Xue

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).