From: Bjorn Helgaas <helgaas@kernel.org>
To: Shuai Xue <xueshuai@linux.alibaba.com>
Cc: rostedt@goodmis.org, lukas@wunner.de, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org,
linux-trace-kernel@vger.kernel.org, bhelgaas@google.com,
tony.luck@intel.com, bp@alien8.de, mhiramat@kernel.org,
mathieu.desnoyers@efficios.com, oleg@redhat.com,
naveen@kernel.org, davem@davemloft.net,
anil.s.keshavamurthy@intel.com, mark.rutland@arm.com,
peterz@infradead.org, tianruidong@linux.alibaba.com,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"Jonathan Cameron" <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v8] PCI: hotplug: Add a generic RAS tracepoint for hotplug event
Date: Wed, 16 Jul 2025 17:25:33 -0500 [thread overview]
Message-ID: <20250716222533.GA2559636@bhelgaas> (raw)
In-Reply-To: <20250512013839.45960-1-xueshuai@linux.alibaba.com>
[+cc Ilpo, Jonathan (should have been included since the patch has his
Reviewed-by)]
Thanks for the ping; I noticed quite a bit of discussion but didn't
follow it myself, so didn't know it was basically all resolved.
On Mon, May 12, 2025 at 09:38:39AM +0800, Shuai Xue wrote:
> Hotplug events are critical indicators for analyzing hardware health,
> particularly in AI supercomputers where surprise link downs can
> significantly impact system performance and reliability.
I dropped the "particularly in AI supercomputers" part because I think
this is relevant in general.
> To this end, define a new TRACING_SYSTEM named pci, add a generic RAS
> tracepoint for hotplug event to help healthy check, and generate
> tracepoints for pcie hotplug event.
I'm not quite clear on the difference between "add generic RAS
tracepoint for hotplug event" and "generate tracepoints for pcie
hotplug event." Are these two different things?
I see the new TRACE_EVENT(pci_hp_event, ...) definition. Is that what
you mean by the "generic RAS tracepoint"?
And the five new trace_pci_hp_event() calls that use the TRACE_EVENT
are the "tracepoints for PCIe hotplug event"?
> Add enum pci_hotplug_event in
> include/uapi/linux/pci.h so applications like rasdaemon can register
> tracepoint event handlers for it.
>
> The output like below:
>
> $ echo 1 > /sys/kernel/debug/tracing/events/pci/pci_hp_event/enable
> $ cat /sys/kernel/debug/tracing/trace_pipe
> <...>-206 [001] ..... 40.373870: pci_hp_event: 0000:00:02.0 slot:10, event:Link Down
>
> <...>-206 [001] ..... 40.374871: pci_hp_event: 0000:00:02.0 slot:10, event:Card not present
> +#define PCI_HOTPLUG_EVENT \
> + EM(PCI_HOTPLUG_LINK_UP, "Link Up") \
> + EM(PCI_HOTPLUG_LINK_DOWN, "Link Down") \
> + EM(PCI_HOTPLUG_CARD_PRESENT, "Card present") \
> + EMe(PCI_HOTPLUG_CARD_NOT_PRESENT, "Card not present")
Running this:
$ git grep -E "\<(EM|EMe)\("
I notice that these new events don't look like the others, which
mostly look like "word" or "event-type" or "VERB object".
I'm OK with this, but just giving you a chance to consider what will
be the least surprise to users and easiest for grep and shell
scripting.
I also noticed capitalization of "Up" and "Down", but not "present"
and "not present".
"Card" is only used occasionally and informally in the PCIe spec, and
not at all in the context of hotplug of Slot Status (Presence Detect
State refers to "adapter in the slot"), but it does match the pciehp
dmesg text, so it probably makes sense to use that.
Anyway, I applied this on pci/trace for v6.17. If there's anything
you want to tweak in the commit log or event text, we can still do
that.
https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git/commit/?h=trace
Bjorn
next prev parent reply other threads:[~2025-07-16 22:25 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-12 1:38 [PATCH v8] PCI: hotplug: Add a generic RAS tracepoint for hotplug event Shuai Xue
2025-05-19 17:10 ` Ilpo Järvinen
2025-05-20 2:36 ` Shuai Xue
2025-05-20 10:07 ` Ilpo Järvinen
2025-05-20 10:44 ` Lukas Wunner
2025-05-20 10:59 ` Ilpo Järvinen
2025-05-20 12:09 ` Lukas Wunner
2025-05-20 12:52 ` Ilpo Järvinen
2025-05-20 13:11 ` Lukas Wunner
2025-05-22 9:50 ` Shuai Xue
2025-05-31 14:15 ` Lukas Wunner
2025-07-16 6:52 ` Shuai Xue
2025-05-22 9:41 ` Shuai Xue
2025-06-02 6:30 ` Ilpo Järvinen
2025-06-23 3:04 ` Shuai Xue
2025-07-16 22:25 ` Bjorn Helgaas [this message]
2025-07-17 6:00 ` Shuai Xue
2025-07-17 19:29 ` Bjorn Helgaas
2025-07-21 8:55 ` Ilpo Järvinen
2025-07-24 22:27 ` Bjorn Helgaas
2025-07-25 4:33 ` Shuai Xue
2025-07-17 17:28 ` Matthew W Carlis
2025-07-17 19:07 ` Bjorn Helgaas
2025-07-17 20:23 ` Lukas Wunner
2025-07-17 23:27 ` Matthew W Carlis
2025-07-17 23:50 ` Bjorn Helgaas
2025-07-18 3:46 ` Matthew W Carlis
2025-07-18 5:29 ` Shuai Xue
2025-07-18 16:35 ` Bjorn Helgaas
2025-07-19 5:23 ` Shuai Xue
2025-07-19 7:11 ` Lukas Wunner
2025-07-21 13:17 ` Shuai Xue
2025-07-26 7:55 ` Lukas Wunner
2025-07-21 10:18 ` Ilpo Järvinen
2025-07-22 2:43 ` [PATCH v8] PCI: hotplug: Add a generic RAS tracepoinggt " Shuai Xue
2025-07-22 12:29 ` Ilpo Järvinen
2025-07-23 1:29 ` Shuai Xue
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250716222533.GA2559636@bhelgaas \
--to=helgaas@kernel.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=anil.s.keshavamurthy@intel.com \
--cc=bhelgaas@google.com \
--cc=bp@alien8.de \
--cc=davem@davemloft.net \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=mark.rutland@arm.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhiramat@kernel.org \
--cc=naveen@kernel.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tianruidong@linux.alibaba.com \
--cc=tony.luck@intel.com \
--cc=xueshuai@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox