From: Bjorn Helgaas <helgaas@kernel.org>
To: Jon Pan-Doh <pandoh@google.com>
Cc: "Bjorn Helgaas" <bhelgaas@google.com>,
"Karolina Stolarek" <karolina.stolarek@oracle.com>,
linux-pci@vger.kernel.org,
"Martin Petersen" <martin.petersen@oracle.com>,
"Ben Fuller" <ben.fuller@oracle.com>,
"Drew Walton" <drewwalton@microsoft.com>,
"Anil Agrawal" <anilagrawal@meta.com>,
"Tony Luck" <tony.luck@intel.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"Sathyanarayanan Kuppuswamy"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"Lukas Wunner" <lukas@wunner.de>,
"Jonathan Cameron" <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v2 1/8] PCI/AER: Remove aer_print_port_info
Date: Thu, 6 Mar 2025 18:02:20 -0600 [thread overview]
Message-ID: <20250307000220.GA382422@bhelgaas> (raw)
In-Reply-To: <CAMC_AXWVOtKh2r4kX6c7jtJwQaEE4KEQsH=uoB1OhczJ=8K2VA@mail.gmail.com>
On Wed, Mar 05, 2025 at 05:32:45PM -0800, Jon Pan-Doh wrote:
> > On Tue, Mar 04, 2025 at 05:04:21PM -0800, Jon Pan-Doh wrote:
> > > Would a log suffice in that case (i.e. when aer_get_device_error()
> > > returns 0)? Something along the lines of "{device} is not accessible
> > > while processing (un)correctable error"
>
> What are your thoughts on this? It adds the pcie port log in the
> edge case described (with no loss of info) and doesn't require
> changes to current ratelimit logic. Something like this (with more
> fields filled in of course):
>
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 21cdf590b25e..bdfc7e8d6f0f 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -1253,6 +1253,8 @@ static inline void
> aer_process_err_devices(struct aer_err_info *e_info)
> for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++) {
> if (aer_get_device_error_info(e_info->dev[i], e_info))
> aer_print_error(e_info->dev[i], e_info);
> + else
> + pci_error(e_info->dev[i], "{device} is not
> accessible while processing (un)correctable error");
> }
> for (i = 0; i < e_info->error_dev_num && e_info->dev[i]; i++) {
> if (aer_get_device_error_info(e_info->dev[i], e_info))
Maybe, although I think consistency is very important, and we'll
always have Root Port info but won't always have Endpoint info. So
dropping the Root Port message seems possibly the wrong way around
when it's the Endpoint part that's "optional".
One thing I do like about the current messages is that they associate
information with the device that is the source of the information. I
remember finding this very confusing when I first looked at how AER
works.
E.g., the "pcieport ... Correctable error" message means the Root Port
received an ERR_COR and generated an interrupt, and the error class
and error source came from the Root Port AER Capability. Similarly,
the "e1000e ... error status" message contains information read from
the Endpoint AER Capability.
I do think the existing messages are WAY too verbose. I would love to
make them more concise, and I think the important endpoint info could
probably be squeezed into a single line, although obviously TLP header
logs would be too much for that.
Bjorn
next prev parent reply other threads:[~2025-03-07 0:02 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-14 2:35 [PATCH v2 0/8] Rate limit AER logs Jon Pan-Doh
2025-02-14 2:35 ` [PATCH v2 1/8] PCI/AER: Remove aer_print_port_info Jon Pan-Doh
2025-03-04 18:32 ` Bjorn Helgaas
2025-03-05 1:04 ` Jon Pan-Doh
2025-03-05 22:35 ` Bjorn Helgaas
2025-03-06 1:32 ` Jon Pan-Doh
2025-03-07 0:02 ` Bjorn Helgaas [this message]
2025-02-14 2:35 ` [PATCH v2 2/8] PCI/AER: Use the same log level for all messages Jon Pan-Doh
2025-02-17 11:25 ` Karolina Stolarek
2025-02-19 2:48 ` Jon Pan-Doh
2025-02-24 11:26 ` Karolina Stolarek
2025-02-28 22:25 ` Jon Pan-Doh
2025-03-04 18:59 ` Bjorn Helgaas
2025-03-07 12:04 ` Karolina Stolarek
2025-03-13 21:15 ` Jon Pan-Doh
2025-02-14 2:35 ` [PATCH v2 3/8] PCI/AER: Move AER stat collection out of __aer_print_error Jon Pan-Doh
2025-02-17 11:29 ` Karolina Stolarek
2025-02-19 2:48 ` Jon Pan-Doh
2025-02-24 11:26 ` Karolina Stolarek
2025-02-14 2:35 ` [PATCH v2 4/8] PCI/AER: Rename struct aer_stats to aer_report Jon Pan-Doh
2025-02-17 11:29 ` Karolina Stolarek
2025-02-19 2:49 ` Jon Pan-Doh
2025-02-14 2:35 ` [PATCH v2 5/8] PCI/AER: Introduce ratelimit for error logs Jon Pan-Doh
2025-02-17 11:29 ` Karolina Stolarek
2025-02-19 2:49 ` Jon Pan-Doh
2025-02-14 2:35 ` [PATCH v2 6/8] PCI/AER: Add ratelimits to PCI AER Documentation Jon Pan-Doh
2025-02-17 11:30 ` Karolina Stolarek
2025-02-14 2:35 ` [PATCH v2 7/8] PCI/AER: Add AER sysfs attributes for log ratelimits Jon Pan-Doh
2025-02-17 13:31 ` Karolina Stolarek
2025-02-19 2:50 ` Jon Pan-Doh
2025-02-24 11:28 ` Karolina Stolarek
2025-02-19 5:42 ` Jon Pan-Doh
2025-02-25 13:56 ` Karolina Stolarek
2025-02-28 22:28 ` Jon Pan-Doh
2025-03-03 8:31 ` Karolina Stolarek
2025-03-04 0:58 ` Jon Pan-Doh
2025-03-07 12:10 ` Karolina Stolarek
2025-03-07 19:41 ` Bjorn Helgaas
2025-02-14 2:35 ` [PATCH v2 8/8] PCI/AER: Update AER sysfs ABI filename Jon Pan-Doh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250307000220.GA382422@bhelgaas \
--to=helgaas@kernel.org \
--cc=Jonathan.Cameron@huawei.com \
--cc=anilagrawal@meta.com \
--cc=ben.fuller@oracle.com \
--cc=bhelgaas@google.com \
--cc=drewwalton@microsoft.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=karolina.stolarek@oracle.com \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=martin.petersen@oracle.com \
--cc=pandoh@google.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox