From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: linux-pci@vger.kernel.org, "Jon Pan-Doh" <pandoh@google.com>,
"Karolina Stolarek" <karolina.stolarek@oracle.com>,
"Weinan Liu" <wnliu@google.com>,
"Martin Petersen" <martin.petersen@oracle.com>,
"Ben Fuller" <ben.fuller@oracle.com>,
"Drew Walton" <drewwalton@microsoft.com>,
"Anil Agrawal" <anilagrawal@meta.com>,
"Tony Luck" <tony.luck@intel.com>,
"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
"Sathyanarayanan Kuppuswamy"
<sathyanarayanan.kuppuswamy@linux.intel.com>,
"Lukas Wunner" <lukas@wunner.de>,
"Sargun Dhillon" <sargun@meta.com>,
"Paul E . McKenney" <paulmck@kernel.org>,
"Mahesh J Salgaonkar" <mahesh@linux.ibm.com>,
"Oliver O'Halloran" <oohall@gmail.com>,
"Kai-Heng Feng" <kaihengf@nvidia.com>,
"Keith Busch" <kbusch@kernel.org>,
"Robert Richter" <rrichter@amd.com>,
"Terry Bowman" <terry.bowman@amd.com>,
"Shiju Jose" <shiju.jose@huawei.com>,
"Dave Jiang" <dave.jiang@intel.com>,
linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
"Bjorn Helgaas" <bhelgaas@google.com>
Subject: Re: [PATCH v7 15/17] PCI/AER: Ratelimit correctable and non-fatal error logging
Date: Wed, 21 May 2025 11:31:21 +0100 [thread overview]
Message-ID: <20250521113121.000067ce@huawei.com> (raw)
In-Reply-To: <20250520215047.1350603-16-helgaas@kernel.org>
On Tue, 20 May 2025 16:50:32 -0500
Bjorn Helgaas <helgaas@kernel.org> wrote:
> From: Jon Pan-Doh <pandoh@google.com>
>
> Spammy devices can flood kernel logs with AER errors and slow/stall
> execution. Add per-device ratelimits for AER correctable and non-fatal
> uncorrectable errors that use the kernel defaults (10 per 5s). Logging of
> fatal errors is not ratelimited.
See below. I'm not sure that logging of fatal error should affect the rate
for non fatal errors + the rate limit infrastructure kind of assumes
that you only call it if you are planning to respect it's decision.
Given overall aim is to restrict rates, maybe we don't care if we sometimes
throttle earlier that we might expect with a simpler separation of what
is being limited.
I don't mind strongly either way.
>
> There are two AER logging entry points:
>
> - aer_print_error() is used by DPC and native AER
>
> - pci_print_aer() is used by GHES and CXL
>
> The native AER aer_print_error() case includes a loop that may log details
> from multiple devices. This is ratelimited such that we log all the
> details we find if any of the devices has not hit the ratelimit. If no
> such device details are found, we still log the Error Source from the ERR_*
> Message, ratelimited by the Root Port or RCEC that received it.
>
> The DPC aer_print_error() case is not ratelimited, since this only happens
> for fatal errors.
>
> The CXL pci_print_aer() case is ratelimited by the Error Source device.
>
> The GHES pci_print_aer() case is via aer_recover_work_func(), which
> searches for the Error Source device. If the device is not found, there's
> no per-device ratelimit, so we use a system-wide ratelimit that covers all
> error types (correctable, non-fatal, and fatal).
>
> Sargun at Meta reported internally that a flood of AER errors causes RCU
> CPU stall warnings and CSD-lock warnings.
>
> Tested using aer-inject[1]. Sent 11 AER errors. Observed 10 errors logged
> while AER stats (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) show
> true count of 11.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git
>
> [bhelgaas: commit log, factor out trace_aer_event() and aer_print_rp_info()
> changes to previous patches, collect single aer_err_info.ratelimit as union
> of ratelimits of all error source devices, don't ratelimit fatal errors,
> "aer_report" -> "aer_info"]
> Reported-by: Sargun Dhillon <sargun@meta.com>
> Signed-off-by: Jon Pan-Doh <pandoh@google.com>
> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> ---
> drivers/pci/pci.h | 3 +-
> drivers/pci/pcie/aer.c | 66 ++++++++++++++++++++++++++++++++++++++----
> drivers/pci/pcie/dpc.c | 1 +
> 3 files changed, 64 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 705f9ef58acc..65c466279ade 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -593,7 +593,8 @@ struct aer_err_info {
> unsigned int id:16;
>
> unsigned int severity:2; /* 0:NONFATAL | 1:FATAL | 2:COR */
> - unsigned int __pad1:5;
> + unsigned int ratelimit:1; /* 0=skip, 1=print */
That naming is less than intuitive. Maybe expand it to ratelimit_print or
something like that.
> + unsigned int __pad1:4;
> unsigned int multi_error_valid:1;
>
> unsigned int first_error:5;
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index 4f1bff0f000f..f9e684ac7878 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -815,8 +843,19 @@ EXPORT_SYMBOL_NS_GPL(pci_print_aer, "CXL");
> */
> static int add_error_device(struct aer_err_info *e_info, struct pci_dev *dev)
> {
> + /*
> + * Ratelimit AER log messages. "dev" is either the source
> + * identified by the root's Error Source ID or it has an unmasked
> + * error logged in its own AER Capability. If any of these devices
> + * has not reached its ratelimit, log messages for all of them.
> + * Messages are emitted when "e_info->ratelimit" is non-zero.
> + *
> + * Note that "e_info->ratelimit" was already initialized to 1 for the
> + * ERR_FATAL case.
> + */
> if (e_info->error_dev_num < AER_MAX_MULTI_ERR_DEVICES) {
> e_info->dev[e_info->error_dev_num] = pci_dev_get(dev);
> + e_info->ratelimit |= aer_ratelimit(dev, e_info->severity);
So this is a little odd. I think it works but there is code inside __ratelimit
that I think we should not be calling for that ERROR_FATAL case
(whether we should call lots of times for each device isn't obvious either
but maybe that is more valid).
In the event of it already being 1 due to ERROR_FATAL you will falsely trigger
a potential print from inside __ratelimit() if we were rate limited and no
longer are but only skipped FATAL prints. My concern is that function
is kind of assuming it's only called in cases where a rate limit decision
is being made and the implementation may change in future).
https://elixir.bootlin.com/linux/v6.14.7/source/lib/ratelimit.c#L56
Maybe,
if (!info->ratelimit)
e_info->ratelimit = aer_ratelimit(dev, e_info->severity);
is an alternative option.
That allows a multiplication factor on the rate as all device count for 1.
> e_info->error_dev_num++;
> return 0;
> }
> @@ -914,7 +953,7 @@ static int find_device_iter(struct pci_dev *dev, void *data)
> * e_info->error_dev_num and e_info->dev[], based on the given information.
> */
> static bool find_source_device(struct pci_dev *parent,
> - struct aer_err_info *e_info)
> + struct aer_err_info *e_info)
Unrelated change
> {
> struct pci_dev *dev = parent;
> int result;
> pci_aer_clear_fatal_status(pdev);
next prev parent reply other threads:[~2025-05-21 10:31 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-20 21:50 [PATCH v7 00/17] Rate limit AER logs Bjorn Helgaas
2025-05-20 21:50 ` [PATCH v7 01/17] PCI/DPC: Initialize aer_err_info before using it Bjorn Helgaas
2025-05-21 8:52 ` Jonathan Cameron
2025-05-21 19:18 ` Bjorn Helgaas
2025-05-21 10:06 ` Ilpo Järvinen
2025-05-20 21:50 ` [PATCH v7 02/17] PCI/DPC: Log Error Source ID only when valid Bjorn Helgaas
2025-05-21 9:00 ` Jonathan Cameron
2025-05-21 19:23 ` Bjorn Helgaas
2025-05-21 10:09 ` Ilpo Järvinen
2025-05-20 21:50 ` [PATCH v7 03/17] PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error() Bjorn Helgaas
2025-05-20 22:26 ` Sathyanarayanan Kuppuswamy
2025-05-21 9:15 ` Jonathan Cameron
2025-05-21 10:12 ` Ilpo Järvinen
2025-05-20 21:50 ` [PATCH v7 04/17] PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type() Bjorn Helgaas
2025-05-20 22:27 ` Sathyanarayanan Kuppuswamy
2025-05-21 9:20 ` Jonathan Cameron
2025-05-21 19:39 ` Bjorn Helgaas
2025-05-21 10:14 ` Ilpo Järvinen
2025-05-20 21:50 ` [PATCH v7 05/17] PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc Bjorn Helgaas
2025-05-21 9:21 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 06/17] PCI/AER: Rename aer_print_port_info() to aer_print_source() Bjorn Helgaas
2025-05-21 9:22 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 07/17] PCI/AER: Move aer_print_source() earlier in file Bjorn Helgaas
2025-05-21 9:23 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 08/17] PCI/AER: Initialize aer_err_info before using it Bjorn Helgaas
2025-05-21 9:24 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 09/17] PCI/AER: Simplify pci_print_aer() Bjorn Helgaas
2025-05-20 22:29 ` Sathyanarayanan Kuppuswamy
2025-05-21 9:27 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 10/17] PCI/AER: Update statistics early in logging Bjorn Helgaas
2025-05-21 9:43 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 11/17] PCI/AER: Combine trace_aer_event() with statistics updates Bjorn Helgaas
2025-05-21 9:46 ` Jonathan Cameron
2025-05-21 17:12 ` Bjorn Helgaas
2025-05-20 21:50 ` [PATCH v7 12/17] PCI/AER: Check log level once and remember it Bjorn Helgaas
2025-05-21 9:52 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 13/17] PCI/AER: Make all pci_print_aer() log levels depend on error type Bjorn Helgaas
2025-05-21 9:56 ` Jonathan Cameron
2025-05-21 17:45 ` Bjorn Helgaas
2025-05-20 21:50 ` [PATCH v7 14/17] PCI/AER: Rename struct aer_stats to aer_info Bjorn Helgaas
2025-05-21 9:59 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 15/17] PCI/AER: Ratelimit correctable and non-fatal error logging Bjorn Helgaas
2025-05-20 22:33 ` Sathyanarayanan Kuppuswamy
2025-05-21 23:06 ` Bjorn Helgaas
2025-05-21 10:24 ` Ilpo Järvinen
2025-05-21 10:31 ` Jonathan Cameron [this message]
2025-05-21 22:54 ` Bjorn Helgaas
2025-05-22 11:17 ` Jonathan Cameron
2025-05-20 21:50 ` [PATCH v7 16/17] PCI/AER: Add ratelimits to PCI AER Documentation Bjorn Helgaas
2025-05-20 21:50 ` [PATCH v7 17/17] PCI/AER: Add sysfs attributes for log ratelimits Bjorn Helgaas
2025-05-20 22:35 ` Sathyanarayanan Kuppuswamy
2025-05-21 10:46 ` Jonathan Cameron
2025-05-21 11:05 ` Ilpo Järvinen
2025-05-21 22:59 ` Bjorn Helgaas
2025-05-22 11:21 ` Jonathan Cameron
2025-05-22 23:05 ` Bjorn Helgaas
2025-05-21 23:14 ` [PATCH v7 00/17] Rate limit AER logs Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250521113121.000067ce@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=anilagrawal@meta.com \
--cc=ben.fuller@oracle.com \
--cc=bhelgaas@google.com \
--cc=dave.jiang@intel.com \
--cc=drewwalton@microsoft.com \
--cc=helgaas@kernel.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=kaihengf@nvidia.com \
--cc=karolina.stolarek@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=martin.petersen@oracle.com \
--cc=oohall@gmail.com \
--cc=pandoh@google.com \
--cc=paulmck@kernel.org \
--cc=rrichter@amd.com \
--cc=sargun@meta.com \
--cc=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=shiju.jose@huawei.com \
--cc=terry.bowman@amd.com \
--cc=tony.luck@intel.com \
--cc=wnliu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).