Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Karolina Stolarek <karolina.stolarek@oracle.com>
To: Jon Pan-Doh <pandoh@google.com>, Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-pci@vger.kernel.org,
	"Martin Petersen" <martin.petersen@oracle.com>,
	"Ben Fuller" <ben.fuller@oracle.com>,
	"Drew Walton" <drewwalton@microsoft.com>,
	"Anil Agrawal" <anilagrawal@meta.com>,
	"Tony Luck" <tony.luck@intel.com>,
	"Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>,
	"Sathyanarayanan Kuppuswamy"
	<sathyanarayanan.kuppuswamy@linux.intel.com>,
	"Lukas Wunner" <lukas@wunner.de>,
	"Jonathan Cameron" <Jonathan.Cameron@huawei.com>
Subject: Re: [PATCH v2 7/8] PCI/AER: Add AER sysfs attributes for log ratelimits
Date: Mon, 17 Feb 2025 14:31:25 +0100	[thread overview]
Message-ID: <38451a01-4e95-44ff-922b-8fda725eb25b@oracle.com> (raw)
In-Reply-To: <20250214023543.992372-8-pandoh@google.com>

On 14/02/2025 03:35, Jon Pan-Doh wrote:
> Allow userspace to read/write log ratelimits per device. Create aer/ sysfs
> directory to store them and any future aer configs.

I don't think it's neccessary to keep ratelimits in a separate directory 
when we decided to we keep the rest of AER attributes at the dev level.

> 
> Tested using aer-inject[1]. Configured correctable log ratelimits to 5.
> Sent 6 AER errors. Observed 5 errors logged while AER stats
> (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6.
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git
> 
> Signed-off-by: Jon Pan-Doh <pandoh@google.com>
> ---
>   .../testing/sysfs-bus-pci-devices-aer_stats   | 20 +++++++
>   Documentation/PCI/pcieaer-howto.rst           |  3 ++
>   drivers/pci/pci-sysfs.c                       |  1 +
>   drivers/pci/pci.h                             |  1 +
>   drivers/pci/pcie/aer.c                        | 52 +++++++++++++++++++
>   5 files changed, 77 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> index d1f67bb81d5d..c1221614c079 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> +++ b/Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats
> @@ -117,3 +117,23 @@ Date:		July 2018
>   KernelVersion:	4.19.0
>   Contact:	linux-pci@vger.kernel.org, rajatja@google.com
>   Description:	Total number of ERR_NONFATAL messages reported to rootport.
> +
> +PCIe AER ratelimits
> +-------------------
> +
> +These attributes show up under all the devices that are AER capable.
> +Provides configurable ratelimits of logs/irq per error type. Writing a

This sentence seems to refer to _one_ attribute and mentions IRQ 
ratelimiting, which is not a part of the series.

How about rephrasing this section to say that the attributes allow 
configuration of the rate at which AER errors are reported, with each of 
them dedicated to one error type (correctable or uncorrectable)? 
Something along these lines.

> +nonzero value changes the number of errors (burst) allowed per 5 second
> +window before ratelimiting. Reading gets the current ratelimits.
> +
> +What:		/sys/bus/pci/devices/<dev>/aer/ratelimit_cor_log
> +Date:		March 2025
> +KernelVersion:	6.15.0
> +Contact:	linux-pci@vger.kernel.org, pandoh@google.com
> +Description:	Log ratelimit for correctable errors.
> +
> +What:		/sys/bus/pci/devices/<dev>/aer/ratelimit_uncor_log
> +Date:		March 2025
> +KernelVersion:	6.15.0
> +Contact:	linux-pci@vger.kernel.org, pandoh@google.com
> +Description:	Log ratelimit for uncorrectable errors.
> diff --git a/Documentation/PCI/pcieaer-howto.rst b/Documentation/PCI/pcieaer-howto.rst
> index 167c0b277b62..ab5b0f232204 100644
> --- a/Documentation/PCI/pcieaer-howto.rst
> +++ b/Documentation/PCI/pcieaer-howto.rst
> @@ -93,6 +93,9 @@ spammy devices from flooding the console and stalling execution. Set the
>   default ratelimit to DEFAULT_RATELIMIT_BURST over
>   DEFAULT_RATELIMIT_INTERVAL (10 per 5 seconds).
>   
> +Ratelimits are exposed in the form of sysfs attributes and configurable.
> +See Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats.
> +
>   AER Statistics / Counters
>   -------------------------
>   
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index b46ce1a2c554..16de3093294e 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1801,6 +1801,7 @@ const struct attribute_group *pci_dev_attr_groups[] = {
>   	&pcie_dev_attr_group,
>   #ifdef CONFIG_PCIEAER
>   	&aer_stats_attr_group,
> +	&aer_attr_group,
>   #endif
>   #ifdef CONFIG_PCIEASPM
>   	&aspm_ctrl_attr_group,
> diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> index 26104aee06c0..26d30a99c48b 100644
> --- a/drivers/pci/pci.h
> +++ b/drivers/pci/pci.h
> @@ -887,6 +887,7 @@ void pci_no_aer(void);
>   void pci_aer_init(struct pci_dev *dev);
>   void pci_aer_exit(struct pci_dev *dev);
>   extern const struct attribute_group aer_stats_attr_group;
> +extern const struct attribute_group aer_attr_group;
>   void pci_aer_clear_fatal_status(struct pci_dev *dev);
>   int pci_aer_clear_status(struct pci_dev *dev);
>   int pci_aer_raw_clear_status(struct pci_dev *dev);
> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
> index c5b5381e2930..1237faee6542 100644
> --- a/drivers/pci/pcie/aer.c
> +++ b/drivers/pci/pcie/aer.c
> @@ -626,6 +626,58 @@ const struct attribute_group aer_stats_attr_group = {
>   	.is_visible = aer_stats_attrs_are_visible,
>   };
>   
> +#define aer_ratelimit_attr(name, ratelimit)				\
> +	static ssize_t							\
> +	name##_show(struct device *dev, struct device_attribute *attr,	\
> +		    char *buf)						\
> +{									\
> +	struct pci_dev *pdev = to_pci_dev(dev);				\
> +	return sysfs_emit(buf, "%u errors every %u seconds\n",		\

The convention is that sysfs files should provide one value per file. It 
won't be just people interacting with this interface, but scripts as 
well. Parsing such a string is a hassle. As we can only change the burst 
of the ratelimit, I'd simply emit pdev->aer_report->ratelimit.burst.

All the best,
Karolina

> +			  pdev->aer_report->ratelimit.burst,		\
> +			  pdev->aer_report->ratelimit.interval / HZ);	\
> +}									\
> +									\
> +	static ssize_t							\
> +	name##_store(struct device *dev, struct device_attribute *attr,	\
> +		     const char *buf, size_t count)			\
> +{									\
> +	struct pci_dev *pdev = to_pci_dev(dev);				\
> +	int burst;							\
> +									\
> +	if (kstrtoint(buf, 0, &burst) < 0)				\
> +		return -EINVAL;						\
> +									\
> +	pdev->aer_report->ratelimit.burst = burst;			\
> +	return count;							\
> +}									\
> +static DEVICE_ATTR_RW(name)
> +
> +aer_ratelimit_attr(ratelimit_cor_log, cor_log_ratelimit);
> +aer_ratelimit_attr(ratelimit_uncor_log, uncor_log_ratelimit);
> +
> +static struct attribute *aer_attrs[] __ro_after_init = {
> +	&dev_attr_ratelimit_cor_log.attr,
> +	&dev_attr_ratelimit_uncor_log.attr,
> +	NULL
> +};
> +
> +static umode_t aer_attrs_are_visible(struct kobject *kobj,
> +				     struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (!pdev->aer_report)
> +		return 0;
> +	return a->mode;
> +}
> +
> +const struct attribute_group aer_attr_group = {
> +	.name = "aer",
> +	.attrs = aer_attrs,
> +	.is_visible = aer_attrs_are_visible,
> +};
> +
>   void pci_dev_aer_stats_incr(struct pci_dev *pdev, struct aer_err_info *info)
>   {
>   	unsigned long status = info->status & ~info->mask;


  reply	other threads:[~2025-02-17 13:31 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-14  2:35 [PATCH v2 0/8] Rate limit AER logs Jon Pan-Doh
2025-02-14  2:35 ` [PATCH v2 1/8] PCI/AER: Remove aer_print_port_info Jon Pan-Doh
2025-03-04 18:32   ` Bjorn Helgaas
2025-03-05  1:04     ` Jon Pan-Doh
2025-03-05 22:35       ` Bjorn Helgaas
2025-03-06  1:32         ` Jon Pan-Doh
2025-03-07  0:02           ` Bjorn Helgaas
2025-02-14  2:35 ` [PATCH v2 2/8] PCI/AER: Use the same log level for all messages Jon Pan-Doh
2025-02-17 11:25   ` Karolina Stolarek
2025-02-19  2:48     ` Jon Pan-Doh
2025-02-24 11:26       ` Karolina Stolarek
2025-02-28 22:25         ` Jon Pan-Doh
2025-03-04 18:59   ` Bjorn Helgaas
2025-03-07 12:04     ` Karolina Stolarek
2025-03-13 21:15       ` Jon Pan-Doh
2025-02-14  2:35 ` [PATCH v2 3/8] PCI/AER: Move AER stat collection out of __aer_print_error Jon Pan-Doh
2025-02-17 11:29   ` Karolina Stolarek
2025-02-19  2:48     ` Jon Pan-Doh
2025-02-24 11:26       ` Karolina Stolarek
2025-02-14  2:35 ` [PATCH v2 4/8] PCI/AER: Rename struct aer_stats to aer_report Jon Pan-Doh
2025-02-17 11:29   ` Karolina Stolarek
2025-02-19  2:49     ` Jon Pan-Doh
2025-02-14  2:35 ` [PATCH v2 5/8] PCI/AER: Introduce ratelimit for error logs Jon Pan-Doh
2025-02-17 11:29   ` Karolina Stolarek
2025-02-19  2:49     ` Jon Pan-Doh
2025-02-14  2:35 ` [PATCH v2 6/8] PCI/AER: Add ratelimits to PCI AER Documentation Jon Pan-Doh
2025-02-17 11:30   ` Karolina Stolarek
2025-02-14  2:35 ` [PATCH v2 7/8] PCI/AER: Add AER sysfs attributes for log ratelimits Jon Pan-Doh
2025-02-17 13:31   ` Karolina Stolarek [this message]
2025-02-19  2:50     ` Jon Pan-Doh
2025-02-24 11:28       ` Karolina Stolarek
2025-02-19  5:42     ` Jon Pan-Doh
2025-02-25 13:56   ` Karolina Stolarek
2025-02-28 22:28     ` Jon Pan-Doh
2025-03-03  8:31       ` Karolina Stolarek
2025-03-04  0:58         ` Jon Pan-Doh
2025-03-07 12:10           ` Karolina Stolarek
2025-03-07 19:41             ` Bjorn Helgaas
2025-02-14  2:35 ` [PATCH v2 8/8] PCI/AER: Update AER sysfs ABI filename Jon Pan-Doh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=38451a01-4e95-44ff-922b-8fda725eb25b@oracle.com \
    --to=karolina.stolarek@oracle.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=anilagrawal@meta.com \
    --cc=ben.fuller@oracle.com \
    --cc=bhelgaas@google.com \
    --cc=drewwalton@microsoft.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=martin.petersen@oracle.com \
    --cc=pandoh@google.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=tony.luck@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox