From: Jon Pan-Doh <pandoh@google.com>
To: karolina.stolarek@oracle.com
Cc: ben.fuller@oracle.com, bhelgaas@google.com,
linux-pci@vger.kernel.org, martin.petersen@oracle.com
Subject: [PATCH RESEND 0/4] Rate limit reporting of Correctable Errors
Date: Tue, 14 Jan 2025 23:55:53 -0800 [thread overview]
Message-ID: <20250115075553.3518103-1-pandoh@google.com> (raw)
In-Reply-To: <cover.1736341506.git.karolina.stolarek@oracle.com>
On Wed, 8 Jan 2025 13:55:30 +0000
Karolina Stolarek <karolina.stolarek@oracle.com> wrote:
> TL;DR
> ====
>
> We are getting multiple reports about excessive logging of Correctable
> Errors with no clear common root cause. As these errors are already
> corrected by hardware, it makes sense to limit them. Introduce
> a ratelimit state definition to pci_dev to control the number of
> messages reported by a Root Port within a specified time interval.
> The series adds other improvements in the area, as outlined in the
> Proposal section.
Hi Karolina,
This is a common impediment for many folks that want to enable AER. The
excessive logging stalls execution, making machines unusable. I've been
working on a similar solution[1] to yours (i.e. ratelimiting) with a few
differences:
- ratelimit uncorrectable errors
- ratelimit IRQs
- configure ratelimits from userspace (sysfs knobs)
Hoping we can collaborate on a solution (i.e. take best parts of both patch
series).
Thanks,
Jon
[1] https://lore.kernel.org/linux-pci/20250115074301.3514927-1-pandoh@google.com/
next prev parent reply other threads:[~2025-01-15 7:55 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-08 13:55 [PATCH RESEND 0/4] Rate limit reporting of Correctable Errors Karolina Stolarek
2025-01-08 13:55 ` [PATCH RESEND 1/4] PCI/AER: Use the same log level for all messages Karolina Stolarek
2025-01-15 7:50 ` Jon Pan-Doh
2025-01-08 13:55 ` [PATCH RESEND 2/4] PCI/AER: Add Correctable Errors rate limiting Karolina Stolarek
2025-01-15 7:52 ` Jon Pan-Doh
2025-01-08 13:55 ` [PATCH RESEND 3/4] PCI/AER: Increase the rate limit interval after threshold Karolina Stolarek
2025-01-08 13:55 ` [PATCH RESEND 4/4] PCI: Add 'cor_err_reporting_enable' attribute Karolina Stolarek
2025-01-15 7:55 ` Jon Pan-Doh [this message]
2025-01-15 14:18 ` [PATCH RESEND 0/4] Rate limit reporting of Correctable Errors Karolina Stolarek
2025-01-17 2:51 ` Jon Pan-Doh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250115075553.3518103-1-pandoh@google.com \
--to=pandoh@google.com \
--cc=ben.fuller@oracle.com \
--cc=bhelgaas@google.com \
--cc=karolina.stolarek@oracle.com \
--cc=linux-pci@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox