From: Grant Grundler <grundler@chromium.org>
To: Bjorn Helgaas <helgaas@kernel.org>
Cc: Grant Grundler <grundler@chromium.org>,
Rajat Jain <rajatja@chromium.org>,
Rajat Khandelwal <rajat.khandelwal@linux.intel.com>,
linux-pci@vger.kernel.org,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
linux-kernel@vger.kernel.org,
"Oliver O 'Halloran" <oohall@gmail.com>,
Bjorn Helgaas <bhelgaas@google.com>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors
Date: Wed, 17 May 2023 14:02:43 -0700 [thread overview]
Message-ID: <CANEJEGv8yxcYmrn4dsc0GCrcMGSFJNoJ=-VUvTjPLCVug+X29w@mail.gmail.com> (raw)
In-Reply-To: <ZGT6sTOtk+WY3aYt@bhelgaas>
On Wed, May 17, 2023 at 9:03 AM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Fri, Apr 07, 2023 at 04:46:03PM -0700, Grant Grundler wrote:
> > On Fri, Apr 7, 2023 at 12:46 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > > On Fri, Apr 07, 2023 at 11:53:27AM -0700, Grant Grundler wrote:
> > > > On Thu, Apr 6, 2023 at 12:50 PM Bjorn Helgaas <helgaas@kernel.org>
> > > wrote:
> > > > > On Fri, Mar 17, 2023 at 10:51:09AM -0700, Grant Grundler wrote:
> > > > > > From: Rajat Khandelwal <rajat.khandelwal@linux.intel.com>
> > > > > >
> > > > > > There are many instances where correctable errors tend to inundate
> > > > > > the message buffer. We observe such instances during thunderbolt PCIe
> > > > > > tunneling.
> > > > ...
> > >
> > > > > > if (info->severity == AER_CORRECTABLE)
> > > > > > - pci_info(dev, " [%2d] %-22s%s\n", i, errmsg,
> > > > > > - info->first_error == i ? " (First)" :
> > > "");
> > > > > > + pci_info_ratelimited(dev, " [%2d]
> > > %-22s%s\n", i, errmsg,
> > > > > > + info->first_error == i ?
> > > " (First)" : "");
> > > > >
> > > > > I don't think this is going to reliably work the way we want. We have
> > > > > a bunch of pci_info_ratelimited() calls, and each caller has its own
> > > > > ratelimit_state data. Unless we call pci_info_ratelimited() exactly
> > > > > the same number of times for each error, the ratelimit counters will
> > > > > get out of sync and we'll end up printing fragments from error A mixed
> > > > > with fragments from error B.
> > > >
> > > > Ok - what I'm reading between the lines here is the output should be
> > > > emitted in one step, not multiple pci_info_ratelimited() calls. if the
> > > > code built an output string (using sprintnf()), and then called
> > > > pci_info_ratelimited() exactly once at the bottom, would that be
> > > > sufficient?
> > > >
> > > > > I think we need to explicitly manage the ratelimiting ourselves,
> > > > > similar to print_hmi_event_info() or print_extlog_rcd(). Then we can
> > > > > have a *single* ratelimit_state, and we can check it once to determine
> > > > > whether to log this correctable error.
> > > >
> > > > Is the rate limiting per call location or per device? From above, I
> > > > understood rate limiting is "per call location". If the code only
> > > > has one call location, it should achieve the same goal, right?
> > >
> > > Rate-limiting is per call location, so yes, if we only have one call
> > > location, that would solve it. It would also have the nice property
> > > that all the output would be atomic so it wouldn't get mixed with
> > > other stuff, and it might encourage us to be a little less wordy in
> > > the output.
> > >
> >
> > +1 to all of those reasons. Especially reducing the number of lines output.
> >
> > I'm going to be out for the next week. If someone else (Rajat Kendalwal
> > maybe?) wants to rework this to use one call location it should be fairly
> > straight forward. If not, I'll tackle this when I'm back (in 2 weeks
> > essentially).
>
> Ping? Really hoping to merge this for v6.5.
Sorry - I forgot about this... I'll take a shot at it. Should have
something by this evening.
cheers,
grant
>
> Bjorn
next prev parent reply other threads:[~2023-05-17 21:02 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CANEJEGsE6KS484iSLkKV8hx2nNThZGfaaz+u+R-A3X5nRev6Gg@mail.gmail.com>
2023-05-17 16:02 ` [PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors Bjorn Helgaas
2023-05-17 21:02 ` Grant Grundler [this message]
2023-05-18 5:58 ` Grant Grundler
2023-03-17 17:51 [PATCHv2 pci-next 1/2] PCI/AER: correctable error message as KERN_INFO Grant Grundler
2023-03-17 17:51 ` [PATCHv2 pci-next 2/2] PCI/AER: Rate limit the reporting of the correctable errors Grant Grundler
2023-04-06 19:50 ` Bjorn Helgaas
2023-04-07 18:53 ` Grant Grundler
2023-04-07 19:46 ` Bjorn Helgaas
2023-04-07 23:47 ` Grant Grundler
2023-04-07 23:49 ` Grant Grundler
2023-05-18 6:11 ` Grant Grundler
2023-06-06 3:45 ` Grant Grundler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CANEJEGv8yxcYmrn4dsc0GCrcMGSFJNoJ=-VUvTjPLCVug+X29w@mail.gmail.com' \
--to=grundler@chromium.org \
--cc=bhelgaas@google.com \
--cc=helgaas@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=rajat.khandelwal@linux.intel.com \
--cc=rajatja@chromium.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).