From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>,
Breno Leitao <leitao@debian.org>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Oliver O'Halloran <oohall@gmail.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Jon Pan-Doh <pandoh@google.com>
Cc: linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel-team@meta.com,
stable@vger.kernel.org
Subject: Re: [PATCH RESEND] PCI/AER: Check for NULL aer_info before ratelimiting in pci_print_aer()
Date: Thu, 2 Oct 2025 11:10:34 -0700 [thread overview]
Message-ID: <9c4e25e4-c6c7-4c56-ba0a-006b40e64d78@linux.intel.com> (raw)
In-Reply-To: <a63012d4-0c98-4022-8183-5a3488ca66e9@csgroup.eu>
On 10/2/25 03:06, Christophe Leroy wrote:
>
>
> Le 29/09/2025 à 17:10, Sathyanarayanan Kuppuswamy a écrit :
>>
>> On 9/29/25 2:15 AM, Breno Leitao wrote:
>>> Similarly to pci_dev_aer_stats_incr(), pci_print_aer() may be called
>>> when dev->aer_info is NULL. Add a NULL check before proceeding to avoid
>>> calling aer_ratelimit() with a NULL aer_info pointer, returning 1, which
>>> does not rate limit, given this is fatal.
>>>
>>> This prevents a kernel crash triggered by dereferencing a NULL pointer
>>> in aer_ratelimit(), ensuring safer handling of PCI devices that lack
>>> AER info. This change aligns pci_print_aer() with pci_dev_aer_stats_incr()
>>> which already performs this NULL check.
>>>
>>> Cc: stable@vger.kernel.org
>>> Fixes: a57f2bfb4a5863 ("PCI/AER: Ratelimit correctable and non-fatal error logging")
>>> Signed-off-by: Breno Leitao <leitao@debian.org>
>>> ---
>>> - This problem is still happening in upstream, and unfortunately no action
>>> was done in the previous discussion.
>>> - Link to previous post:
>>> https://eur01.safelinks.protection.outlook.com/? url=https%3A%2F%2Flore.kernel.org%2Fr%2F20250804-aer_crash_2-v1-1- fd06562c18a4%40debian.org&data=05%7C02%7Cchristophe.leroy2%40cs- soprasteria.com%7Cfd3d2f1b4e8448a8e67608ddff6a4e70%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638947554250805439%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=6yTN1%2Fq%2Fy0VKX%2BXpE%2BiKxBrn19AkY4IPj01N2ZdxEkg%3D&reserved=0
>>> ---
>>
>> Although we haven't identified the path that triggers this issue, adding this check is harmless.
>
> Is it really harmless ?
>
> The purpose of the function is to ratelimit logs. Here by returning 1 when dev->aer_info is NULL it says: don't ratelimit. Isn't it an opened door to Denial of Service by overloading with logs ?
We only skip rate limiting when dev->aer_info is NULL, which happens for
devices without AER capability. In that case, I think the trade-off is reasonable:
generating more logs is better than triggering a NULL pointer exception.
Also, this approach is consistent with other functions (for example, the stat
collection helpers) that already perform similar checks before accessing
aer_info. So extending the same safeguard here seems acceptable to me.
>
> Christophe
>
>>
>> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
>>
>>
>>
>>> drivers/pci/pcie/aer.c | 3 +++
>>> 1 file changed, 3 insertions(+)
>>>
>>> diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c
>>> index e286c197d7167..55abc5e17b8b1 100644
>>> --- a/drivers/pci/pcie/aer.c
>>> +++ b/drivers/pci/pcie/aer.c
>>> @@ -786,6 +786,9 @@ static void pci_rootport_aer_stats_incr(struct pci_dev *pdev,
>>> static int aer_ratelimit(struct pci_dev *dev, unsigned int severity)
>>> {
>>> + if (!dev->aer_info)
>>> + return 1;
>>> +
>>> switch (severity) {
>>> case AER_NONFATAL:
>>> return __ratelimit(&dev->aer_info->nonfatal_ratelimit);
>>>
>>> ---
>>> base-commit: e5f0a698b34ed76002dc5cff3804a61c80233a7a
>>> change-id: 20250801-aer_crash_2-b21cc2ef0d00
>>>
>>> Best regards,
>>> --
>>> Breno Leitao <leitao@debian.org>
>>>
>
>
next prev parent reply other threads:[~2025-10-02 18:10 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-29 9:15 [PATCH RESEND] PCI/AER: Check for NULL aer_info before ratelimiting in pci_print_aer() Breno Leitao
2025-09-29 15:10 ` Sathyanarayanan Kuppuswamy
2025-10-02 10:06 ` Christophe Leroy
2025-10-02 18:10 ` Kuppuswamy Sathyanarayanan [this message]
2025-09-29 17:01 ` Christophe Leroy
2025-10-01 13:52 ` Breno Leitao
2025-10-01 21:36 ` Bjorn Helgaas
2025-10-02 9:10 ` Breno Leitao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9c4e25e4-c6c7-4c56-ba0a-006b40e64d78@linux.intel.com \
--to=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=christophe.leroy@csgroup.eu \
--cc=kernel-team@meta.com \
--cc=leitao@debian.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mahesh@linux.ibm.com \
--cc=oohall@gmail.com \
--cc=pandoh@google.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).