Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Shuai Xue <xueshuai@linux.alibaba.com>
To: Lukas Wunner <lukas@wunner.de>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, bhelgaas@google.com,
	kbusch@kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com,
	mahesh@linux.ibm.com, oohall@gmail.com,
	Jonathan.Cameron@huawei.com, terry.bowman@amd.com,
	tianruidong@linux.alibaba.com
Subject: Re: [PATCH v6 3/5] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd
Date: Mon, 20 Oct 2025 23:20:58 +0800	[thread overview]
Message-ID: <30fe11dd-3f21-4a61-adb0-74e39087c84c@linux.alibaba.com> (raw)
In-Reply-To: <aPZGNP79kJO74W4J@wunner.de>



在 2025/10/20 22:24, Lukas Wunner 写道:
> On Mon, Oct 20, 2025 at 10:17:10PM +0800, Shuai Xue wrote:
>> void aer_report_frozen_error(struct pci_dev *dev)
>> {
>>      struct aer_err_info info;
>>
>>      if (dev->pci_type != PCI_EXP_TYPE_ENDPOINT &&
>>          dev->pci_type != PCI_EXP_TYPE_RC_END)
>>          return;
>>
>>      aer_info_init(&info);
>>      aer_add_error_device(&info, dev);
>>      info.severity = AER_FATAL;
>>      if (aer_get_device_error_info(&info, 0, true))
>>          aer_print_error(&info, 0);
>>
>>      /* pci_dev_put() pairs with pci_dev_get() in aer_add_error_device() */
>>      pci_dev_put(dev);
>> }

Hi Lukas,
> 
> Much better.  Again, I think you don't need to rename add_error_device()
> and then the code comment even fits on the same line:

Good point. I'll keep the original function name and use the one-line
comment format.

> 
> 	pci_dev_put(dev);  /* pairs with pci_dev_get() in add_error_device() */
> 
>>>>     .slot_reset()
>>>>       => pci_restore_state()
>>>>         => pci_aer_clear_status()
>>>
>>> This was added in 2015 by b07461a8e45b.  The commit claims that
>>> the errors are stale and can be ignored.  It turns out they cannot.
>>>
>>> So maybe pci_restore_state() should print information about the
>>> errors before clearing them?
>>
>> While that could work, we would lose the error severity information at
> 
> Wait, we've got that saved in pci_cap_saved_state, so we could restore
> the severity register, report leftover errors, then clear those errors?

You're right that the severity register is also sticky, so we could
retrieve error severity directly from AER registers.

However, I have concerns about implementing this approach:

1. Scope creep: The current implementation is explicit - we know we're
in an AER error recovery path and intentionally report the missed fatal
errors. Your suggestion would make error reporting implicit during any
device state restoration.

2. Wider impact: pci_save_state() and pci_restore_state() are used
extensively beyond AER error handling - in power management, reset
operations, and various driver flows. Adding error reporting to
pci_restore_state() could generate unexpected error messages in
non-error scenarios.

3. Architectural consistency: As you noted earlier, "pci_restore_state()
is only supposed to restore state, as the name implies, and not clear
errors." Adding error reporting to this function would further violate
this principle - we'd be making it do even more than just restore state.

Would you prefer I implement this broader change, or shall we proceed
with the targeted helper function approach for now? The helper function
solves the immediate problem while keeping the changes focused on the
AER recovery path.

> Thanks,
> 
> Lukas

Thanks.
Shuai

  reply	other threads:[~2025-10-20 15:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-15  2:41 [PATCH v6 0/5] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd Shuai Xue
2025-10-15  2:41 ` [PATCH v6 1/5] PCI/DPC: Clarify naming for error port in DPC Handling Shuai Xue
2025-10-15  2:41 ` [PATCH v6 2/5] PCI/DPC: Run recovery on device that detected the error Shuai Xue
2025-10-15  2:41 ` [PATCH v6 3/5] PCI/AER: Report fatal errors of RCiEP and EP if link recoverd Shuai Xue
2025-10-20 10:10   ` Lukas Wunner
2025-10-20 12:58     ` Shuai Xue
2025-10-20 13:54       ` Lukas Wunner
2025-10-20 14:17         ` Shuai Xue
2025-10-20 14:24           ` Lukas Wunner
2025-10-20 15:20             ` Shuai Xue [this message]
2025-10-23 10:48               ` Lukas Wunner
2025-10-24  6:43                 ` Shuai Xue
2025-12-16  8:07                   ` Shuai Xue
2025-10-20 18:38   ` Kuppuswamy Sathyanarayanan
2025-10-21  1:51     ` Shuai Xue
2025-10-15  2:41 ` [PATCH v6 4/5] PCI/ERR: Use pcie_aer_is_native() to check for native AER control Shuai Xue
2025-10-20 10:17   ` Lukas Wunner
2025-10-20 13:09     ` Shuai Xue
2025-10-20 13:58       ` Lukas Wunner
2025-10-20 14:45         ` Shuai Xue
2025-10-23 10:29           ` Lukas Wunner
2025-10-24  3:09             ` Shuai Xue
2025-10-24  3:14               ` Lukas Wunner
2025-10-24  3:38                 ` Shuai Xue
2025-10-24  4:03                   ` Lukas Wunner
2025-10-24  5:37                     ` Shuai Xue
2025-10-20 18:43   ` Kuppuswamy Sathyanarayanan
2025-10-15  2:41 ` [PATCH v6 5/5] PCI/AER: Clear both AER fatal and non-fatal status Shuai Xue
2025-10-20 18:44   ` Kuppuswamy Sathyanarayanan
2025-10-21  1:33     ` Shuai Xue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30fe11dd-3f21-4a61-adb0-74e39087c84c@linux.alibaba.com \
    --to=xueshuai@linux.alibaba.com \
    --cc=Jonathan.Cameron@huawei.com \
    --cc=bhelgaas@google.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=oohall@gmail.com \
    --cc=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=terry.bowman@amd.com \
    --cc=tianruidong@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox