public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>, Bjorn Helgaas <helgaas@kernel.org>
Cc: Sizhe Liu <liusizhe5@huawei.com>,
	bhelgaas@google.com, jonathan.cameron@huawei.com,
	shiju.jose@huawei.com, keith.busch@intel.com,
	linux-pci@vger.kernel.org, linuxarm@huawei.com,
	prime.zeng@hisilicon.com, fanghao11@huawei.com,
	shenyang39@huawei.com, Shuai Xue <xueshuai@linux.alibaba.com>,
	Terry Bowman <terry.bowman@amd.com>
Subject: Re: [PATCH] PCI/ERR: Clear fatal status of the reporting device
Date: Fri, 27 Feb 2026 14:47:16 -0800	[thread overview]
Message-ID: <0fdec25a-1ea5-4526-8809-eff71553067c@linux.intel.com> (raw)
In-Reply-To: <aaIJa_2XiQht4swq@wunner.de>

Hi Lukas,

On 2/27/2026 1:15 PM, Lukas Wunner wrote:
> On Fri, Feb 27, 2026 at 06:25:05PM +0800, Sizhe Liu wrote:
>> During PCIe native AER error recovery, ERR_FATAL status bits are not cleared
>> after fatal error handling. This causes stale ERR_FATAL bits to be reported
>> in subsequent AER events, even after reporting "device recovery successful".
> 
> Wrong.	The bits are cleared by:
> 
> report_slot_reset()
>   err_handler->slot_reset()
>     pci_restore_state()
>       pci_aer_clear_status()
>       	pci_aer_raw_clear_status()

Thanks for the correction and for sharing the call flow. I was not aware
that the fatal status bits are already cleared via pci_restore_state().

That raises a question. If pci_restore_state() already clears all AER
status bits through pci_aer_raw_clear_status(), do we still need the
explicit pci_aer_clear_nonfatal_status() call in pcie_do_recovery()?
Similarly, could pcie_clear_device_status() also be moved there?
I see pcie_clear_device_status() call is sprinkled across all error
handling paths (EDR, DPC & AER).

Also, since pci_aer_raw_clear_status() clears all error status registers,
is there a risk of silently losing newly detected errors that arrive while
recovery is still in progress?


> 
> Thanks,
> 
> Lukas
> 

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


  reply	other threads:[~2026-02-27 22:47 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-27 10:25 [PATCH] PCI/ERR: Clear fatal status of the reporting device Sizhe Liu
2026-02-27 16:31 ` Bjorn Helgaas
2026-02-27 18:01   ` Kuppuswamy Sathyanarayanan
2026-02-27 21:15   ` Lukas Wunner
2026-02-27 22:47     ` Kuppuswamy Sathyanarayanan [this message]
2026-02-28  2:06       ` Shuai Xue
2026-03-03 13:34         ` Sizhe Liu
2026-02-28 12:01     ` Sizhe Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fdec25a-1ea5-4526-8809-eff71553067c@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc=bhelgaas@google.com \
    --cc=fanghao11@huawei.com \
    --cc=helgaas@kernel.org \
    --cc=jonathan.cameron@huawei.com \
    --cc=keith.busch@intel.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linuxarm@huawei.com \
    --cc=liusizhe5@huawei.com \
    --cc=lukas@wunner.de \
    --cc=prime.zeng@hisilicon.com \
    --cc=shenyang39@huawei.com \
    --cc=shiju.jose@huawei.com \
    --cc=terry.bowman@amd.com \
    --cc=xueshuai@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox