From: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>, Bjorn Helgaas <helgaas@kernel.org>
Cc: Riana Tauro <riana.tauro@intel.com>,
Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>,
"Sean C. Dardis" <sean.c.dardis@intel.com>,
Terry Bowman <terry.bowman@amd.com>,
Niklas Schnelle <schnelle@linux.ibm.com>,
Linas Vepstas <linasvepstas@gmail.com>,
Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
Oliver OHalloran <oohall@gmail.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>,
linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
Shahed Shaikh <shshaikh@marvell.com>,
Manish Chopra <manishc@marvell.com>,
GR-Linux-NIC-Dev@marvell.com, Nilesh Javali <njavali@marvell.com>,
GR-QLogic-Storage-Upstream@marvell.com,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
"Martin K. Petersen" <martin.petersen@oracle.com>,
linux-scsi@vger.kernel.org, Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <"ku ba"@kernel.org>,
Paolo Abeni <pabeni@redhat.com>,
netdev@vger.kernel.org
Subject: Re: [PATCH 4/5] PCI/ERR: Update device error_state already after reset
Date: Wed, 13 Aug 2025 16:43:39 -0700 [thread overview]
Message-ID: <004298f7-ae08-428e-9b98-995fc56e55b1@linux.intel.com> (raw)
In-Reply-To: <4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de>
On 8/12/25 10:11 PM, Lukas Wunner wrote:
> After a Fatal Error has been reported by a device and has been recovered
> through a Secondary Bus Reset, AER updates the device's error_state to
> pci_channel_io_normal before invoking its driver's ->resume() callback.
>
> By contrast, EEH updates the error_state earlier, namely after resetting
> the device and before invoking its driver's ->slot_reset() callback.
> Commit c58dc575f3c8 ("powerpc/pseries: Set error_state to
> pci_channel_io_normal in eeh_report_reset()") explains in great detail
> that the earlier invocation is necessitated by various drivers checking
> accessibility of the device with pci_channel_offline() and avoiding
> accesses if it returns true. It returns true for any other error_state
> than pci_channel_io_normal.
>
> The device should be accessible already after reset, hence the reasoning
> is that it's safe to update the error_state immediately afterwards.
>
> This deviation between AER and EEH seems problematic because drivers
> behave differently depending on which error recovery mechanism the
> platform uses. Three drivers have gone so far as to update the
> error_state themselves, presumably to work around AER's behavior.
>
> For consistency, amend AER to update the error_state at the same recovery
> steps as EEH. Drop the now unnecessary workaround from the three drivers.
>
> Keep updating the error_state before ->resume() in case ->error_detected()
> or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
> ->slot_reset() to be skipped. There are drivers doing this even for Fatal
> Errors, e.g. mhi_pci_error_detected().
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
> drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 1 -
> drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c | 2 --
> drivers/pci/pcie/err.c | 3 ++-
> drivers/scsi/qla2xxx/qla_os.c | 5 -----
> 4 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> index d7cdea8f604d..91e7b38143ea 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> @@ -4215,7 +4215,6 @@ static pci_ers_result_t qlcnic_83xx_io_slot_reset(struct pci_dev *pdev)
> struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
> int err = 0;
>
> - pdev->error_state = pci_channel_io_normal;
> err = pci_enable_device(pdev);
> if (err)
> goto disconnect;
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> index 53cdd36c4123..e051d8c7a28d 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> @@ -3766,8 +3766,6 @@ static int qlcnic_attach_func(struct pci_dev *pdev)
> struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
> struct net_device *netdev = adapter->netdev;
>
> - pdev->error_state = pci_channel_io_normal;
> -
> err = pci_enable_device(pdev);
> if (err)
> return err;
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 930bb60fb761..bebe4bc111d7 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -153,7 +153,8 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>
> device_lock(&dev->dev);
> pdrv = dev->driver;
> - if (!pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
> + if (!pci_dev_set_io_state(dev, pci_channel_io_normal) ||
> + !pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
> goto out;
>
> err_handler = pdrv->err_handler;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d4b484c0fd9d..4460421834cb 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -7883,11 +7883,6 @@ qla2xxx_pci_slot_reset(struct pci_dev *pdev)
> "Slot Reset.\n");
>
> ha->pci_error_state = QLA_PCI_SLOT_RESET;
> - /* Workaround: qla2xxx driver which access hardware earlier
> - * needs error state to be pci_channel_io_online.
> - * Otherwise mailbox command timesout.
> - */
> - pdev->error_state = pci_channel_io_normal;
>
> pci_restore_state(pdev);
>
--
Sathyanarayanan Kuppuswamy
Linux Kernel Developer
next prev parent reply other threads:[~2025-08-13 23:43 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-13 5:11 [PATCH 0/5] PCI: Reduce AER / EEH deviations Lukas Wunner
2025-08-13 5:11 ` [PATCH 1/5] PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors Lukas Wunner
2025-08-13 23:01 ` Sathyanarayanan Kuppuswamy
2025-08-17 13:45 ` Lukas Wunner
2025-08-14 7:56 ` Niklas Schnelle
2025-08-14 9:36 ` Lukas Wunner
2025-08-14 19:29 ` Sathyanarayanan Kuppuswamy
2025-08-17 13:17 ` Lukas Wunner
2025-08-17 16:10 ` Sathyanarayanan Kuppuswamy
2025-08-14 20:31 ` Niklas Schnelle
2025-08-18 23:17 ` Linas Vepstas
2025-08-17 16:11 ` Sathyanarayanan Kuppuswamy
2025-08-13 5:11 ` [PATCH 2/5] PCI/ERR: Fix uevent on failure to recover Lukas Wunner
2025-08-13 23:01 ` Sathyanarayanan Kuppuswamy
2025-08-14 7:08 ` Niklas Schnelle
2025-08-13 5:11 ` [PATCH 3/5] PCI/ERR: Notify drivers " Lukas Wunner
2025-08-13 23:05 ` Sathyanarayanan Kuppuswamy
2025-08-13 5:11 ` [PATCH 4/5] PCI/ERR: Update device error_state already after reset Lukas Wunner
2025-08-13 23:43 ` Sathyanarayanan Kuppuswamy [this message]
2025-08-13 5:11 ` [PATCH 5/5] PCI/ERR: Remove remnants of .link_reset() callback Lukas Wunner
2025-08-14 0:40 ` Sathyanarayanan Kuppuswamy
2025-08-13 18:21 ` [PATCH 0/5] PCI: Reduce AER / EEH deviations Bjorn Helgaas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=004298f7-ae08-428e-9b98-995fc56e55b1@linux.intel.com \
--to=sathyanarayanan.kuppuswamy@linux.intel.com \
--cc="ku ba"@kernel.org \
--cc=GR-Linux-NIC-Dev@marvell.com \
--cc=GR-QLogic-Storage-Upstream@marvell.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=andrew+netdev@lunn.ch \
--cc=aravind.iddamsetty@linux.intel.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=helgaas@kernel.org \
--cc=linasvepstas@gmail.com \
--cc=linux-pci@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lukas@wunner.de \
--cc=mahesh@linux.ibm.com \
--cc=manishc@marvell.com \
--cc=manivannan.sadhasivam@oss.qualcomm.com \
--cc=martin.petersen@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=njavali@marvell.com \
--cc=oohall@gmail.com \
--cc=pabeni@redhat.com \
--cc=riana.tauro@intel.com \
--cc=schnelle@linux.ibm.com \
--cc=sean.c.dardis@intel.com \
--cc=shshaikh@marvell.com \
--cc=terry.bowman@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).