linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
To: Lukas Wunner <lukas@wunner.de>, Bjorn Helgaas <helgaas@kernel.org>
Cc: Riana Tauro <riana.tauro@intel.com>,
	Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>,
	"Sean C. Dardis" <sean.c.dardis@intel.com>,
	Terry Bowman <terry.bowman@amd.com>,
	Niklas Schnelle <schnelle@linux.ibm.com>,
	Linas Vepstas <linasvepstas@gmail.com>,
	Mahesh J Salgaonkar <mahesh@linux.ibm.com>,
	Oliver OHalloran <oohall@gmail.com>,
	Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>,
	linuxppc-dev@lists.ozlabs.org, linux-pci@vger.kernel.org,
	Shahed Shaikh <shshaikh@marvell.com>,
	Manish Chopra <manishc@marvell.com>,
	GR-Linux-NIC-Dev@marvell.com, Nilesh Javali <njavali@marvell.com>,
	GR-QLogic-Storage-Upstream@marvell.com,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>,
	"Martin K. Petersen" <martin.petersen@oracle.com>,
	linux-scsi@vger.kernel.org, Andrew Lunn <andrew+netdev@lunn.ch>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <"ku ba"@kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	netdev@vger.kernel.org
Subject: Re: [PATCH 4/5] PCI/ERR: Update device error_state already after reset
Date: Wed, 13 Aug 2025 16:43:39 -0700	[thread overview]
Message-ID: <004298f7-ae08-428e-9b98-995fc56e55b1@linux.intel.com> (raw)
In-Reply-To: <4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de>


On 8/12/25 10:11 PM, Lukas Wunner wrote:
> After a Fatal Error has been reported by a device and has been recovered
> through a Secondary Bus Reset, AER updates the device's error_state to
> pci_channel_io_normal before invoking its driver's ->resume() callback.
>
> By contrast, EEH updates the error_state earlier, namely after resetting
> the device and before invoking its driver's ->slot_reset() callback.
> Commit c58dc575f3c8 ("powerpc/pseries: Set error_state to
> pci_channel_io_normal in eeh_report_reset()") explains in great detail
> that the earlier invocation is necessitated by various drivers checking
> accessibility of the device with pci_channel_offline() and avoiding
> accesses if it returns true.  It returns true for any other error_state
> than pci_channel_io_normal.
>
> The device should be accessible already after reset, hence the reasoning
> is that it's safe to update the error_state immediately afterwards.
>
> This deviation between AER and EEH seems problematic because drivers
> behave differently depending on which error recovery mechanism the
> platform uses.  Three drivers have gone so far as to update the
> error_state themselves, presumably to work around AER's behavior.
>
> For consistency, amend AER to update the error_state at the same recovery
> steps as EEH.  Drop the now unnecessary workaround from the three drivers.
>
> Keep updating the error_state before ->resume() in case ->error_detected()
> or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
> ->slot_reset() to be skipped.  There are drivers doing this even for Fatal
> Errors, e.g. mhi_pci_error_detected().
>
> Signed-off-by: Lukas Wunner <lukas@wunner.de>
> ---

Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>

>   drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c | 1 -
>   drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c    | 2 --
>   drivers/pci/pcie/err.c                              | 3 ++-
>   drivers/scsi/qla2xxx/qla_os.c                       | 5 -----
>   4 files changed, 2 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> index d7cdea8f604d..91e7b38143ea 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_83xx_hw.c
> @@ -4215,7 +4215,6 @@ static pci_ers_result_t qlcnic_83xx_io_slot_reset(struct pci_dev *pdev)
>   	struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
>   	int err = 0;
>   
> -	pdev->error_state = pci_channel_io_normal;
>   	err = pci_enable_device(pdev);
>   	if (err)
>   		goto disconnect;
> diff --git a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> index 53cdd36c4123..e051d8c7a28d 100644
> --- a/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> +++ b/drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c
> @@ -3766,8 +3766,6 @@ static int qlcnic_attach_func(struct pci_dev *pdev)
>   	struct qlcnic_adapter *adapter = pci_get_drvdata(pdev);
>   	struct net_device *netdev = adapter->netdev;
>   
> -	pdev->error_state = pci_channel_io_normal;
> -
>   	err = pci_enable_device(pdev);
>   	if (err)
>   		return err;
> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
> index 930bb60fb761..bebe4bc111d7 100644
> --- a/drivers/pci/pcie/err.c
> +++ b/drivers/pci/pcie/err.c
> @@ -153,7 +153,8 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>   
>   	device_lock(&dev->dev);
>   	pdrv = dev->driver;
> -	if (!pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
> +	if (!pci_dev_set_io_state(dev, pci_channel_io_normal) ||
> +	    !pdrv || !pdrv->err_handler || !pdrv->err_handler->slot_reset)
>   		goto out;
>   
>   	err_handler = pdrv->err_handler;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d4b484c0fd9d..4460421834cb 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -7883,11 +7883,6 @@ qla2xxx_pci_slot_reset(struct pci_dev *pdev)
>   	       "Slot Reset.\n");
>   
>   	ha->pci_error_state = QLA_PCI_SLOT_RESET;
> -	/* Workaround: qla2xxx driver which access hardware earlier
> -	 * needs error state to be pci_channel_io_online.
> -	 * Otherwise mailbox command timesout.
> -	 */
> -	pdev->error_state = pci_channel_io_normal;
>   
>   	pci_restore_state(pdev);
>   

-- 
Sathyanarayanan Kuppuswamy
Linux Kernel Developer


  reply	other threads:[~2025-08-13 23:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-13  5:11 [PATCH 0/5] PCI: Reduce AER / EEH deviations Lukas Wunner
2025-08-13  5:11 ` [PATCH 1/5] PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors Lukas Wunner
2025-08-13 23:01   ` Sathyanarayanan Kuppuswamy
2025-08-17 13:45     ` Lukas Wunner
2025-08-14  7:56   ` Niklas Schnelle
2025-08-14  9:36     ` Lukas Wunner
2025-08-14 19:29       ` Sathyanarayanan Kuppuswamy
2025-08-17 13:17         ` Lukas Wunner
2025-08-17 16:10           ` Sathyanarayanan Kuppuswamy
2025-08-14 20:31       ` Niklas Schnelle
2025-08-18 23:17     ` Linas Vepstas
2025-08-17 16:11   ` Sathyanarayanan Kuppuswamy
2025-08-13  5:11 ` [PATCH 2/5] PCI/ERR: Fix uevent on failure to recover Lukas Wunner
2025-08-13 23:01   ` Sathyanarayanan Kuppuswamy
2025-08-14  7:08   ` Niklas Schnelle
2025-08-13  5:11 ` [PATCH 3/5] PCI/ERR: Notify drivers " Lukas Wunner
2025-08-13 23:05   ` Sathyanarayanan Kuppuswamy
2025-08-13  5:11 ` [PATCH 4/5] PCI/ERR: Update device error_state already after reset Lukas Wunner
2025-08-13 23:43   ` Sathyanarayanan Kuppuswamy [this message]
2025-08-13  5:11 ` [PATCH 5/5] PCI/ERR: Remove remnants of .link_reset() callback Lukas Wunner
2025-08-14  0:40   ` Sathyanarayanan Kuppuswamy
2025-08-13 18:21 ` [PATCH 0/5] PCI: Reduce AER / EEH deviations Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=004298f7-ae08-428e-9b98-995fc56e55b1@linux.intel.com \
    --to=sathyanarayanan.kuppuswamy@linux.intel.com \
    --cc="ku ba"@kernel.org \
    --cc=GR-Linux-NIC-Dev@marvell.com \
    --cc=GR-QLogic-Storage-Upstream@marvell.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=andrew+netdev@lunn.ch \
    --cc=aravind.iddamsetty@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=helgaas@kernel.org \
    --cc=linasvepstas@gmail.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lukas@wunner.de \
    --cc=mahesh@linux.ibm.com \
    --cc=manishc@marvell.com \
    --cc=manivannan.sadhasivam@oss.qualcomm.com \
    --cc=martin.petersen@oracle.com \
    --cc=netdev@vger.kernel.org \
    --cc=njavali@marvell.com \
    --cc=oohall@gmail.com \
    --cc=pabeni@redhat.com \
    --cc=riana.tauro@intel.com \
    --cc=schnelle@linux.ibm.com \
    --cc=sean.c.dardis@intel.com \
    --cc=shshaikh@marvell.com \
    --cc=terry.bowman@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).