From: poza@codeaurora.org
To: Thomas Tai <thomas.tai@oracle.com>
Cc: bhelgaas@google.com, keith.busch@intel.com,
linux-pci@vger.kernel.org, linux-pci-owner@vger.kernel.org
Subject: Re: [PATCH 1/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed
Date: Tue, 14 Aug 2018 14:52:29 +0530 [thread overview]
Message-ID: <b0104a716319d76e734c307cb4bedd9d@codeaurora.org> (raw)
In-Reply-To: <51f4b387d9bd96a42d526a6a029fc43b@codeaurora.org>
On 2018-08-14 14:46, poza@codeaurora.org wrote:
> On 2018-08-13 22:21, Thomas Tai wrote:
>> In order to prevent the pcie_do_fatal_recovery() from
>> using the device after it is removed, the device's
>> domain:bus:devfn is stored at the entry of
>> pcie_do_fatal_recovery(). After rescanning the bus, the
>> stored domain:bus:devfn is used to find the device and
>> uses to report pci_info. The original issue only happens
>> on an non-bridge device, a local variable is used instead
>> of checking the device's header type.
>>
>> Signed-off-by: Thomas Tai <thomas.tai@oracle.com>
>> ---
>> drivers/pci/pcie/err.c | 33 +++++++++++++++++++++++----------
>> 1 file changed, 23 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
>> index f02e334..3414445 100644
>> --- a/drivers/pci/pcie/err.c
>> +++ b/drivers/pci/pcie/err.c
>> @@ -287,15 +287,20 @@ void pcie_do_fatal_recovery(struct pci_dev *dev,
>> u32 service)
>> struct pci_bus *parent;
>> struct pci_dev *pdev, *temp;
>> pci_ers_result_t result;
>> + bool is_bridge_device = false;
>> + u16 domain = pci_domain_nr(dev->bus);
>> + u8 bus = dev->bus->number;
>> + u8 devfn = dev->devfn;
>>
>> - if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)
>> + if (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE) {
>> + is_bridge_device = true;
>> udev = dev;
>> - else
>> + } else {
>> udev = dev->bus->self;
>> + }
>>
>> parent = udev->subordinate;
>> pci_lock_rescan_remove();
>> - pci_dev_get(dev);
>> list_for_each_entry_safe_reverse(pdev, temp, &parent->devices,
>> bus_list) {
>> pci_dev_get(pdev);
>> @@ -309,27 +314,35 @@ void pcie_do_fatal_recovery(struct pci_dev *dev,
>> u32 service)
>>
>> result = reset_link(udev, service);
>>
>> - if ((service == PCIE_PORT_SERVICE_AER) &&
>> - (dev->hdr_type == PCI_HEADER_TYPE_BRIDGE)) {
>> + if (service == PCIE_PORT_SERVICE_AER && is_bridge_device) {
>> /*
>> * If the error is reported by a bridge, we think this error
>> * is related to the downstream link of the bridge, so we
>> * do error recovery on all subordinates of the bridge instead
>> * of the bridge and clear the error status of the bridge.
>> */
>> - pci_cleanup_aer_uncorrect_error_status(dev);
>> + pci_cleanup_aer_uncorrect_error_status(udev);
>> }
>>
>> if (result == PCI_ERS_RESULT_RECOVERED) {
>> if (pcie_wait_for_link(udev, true))
>> pci_rescan_bus(udev->bus);
>> - pci_info(dev, "Device recovery from fatal error successful\n");
>> + /* find the pci_dev after rescanning the bus */
>> + dev = pci_get_domain_bus_and_slot(domain, bus, devfn);
> one of the motivations was to remove and re-enumerate rather then
> going thorugh driver's recovery sequence
> was; it might be possible that hotplug capable bridge, the device
> might have changed.
> hence this check will fail
>> + if (dev)
>> + pci_info(dev, "Device recovery from fatal error successful\n");
>> + else
>> + pr_err("AER: Can not find pci_dev for %04x:%02x:%02x.%x\n",
>> + domain, bus,
>> + PCI_SLOT(devfn), PCI_FUNC(devfn));
>> + pci_dev_put(dev);
>> } else {
>> - pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT);
>> - pci_info(dev, "Device recovery from fatal error failed\n");
>> + if (is_bridge_device)
> previously there was no condition.. why is this required now ?
> and it was doing on "dev" while you are doing it on "udev"
is that the reason we dont watnt o use dev after it is removed ? instead
do pci_uevent_ers on bridge ?
>> + pci_uevent_ers(udev, PCI_ERS_RESULT_DISCONNECT);
>> + pr_err("AER: Device %04x:%02x:%02x.%x recovery from fatal error
>> failed\n",
>> + domain, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
>> }
>>
>> - pci_dev_put(dev);
>> pci_unlock_rescan_remove();
>> }
>
> Regards,
> Oza.
next prev parent reply other threads:[~2018-08-14 12:08 UTC|newest]
Thread overview: 91+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-13 16:51 [PATCH 0/1] PCI/AER: prevent pcie_do_fatal_recovery from using device after it is removed Thomas Tai
2018-08-13 16:51 ` [PATCH 1/1] " Thomas Tai
2018-08-14 9:16 ` poza
2018-08-14 9:22 ` poza [this message]
2018-08-14 13:51 ` Thomas Tai
2018-08-15 14:57 ` poza
2018-08-15 15:02 ` Thomas Tai
2018-08-15 15:26 ` poza
2018-08-15 15:43 ` Thomas Tai
2018-08-15 15:59 ` poza
2018-08-15 16:04 ` Thomas Tai
2018-08-15 21:55 ` Benjamin Herrenschmidt
2018-08-15 21:56 ` Benjamin Herrenschmidt
2018-08-16 6:36 ` poza
2018-08-16 6:51 ` Benjamin Herrenschmidt
2018-08-16 6:59 ` Benjamin Herrenschmidt
2018-08-16 8:07 ` poza
2018-08-16 8:12 ` Benjamin Herrenschmidt
2018-08-16 9:03 ` poza
2018-08-16 10:07 ` Benjamin Herrenschmidt
2018-08-16 14:11 ` poza
2018-08-16 23:30 ` Benjamin Herrenschmidt
2018-08-17 10:29 ` poza
2018-08-17 10:44 ` poza
2018-08-18 7:38 ` Benjamin Herrenschmidt
2018-08-16 7:05 ` Benjamin Herrenschmidt
2018-08-16 7:15 ` Benjamin Herrenschmidt
2018-08-16 7:56 ` poza
2018-08-16 8:10 ` Benjamin Herrenschmidt
2018-08-16 8:05 ` poza
2018-08-16 8:15 ` Benjamin Herrenschmidt
2018-08-16 8:22 ` poza
2018-08-16 8:28 ` Benjamin Herrenschmidt
2018-08-16 13:30 ` Thomas Tai
2018-08-16 13:46 ` Sinan Kaya
2018-08-16 23:27 ` Benjamin Herrenschmidt
2018-08-17 6:35 ` poza
2018-08-19 2:24 ` Bjorn Helgaas
2018-08-20 5:09 ` poza
2018-08-20 5:15 ` Benjamin Herrenschmidt
2018-08-20 13:02 ` Thomas Tai
2018-08-20 13:27 ` Benjamin Herrenschmidt
2018-08-19 2:19 ` Bjorn Helgaas
2018-08-19 21:41 ` Sinan Kaya
2018-08-20 2:03 ` Benjamin Herrenschmidt
2018-08-20 5:19 ` poza
2018-08-20 5:33 ` Benjamin Herrenschmidt
2018-08-20 7:56 ` poza
2018-08-20 11:22 ` Benjamin Herrenschmidt
2018-08-20 13:26 ` poza
2018-08-20 21:02 ` Benjamin Herrenschmidt
2018-08-21 5:14 ` poza
2018-08-21 6:06 ` Benjamin Herrenschmidt
2018-08-21 14:37 ` Keith Busch
2018-08-21 15:07 ` Sinan Kaya
2018-08-21 15:29 ` Keith Busch
2018-08-21 15:50 ` Sinan Kaya
2018-08-21 15:55 ` Sinan Kaya
2018-08-21 16:44 ` Keith Busch
2018-08-21 15:30 ` poza
2018-08-21 21:14 ` Benjamin Herrenschmidt
2018-08-21 22:04 ` Keith Busch
2018-08-21 22:33 ` Keith Busch
2018-08-21 23:06 ` Benjamin Herrenschmidt
2018-08-21 23:13 ` Keith Busch
2018-08-22 0:36 ` Benjamin Herrenschmidt
2018-08-30 0:01 ` Keith Busch
2018-08-30 0:10 ` Sinan Kaya
2018-08-30 0:46 ` Keith Busch
2018-08-30 4:26 ` Benjamin Herrenschmidt
2018-08-20 15:53 ` Keith Busch
2018-08-20 16:13 ` poza
2018-08-20 16:32 ` Keith Busch
2018-08-20 21:05 ` Benjamin Herrenschmidt
2018-08-20 21:21 ` Sinan Kaya
2018-08-20 21:35 ` Keith Busch
2018-08-20 21:53 ` Benjamin Herrenschmidt
2018-08-20 22:02 ` Sinan Kaya
2018-08-20 22:04 ` Benjamin Herrenschmidt
2018-08-20 22:13 ` Sinan Kaya
2018-08-20 22:19 ` Benjamin Herrenschmidt
2018-08-22 9:13 ` Lukas Wunner
2018-08-22 14:38 ` Keith Busch
2018-08-22 14:51 ` Sinan Kaya
2018-08-20 22:13 ` Keith Busch
2018-08-20 22:19 ` Benjamin Herrenschmidt
2018-08-21 1:30 ` Keith Busch
2018-08-20 4:37 ` Benjamin Herrenschmidt
2018-08-20 4:39 ` PATCH] Partial revert of "PCI/AER: Handle ERR_FATAL with removal and re-enumeration of devices" Benjamin Herrenschmidt
2018-08-21 19:50 ` Bjorn Helgaas
2018-08-22 4:35 ` poza
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b0104a716319d76e734c307cb4bedd9d@codeaurora.org \
--to=poza@codeaurora.org \
--cc=bhelgaas@google.com \
--cc=keith.busch@intel.com \
--cc=linux-pci-owner@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=thomas.tai@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.