From: Wei Hu <weh@microsoft.com>
To: kys@microsoft.com, haiyangz@microsoft.com,
sthemmin@microsoft.com, wei.liu@kernel.org,
lorenzo.pieralisi@arm.com, robh@kernel.org, bhelgaas@google.com,
linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, decui@microsoft.com,
mikelley@microsoft.com
Cc: Wei Hu <weh@microsoft.com>
Subject: [PATCH v2] PCI: hv: Fix a timing issue which causes kdump to fail occasionally
Date: Fri, 17 Jul 2020 10:55:28 +0800 [thread overview]
Message-ID: <20200717025528.3093-1-weh@microsoft.com> (raw)
Kdump could fail sometime on HyperV guest over Accerlated Network
interface. This is because the retry in hv_pci_enter_d0() relies on
an asynchronous host event to arrive guest before calling
hv_send_resources_allocated(). This fixes the problem by moving retry
to hv_pci_probe(), removing this dependence and making the calling
sequence synchronous.
v2: Adding Fixes tag according to Michael Kelley's review comment.
Fixes: c81992e7f4aa ("PCI: hv: Retry PCI bus D0 entry on invalid device state")
Signed-off-by: Wei Hu <weh@microsoft.com>
---
drivers/pci/controller/pci-hyperv.c | 66 ++++++++++++++---------------
1 file changed, 32 insertions(+), 34 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index bf40ff09c99d..738ee30f3334 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -2759,10 +2759,8 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
struct pci_bus_d0_entry *d0_entry;
struct hv_pci_compl comp_pkt;
struct pci_packet *pkt;
- bool retry = true;
int ret;
-enter_d0_retry:
/*
* Tell the host that the bus is ready to use, and moved into the
* powered-on state. This includes telling the host which region
@@ -2789,38 +2787,6 @@ static int hv_pci_enter_d0(struct hv_device *hdev)
if (ret)
goto exit;
- /*
- * In certain case (Kdump) the pci device of interest was
- * not cleanly shut down and resource is still held on host
- * side, the host could return invalid device status.
- * We need to explicitly request host to release the resource
- * and try to enter D0 again.
- */
- if (comp_pkt.completion_status < 0 && retry) {
- retry = false;
-
- dev_err(&hdev->device, "Retrying D0 Entry\n");
-
- /*
- * Hv_pci_bus_exit() calls hv_send_resource_released()
- * to free up resources of its child devices.
- * In the kdump kernel we need to set the
- * wslot_res_allocated to 255 so it scans all child
- * devices to release resources allocated in the
- * normal kernel before panic happened.
- */
- hbus->wslot_res_allocated = 255;
-
- ret = hv_pci_bus_exit(hdev, true);
-
- if (ret == 0) {
- kfree(pkt);
- goto enter_d0_retry;
- }
- dev_err(&hdev->device,
- "Retrying D0 failed with ret %d\n", ret);
- }
-
if (comp_pkt.completion_status < 0) {
dev_err(&hdev->device,
"PCI Pass-through VSP failed D0 Entry with status %x\n",
@@ -3058,6 +3024,7 @@ static int hv_pci_probe(struct hv_device *hdev,
struct hv_pcibus_device *hbus;
u16 dom_req, dom;
char *name;
+ bool enter_d0_retry = true;
int ret;
/*
@@ -3178,11 +3145,42 @@ static int hv_pci_probe(struct hv_device *hdev,
if (ret)
goto free_fwnode;
+retry:
ret = hv_pci_query_relations(hdev);
if (ret)
goto free_irq_domain;
ret = hv_pci_enter_d0(hdev);
+ /*
+ * In certain case (Kdump) the pci device of interest was
+ * not cleanly shut down and resource is still held on host
+ * side, the host could return invalid device status.
+ * We need to explicitly request host to release the resource
+ * and try to enter D0 again.
+ * The retry should start from hv_pci_query_relations() call.
+ */
+ if (ret == -EPROTO && enter_d0_retry) {
+ enter_d0_retry = false;
+
+ dev_err(&hdev->device, "Retrying D0 Entry\n");
+
+ /*
+ * Hv_pci_bus_exit() calls hv_send_resources_released()
+ * to free up resources of its child devices.
+ * In the kdump kernel we need to set the
+ * wslot_res_allocated to 255 so it scans all child
+ * devices to release resources allocated in the
+ * normal kernel before panic happened.
+ */
+ hbus->wslot_res_allocated = 255;
+ ret = hv_pci_bus_exit(hdev, true);
+
+ if (ret == 0)
+ goto retry;
+
+ dev_err(&hdev->device,
+ "Retrying D0 failed with ret %d\n", ret);
+ }
if (ret)
goto free_irq_domain;
--
2.20.1
next reply other threads:[~2020-07-17 2:56 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-17 2:55 Wei Hu [this message]
2020-07-17 20:11 ` [PATCH v2] PCI: hv: Fix a timing issue which causes kdump to fail occasionally Bjorn Helgaas
2020-07-18 3:46 ` Wei Hu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200717025528.3093-1-weh@microsoft.com \
--to=weh@microsoft.com \
--cc=bhelgaas@google.com \
--cc=decui@microsoft.com \
--cc=haiyangz@microsoft.com \
--cc=kys@microsoft.com \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lorenzo.pieralisi@arm.com \
--cc=mikelley@microsoft.com \
--cc=robh@kernel.org \
--cc=sthemmin@microsoft.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.