From: Dexuan Cui <decui@microsoft.com>
To: bhelgaas@google.com, davem@davemloft.net, decui@microsoft.com,
edumazet@google.com, haiyangz@microsoft.com, jakeo@microsoft.com,
kuba@kernel.org, kw@linux.com, kys@microsoft.com,
leon@kernel.org, linux-pci@vger.kernel.org,
lpieralisi@kernel.org, mikelley@microsoft.com, pabeni@redhat.com,
robh@kernel.org, saeedm@nvidia.com, wei.liu@kernel.org,
longli@microsoft.com, boqun.feng@gmail.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 1/6] PCI: hv: fix a race condition bug in hv_pci_query_relations()
Date: Mon, 27 Mar 2023 21:51:17 -0700 [thread overview]
Message-ID: <20230328045122.25850-2-decui@microsoft.com> (raw)
In-Reply-To: <20230328045122.25850-1-decui@microsoft.com>
Fix the longstanding race between hv_pci_query_relations() and
survey_child_resources() by flushing the workqueue before we exit from
hv_pci_query_relations().
Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
drivers/pci/controller/pci-hyperv.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
With the below debug code:
@@ -2103,6 +2103,8 @@ static void survey_child_resources(struct hv_pcibus_device *hbus)
}
spin_unlock_irqrestore(&hbus->device_list_lock, flags);
+ ssleep(15);
+ printk("%s: completing %px\n", __func__, event);
complete(event);
}
@@ -3305,8 +3307,12 @@ static int hv_pci_query_relations(struct hv_device *hdev)
ret = vmbus_sendpacket(hdev->channel, &message, sizeof(message),
0, VM_PKT_DATA_INBAND, 0);
- if (!ret)
+ if (!ret) {
+ ssleep(10); // unassign the PCI device on the host during the 10s
ret = wait_for_response(hdev, &comp);
+ printk("%s: comp=%px is becoming invalid! ret=%d\n",
+ __func__, &comp, ret);
+ }
return ret;
}
@@ -3635,6 +3641,8 @@ static int hv_pci_probe(struct hv_device *hdev,
retry:
ret = hv_pci_query_relations(hdev);
+ printk("hv_pci_query_relations() exited\n");
+
if (ret)
goto free_irq_domain;
I'm able to repro the below hang issue:
[ 74.544744] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: PCI VMBus probing: Using version 0x10004
[ 76.886944] hv_netvsc 818fe754-b912-4445-af51-1f584812e3c9 eth0: VF slot 1 removed
[ 84.788266] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is gone.
[ 84.792586] hv_pci_query_relations: comp=ffffa7504012fb58 is becoming invalid! ret=-19
[ 84.797505] hv_pci_query_relations() exited
[ 89.652268] survey_child_resources: completing ffffa7504012fb58
[ 150.392242] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 150.398447] rcu: 15-...0: (2 ticks this GP) idle=867c/1/0x4000000000000000 softirq=947/947 fqs=5234
[ 150.405851] rcu: (detected by 14, t=15004 jiffies, g=2553, q=4833 ncpus=16)
[ 150.410870] Sending NMI from CPU 14 to CPUs 15:
[ 150.414836] NMI backtrace for cpu 15
[ 150.414840] CPU: 15 PID: 10 Comm: kworker/u32:0 Tainted: G W E 6.3.0-rc3-decui-dirty #34
...
[ 150.414849] Workqueue: hv_pci_468b pci_devices_present_work [pci_hyperv]
[ 150.414866] RIP: 0010:__pv_queued_spin_lock_slowpath+0x10f/0x3c0
...
[ 150.414905] Call Trace:
[ 150.414907] <TASK>
[ 150.414911] _raw_spin_lock_irqsave+0x40/0x50
[ 150.414917] complete+0x1d/0x60
[ 150.414924] pci_devices_present_work+0x5dd/0x680 [pci_hyperv]
[ 150.414946] process_one_work+0x21f/0x430
[ 150.414952] worker_thread+0x4a/0x3c0
With this patch, the hang issue goes away:
[ 186.143612] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is gone.
[ 186.148034] hv_pci_query_relations: comp=ffffa7cfd0aa3b50 is becoming invalid! ret=-19
[ 191.263611] survey_child_resources: completing ffffa7cfd0aa3b50
[ 191.267732] hv_pci_query_relations() exited
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index f33370b75628..b82c7cde19e6 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3308,6 +3308,19 @@ static int hv_pci_query_relations(struct hv_device *hdev)
if (!ret)
ret = wait_for_response(hdev, &comp);
+ /*
+ * In the case of fast device addition/removal, it's possible that
+ * vmbus_sendpacket() or wait_for_response() returns -ENODEV but we
+ * already got a PCI_BUS_RELATIONS* message from the host and the
+ * channel callback already scheduled a work to hbus->wq, which can be
+ * running survey_child_resources() -> complete(&hbus->survey_event),
+ * even after hv_pci_query_relations() exits and the stack variable
+ * 'comp' is no longer valid. This can cause a strange hang issue
+ * or sometimes a page fault. Flush hbus->wq before we exit from
+ * hv_pci_query_relations() to avoid the issues.
+ */
+ flush_workqueue(hbus->wq);
+
return ret;
}
--
2.25.1
next prev parent reply other threads:[~2023-03-28 4:53 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-28 4:51 [PATCH 0/6] pci-hyper: fix race condition bugs for fast device hotplug Dexuan Cui
2023-03-28 4:51 ` Dexuan Cui [this message]
2023-03-28 5:29 ` [EXTERNAL] [PATCH 1/6] PCI: hv: fix a race condition bug in hv_pci_query_relations() Saurabh Singh Sengar
2023-03-28 5:38 ` Dexuan Cui
2023-03-28 17:24 ` Bjorn Helgaas
2023-03-29 8:37 ` Dexuan Cui
2023-03-29 16:06 ` Bjorn Helgaas
2023-03-28 16:48 ` Long Li
2023-03-29 8:21 ` Dexuan Cui
2023-03-28 4:51 ` [PATCH 2/6] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic Dexuan Cui
2023-03-28 4:51 ` [PATCH 3/6] PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev Dexuan Cui
2023-03-28 4:51 ` [PATCH 4/6] Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" Dexuan Cui
2023-03-28 6:33 ` Dexuan Cui
2023-03-28 12:46 ` Wei Hu
2023-03-30 5:17 ` Wei Hu
2023-03-28 4:51 ` [PATCH 5/6] PCI: hv: Add a per-bus mutex state_lock Dexuan Cui
2023-03-29 16:41 ` Michael Kelley (LINUX)
2023-03-29 16:54 ` Dexuan Cui
2023-03-28 4:51 ` [PATCH 6/6] PCI: hv: Use async probing to reduce boot time Dexuan Cui
2023-03-29 16:55 ` Michael Kelley (LINUX)
2023-03-29 22:42 ` Dexuan Cui
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230328045122.25850-2-decui@microsoft.com \
--to=decui@microsoft.com \
--cc=bhelgaas@google.com \
--cc=boqun.feng@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=jakeo@microsoft.com \
--cc=kuba@kernel.org \
--cc=kw@linux.com \
--cc=kys@microsoft.com \
--cc=leon@kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=lpieralisi@kernel.org \
--cc=mikelley@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=robh@kernel.org \
--cc=saeedm@nvidia.com \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).