linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dexuan Cui <decui@microsoft.com>
To: bhelgaas@google.com, davem@davemloft.net, decui@microsoft.com,
	edumazet@google.com, haiyangz@microsoft.com, jakeo@microsoft.com,
	kuba@kernel.org, kw@linux.com, kys@microsoft.com,
	leon@kernel.org, linux-pci@vger.kernel.org,
	lpieralisi@kernel.org, mikelley@microsoft.com, pabeni@redhat.com,
	robh@kernel.org, saeedm@nvidia.com, wei.liu@kernel.org,
	longli@microsoft.com, boqun.feng@gmail.com
Cc: linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-rdma@vger.kernel.org, netdev@vger.kernel.org
Subject: [PATCH 1/6] PCI: hv: fix a race condition bug in hv_pci_query_relations()
Date: Mon, 27 Mar 2023 21:51:17 -0700	[thread overview]
Message-ID: <20230328045122.25850-2-decui@microsoft.com> (raw)
In-Reply-To: <20230328045122.25850-1-decui@microsoft.com>

Fix the longstanding race between hv_pci_query_relations() and
survey_child_resources() by flushing the workqueue before we exit from
hv_pci_query_relations().

Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Signed-off-by: Dexuan Cui <decui@microsoft.com>

---
 drivers/pci/controller/pci-hyperv.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

With the below debug code:

@@ -2103,6 +2103,8 @@ static void survey_child_resources(struct hv_pcibus_device *hbus)
 	}

 	spin_unlock_irqrestore(&hbus->device_list_lock, flags);
+	ssleep(15);
+	printk("%s: completing %px\n", __func__, event);
 	complete(event);
 }

@@ -3305,8 +3307,12 @@ static int hv_pci_query_relations(struct hv_device *hdev)

 	ret = vmbus_sendpacket(hdev->channel, &message, sizeof(message),
 			       0, VM_PKT_DATA_INBAND, 0);
-	if (!ret)
+	if (!ret) {
+		ssleep(10); // unassign the PCI device on the host during the 10s
 		ret = wait_for_response(hdev, &comp);
+		printk("%s: comp=%px is becoming invalid! ret=%d\n",
+			__func__, &comp, ret);
+	}

 	return ret;
 }
@@ -3635,6 +3641,8 @@ static int hv_pci_probe(struct hv_device *hdev,

 retry:
 	ret = hv_pci_query_relations(hdev);
+	printk("hv_pci_query_relations() exited\n");
+
 	if (ret)
 		goto free_irq_domain;

I'm able to repro the below hang issue:

[   74.544744] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: PCI VMBus probing: Using version 0x10004
[   76.886944] hv_netvsc 818fe754-b912-4445-af51-1f584812e3c9 eth0: VF slot 1 removed
[   84.788266] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is gone.
[   84.792586] hv_pci_query_relations: comp=ffffa7504012fb58 is becoming invalid! ret=-19
[   84.797505] hv_pci_query_relations() exited
[   89.652268] survey_child_resources: completing ffffa7504012fb58
[  150.392242] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  150.398447] rcu:     15-...0: (2 ticks this GP) idle=867c/1/0x4000000000000000 softirq=947/947 fqs=5234
[  150.405851] rcu:     (detected by 14, t=15004 jiffies, g=2553, q=4833 ncpus=16)
[  150.410870] Sending NMI from CPU 14 to CPUs 15:
[  150.414836] NMI backtrace for cpu 15
[  150.414840] CPU: 15 PID: 10 Comm: kworker/u32:0 Tainted: G        W   E      6.3.0-rc3-decui-dirty #34
...
[  150.414849] Workqueue: hv_pci_468b pci_devices_present_work [pci_hyperv]
[  150.414866] RIP: 0010:__pv_queued_spin_lock_slowpath+0x10f/0x3c0
...
[  150.414905] Call Trace:
[  150.414907]  <TASK>
[  150.414911]  _raw_spin_lock_irqsave+0x40/0x50
[  150.414917]  complete+0x1d/0x60
[  150.414924]  pci_devices_present_work+0x5dd/0x680 [pci_hyperv]
[  150.414946]  process_one_work+0x21f/0x430
[  150.414952]  worker_thread+0x4a/0x3c0

With this patch, the hang issue goes away:

[  186.143612] hv_pci b92a0085-468b-407a-a88a-d33fac8edc75: The device is gone.
[  186.148034] hv_pci_query_relations: comp=ffffa7cfd0aa3b50 is becoming invalid! ret=-19
[  191.263611] survey_child_resources: completing ffffa7cfd0aa3b50
[  191.267732] hv_pci_query_relations() exited

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index f33370b75628..b82c7cde19e6 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -3308,6 +3308,19 @@ static int hv_pci_query_relations(struct hv_device *hdev)
 	if (!ret)
 		ret = wait_for_response(hdev, &comp);
 
+	/*
+	 * In the case of fast device addition/removal, it's possible that
+	 * vmbus_sendpacket() or wait_for_response() returns -ENODEV but we
+	 * already got a PCI_BUS_RELATIONS* message from the host and the
+	 * channel callback already scheduled a work to hbus->wq, which can be
+	 * running survey_child_resources() -> complete(&hbus->survey_event),
+	 * even after hv_pci_query_relations() exits and the stack variable
+	 * 'comp' is no longer valid. This can cause a strange hang issue
+	 * or sometimes a page fault. Flush hbus->wq before we exit from
+	 * hv_pci_query_relations() to avoid the issues.
+	 */
+	flush_workqueue(hbus->wq);
+
 	return ret;
 }
 
-- 
2.25.1


  reply	other threads:[~2023-03-28  4:53 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-28  4:51 [PATCH 0/6] pci-hyper: fix race condition bugs for fast device hotplug Dexuan Cui
2023-03-28  4:51 ` Dexuan Cui [this message]
2023-03-28  5:29   ` [EXTERNAL] [PATCH 1/6] PCI: hv: fix a race condition bug in hv_pci_query_relations() Saurabh Singh Sengar
2023-03-28  5:38     ` Dexuan Cui
2023-03-28 17:24       ` Bjorn Helgaas
2023-03-29  8:37         ` Dexuan Cui
2023-03-29 16:06           ` Bjorn Helgaas
2023-03-28 16:48   ` Long Li
2023-03-29  8:21     ` Dexuan Cui
2023-03-28  4:51 ` [PATCH 2/6] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic Dexuan Cui
2023-03-28  4:51 ` [PATCH 3/6] PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev Dexuan Cui
2023-03-28  4:51 ` [PATCH 4/6] Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" Dexuan Cui
2023-03-28  6:33   ` Dexuan Cui
2023-03-28 12:46     ` Wei Hu
2023-03-30  5:17     ` Wei Hu
2023-03-28  4:51 ` [PATCH 5/6] PCI: hv: Add a per-bus mutex state_lock Dexuan Cui
2023-03-29 16:41   ` Michael Kelley (LINUX)
2023-03-29 16:54     ` Dexuan Cui
2023-03-28  4:51 ` [PATCH 6/6] PCI: hv: Use async probing to reduce boot time Dexuan Cui
2023-03-29 16:55   ` Michael Kelley (LINUX)
2023-03-29 22:42     ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230328045122.25850-2-decui@microsoft.com \
    --to=decui@microsoft.com \
    --cc=bhelgaas@google.com \
    --cc=boqun.feng@gmail.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=haiyangz@microsoft.com \
    --cc=jakeo@microsoft.com \
    --cc=kuba@kernel.org \
    --cc=kw@linux.com \
    --cc=kys@microsoft.com \
    --cc=leon@kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=lpieralisi@kernel.org \
    --cc=mikelley@microsoft.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=robh@kernel.org \
    --cc=saeedm@nvidia.com \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).