From: Lorenzo Pieralisi <lpieralisi@kernel.org>
To: Dexuan Cui <decui@microsoft.com>
Cc: bhelgaas@google.com, davem@davemloft.net, edumazet@google.com,
haiyangz@microsoft.com, jakeo@microsoft.com, kuba@kernel.org,
kw@linux.com, kys@microsoft.com, leon@kernel.org,
linux-pci@vger.kernel.org, mikelley@microsoft.com,
pabeni@redhat.com, robh@kernel.org, saeedm@nvidia.com,
wei.liu@kernel.org, longli@microsoft.com, boqun.feng@gmail.com,
ssengar@microsoft.com, helgaas@kernel.org,
linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-rdma@vger.kernel.org, netdev@vger.kernel.org,
josete@microsoft.com, stable@vger.kernel.org
Subject: Re: [PATCH v3 2/6] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic
Date: Thu, 25 May 2023 12:15:54 +0200 [thread overview]
Message-ID: <ZG81WpJBBegbLSbT@lpieralisi> (raw)
In-Reply-To: <20230420024037.5921-3-decui@microsoft.com>
On Wed, Apr 19, 2023 at 07:40:33PM -0700, Dexuan Cui wrote:
> When the host tries to remove a PCI device, the host first sends a
> PCI_EJECT message to the guest, and the guest is supposed to gracefully
> remove the PCI device and send a PCI_EJECTION_COMPLETE message to the host;
> the host then sends a VMBus message CHANNELMSG_RESCIND_CHANNELOFFER to
> the guest (when the guest receives this message, the device is already
> unassigned from the guest) and the guest can do some final cleanup work;
> if the guest fails to respond to the PCI_EJECT message within one minute,
> the host sends the VMBus message CHANNELMSG_RESCIND_CHANNELOFFER and
> removes the PCI device forcibly.
>
> In the case of fast device addition/removal, it's possible that the PCI
> device driver is still configuring MSI-X interrupts when the guest receives
> the PCI_EJECT message; the channel callback calls hv_pci_eject_device(),
> which sets hpdev->state to hv_pcichild_ejecting, and schedules a work
> hv_eject_device_work(); if the PCI device driver is calling
> pci_alloc_irq_vectors() -> ... -> hv_compose_msi_msg(), we can break the
> while loop in hv_compose_msi_msg() due to the updated hpdev->state, and
> leave data->chip_data with its default value of NULL; later, when the PCI
> device driver calls request_irq() -> ... -> hv_irq_unmask(), the guest
> crashes in hv_arch_irq_unmask() due to data->chip_data being NULL.
>
> Fix the issue by not testing hpdev->state in the while loop: when the
> guest receives PCI_EJECT, the device is still assigned to the guest, and
> the guest has one minute to finish the device removal gracefully. We don't
> really need to (and we should not) test hpdev->state in the loop.
>
> Fixes: de0aa7b2f97d ("PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()")
> Signed-off-by: Dexuan Cui <decui@microsoft.com>
> Reviewed-by: Michael Kelley <mikelley@microsoft.com>
> Cc: stable@vger.kernel.org
> ---
>
> v2:
> Removed the "debug code".
> No change to the patch body.
> Added Cc:stable
>
> v3:
> Added Michael's Reviewed-by.
>
> drivers/pci/controller/pci-hyperv.c | 11 +++++------
> 1 file changed, 5 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
> index b82c7cde19e66..1b11cf7391933 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -643,6 +643,11 @@ static void hv_arch_irq_unmask(struct irq_data *data)
> pbus = pdev->bus;
> hbus = container_of(pbus->sysdata, struct hv_pcibus_device, sysdata);
> int_desc = data->chip_data;
> + if (!int_desc) {
> + dev_warn(&hbus->hdev->device, "%s() can not unmask irq %u\n",
> + __func__, data->irq);
> + return;
> + }
That's a check that should be there regardless ?
> spin_lock_irqsave(&hbus->retarget_msi_interrupt_lock, flags);
>
> @@ -1911,12 +1916,6 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
> hv_pci_onchannelcallback(hbus);
> spin_unlock_irqrestore(&channel->sched_lock, flags);
>
> - if (hpdev->state == hv_pcichild_ejecting) {
> - dev_err_once(&hbus->hdev->device,
> - "the device is being ejected\n");
> - goto enable_tasklet;
> - }
> -
> udelay(100);
> }
I don't understand why this code is in hv_compose_msi_msg() in the first
place (and why only in that function ?) to me this looks like you are
adding plasters in the code that can turn out to be problematic while
ejecting a device, this does not seem robust at all - that's my opinion.
Feel free to merge this code, I can't ACK it, sorry.
Lorenzo
next prev parent reply other threads:[~2023-05-25 10:16 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-20 2:40 [PATCH v3 0/6] pci-hyper: Fix race condition bugs for fast device hotplug Dexuan Cui
2023-04-20 2:40 ` [PATCH v3 1/6] PCI: hv: Fix a race condition bug in hv_pci_query_relations() Dexuan Cui
2023-05-25 8:14 ` Lorenzo Pieralisi
2023-06-15 3:55 ` Dexuan Cui
2023-04-20 2:40 ` [PATCH v3 2/6] PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic Dexuan Cui
2023-05-25 10:15 ` Lorenzo Pieralisi [this message]
2023-06-15 4:27 ` Dexuan Cui
2023-04-20 2:40 ` [PATCH v3 3/6] PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev Dexuan Cui
2023-05-25 8:16 ` Lorenzo Pieralisi
2023-06-15 4:36 ` Dexuan Cui
2023-04-20 2:40 ` [PATCH v3 4/6] Revert "PCI: hv: Fix a timing issue which causes kdump to fail occasionally" Dexuan Cui
2023-05-25 8:22 ` Lorenzo Pieralisi
2023-06-15 4:41 ` Dexuan Cui
2023-04-20 2:40 ` [PATCH v3 5/6] PCI: hv: Add a per-bus mutex state_lock Dexuan Cui
2023-05-25 8:27 ` Lorenzo Pieralisi
2023-04-20 2:40 ` [PATCH v3 6/6] PCI: hv: Use async probing to reduce boot time Dexuan Cui
2023-04-23 19:11 ` Simon Horman
2023-04-24 20:50 ` Dexuan Cui
2023-05-10 8:23 ` Lorenzo Pieralisi
2023-05-10 17:12 ` Dexuan Cui
2023-05-17 0:02 ` Dexuan Cui
2023-05-23 19:30 ` Dexuan Cui
2023-04-21 2:04 ` [PATCH v3 0/6] pci-hyper: Fix race condition bugs for fast device hotplug Dexuan Cui
2023-04-21 22:23 ` Dexuan Cui
2023-05-08 16:52 ` Wei Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZG81WpJBBegbLSbT@lpieralisi \
--to=lpieralisi@kernel.org \
--cc=bhelgaas@google.com \
--cc=boqun.feng@gmail.com \
--cc=davem@davemloft.net \
--cc=decui@microsoft.com \
--cc=edumazet@google.com \
--cc=haiyangz@microsoft.com \
--cc=helgaas@kernel.org \
--cc=jakeo@microsoft.com \
--cc=josete@microsoft.com \
--cc=kuba@kernel.org \
--cc=kw@linux.com \
--cc=kys@microsoft.com \
--cc=leon@kernel.org \
--cc=linux-hyperv@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=longli@microsoft.com \
--cc=mikelley@microsoft.com \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=robh@kernel.org \
--cc=saeedm@nvidia.com \
--cc=ssengar@microsoft.com \
--cc=stable@vger.kernel.org \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).