From: "Michael S. Tsirkin" <mst@redhat.com>
To: "Zhangbo (Oscar)" <oscar.zhangbo@huawei.com>
Cc: fangying <fangying1@huawei.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"limingwang \(A\)" <limingwang@huawei.com>,
"dengkai \(A\)" <dengkai1@huawei.com>
Subject: Re: [Qemu-devel] [PATCH] pcie: fix device hotplug failure at the meantime of VM boot
Date: Tue, 23 Jul 2019 05:45:07 -0400 [thread overview]
Message-ID: <20190723053038-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <0259E1C966E8C54AA93AA2B1240828E672DF1C92@dggeml509-mbx.china.huawei.com>
On Tue, Jul 23, 2019 at 07:47:51AM +0000, Zhangbo (Oscar) wrote:
> If the PCI_EXP_LNKSTA_DLLLA capability is set by default, linux kernel will send
> PDC event to detect whether there is a device in pcie slot. If a device is pluged
> in the pcie-root-port at the same time, hot-plug device will send ABP + PDC
> events to the kernel. The VM kernel will wrongly unplug the device if two PDC
> events get too close. Thus we'd better set the PCI_EXP_LNKSTA_DLLLA
> capability only in hotplug callback.
Could you please describe a reproducer in a bit more detail?
>
> By the way, we should clean up the PCI_EXP_LNKSTA_DLLLA capability during
> unplug to avoid VM restart or migration failure which will enter the same
> abnormal scenario as above.
>
> Signed-off-by: limingwang@huawei.com
> Signed-off-by: fangying1@huawei.com
> Signed-off-by: oscar.zhangbo@huawei.com
So looking at linux I see:
* pciehp_card_present_or_link_active() - whether given slot is occupied
* @ctrl: PCIe hotplug controller
*
* Unlike pciehp_card_present(), which determines presence solely from the
* Presence Detect State bit, this helper also returns true if the Link Active
* bit is set. This is a concession to broken hotplug ports which hardwire
* Presence Detect State to zero, such as Wilocity's [1ae9:0200].
so it looks like linux actually looks at presence detect state,
but we have a bug just like Wilocity's and keeping
link active up fixes that. Can't we fix the bug instead?
> ---
> hw/pci/pcie.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
> index a6beb56..174f392 100644
> --- a/hw/pci/pcie.c
> +++ b/hw/pci/pcie.c
> @@ -75,10 +75,6 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t type, uint8_t version)
> QEMU_PCI_EXP_LNKSTA_NLW(QEMU_PCI_EXP_LNK_X1) |
> QEMU_PCI_EXP_LNKSTA_CLS(QEMU_PCI_EXP_LNK_2_5GT));
>
> - if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
> - pci_word_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA,
> - PCI_EXP_LNKSTA_DLLLA);
> - }
>
> /* We changed link status bits over time, and changing them across
> * migrations is generally fine as hardware changes them too.
Does this actually change anything?
I don't know why do we bother setting it here but we do
set it later in pcie_cap_slot_plug_cb, correct?
I'd like to understand whether this is part of fix or
just a cleanup.
> @@ -484,6 +480,11 @@ void pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
> return;
> }
>
> + if (pci_dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
> + pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA,
> + PCI_EXP_LNKSTA_DLLLA);
> + }
> +
> pcie_cap_slot_push_attention_button(PCI_DEVICE(hotplug_dev));
> }
So this reports data link inactive immediately after
unplug request. Seems a bit questionable: guest did not
respond yet. I'd like to see a comment explaining why
this makes sense.
> --
> 1.8.3.1
prev parent reply other threads:[~2019-07-23 9:45 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-07-23 7:47 [Qemu-devel] [PATCH] pcie: fix device hotplug failure at the meantime of VM boot Zhangbo (Oscar)
2019-07-23 9:18 ` Philippe Mathieu-Daudé
2019-07-23 9:35 ` Zhangbo (Oscar)
2019-07-23 9:45 ` Michael S. Tsirkin [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190723053038-mutt-send-email-mst@kernel.org \
--to=mst@redhat.com \
--cc=dengkai1@huawei.com \
--cc=fangying1@huawei.com \
--cc=limingwang@huawei.com \
--cc=oscar.zhangbo@huawei.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.