From: Chao Gao <chao.gao@intel.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: xen-devel@lists.xenproject.org, Jan Beulich <jbeulich@suse.com>
Subject: Re: [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot
Date: Tue, 22 Jan 2019 13:50:20 +0800 [thread overview]
Message-ID: <20190122055015.GA19297@gao-cwp> (raw)
In-Reply-To: <20190116103823.2h5kjq77y2nmhfn4@mac>
On Wed, Jan 16, 2019 at 11:38:23AM +0100, Roger Pau Monné wrote:
>On Wed, Jan 16, 2019 at 04:17:30PM +0800, Chao Gao wrote:
>> I find some pass-thru devices don't work any more across guest
>> reboot. Assigning it to another domain also meets the same issue. And
>> the only way to make it work again is un-binding and binding it to
>> pciback. Someone reported this issue one year ago [1].
>>
>> If the device's driver doesn't disable MSI-X during shutdown or qemu is
>> killed/crashed before the domain shutdown, this domain's pirq won't be
>> unmapped. Then xen takes over this work, unmapping all pirq-s, when
>> destroying guest. But as pciback has already disabled meory decoding before
>> xen unmapping pirq, Xen has to sets the host_maskall flag and maskall bit
>> to mask a MSI rather than sets maskbit in MSI-x table. The call trace of
>> this process is:
>>
>> ->arch_domain_destroy
>> ->free_domain_pirqs
>> ->unmap_domain_pirq (if pirq isn't unmapped by qemu)
>> ->pirq_guest_force_unbind
>> ->__pirq_guest_unbind
>> ->mask_msi_irq(=desc->handler->disable())
>> ->the warning in msi_set_mask_bit()
>>
>> The host_maskall bit will prevent guests from clearing the maskall bit
>> even the device is assigned to another guest later. Then guests cannot
>> receive MSIs from this device.
>>
>> To fix this issue, a pirq is unmapped before memory decoding is disabled by
>> pciback. Specifically, when a device is detached from a guest, all established
>> mappings between pirq and msi are destroying before changing the ownership.
>>
>> [1]: https://lists.xenproject.org/archives/html/xen-devel/2017-09/msg02520.html
>
>Thanks, I think the approach is fine, just a couple of comments.
>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v5:
>> - fix the potential infinite loop
>> - assert that unmap_domain_pirq() won't fail
>> - assert msi_list is empty after the loop in pci_unmap_msi
>> - provide a stub for pt_irq_destroy_bind_msi() if !CONFIG_HVM to fix a
>> compilation error when building PVShim
>>
>> Changes in v4:
>> - split out change to 'msix->warned' field
>> - handle multiple msi cases
>> - use list_first_entry_or_null to traverse 'pdev->msi_list'
>> ---
>> xen/drivers/passthrough/io.c | 57 ++++++++++++++++++++++++++------------
>> xen/drivers/passthrough/pci.c | 64 +++++++++++++++++++++++++++++++++++++++++++
>> xen/include/xen/iommu.h | 4 +++
>> 3 files changed, 107 insertions(+), 18 deletions(-)
>>
>> diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c
>> index a6eb8a4..56ee1ef 100644
>> --- a/xen/drivers/passthrough/io.c
>> +++ b/xen/drivers/passthrough/io.c
>> @@ -619,6 +619,42 @@ int pt_irq_create_bind(
>> return 0;
>> }
>>
>> +static void pt_irq_destroy_bind_common(struct domain *d, struct pirq *pirq)
>> +{
>> + struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
>> +
>> + ASSERT(spin_is_locked(&d->event_lock));
>> +
>> + if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) &&
>> + list_empty(&pirq_dpci->digl_list) )
>> + {
>> + pirq_guest_unbind(d, pirq);
>> + msixtbl_pt_unregister(d, pirq);
>> + if ( pt_irq_need_timer(pirq_dpci->flags) )
>> + kill_timer(&pirq_dpci->timer);
>> + pirq_dpci->flags = 0;
>> + /*
>> + * See comment in pt_irq_create_bind's PT_IRQ_TYPE_MSI before the
>> + * call to pt_pirq_softirq_reset.
>> + */
>> + pt_pirq_softirq_reset(pirq_dpci);
>> +
>> + pirq_cleanup_check(pirq, d);
>> + }
>> +}
>> +
>> +void pt_irq_destroy_bind_msi(struct domain *d, struct pirq *pirq)
>> +{
>> + struct hvm_pirq_dpci *pirq_dpci = pirq_dpci(pirq);
>
>const
>
>> +
>> + ASSERT(spin_is_locked(&d->event_lock));
>> +
>> + if ( pirq_dpci && pirq_dpci->gmsi.posted )
>> + pi_update_irte(NULL, pirq, 0);
>> +
>> + pt_irq_destroy_bind_common(d, pirq);
>> +}
>> +
>> int pt_irq_destroy_bind(
>> struct domain *d, const struct xen_domctl_bind_pt_irq *pt_irq_bind)
>> {
>> @@ -727,26 +763,11 @@ int pt_irq_destroy_bind(
>> }
>> else
>> what = "bogus";
>> - }
>> - else if ( pirq_dpci && pirq_dpci->gmsi.posted )
>> - pi_update_irte(NULL, pirq, 0);
>> -
>> - if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) &&
>> - list_empty(&pirq_dpci->digl_list) )
>> - {
>> - pirq_guest_unbind(d, pirq);
>> - msixtbl_pt_unregister(d, pirq);
>> - if ( pt_irq_need_timer(pirq_dpci->flags) )
>> - kill_timer(&pirq_dpci->timer);
>> - pirq_dpci->flags = 0;
>> - /*
>> - * See comment in pt_irq_create_bind's PT_IRQ_TYPE_MSI before the
>> - * call to pt_pirq_softirq_reset.
>> - */
>> - pt_pirq_softirq_reset(pirq_dpci);
>>
>> - pirq_cleanup_check(pirq, d);
>> + pt_irq_destroy_bind_common(d, pirq);
>> }
>> + else
>> + pt_irq_destroy_bind_msi(d, pirq);
>>
>> spin_unlock(&d->event_lock);
>>
>> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
>> index 93c20b9..4f2be02 100644
>> --- a/xen/drivers/passthrough/pci.c
>> +++ b/xen/drivers/passthrough/pci.c
>> @@ -1514,6 +1514,68 @@ static int assign_device(struct domain *d, u16 seg, u8 bus, u8 devfn, u32 flag)
>> return rc;
>> }
>>
>> +/*
>> + * Unmap established mappings between domain's pirq and device's MSI.
>> + * These mappings were set up by qemu/guest and are expected to be
>> + * destroyed when changing the device's ownership.
>> + */
>> +static void pci_unmap_msi(struct pci_dev *pdev)
>> +{
>> + struct msi_desc *entry, *tmp;
>> + struct domain *d = pdev->domain;
>> +
>> + ASSERT(pcidevs_locked());
>> + ASSERT(d);
>> +
>> + spin_lock(&d->event_lock);
>> + list_for_each_entry_safe(entry, tmp, &pdev->msi_list, list)
>> + {
>> + struct pirq *info;
>> + int ret, pirq = 0;
>> + unsigned int nr = entry->msi_attrib.type != PCI_CAP_ID_MSIX
>> + ? entry->msi.nvec : 1;
>
>I think you should mask the entry, like it's done in
>pt_irq_destroy_bind, see the call to guest_mask_msi_irq. That gives a
>consistent state between bind and unbind.
>
>> +
>> + while ( nr-- )
>> + {
>> + struct hvm_pirq_dpci *pirq_dpci;
>
>Nit: you could reduce the scope of info by declaring it here AFAICT.
>
>> +
>> + pirq = domain_irq_to_pirq(d, entry[nr].irq);
>> + WARN_ON(pirq < 0);
>> + if ( pirq <= 0 )
>> + continue;
>> +
>> + info = pirq_info(d, pirq);
>> + if ( !info )
>> + continue;
>> +
>> + pirq_dpci = pirq_dpci(info);
>> + if ( pirq_dpci &&
>> + (pirq_dpci->flags & HVM_IRQ_DPCI_MACH_MSI) &&
>> + (pirq_dpci->flags & HVM_IRQ_DPCI_GUEST_MSI) )
>> + pt_irq_destroy_bind_msi(d, info);
>> + }
>> +
>> + if ( pirq > 0 )
>> + {
>> + /*
>> + * The pirq is derived from an entry in msi_list rather than an
>> + * arbitrary value passed down. There should be a irq (msi) mapped
>> + * to this pirq. In this case, unmapping this pirq should succeed.
>> + * Otherwise, something goes wrong.
>> + */
>> + ret = unmap_domain_pirq(d, pirq);
>> + ASSERT(!ret);
>
>unmap_domain_pirq can fail, why not make pci_unmap_msi return an int
>and propagate the error to the caller?
>
>deassign_device returning an error should also be fine.
>
>> + }
>> + }
>> + /*
>> + * All pirq-s should have been unmapped and corresponding msi_desc
>> + * entries should have been removed in the above loop.
>> + */
>> + ASSERT(list_empty(&pdev->msi_list));
>> +
>> + spin_unlock(&d->event_lock);
>> +}
>> +
>> /* caller should hold the pcidevs_lock */
>> int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
>> {
>> @@ -1529,6 +1591,8 @@ int deassign_device(struct domain *d, u16 seg, u8 bus, u8 devfn)
>> if ( !pdev )
>> return -ENODEV;
>>
>> + pci_unmap_msi(pdev);
>
>Just want to make sure, since deassign_device will be called for both
>PV and HVM domains. AFAICT pci_unmap_msi is safe to call when the
>device is assigned to a PV guest, but would like your confirmation.
Tested with a PV guest loaded by Pygrub. PV guest doesn't suffer the
msi-x issue I want to fix.
With these three patches applied, I got some error messages from Xen
and Dom0 as follow:
(XEN) irq.c:2176: dom3: forcing unbind of pirq 332
(XEN) irq.c:2176: dom3: forcing unbind of pirq 331
(XEN) irq.c:2176: dom3: forcing unbind of pirq 328
(XEN) irq.c:2148: dom3: pirq 359 not mapped
[ 2887.067685] xen:events: unmap irq failed -22
(XEN) irq.c:2148: dom3: pirq 358 not mapped
[ 2887.075917] xen:events: unmap irq failed -22
(XEN) irq.c:2148: dom3: pirq 357 not mapped
It seems, the cause of such error is that pirq-s are unmapped and forcibly
unbound on deassignment; subsequent unmapping pirq issued by dom0 fail.
From some aspects, this error is expected. Because with this patch,
pirq-s are expected to be mapped by qemu or dom0 kernel (for pv case) before
deassignment and mapping/binding pirq after deassignment should fail.
So what's your opinion on handling such error? We should figure out another
method to fix msi-x issue to avoid such error or suppress these errors in
qemu and linux kernel?
Thanks
Chao
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-01-22 5:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-16 8:17 [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot Chao Gao
2019-01-16 8:17 ` [PATCH v5 2/3] libxl: don't reset device when it is accessible by the guest Chao Gao
2019-01-16 8:17 ` [PATCH v5 3/3] xen/pt: initialize 'warned' field of arch_msix properly Chao Gao
2019-01-16 10:38 ` [PATCH v5 1/3] xen/pt: fix some pass-thru devices don't work across reboot Roger Pau Monné
2019-01-16 11:59 ` Chao Gao
2019-01-16 12:34 ` Roger Pau Monné
2019-01-16 14:52 ` Chao Gao
2019-01-16 15:29 ` Roger Pau Monné
2019-01-22 5:50 ` Chao Gao [this message]
2019-01-22 8:24 ` Jan Beulich
2019-01-22 16:08 ` Chao Gao
2019-01-22 16:28 ` Jan Beulich
2019-01-25 11:10 ` Chao Gao
2019-01-25 15:57 ` Jan Beulich
2019-01-22 9:18 ` Roger Pau Monné
2019-01-22 16:11 ` Chao Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190122055015.GA19297@gao-cwp \
--to=chao.gao@intel.com \
--cc=jbeulich@suse.com \
--cc=roger.pau@citrix.com \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.