From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Jintack Lim <jintack@cs.columbia.edu>,
mst@redhat.com, QEMU Devel Mailing List <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] iommu emulation
Date: Wed, 15 Feb 2017 19:47:58 -0700 [thread overview]
Message-ID: <20170215194758.4622d73c@t450s.home> (raw)
In-Reply-To: <20170216022839.GC15985@pxdev.xzpeter.org>
On Thu, 16 Feb 2017 10:28:39 +0800
Peter Xu <peterx@redhat.com> wrote:
> On Wed, Feb 15, 2017 at 11:15:52AM -0700, Alex Williamson wrote:
>
> [...]
>
> > > Alex, do you like something like below to fix above issue that Jintack
> > > has encountered?
> > >
> > > (note: this code is not for compile, only trying show what I mean...)
> > >
> > > ------8<-------
> > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> > > index 332f41d..4dca631 100644
> > > --- a/hw/vfio/pci.c
> > > +++ b/hw/vfio/pci.c
> > > @@ -1877,25 +1877,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > > */
> > > config = g_memdup(pdev->config, vdev->config_size);
> > >
> > > - /*
> > > - * Extended capabilities are chained with each pointing to the next, so we
> > > - * can drop anything other than the head of the chain simply by modifying
> > > - * the previous next pointer. For the head of the chain, we can modify the
> > > - * capability ID to something that cannot match a valid capability. ID
> > > - * 0 is reserved for this since absence of capabilities is indicated by
> > > - * 0 for the ID, version, AND next pointer. However, pcie_add_capability()
> > > - * uses ID 0 as reserved for list management and will incorrectly match and
> > > - * assert if we attempt to pre-load the head of the chain with this ID.
> > > - * Use ID 0xFFFF temporarily since it is also seems to be reserved in
> > > - * part for identifying absence of capabilities in a root complex register
> > > - * block. If the ID still exists after adding capabilities, switch back to
> > > - * zero. We'll mark this entire first dword as emulated for this purpose.
> > > - */
> > > - pci_set_long(pdev->config + PCI_CONFIG_SPACE_SIZE,
> > > - PCI_EXT_CAP(0xFFFF, 0, 0));
> > > - pci_set_long(pdev->wmask + PCI_CONFIG_SPACE_SIZE, 0);
> > > - pci_set_long(vdev->emulated_config_bits + PCI_CONFIG_SPACE_SIZE, ~0);
> > > -
> > > for (next = PCI_CONFIG_SPACE_SIZE; next;
> > > next = PCI_EXT_CAP_NEXT(pci_get_long(config + next))) {
> > > header = pci_get_long(config + next);
> > > @@ -1917,6 +1898,8 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > > switch (cap_id) {
> > > case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> > > case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
> > > + /* keep this ecap header (4 bytes), but mask cap_id to 0xffff */
> > > + ...
> > > trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
> > > break;
> > > default:
> > > @@ -1925,11 +1908,6 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> > >
> > > }
> > >
> > > - /* Cleanup chain head ID if necessary */
> > > - if (pci_get_word(pdev->config + PCI_CONFIG_SPACE_SIZE) == 0xFFFF) {
> > > - pci_set_word(pdev->config + PCI_CONFIG_SPACE_SIZE, 0);
> > > - }
> > > -
> > > g_free(config);
> > > return;
> > > }
> > > ----->8-----
> > >
> > > Since after all we need the assumption that 0xffff is reserved for
> > > cap_id. Then, we can just remove the "first 0xffff then 0x0" hack,
> > > which is imho error-prone and hacky.
> >
> > This doesn't fix the bug, which is that pcie_add_capability() uses a
> > valid capability ID for it's own internal tracking. It's only doing
> > this to find the end of the capability chain, which we could do in a
> > spec complaint way by looking for a zero next pointer. Fix that and
> > then vfio doesn't need to do this set to 0xffff then back to zero
> > nonsense at all. Capability ID zero is valid. Thanks,
>
> Yeah I see Michael's fix on the capability list stuff. However, imho
> these are two different issues? Or say, even if with that patch, we
> should still need this hack (first 0x0, then 0xffff) right? Since
> looks like that patch didn't solve the problem if the first pcie ecap
> is masked at 0x100.
I thought the problem was that QEMU in the host exposes a device with a
capability ID of 0 to the L1 guest. QEMU in the L1 guest balks at a
capability ID of 0 because that's how it finds the end of the chain.
Therefore if we make QEMU not use capability ID 0 for internal
purposes, things work. vfio using 0xffff and swapping back to 0x0
becomes unnecessary, but doesn't hurt anything. Thanks,
Alex
next prev parent reply other threads:[~2017-02-16 2:48 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CAHyh4xiVKjd+D=qaizUZ02O8xLYhpoVKOqC9cR0ZWWyLq9HtbQ@mail.gmail.com>
[not found] ` <20170208031216.GA5151@pxdev.xzpeter.org>
[not found] ` <CAHyh4xg7NVPjXu3c+xGWNzQqwLgFqFJTPo4SgN-X+FNuHjGihQ@mail.gmail.com>
[not found] ` <CAHyh4xhOPmfLoU_fvtbBF1Wqbzji9q6rp_bRN38qfnwvhQq+9A@mail.gmail.com>
2017-02-09 3:52 ` [Qemu-devel] iommu emulation Peter Xu
2017-02-09 13:01 ` Jintack Lim
2017-02-14 7:35 ` Peter Xu
2017-02-14 12:50 ` Jintack Lim
2017-02-15 2:52 ` Peter Xu
2017-02-15 3:34 ` Peter Xu
2017-02-15 18:15 ` Alex Williamson
2017-02-16 2:28 ` Peter Xu
2017-02-16 2:47 ` Alex Williamson [this message]
2017-02-21 10:33 ` Jintack Lim
2017-02-23 23:04 ` Jintack Lim
2017-03-02 22:20 ` Bandan Das
2017-03-02 23:36 ` Jintack Lim
2017-03-03 3:43 ` Peter Xu
2017-03-03 7:45 ` Bandan Das
2017-02-15 22:05 ` Jintack Lim
2017-02-15 22:50 ` Alex Williamson
2017-02-15 23:25 ` Jintack Lim
2017-02-16 1:17 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170215194758.4622d73c@t450s.home \
--to=alex.williamson@redhat.com \
--cc=jintack@cs.columbia.edu \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).