qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, yi.l.liu@intel.com,
	"\\  Michael S . Tsirkin \\ " <mst@redhat.com>,
	Jintack Lim <jintack@cs.columbia.edu>,
	Marcel Apfelbaum <marcel@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] intel_iommu: make sure its init before PCI dev
Date: Wed, 22 Feb 2017 20:24:51 -0700	[thread overview]
Message-ID: <20170222202451.458de865@t450s.home> (raw)
In-Reply-To: <20170223030647.GB4015@pxdev.xzpeter.org>

On Thu, 23 Feb 2017 11:06:47 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Wed, Feb 22, 2017 at 10:30:47AM -0700, Alex Williamson wrote:
> > On Wed, 22 Feb 2017 13:49:25 +0800
> > Peter Xu <peterx@redhat.com> wrote:
> >   
> > > Intel vIOMMU devices are created with "-device" parameter, while here
> > > actually we need to make sure this device will be created before some
> > > other PCI devices (like vfio-pci devices) so that we know iommu_fn will
> > > be setup correctly before realizations of those PCI devices.
> > > 
> > > Here we do explicit check to make sure intel-iommu device will be inited
> > > before all the rest of the PCI devices. This is done by checking against
> > > the devices dangled under current root PCIe bus and we should see
> > > nothing there besides integrated ICH9 ones.
> > > 
> > > If the user violated this rule, we abort the program.
> > > 
> > > Maybe one day we will be able to manage the ordering of device
> > > initialization, and then we can grant VT-d devices a higher init
> > > priority. But before that, let's have this explicit check to make sure
> > > of it.
> > > 
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > >  hw/i386/intel_iommu.c | 40 ++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 40 insertions(+)
> > > 
> > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > > index 22d8226..db74124 100644
> > > --- a/hw/i386/intel_iommu.c
> > > +++ b/hw/i386/intel_iommu.c
> > > @@ -31,6 +31,7 @@
> > >  #include "hw/i386/apic-msidef.h"
> > >  #include "hw/boards.h"
> > >  #include "hw/i386/x86-iommu.h"
> > > +#include "hw/i386/ich9.h"
> > >  #include "hw/pci-host/q35.h"
> > >  #include "sysemu/kvm.h"
> > >  #include "hw/i386/apic_internal.h"
> > > @@ -2560,6 +2561,41 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp)
> > >      return true;
> > >  }
> > >  
> > > +static bool vtd_has_inited_pci_devices(PCIBus *bus, Error **errp)
> > > +{
> > > +    int i;
> > > +    uint8_t func;
> > > +
> > > +    /* We check against root bus */
> > > +    assert(bus && pci_bus_is_root(bus));
> > > +
> > > +    /*
> > > +     * We need to make sure vIOMMU device is created before other PCI
> > > +     * devices other than the integrated ICH9 ones, so that they can
> > > +     * get correct iommu_fn setup even during its realize(). Some
> > > +     * devices (e.g., vfio-pci) will need a correct iommu_fn to work.
> > > +     */
> > > +    for (i = 1; i < PCI_FUNC_MAX * PCI_SLOT_MAX; i++) {
> > > +        /* Skip the checking against ICH9 integrated devices */
> > > +        if (PCI_SLOT(i) == ICH9_LPC_DEV) {
> > > +            func = PCI_FUNC(i);
> > > +            if (func == ICH9_LPC_FUNC ||
> > > +                func == ICH9_SATA1_FUNC ||
> > > +                func == ICH9_SMB_FUNC) {
> > > +                continue;
> > > +            }
> > > +        }  
> > 
> > 
> > Whitelisting specific devfns seems pretty sketchy and fragile.  Can we
> > even assume we're on a Q35 chipset?  I don't see that vtd_realize()
> > takes any particular precautions not to allow initialization on 440fx,
> > or whatever generic chipset we come up with next that may not have
> > these specific devices.  
> 
> Yes. IIUC VT-d now can only work with Q35, right? Cc Marcel and Paolo
> in case I misunderstood it.

440FX is not a PCIe chipset therefore on bare metal there'd be no such
thing as requester IDs with which to lookup context entries.  Of course
a VM doesn't care about those details, but I think it best not to tempt
users with invalid configs.
 
> If so, maybe here we should check against q35 in vtd realization. How
> about something like:
> 
>     if (!object_property_find((Object *)pcms, "q35", NULL)) {
>         error_setg(errp, "Currently Intel vIOMMU only support Q35 platform. "
>                    "Please specify \"-M q35\" to enable it.");
>         return;
>     }
> 
> ?
> 
> > It would probably be a better idea to use
> > object_dynamic_cast() if you want to whitelist specific devices.
> > Perhaps this could even be used to validate the chipset as well.  
> 
> Now Jintack reported another issue, that we may have two default
> devices there if not specifying "-nodefaults", and that two devices
> will always be the first ones to be inited.
> 
> How about here we just explicitly check against vfio-pci devices, so
> we just make sure vfio-pci devices will be put after intel-iommu?
> Since actually vfio-pci devices are the only ones that we know we need
> to be inited explicitly after the VT-d device.

I was afraid you were going to come to this conclusion.  That works,
BUT it just means the problem gets ignored as a vfio problem when
really vfio is doing nothing wrong other than caring about the device
address space during its initialization.  Then users have a perfectly
working config, add a vfio-pci device and things explode.  If you want
to impose a user ordering requirement, do it consistently.  Thanks,

Alex

  reply	other threads:[~2017-02-23  3:24 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-22  5:49 [Qemu-devel] [PATCH] intel_iommu: make sure its init before PCI dev Peter Xu
2017-02-22 11:42 ` Jintack Lim
2017-02-22 13:37   ` Jintack Lim
2017-02-23  2:35     ` Peter Xu
2017-02-22 17:30 ` Alex Williamson
2017-02-23  3:06   ` Peter Xu
2017-02-23  3:24     ` Alex Williamson [this message]
2017-02-23  5:42       ` Peter Xu
2017-02-23 23:26         ` Michael S. Tsirkin
2017-02-23  8:10     ` Marcel Apfelbaum
2017-02-23  8:16       ` Peter Xu
2017-02-23 12:02         ` Marcel Apfelbaum
2017-02-23 15:35           ` Alex Williamson
2017-02-28 14:37             ` Marcel Apfelbaum
2017-03-09 12:31             ` Paolo Bonzini
2017-03-09 15:29               ` Michael S. Tsirkin
2017-03-09 15:34                 ` Paolo Bonzini
2017-02-23 23:21 ` Michael S. Tsirkin
2017-02-24  2:45   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170222202451.458de865@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=jintack@cs.columbia.edu \
    --cc=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=yi.l.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).