Re: [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU to support device assignment

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Aviv B.D" <bd.aviv@gmail.com>,
	qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	Jan Kiszka <jan.kiszka@siemens.com>
Subject: Re: [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU to support device assignment
Date: Mon, 6 Jun 2016 11:30:24 -0600	[thread overview]
Message-ID: <20160606113024.350e3d85@ul30vt.home> (raw)
In-Reply-To: <20160606073825.GH21254@pxdev.xzpeter.org>

On Mon, 6 Jun 2016 15:38:25 +0800
Peter Xu <peterx@redhat.com> wrote:

> Some questions not quite related to this patch content but vfio...
> 
> On Mon, May 23, 2016 at 11:53:42AM -0600, Alex Williamson wrote:
> > On Sat, 21 May 2016 19:19:50 +0300
> > "Aviv B.D" <bd.aviv@gmail.com> wrote:  
> 
> [...]
> 
> > > +#if 0
> > >  static hwaddr vfio_container_granularity(VFIOContainer *container)
> > >  {
> > >      return (hwaddr)1 << ctz64(container->iova_pgsizes);
> > >  }
> > > -
> > > +#endif  
> 
> Here we are fetching the smallest page size that host IOMMU support,
> so even if host IOMMU support large pages, it will not be used as long
> as guest enabled vIOMMU, right?

Not using this replay mechanism, correct.  AFAIK, this replay code has
only been tested on POWER where the window is much, much smaller than
the 64bit address space and hugepages are not supported.  A replay
callback into the iommu could could not only walk the address space
more efficiently, but also attempt to map with hugepages.  It would
however need to be cautious not to coalesce separate mappings by the
guest into a single mapping through vfio, or else we're going to have
inconsistency for mapping vs unmapping that vfio does not expect or
support.
 
> > 
> > 
> > Clearly this is unacceptable, the code has a purpose.
> >   
> > >  static void vfio_listener_region_add(MemoryListener *listener,
> > >                                       MemoryRegionSection *section)
> > >  {
> > > @@ -384,11 +387,13 @@ static void vfio_listener_region_add(MemoryListener *listener,
> > >          giommu->n.notify = vfio_iommu_map_notify;
> > >          QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next);
> > >  
> > > +        vtd_register_giommu(giommu);  
> > 
> > vfio will not assume VT-d, this is why we register the notifier below.
> >   
> > >          memory_region_register_iommu_notifier(giommu->iommu, &giommu->n);
> > > +#if 0
> > >          memory_region_iommu_replay(giommu->iommu, &giommu->n,
> > >                                     vfio_container_granularity(container),
> > >                                     false);  
> 
> For memory_region_iommu_replay(), we are using
> vfio_container_granularity() as the granularity, which is the host
> IOMMU page size. However inside it:
> 
> void memory_region_iommu_replay(MemoryRegion *mr, Notifier *n,
>                                 hwaddr granularity, bool is_write)
> {
>     hwaddr addr;
>     IOMMUTLBEntry iotlb;
> 
>     for (addr = 0; addr < memory_region_size(mr); addr += granularity) {
>         iotlb = mr->iommu_ops->translate(mr, addr, is_write);
>         if (iotlb.perm != IOMMU_NONE) {
>             n->notify(n, &iotlb);
>         }
> 
>         /* if (2^64 - MR size) < granularity, it's possible to get an
>          * infinite loop here.  This should catch such a wraparound */
>         if ((addr + granularity) < addr) {
>             break;
>         }
>     }
> }
> 
> Is it possible that iotlb mapped to a large page (or any page that is
> not the same as granularity)? The above code should have assumed that
> host/guest IOMMU are having the same page size == granularity?

I think this is answered above.  This is not remotely efficient code
for a real 64bit IOMMU (BTW, VT-d does not support the full 64bit
address space either, I believe it's more like 48bits) and is not going
to replay hugepages, but it will give us sufficiently correct IOMMU
entries... eventually. Thanks,

Alex

     prev parent reply	other threads:[~2016-06-06 17:30 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-21 16:19 [Qemu-devel] [PATCH v3 0/3] IOMMU: Add Support to VFIO devices with vIOMMU present Aviv B.D
2016-05-21 16:19 ` [Qemu-devel] [PATCH v3 1/3] IOMMU: add VTD_CAP_CM to vIOMMU capability exposed to guest Aviv B.D
2016-05-21 16:42   ` Jan Kiszka
2016-06-02  8:44     ` Peter Xu
2016-06-02 13:00       ` Alex Williamson
2016-06-02 13:14         ` Jan Kiszka
2016-06-02 13:17           ` Jan Kiszka
2016-06-02 16:15           ` Michael S. Tsirkin
2016-06-06  5:04           ` Peter Xu
2016-06-06 13:11             ` Alex Williamson
2016-06-06 13:43               ` Peter Xu
2016-06-06 17:02                 ` Alex Williamson
2016-06-07  3:20                   ` Peter Xu
2016-06-07  3:58                     ` Alex Williamson
2016-06-07  5:00                       ` Peter Xu
2016-06-07  5:21                       ` Huang, Kai
2016-06-07 18:46                         ` Alex Williamson
2016-06-07 22:39                           ` Huang, Kai
2016-05-24  8:14   ` Jason Wang
2016-05-24  9:25     ` Jan Kiszka
2016-05-28 16:12       ` Aviv B.D.
2016-05-28 16:34         ` Kiszka, Jan
2016-05-21 16:19 ` [Qemu-devel] [PATCH v3 2/3] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode Aviv B.D
2016-06-06  5:04   ` Peter Xu
2016-05-21 16:19 ` [Qemu-devel] [PATCH v3 3/3] IOMMU: Integrate between VFIO and vIOMMU to support device assignment Aviv B.D
2016-05-23 17:53   ` Alex Williamson
2016-05-26 20:58     ` Alex Williamson
2016-05-28 10:52       ` Aviv B.D.
2016-05-28 16:02         ` Alex Williamson
2016-05-28 16:10           ` Aviv B.D.
2016-05-28 17:39             ` Alex Williamson
2016-05-28 18:14               ` Aviv B.D.
2016-05-28 19:48                 ` Alex Williamson
2016-06-02 13:09                   ` Aviv B.D.
2016-06-02 13:34                     ` Alex Williamson
2016-06-06  8:09                       ` Peter Xu
2016-06-06 18:21                         ` Alex Williamson
2016-06-07 13:20                           ` Peter Xu
2016-06-06  7:38     ` Peter Xu
2016-06-06 17:30       ` Alex Williamson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160606113024.350e3d85@ul30vt.home \
    --to=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).