qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>,
	tianyu.lan@intel.com, kevin.tian@intel.com, mst@redhat.com,
	jan.kiszka@siemens.com, bd.aviv@gmail.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH RFC v4 18/20] intel_iommu: enable vfio devices
Date: Mon, 23 Jan 2017 11:03:08 -0700	[thread overview]
Message-ID: <20170123110308.5d36ce87@t450s.home> (raw)
In-Reply-To: <20170123033429.GF26526@pxdev.xzpeter.org>

On Mon, 23 Jan 2017 11:34:29 +0800
Peter Xu <peterx@redhat.com> wrote:

> On Mon, Jan 23, 2017 at 09:55:39AM +0800, Jason Wang wrote:
> > 
> > 
> > On 2017年01月22日 17:04, Peter Xu wrote:  
> > >On Sun, Jan 22, 2017 at 04:08:04PM +0800, Jason Wang wrote:
> > >
> > >[...]
> > >  
> > >>>+static void vtd_iotlb_page_invalidate_notify(IntelIOMMUState *s,
> > >>>+                                           uint16_t domain_id, hwaddr addr,
> > >>>+                                           uint8_t am)
> > >>>+{
> > >>>+    IntelIOMMUNotifierNode *node;
> > >>>+    VTDContextEntry ce;
> > >>>+    int ret;
> > >>>+
> > >>>+    QLIST_FOREACH(node, &(s->notifiers_list), next) {
> > >>>+        VTDAddressSpace *vtd_as = node->vtd_as;
> > >>>+        ret = vtd_dev_to_context_entry(s, pci_bus_num(vtd_as->bus),
> > >>>+                                       vtd_as->devfn, &ce);
> > >>>+        if (!ret && domain_id == VTD_CONTEXT_ENTRY_DID(ce.hi)) {
> > >>>+            vtd_page_walk(&ce, addr, addr + (1 << am) * VTD_PAGE_SIZE,
> > >>>+                          vtd_page_invalidate_notify_hook,
> > >>>+                          (void *)&vtd_as->iommu, true);  
> > >>Why not simply trigger the notifier here? (or is this vfio required?)  
> > >Because we may only want to notify part of the region - we are with
> > >mask here, but not exact size.
> > >
> > >Consider this: guest (with caching mode) maps 12K memory (4K*3 pages),
> > >the mask will be extended to 16K in the guest. In that case, we need
> > >to explicitly go over the page entry to know that the 4th page should
> > >not be notified.  
> > 
> > I see. Then it was required by vfio only, I think we can add a fast path for
> > !CM in this case by triggering the notifier directly.  
> 
> I noted this down (to be further investigated in my todo), but I don't
> know whether this can work, due to the fact that I think it is still
> legal that guest merge more than one PSIs into one. For example, I
> don't know whether below is legal:
> 
> - guest invalidate page (0, 4k)
> - guest map new page (4k, 8k)
> - guest send single PSI of (0, 8k)
> 
> In that case, it contains both map/unmap, and looks like it didn't
> disobay the spec as well?

The topic of mapping and invalidation granularity also makes me
slightly concerned with the abstraction we use for the type1 IOMMU
backend.  With the "v2" type1 configuration we currently use in QEMU,
the user may only unmap with the same minimum granularity with which
the original mapping was created.  For instance if an iommu notifier
map request gets to vfio with an 8k range, the resulting mapping can
only be removed by an invalidation covering the full range.  Trying to
bisect that original mapping by only invalidating 4k of the range will
generate an error.

I would think (but please confirm), that when we're only tracking
mappings generated by the guest OS that this works.  If the guest OS
maps with 4k pages, we get map notifies for each of those 4k pages.  If
they use 2MB pages, we get 2MB ranges and invalidations will come in
the same granularity.

An area of concern though is the replay mechanism in QEMU, I'll need to
look for it in the code, but replaying an IOMMU domain into a new
container *cannot* coalesce mappings or else it limits the granularity
with which we can later accept unmaps.  Take for instance a guest that
has mapped a contiguous 2MB range with 4K pages.  They can unmap any 4K
page within that range.  However if vfio gets a single 2MB mapping
rather than 512 4K mappings, then the host IOMMU may use a hugepage
mapping where our granularity is now 2MB.  Thanks,

Alex

  parent reply	other threads:[~2017-01-23 18:03 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-20 13:08 [Qemu-devel] [PATCH RFC v4 00/20] VT-d: vfio enablement and misc enhances Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 01/20] vfio: trace map/unmap for notify as well Peter Xu
2017-01-23 18:20   ` Alex Williamson
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 02/20] vfio: introduce vfio_get_vaddr() Peter Xu
2017-01-23 18:49   ` Alex Williamson
2017-01-24  3:28     ` Peter Xu
2017-01-24  4:30       ` Alex Williamson
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 03/20] vfio: allow to notify unmap for very large region Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 04/20] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Peter Xu
2017-01-22  2:51   ` [Qemu-devel] [PATCH RFC v4.1 04/20] intel_iommu: add "caching-mode" option Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 05/20] intel_iommu: simplify irq region translation Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 06/20] intel_iommu: renaming gpa to iova where proper Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 07/20] intel_iommu: fix trace for inv desc handling Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 08/20] intel_iommu: fix trace for addr translation Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 09/20] intel_iommu: vtd_slpt_level_shift check level Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 10/20] memory: add section range info for IOMMU notifier Peter Xu
2017-01-23 19:12   ` Alex Williamson
2017-01-24  7:48     ` Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 11/20] memory: provide IOMMU_NOTIFIER_FOREACH macro Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 12/20] memory: provide iommu_replay_all() Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 13/20] memory: introduce memory_region_notify_one() Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 14/20] memory: add MemoryRegionIOMMUOps.replay() callback Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 15/20] intel_iommu: provide its own replay() callback Peter Xu
2017-01-22  7:56   ` Jason Wang
2017-01-22  8:51     ` Peter Xu
2017-01-22  9:36       ` Peter Xu
2017-01-23  1:50         ` Jason Wang
2017-01-23  1:48       ` Jason Wang
2017-01-23  2:54         ` Peter Xu
2017-01-23  3:12           ` Jason Wang
2017-01-23  3:35             ` Peter Xu
2017-01-23 19:34           ` Alex Williamson
2017-01-24  4:04             ` Peter Xu
2017-01-23 19:33       ` Alex Williamson
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 16/20] intel_iommu: do replay when context invalidate Peter Xu
2017-01-23 10:36   ` Jason Wang
2017-01-24  4:52     ` Peter Xu
2017-01-25  3:09       ` Jason Wang
2017-01-25  3:46         ` Peter Xu
2017-01-25  6:37           ` Tian, Kevin
2017-01-25  6:44             ` Peter Xu
2017-01-25  7:45               ` Jason Wang
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 17/20] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 18/20] intel_iommu: enable vfio devices Peter Xu
2017-01-22  8:08   ` Jason Wang
2017-01-22  9:04     ` Peter Xu
2017-01-23  1:55       ` Jason Wang
2017-01-23  3:34         ` Peter Xu
2017-01-23 10:23           ` Jason Wang
2017-01-23 19:40             ` Alex Williamson
2017-01-25  1:19               ` Jason Wang
2017-01-25  1:31                 ` Alex Williamson
2017-01-25  7:41                   ` Jason Wang
2017-01-24  4:42             ` Peter Xu
2017-01-23 18:03           ` Alex Williamson [this message]
2017-01-24  7:22             ` Peter Xu
2017-01-24 16:24               ` Alex Williamson
2017-01-25  4:04                 ` Peter Xu
2017-01-23  2:01   ` Jason Wang
2017-01-23  2:17     ` Jason Wang
2017-01-23  3:40     ` Peter Xu
2017-01-23 10:27       ` Jason Wang
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 19/20] intel_iommu: unmap existing pages before replay Peter Xu
2017-01-22  8:13   ` Jason Wang
2017-01-22  9:09     ` Peter Xu
2017-01-23  1:57       ` Jason Wang
2017-01-23  7:30         ` Peter Xu
2017-01-23 10:29           ` Jason Wang
2017-01-23 10:40   ` Jason Wang
2017-01-24  7:31     ` Peter Xu
2017-01-25  3:11       ` Jason Wang
2017-01-25  4:15         ` Peter Xu
2017-01-20 13:08 ` [Qemu-devel] [PATCH RFC v4 20/20] intel_iommu: replay even with DSI/GLOBAL inv desc Peter Xu
2017-01-23 15:55 ` [Qemu-devel] [PATCH RFC v4 00/20] VT-d: vfio enablement and misc enhances Michael S. Tsirkin
2017-01-24  7:40   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170123110308.5d36ce87@t450s.home \
    --to=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).