All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Aviv B.D" <bd.aviv@gmail.com>,
	qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	Jan Kiszka <jan.kiszka@siemens.com>,
	Jason Wang <jasowang@redhat.com>
Subject: Re: [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications
Date: Fri, 2 Dec 2016 14:17:07 +0800	[thread overview]
Message-ID: <20161202061707.GE21601@pxdev.xzpeter.org> (raw)
In-Reply-To: <20161201084204.24ce53b4@t450s.home>

On Thu, Dec 01, 2016 at 08:42:04AM -0700, Alex Williamson wrote:
> On Wed, 30 Nov 2016 17:23:59 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Mon, Nov 28, 2016 at 05:51:50PM +0200, Aviv B.D wrote:
> > > * intel_iommu's replay op is not implemented yet (May come in different patch 
> > >   set).
> > >   The replay function is required for hotplug vfio device and to move devices 
> > >   between existing domains.  
> > 
> > I am thinking about this replay thing recently and now I start to
> > doubt whether the whole vt-d vIOMMU framework suites this...
> > 
> > Generally speaking, current work is throwing away the IOMMU "domain"
> > layer here. We maintain the mapping only per device, and we don't care
> > too much about which domain it belongs. This seems problematic.
> > 
> > A simplest wrong case for this is (let's assume cache-mode is
> > enabled): if we have two assigned devices A and B, both belong to the
> > same domain 1. Meanwhile, in domain 1 assume we have one mapping which
> > is the first page (iova range 0-0xfff). Then, if guest wants to
> > invalidate the page, it'll notify VT-d vIOMMU with an invalidation
> > message. If we do this invalidation per-device, we'll need to UNMAP
> > the region twice - once for A, once for B (if we have more devices, we
> > will unmap more times), and we can never know we have done duplicated
> > work since we don't keep domain info, so we don't know they are using
> > the same address space. The first unmap will work, and then we'll
> > possibly get some errors on the rest of dma unmap failures.
> > 
> > Looks like we just cannot live without knowing this domain layer.
> > Because the address space is binded to the domain. If we want to sync
> > the address space (here to setup a correct shadow page table), we need
> > to do it per-domain.
> > 
> > What I can think of as a solution is that we introduce this "domain"
> > layer - like a memory region per domain. When invalidation happens,
> > it's per-domain, not per-device any more (actually I guess that's what
> > current vt-d iommu driver in kernel is doing, we just ignored it - we
> > fetch the devices that matches the domain ID). We can/need to maintain
> > something different, like sid <-> domain mappings (we can do this as
> > long as we are notified when context entries changed), per-domain
> > mappings (just like per-device mappings that we are trying to build in
> > this series, but what we really need is IMHO per domain one), etc.
> > When device switches domain, we switch the IOMMU memory region
> > accordingly.
> > 
> > Does this make any sense? Comments are greatly welcomed (especially
> > from AlexW and DavidG).
> 
> It's been a bit since I've looked at VT-d emulation, but I certainly
> remember that it's way more convoluted than I expected.  It seems like
> a domain should create an AddressSpace and any devices assigned to that
> domain should make use of that single address space, but IIRC VT-d
> creates an address space per device, ie. per context entry.

Yes, I think this idea (one address space per domain) came from one of
your replies in the past, and I just found it more essential than I
thought before.

I'll see whether I can clear the way out before moving on to the
replay implementations. Because IIUC the replay will depend on this
(introducing the domain layer in VT-d IOMMU emulation).

Thanks!

-- peterx

  reply	other threads:[~2016-12-02  6:17 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-28 15:51 [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications Aviv B.D
2016-11-28 15:51 ` [Qemu-devel] [PATCH v7 1/5] IOMMU: add option to enable VTD_CAP_CM to vIOMMU capility exposoed to guest Aviv B.D
2016-12-01  4:25   ` Tian, Kevin
2016-11-28 15:51 ` [Qemu-devel] [PATCH v7 2/5] IOMMU: change iommu_op->translate's is_write to flags, add support to NO_FAIL flag mode Aviv B.D
2016-11-28 15:51 ` [Qemu-devel] [PATCH v7 3/5] IOMMU: enable intel_iommu map and unmap notifiers Aviv B.D
2016-11-29  3:23   ` 蓝天宇
2016-11-29  7:57     ` Aviv B.D.
2016-11-28 15:51 ` [Qemu-devel] [PATCH v7 4/5] IOMMU: add specific replay function with default implemenation Aviv B.D
2016-11-28 15:51 ` [Qemu-devel] [PATCH v7 5/5] IOMMU: add specific null implementation of iommu_replay to intel_iommu Aviv B.D
2016-11-28 16:36   ` Alex Williamson
2016-11-28 18:57     ` Aviv B.D.
2016-11-30  9:23 ` [Qemu-devel] [PATCH v7 0/5] IOMMU: intel_iommu support map and unmap notifications Peter Xu
2016-12-01  4:21   ` Tian, Kevin
2016-12-01  8:13     ` Lan Tianyu
2016-12-02  5:59     ` Peter Xu
2016-12-02  6:23       ` Tian, Kevin
2016-12-02  6:58         ` Peter Xu
2016-12-02 17:26       ` Alex Williamson
2016-12-01  8:27   ` Lan Tianyu
2016-12-02  6:08     ` Peter Xu
2016-12-02 17:30       ` Alex Williamson
2016-12-06  2:03         ` Lan, Tianyu
2016-12-06  2:18         ` Peter Xu
2016-12-01 15:42   ` Alex Williamson
2016-12-02  6:17     ` Peter Xu [this message]
2016-12-01  3:26 ` Tian, Kevin
2016-12-01  6:44 ` Lan Tianyu
2016-12-02  6:52   ` Peter Xu
2016-12-06  6:30     ` Lan Tianyu
2016-12-06  6:51       ` Peter Xu
2016-12-06  7:06         ` Lan Tianyu
2016-12-06  7:22           ` Peter Xu
2016-12-06  8:27             ` Lan Tianyu
2016-12-06 10:59               ` Peter Xu
2016-12-06 16:58                 ` Alex Williamson
2016-12-07  6:09                 ` Lan Tianyu
2016-12-07  6:43                   ` Peter Xu
2016-12-07 14:04                     ` Lan Tianyu
2016-12-08  2:39                       ` Peter Xu
2016-12-08  5:41                         ` Lan Tianyu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161202061707.GE21601@pxdev.xzpeter.org \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.