qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-devel@nongnu.org, tianyu.lan@intel.com,
	kevin.tian@intel.com, mst@redhat.com, jan.kiszka@siemens.com,
	jasowang@redhat.com, bd.aviv@gmail.com,
	david@gibson.dropbear.id.au
Subject: Re: [Qemu-devel] [PATCH] intel_iommu: allow dynamic switch of IOMMU region
Date: Tue, 20 Dec 2016 11:44:41 +0800	[thread overview]
Message-ID: <20161220034441.GA19964@pxdev.xzpeter.org> (raw)
In-Reply-To: <20161219095650.0a3ac113@t450s.home>

On Mon, Dec 19, 2016 at 09:56:50AM -0700, Alex Williamson wrote:
> On Mon, 19 Dec 2016 22:41:26 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > This is preparation work to finally enabled dynamic switching ON/OFF for
> > VT-d protection. The old VT-d codes is using static IOMMU region, and
> > that won't satisfy vfio-pci device listeners.
> > 
> > Let me explain.
> > 
> > vfio-pci devices depend on the memory region listener and IOMMU replay
> > mechanism to make sure the device mapping is coherent with the guest
> > even if there are domain switches. And there are two kinds of domain
> > switches:
> > 
> >   (1) switch from domain A -> B
> >   (2) switch from domain A -> no domain (e.g., turn DMAR off)
> > 
> > Case (1) is handled by the context entry invalidation handling by the
> > VT-d replay logic. What the replay function should do here is to replay
> > the existing page mappings in domain B.
> > 
> > However for case (2), we don't want to replay any domain mappings - we
> > just need the default GPA->HPA mappings (the address_space_memory
> > mapping). And this patch helps on case (2) to build up the mapping
> > automatically by leveraging the vfio-pci memory listeners.
> > 
> > Another important thing that this patch does is to seperate
> > IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
> > depend on the DMAR region (like before this patch). It should be a
> > standalone region, and it should be able to be activated without
> > DMAR (which is a common behavior of Linux kernel - by default it enables
> > IR while disabled DMAR).
> 
> 
> This seems like an improvement, but I will note that there are existing
> locked memory accounting issues inherent with VT-d and vfio.  With
> VT-d, each device has a unique AddressSpace.  This requires that each
> is managed via a separate vfio container.  Each container is accounted
> for separately for locked pages.  libvirt currently only knows that if
> any vfio devices are attached that the locked memory limit for the
> process needs to be set sufficient for the VM memory.  When VT-d is
> involved, we either need to figure out how to associate otherwise
> independent vfio containers to share locked page accounting or teach
> libvirt that the locked memory requirement needs to be multiplied by
> the number of attached vfio devices.  The latter seems far less
> complicated but reduces the containment of QEMU a bit since the
> process has the ability to lock potentially many multiples of the VM
> address size.  Thanks,

Yes, this patch just tried to move VT-d forward a bit, rather than do
it once and for all. I think we can do better than this in the future,
for example, one address space per guest IOMMU domain (as you have
mentioned before). However I suppose that will need more work (which I
still can't estimate on the amount of work). So I am considering to
enable the device assignments functionally first, then we can further
improve based on a workable version. Same thoughts apply to the IOMMU
replay RFC series.

Regarding to the locked memory accounting issue: do we have existing
way to do the accounting? If so, would you (or anyone) please
elaborate a bit? If not, is that an ongoing/planned work?

Thanks,

-- peterx

  reply	other threads:[~2016-12-20  3:44 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-19 14:41 [Qemu-devel] [PATCH] intel_iommu: allow dynamic switch of IOMMU region Peter Xu
2016-12-19 16:56 ` Alex Williamson
2016-12-20  3:44   ` Peter Xu [this message]
2016-12-20  4:52     ` Alex Williamson
2016-12-20  6:38       ` Peter Xu
2016-12-21  0:04         ` Alex Williamson
2016-12-21  3:19           ` Peter Xu
2016-12-21  3:49           ` David Gibson
2016-12-21  3:30       ` David Gibson
2016-12-19 23:30 ` David Gibson
2016-12-20  4:16   ` Peter Xu
2016-12-21  2:53     ` David Gibson
2016-12-21 10:05       ` Peter Xu
2016-12-21 22:56         ` David Gibson
2016-12-20 23:02 ` no-reply
2016-12-21  3:33   ` Peter Xu
2016-12-20 23:57 ` no-reply
2016-12-21  3:39   ` Peter Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161220034441.GA19964@pxdev.xzpeter.org \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=bd.aviv@gmail.com \
    --cc=david@gibson.dropbear.id.au \
    --cc=jan.kiszka@siemens.com \
    --cc=jasowang@redhat.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=tianyu.lan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).