Re: [Qemu-devel] [PATCH for-2.9 2/2] intel_iommu: extend supported guest aw to 48 bits

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Peter Xu <peterx@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	jasowang@redhat.com, famz@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-2.9 2/2] intel_iommu: extend supported guest aw to 48 bits
Date: Tue, 13 Dec 2016 14:12:12 +0800	[thread overview]
Message-ID: <20161213061212.GC32222@pxdev.xzpeter.org> (raw)
In-Reply-To: <20161212224828.5cc9f841@t450s.home>

On Mon, Dec 12, 2016 at 10:48:28PM -0700, Alex Williamson wrote:
> On Tue, 13 Dec 2016 13:24:29 +0800
> Peter Xu <peterx@redhat.com> wrote:
> 
> > On Mon, Dec 12, 2016 at 08:51:50PM -0700, Alex Williamson wrote:
> > 
> > [...]
> > 
> > > > > I'm not sure how the vIOMMU supporting 39 bits or 48 bits is directly
> > > > > relevant to vfio, we're not sharing page tables.  There is already a
> > > > > case today, without vIOMMU that you can make a guest which has more
> > > > > guest physical address space than the hardware IOMMU by overcommitting
> > > > > system memory.  Generally this quickly resolves itself when we start
> > > > > pinning pages since the physical address width of the IOMMU is
> > > > > typically the same as the physical address width of the host system
> > > > > (ie. we exhaust the host memory).    
> > > > 
> > > > Hi, Alex,
> > > > 
> > > > Here does "hardware IOMMU" means the IOMMU iova address space width?
> > > > For example, if guest has 48 bits physical address width (without
> > > > vIOMMU), but host hardware IOMMU only supports 39 bits for its iova
> > > > address space, could device assigment work in this case?  
> > > 
> > > The current usage depends entirely on what the user (VM) tries to map.
> > > You could expose a vIOMMU with a 64bit address width, but the moment
> > > you try to perform a DMA mapping with IOVA beyond bit 39 (if that's the
> > > host IOMMU address width), the ioctl will fail and the VM will abort.
> > > IOW, you can claim whatever vIOMMU address width you want, but if you
> > > layout guest memory or devices in such a way that actually require IOVA
> > > mapping beyond the host capabilities, you're going to abort.  Likewise,
> > > without a vIOMMU if the guest memory layout is sufficiently sparse to
> > > require such IOVAs, you're going to abort.  Thanks,  
> > 
> > Thanks for the explanation. I got the point.
> > 
> > However, should we allow guest behaviors affect hypervisor? In this
> > case, if guest maps IOVA range over 39 bits (assuming vIOMMU is
> > declaring itself with 48 bits address width), the VM will crash. How
> > about we shrink vIOMMU address width to 39 bits during boot if we
> > detected that assigned devices are configured? IMHO no matter what we
> > do in the guest, the hypervisor should keep the guest alive from
> > hypervisor POV (emulation of the guest hardware should not be stopped
> > by guest behavior). If any operation in guest can cause hypervisor
> > down, isn't it a bug?
> 
> Any case of the guest crashing the hypervisor (ie. the host) is a
> serious bug, but a guest causing it's own VM to abort is an entirely
> different class, and in some cases justified.  For instance, you only
> need a guest misbehaving in the virtio protocol to generate a VM
> abort.  The cases Kevin raises make me reconsider because they are
> cases of a VM behaving properly, within the specifications of the
> hardware exposed to it, generating a VM abort, and in the case of vfio
> exposed through to a guest user, allow the VM to be susceptible to the
> actions of that user.
> 
> Of course any time we tie VM hardware to a host constraint, we're
> asking for trouble.  You're example of shrinking the vIOMMU address
> width to 39bits on boot highlights that.  Clearly cold plug devices is
> only one scenario, what about hotplug devices?  We cannot dynamically
> change the vIOMMU address width.  What about migration, we could start
> the VM w/o an assigned device on a 48bit capable host and migrate it to
> a 39bit host and then attempt to hot add an assigned device.  For the
> most compatibility, why would we ever configure the VM with a vIOMMU
> address width beyond the minimum necessary to support the potential
> populated guest physical memory?  Thanks,

For now, I feel a tunable for the address width more essential - let's
just name it as "aw-bits", which should only be used by advanced
users. By default, we can use an address width safe enough, like 39
bits (I assume that most pIOMMUs should support at least 39 bits).
User configurations can override (for now, we can limit the options to
only 39/48 bits).

Then, we can temporarily live even without the interface to detect
host parameters - when user specify a specific width, he/she will
manage the rest (of course taking the risk of VM aborts).

Thanks,

-- peterx

next prev parent reply	other threads:[~2016-12-13  6:12 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-07  5:52 [Qemu-devel] [PATCH for-2.9 0/2] VT-d extend GAW to 48 bits Peter Xu
2016-12-07  5:52 ` [Qemu-devel] [PATCH for-2.9 1/2] intel_iommu: check validity for GAW bits in CE Peter Xu
2016-12-08  2:02   ` Jason Wang
2016-12-08  2:16     ` Peter Xu
2016-12-08  2:21       ` Jason Wang
2016-12-12  1:47         ` Peter Xu
2016-12-07  5:52 ` [Qemu-devel] [PATCH for-2.9 2/2] intel_iommu: extend supported guest aw to 48 bits Peter Xu
2016-12-08  2:00   ` Jason Wang
2016-12-11  3:13   ` Michael S. Tsirkin
2016-12-12  2:01     ` Peter Xu
2016-12-12 19:35       ` Alex Williamson
2016-12-13  3:33         ` Peter Xu
2016-12-13  3:51           ` Alex Williamson
2016-12-13  5:24             ` Peter Xu
2016-12-13  5:48               ` Alex Williamson
2016-12-13  6:12                 ` Peter Xu [this message]
2016-12-13 13:17                   ` Alex Williamson
2016-12-13 14:38                     ` Michael S. Tsirkin
2016-12-13  5:00         ` Tian, Kevin
2016-12-13  5:31           ` Alex Williamson
2016-12-07  8:40 ` [Qemu-devel] [PATCH for-2.9 0/2] VT-d extend GAW " Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161213061212.GC32222@pxdev.xzpeter.org \
    --to=peterx@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=famz@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.