From: Alex Williamson <alex.williamson@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
qemu-devel@nongnu.org, Cornelia Huck <cohuck@redhat.com>,
kvm@vger.kernel.org, david@redhat.com
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Balloon inhibit enhancements, vfio restriction
Date: Wed, 8 Aug 2018 16:23:04 -0600 [thread overview]
Message-ID: <20180808162304.42d3fecc@t450s.home> (raw)
In-Reply-To: <20180808034543.GC24415@xz-mi>
On Wed, 8 Aug 2018 11:45:43 +0800
Peter Xu <peterx@redhat.com> wrote:
> On Wed, Aug 08, 2018 at 12:58:32AM +0300, Michael S. Tsirkin wrote:
> > At least with VTD, it seems entirely possible to change e.g. a PMD
> > atomically to point to a different set of PTEs, then flush.
> > That will allow removing memory at high granularity for
> > an arbitrary device without mdev or PASID dependency.
>
> My understanding is that the guest driver should prohibit this kind of
> operation (say, modifying PMD).
There's currently no need for this sort of operation within the dma api
and the iommu api doesn't offer it either.
> Actually I don't see how it can
> happen in Linux if the kernel drivers always call the IOMMU API since
> there are only map/unmap APIs rather than this atomic-modify API.
Exactly, the vfio dma mapping api is just an extension of the iommu api
and there's only map and unmap. Furthermore, unmap can currently return
more than requested if the original mapping made use of superpages in
the iommu, so the only way to achieve page level granularity is to make
only page size mappings. Otherwise we're talking about new apis
across the board.
> The thing is that IMHO it's the guest driver's responsibility to make
> sure the pages will never be used by the device before it removes the
> entry (including modifying the PMD since that actually removes all the
> entries on the old PMD). If not, I would see it a guest kernel bug
> instead of the bug in the emulation code.
This is why there is no atomic modify in the dma api, we have drivers
that directly manage the buffers for a device and know when it's in use
and when it's not. There's never a need, currently, to replace the iova
mapping for a single page within a larger buffer. Maybe the dma api
could also find use for it, but it seems more unique to the iommu api
that we have a "buffer", which happens to be a contiguous RAM region
for the VM, where we do want to change the mapping of a single page.
That single page might currently be mapped by a 2MB or 1GB page in the
case of Intel, or by an arbitrary page size in the case of AMD. vfio
is the driver managing these mappings, but versus the dma api, we don't
have any insight to the device behavior, including inflight dma. We can
stop all dma for the device, but not without interfering and potentially
breaking the behavior of the device.
So again, I think this comes down to new iommu driver support and new
iommu apis and new vfio apis to enable some sort of atomic update
interface, or sacrificing performance and adding bloat by forcing page
size mappings. Thanks,
Alex
next prev parent reply other threads:[~2018-08-08 22:23 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-07 19:31 [Qemu-devel] [PATCH v3 0/4] Balloon inhibit enhancements, vfio restriction Alex Williamson
2018-08-07 19:31 ` [Qemu-devel] [PATCH v3 1/4] balloon: Allow multiple inhibit users Alex Williamson
2018-08-07 19:44 ` Michael S. Tsirkin
2018-08-07 20:08 ` Alex Williamson
2018-08-08 0:07 ` Michael S. Tsirkin
2018-08-07 19:31 ` [Qemu-devel] [PATCH v3 2/4] kvm: Use inhibit to prevent ballooning without synchronous mmu Alex Williamson
2018-08-16 18:15 ` Alex Williamson
2018-08-17 7:46 ` Paolo Bonzini
2018-08-07 19:31 ` [Qemu-devel] [PATCH v3 3/4] vfio: Inhibit ballooning based on group attachment to a container Alex Williamson
2018-08-08 3:38 ` Peter Xu
2018-08-07 19:31 ` [Qemu-devel] [PATCH v3 4/4] vfio/ccw/pci: Allow devices to opt-in for ballooning Alex Williamson
2018-08-07 19:44 ` [Qemu-devel] [PATCH v3 0/4] Balloon inhibit enhancements, vfio restriction Michael S. Tsirkin
2018-08-07 19:53 ` Alex Williamson
2018-08-07 21:58 ` Michael S. Tsirkin
2018-08-07 22:40 ` Alex Williamson
2018-08-08 0:02 ` Michael S. Tsirkin
2018-08-08 3:45 ` Peter Xu
2018-08-08 22:23 ` Alex Williamson [this message]
2018-08-09 9:20 ` Michael S. Tsirkin
2018-08-09 9:23 ` Michael S. Tsirkin
2018-08-09 9:37 ` Peter Xu
2018-08-09 10:13 ` Michael S. Tsirkin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180808162304.42d3fecc@t450s.home \
--to=alex.williamson@redhat.com \
--cc=cohuck@redhat.com \
--cc=david@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).