From: David Gibson <david@gibson.dropbear.id.au>
To: Alex Williamson <alex.williamson@redhat.com>
Cc: lvivier@redhat.com, thuth@redhat.com, abologna@redhat.com,
qemu-devel@nongnu.org, qemu-ppc@nongnu.org, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH 3/7] vfio: Check guest IOVA ranges against host IOMMU capabilities
Date: Fri, 25 Sep 2015 15:20:59 +1000 [thread overview]
Message-ID: <20150925052059.GB11620@voom.redhat.com> (raw)
In-Reply-To: <1443115921.23936.567.camel@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 7097 bytes --]
On Thu, Sep 24, 2015 at 11:32:01AM -0600, Alex Williamson wrote:
> On Thu, 2015-09-24 at 14:33 +1000, David Gibson wrote:
> > The current vfio core code assumes that the host IOMMU is capable of
> > mapping any IOVA the guest wants to use to where we need. However, real
> > IOMMUs generally only support translating a certain range of IOVAs (the
> > "DMA window") not a full 64-bit address space.
> >
> > The common x86 IOMMUs support a wide enough range that guests are very
> > unlikely to go beyond it in practice, however the IOMMU used on IBM Power
> > machines - in the default configuration - supports only a much more limited
> > IOVA range, usually 0..2GiB.
> >
> > If the guest attempts to set up an IOVA range that the host IOMMU can't
> > map, qemu won't report an error until it actually attempts to map a bad
> > IOVA. If guest RAM is being mapped directly into the IOMMU (i.e. no guest
> > visible IOMMU) then this will show up very quickly. If there is a guest
> > visible IOMMU, however, the problem might not show up until much later when
> > the guest actually attempt to DMA with an IOVA the host can't handle.
> >
> > This patch adds a test so that we will detect earlier if the guest is
> > attempting to use IOVA ranges that the host IOMMU won't be able to deal
> > with.
> >
> > For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is
> > incorrect, but no worse than what we have already. We can't do better for
> > now because the Type1 kernel interface doesn't tell us what IOVA range the
> > IOMMU actually supports.
> >
> > For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported
> > IOVA range and validate guest IOVA ranges against it, and this patch does
> > so.
> >
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > Reviewed-by: Laurent Vivier <lvivier@redhat.com>
> > ---
> > hw/vfio/common.c | 40 +++++++++++++++++++++++++++++++++++++---
> > include/hw/vfio/vfio-common.h | 6 ++++++
> > 2 files changed, 43 insertions(+), 3 deletions(-)
> >
> > diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> > index 95a4850..f90cc75 100644
> > --- a/hw/vfio/common.c
> > +++ b/hw/vfio/common.c
> > @@ -343,14 +343,22 @@ static void vfio_listener_region_add(MemoryListener *listener,
> > if (int128_ge(int128_make64(iova), llend)) {
> > return;
> > }
> > + end = int128_get64(llend);
> > +
> > + if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) {
> > + error_report("vfio: IOMMU container %p can't map guest IOVA region"
> > + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx,
> > + container, iova, end - 1);
> > + ret = -EFAULT; /* FIXME: better choice here? */
>
> "Bad address" makes sense to me. This looks like an RFC comment, can we
> remove it?
Ok.
>
> > + goto fail;
> > + }
> >
> > memory_region_ref(section->mr);
> >
> > if (memory_region_is_iommu(section->mr)) {
> > VFIOGuestIOMMU *giommu;
> >
> > - trace_vfio_listener_region_add_iommu(iova,
> > - int128_get64(int128_sub(llend, int128_one())));
> > + trace_vfio_listener_region_add_iommu(iova, end - 1);
> > /*
> > * FIXME: We should do some checking to see if the
> > * capabilities of the host VFIO IOMMU are adequate to model
> > @@ -387,7 +395,6 @@ static void vfio_listener_region_add(MemoryListener *listener,
> >
> > /* Here we assume that memory_region_is_ram(section->mr)==true */
> >
> > - end = int128_get64(llend);
> > vaddr = memory_region_get_ram_ptr(section->mr) +
> > section->offset_within_region +
> > (iova - section->offset_within_address_space);
> > @@ -685,7 +692,19 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
> > ret = -errno;
> > goto free_container_exit;
> > }
> > +
> > + /*
> > + * FIXME: This assumes that a Type1 IOMMU can map any 64-bit
> > + * IOVA whatsoever. That's not actually true, but the current
> > + * kernel interface doesn't tell us what it can map, and the
> > + * existing Type1 IOMMUs generally support any IOVA we're
> > + * going to actually try in practice.
> > + */
> > + container->min_iova = 0;
> > + container->max_iova = (hwaddr)-1;
> > } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) {
> > + struct vfio_iommu_spapr_tce_info info;
> > +
> > ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
> > if (ret) {
> > error_report("vfio: failed to set group container: %m");
> > @@ -710,6 +729,21 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
> > ret = -errno;
> > goto free_container_exit;
> > }
> > +
> > + /*
> > + * FIXME: This only considers the host IOMMU' 32-bit window.
>
> IOMMU's?
Yes.
> > + * At some point we need to add support for the optional
> > + * 64-bit window and dynamic windows
> > + */
> > + info.argsz = sizeof(info);
> > + ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info);
> > + if (ret) {
> > + error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m");
> > + ret = -errno;
> > + goto free_container_exit;
> > + }
> > + container->min_iova = info.dma32_window_start;
> > + container->max_iova = container->min_iova + info.dma32_window_size - 1;
> > } else {
> > error_report("vfio: No available IOMMU models");
> > ret = -EINVAL;
> > diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h
> > index fbbe6de..859dbec 100644
> > --- a/include/hw/vfio/vfio-common.h
> > +++ b/include/hw/vfio/vfio-common.h
> > @@ -65,6 +65,12 @@ typedef struct VFIOContainer {
> > MemoryListener listener;
> > int error;
> > bool initialized;
> > + /*
> > + * FIXME: This assumes the host IOMMU can support only a
> > + * single contiguous IOVA window. We may need to generalize
> > + * that in future
> > + */
>
> There sure are a lot of FIXMEs here. This just seems to be an
> implementation note. I certainly encourage comments, but they don't all
> need to start with FIXME unless it's something we really should fix.
> "... may need to generalize..." does not sound like such a case.
True enough. I'll file of some "FIXME"s.
>
> > + hwaddr min_iova, max_iova;
> > QLIST_HEAD(, VFIOGuestIOMMU) giommu_list;
> > QLIST_HEAD(, VFIOGroup) group_list;
> > QLIST_ENTRY(VFIOContainer) next;
>
>
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2015-09-25 5:44 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-24 4:33 [Qemu-devel] [PATCH 0/7] VFIO extensions to allow VFIO devices on spapr-pci-host-bridge David Gibson
2015-09-24 4:33 ` [Qemu-devel] [PATCH 1/7] vfio: Remove unneeded union from VFIOContainer David Gibson
2015-09-24 16:01 ` Alex Williamson
2015-09-25 5:14 ` David Gibson
2015-09-24 16:10 ` Thomas Huth
2015-09-24 4:33 ` [Qemu-devel] [PATCH 2/7] vfio: Generalize vfio_listener_region_add failure path David Gibson
2015-09-24 4:33 ` [Qemu-devel] [PATCH 3/7] vfio: Check guest IOVA ranges against host IOMMU capabilities David Gibson
2015-09-24 17:32 ` Alex Williamson
2015-09-25 5:20 ` David Gibson [this message]
2015-09-24 4:33 ` [Qemu-devel] [PATCH 4/7] vfio: Record host IOMMU's available IO page sizes David Gibson
2015-09-24 17:32 ` Alex Williamson
2015-09-25 5:21 ` David Gibson
2015-09-24 4:33 ` [Qemu-devel] [PATCH 5/7] memory: Allow replay of IOMMU mapping notifications David Gibson
2015-09-24 16:08 ` Laurent Vivier
2015-09-25 5:39 ` David Gibson
2015-09-24 17:32 ` Alex Williamson
2015-09-25 5:24 ` David Gibson
2015-09-25 11:25 ` Paolo Bonzini
2015-09-25 11:20 ` Paolo Bonzini
2015-09-25 11:33 ` David Gibson
2015-09-25 12:04 ` Paolo Bonzini
2015-09-26 6:54 ` David Gibson
2015-09-28 8:59 ` Paolo Bonzini
2015-09-29 3:30 ` David Gibson
2015-09-29 7:15 ` Paolo Bonzini
2015-09-30 2:15 ` David Gibson
2015-09-24 4:33 ` [Qemu-devel] [PATCH 6/7] vfio: Allow hotplug of containers onto existing guest IOMMU mappings David Gibson
2015-09-24 4:33 ` [Qemu-devel] [PATCH 7/7] vfio: Expose a VFIO PCI device's group for EEH David Gibson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150925052059.GB11620@voom.redhat.com \
--to=david@gibson.dropbear.id.au \
--cc=abologna@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=lvivier@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).