qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs
@ 2013-04-24 12:01 David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit David Gibson
                   ` (4 more replies)
  0 siblings, 5 replies; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel

This patch series represents my attempt at better integration of the
vfio code with qemu's DMAContext and other guest IOMMU handling code,
as discussed by Alex Williamson and myself on IRC.  It opens the way
for using VFIO with a guest system containing an IOMMU by passing
guest IOMMU operations through to the host IOMMU via VFIO.  That's
opposed to the present model of having no IOMMU in the guest, and
simply mapping all guest RAM into the host IOMMU.

I hit some complications with some changes in the guest iommu
infrastructure since it went in.  Patches 1 & 2 clean that up, patches
3-5 do the necessary VFIO infrastructure changes.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit
  2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
@ 2013-04-24 12:01 ` David Gibson
  2013-04-24 12:36   ` Paolo Bonzini
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device David Gibson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel, David Gibson

In commit 1c380f9460522f32c8dd2577b2a53d518ec91c6d "pci: honor
PCI_COMMAND_MASTER" the PCI_COMMAND_MASTER bit of the PCI command register
was implemented by toggling the enable bit on a memory region alias
interposed between the PCI device's dma address space and the main
system memory region.

Introducing an extra alias region for every PCI device just to implement
that bit seems like serious overkill.  Furthermore, it doesn't work when
there's a (guest side) iommu present, since that uses a different path for
constructing the PCI device's dma address space.

This patch removes the aliased window, instead implementing
PCI_COMMAND_MASTER with tests in the PCI DMA functions.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/pci/pci.c         |   14 +-------------
 include/hw/pci/pci.h |   19 ++++++++++++++-----
 2 files changed, 15 insertions(+), 18 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 051da67..2fdd4b2 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -425,10 +425,6 @@ static int get_pci_config_device(QEMUFile *f, void *pv, size_t size)
 
     pci_update_mappings(s);
 
-    memory_region_set_enabled(&s->bus_master_enable_region,
-                              pci_get_word(s->config + PCI_COMMAND)
-                              & PCI_COMMAND_MASTER);
-
     g_free(config);
     return 0;
 }
@@ -850,11 +846,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
         /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
          * taken unconditionally */
         /* FIXME: inherit memory region from bus creator */
-        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
-                                 get_system_memory(), 0,
-                                 memory_region_size(get_system_memory()));
-        memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
-        address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
+        address_space_init(&pci_dev->bus_master_as, get_system_memory());
         pci_dev->dma = g_new(DMAContext, 1);
         dma_context_init(pci_dev->dma, &pci_dev->bus_master_as, NULL, NULL, NULL);
     }
@@ -913,7 +905,6 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
 
     if (!pci_dev->bus->dma_context_fn) {
         address_space_destroy(&pci_dev->bus_master_as);
-        memory_region_destroy(&pci_dev->bus_master_enable_region);
         g_free(pci_dev->dma);
         pci_dev->dma = NULL;
     }
@@ -1201,9 +1192,6 @@ void pci_default_write_config(PCIDevice *d, uint32_t addr, uint32_t val, int l)
 
     if (range_covers_byte(addr, l, PCI_COMMAND)) {
         pci_update_irq_disabled(d, was_irq_disabled);
-        memory_region_set_enabled(&d->bus_master_enable_region,
-                                  pci_get_word(d->config + PCI_COMMAND)
-                                    & PCI_COMMAND_MASTER);
     }
 
     msi_write_config(d, addr, val, l);
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 7e5986a..8f682cc 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -241,7 +241,6 @@ struct PCIDevice {
     char name[64];
     PCIIORegion io_regions[PCI_NUM_REGIONS];
     AddressSpace bus_master_as;
-    MemoryRegion bus_master_enable_region;
     DMAContext *dma;
 
     /* do not access the following fields */
@@ -648,8 +647,12 @@ static inline DMAContext *pci_dma_context(PCIDevice *dev)
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
-    return 0;
+    if (pci_get_word(dev->config + PCI_COMMAND) & PCI_COMMAND_MASTER) {
+        dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
+        return 0;
+    } else {
+        return -EPERM;
+    }
 }
 
 static inline int pci_dma_read(PCIDevice *dev, dma_addr_t addr,
@@ -668,12 +671,18 @@ static inline int pci_dma_write(PCIDevice *dev, dma_addr_t addr,
     static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev,      \
                                                    dma_addr_t addr)     \
     {                                                                   \
-        return ld##_l##_dma(pci_dma_context(dev), addr);                \
+        if (pci_get_word(dev->config + PCI_COMMAND) & PCI_COMMAND_MASTER) { \
+            return ld##_l##_dma(pci_dma_context(dev), addr);            \
+        } else {                                                        \
+            return -1;                                                  \
+        }                                                               \
     }                                                                   \
     static inline void st##_s##_pci_dma(PCIDevice *dev,                 \
                                         dma_addr_t addr, uint##_bits##_t val) \
     {                                                                   \
-        st##_s##_dma(pci_dma_context(dev), addr, val);                  \
+        if (pci_get_word(dev->config + PCI_COMMAND) & PCI_COMMAND_MASTER) { \
+            st##_s##_dma(pci_dma_context(dev), addr, val);              \
+        }                                                               \
     }
 
 PCI_DMA_DEFINE_LDST(ub, b, 8);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device
  2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit David Gibson
@ 2013-04-24 12:01 ` David Gibson
  2013-04-24 12:34   ` Paolo Bonzini
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 3/5] vfio: Associate VFIO groups with DMAContexts David Gibson
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel, David Gibson

Commit 817dcc5368988b023c5e1d3f1444fd370c77c6a9 "pci: give each device its
own address space" did exactly what the comment suggests, it creates an
AddressSpace object for every PCI device to represent the address space
it uses for DMA.  All the objects so constructed are basically identical
to address_space_memory.

While its true that PCI devices can have different effective DMA address
spaces, they usually don't.  Further, the PCI structure already had a way
to represent its DMA address space, through its DMAContext pointer.  The
way those are assigned through an optional callback in the bus also
addresses the "FIXME: inherit memory region from bus creator".

So while its true that the DMAContext handling needs to be better
integrated with the MemoryRegion and AddressSpace handling, that commit
wasn't actually a step in the right direction for it.  Since then, the
DMAContext has been extended so it can backend onto an AddressSpace, and
thereby, a MemoryRegion.  Effectively a DMAContext is now an AddressSpace
with iommu translation handling on top.

Therefore, this patch essentially reverts the earlier commit, making all
PCI devices by default shared the global dma_context_memory which backs
onto main system memory.  Those cases which need to set up different DMA
address spaces for each PCI device should supply a suitable dma_context_fn
in the bus to correctly assign / create suitable DMAContext structures.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/pci/pci.c         |   14 ++------------
 include/hw/pci/pci.h |    1 -
 2 files changed, 2 insertions(+), 13 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2fdd4b2..038be92 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -843,12 +843,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
     if (bus->dma_context_fn) {
         pci_dev->dma = bus->dma_context_fn(bus, bus->dma_context_opaque, devfn);
     } else {
-        /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
-         * taken unconditionally */
-        /* FIXME: inherit memory region from bus creator */
-        address_space_init(&pci_dev->bus_master_as, get_system_memory());
-        pci_dev->dma = g_new(DMAContext, 1);
-        dma_context_init(pci_dev->dma, &pci_dev->bus_master_as, NULL, NULL, NULL);
+        pci_dev->dma = &dma_context_memory;
     }
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
@@ -902,12 +897,7 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
     qemu_free_irqs(pci_dev->irq);
     pci_dev->bus->devices[pci_dev->devfn] = NULL;
     pci_config_free(pci_dev);
-
-    if (!pci_dev->bus->dma_context_fn) {
-        address_space_destroy(&pci_dev->bus_master_as);
-        g_free(pci_dev->dma);
-        pci_dev->dma = NULL;
-    }
+    pci_dev->dma = NULL;
 }
 
 static void pci_unregister_io_regions(PCIDevice *pci_dev)
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 8f682cc..206da1f 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -240,7 +240,6 @@ struct PCIDevice {
     int32_t devfn;
     char name[64];
     PCIIORegion io_regions[PCI_NUM_REGIONS];
-    AddressSpace bus_master_as;
     DMAContext *dma;
 
     /* do not access the following fields */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 3/5] vfio: Associate VFIO groups with DMAContexts
  2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device David Gibson
@ 2013-04-24 12:01 ` David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext David Gibson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 5/5] vfio: Only use memory listeners when appropriate David Gibson
  4 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel, David Gibson

The only model so far supported for VFIO passthrough devices is the model
usually used on x86, where all of the guest's RAM is mapped into the
(host) IOMMU and there is no IOMMU visible in the guest.  Later, however
we want to also support guest visible IOMMUs.

In order to do that the vfio subsystem needs to know which address space
its devices are supposed to be in.  In other words the PCI device's
DMAContext needs to be passed through to vfio.  This patch updates the
internal interfaces to do that.  So far it doesn't do much with it, except
to verify/enforce that a group is never added to multiple contexts.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/misc/vfio.c |   38 +++++++++++++++++++++++++++-----------
 1 file changed, 27 insertions(+), 11 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 693a9ff..f77a599 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -116,6 +116,7 @@ enum {
 struct VFIOGroup;
 
 typedef struct VFIOContainer {
+    DMAContext *dma;
     int fd; /* /dev/vfio/vfio, empowered by the attached groups */
     struct {
         /* enable abstraction to support various iommu backends */
@@ -2612,13 +2613,19 @@ static int vfio_load_rom(VFIODevice *vdev)
     return 0;
 }
 
-static int vfio_connect_container(VFIOGroup *group)
+static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
 {
     VFIOContainer *container;
     int ret, fd;
 
     if (group->container) {
-        return 0;
+        if (group->container->dma == dma) {
+            return 0;
+        } else {
+            error_report("vfio: group %d used in multiple DMA contexts",
+                         group->groupid);
+            return -EBUSY;
+        }
     }
 
     QLIST_FOREACH(container, &container_list, next) {
@@ -2644,6 +2651,7 @@ static int vfio_connect_container(VFIOGroup *group)
     }
 
     container = g_malloc0(sizeof(*container));
+    container->dma = dma;
     container->fd = fd;
 
     if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
@@ -2683,12 +2691,12 @@ static int vfio_connect_container(VFIOGroup *group)
     return 0;
 }
 
-static void vfio_disconnect_container(VFIOGroup *group)
+static void vfio_disconnect_context(VFIOGroup *group)
 {
     VFIOContainer *container = group->container;
 
     if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
-        error_report("vfio: error disconnecting group %d from container",
+        error_report("vfio: error disconnecting group %d from context",
                      group->groupid);
     }
 
@@ -2700,13 +2708,13 @@ static void vfio_disconnect_container(VFIOGroup *group)
             container->iommu_data.release(container);
         }
         QLIST_REMOVE(container, next);
-        DPRINTF("vfio_disconnect_container: close container->fd\n");
+        DPRINTF("vfio_disconnect_context: close container->fd\n");
         close(container->fd);
         g_free(container);
     }
 }
 
-static VFIOGroup *vfio_get_group(int groupid)
+static VFIOGroup *vfio_get_group(int groupid, DMAContext *dma)
 {
     VFIOGroup *group;
     char path[32];
@@ -2714,7 +2722,15 @@ static VFIOGroup *vfio_get_group(int groupid)
 
     QLIST_FOREACH(group, &group_list, next) {
         if (group->groupid == groupid) {
-            return group;
+            /* Found it.  Now is it already in the right context? */
+            assert(group->container);
+            if (group->container->dma == dma) {
+                return group;
+            } else {
+                error_report("vfio: group %d used in multiple DMA contexts",
+                             group->groupid);
+                return NULL;
+            }
         }
     }
 
@@ -2747,8 +2763,8 @@ static VFIOGroup *vfio_get_group(int groupid)
     group->groupid = groupid;
     QLIST_INIT(&group->device_list);
 
-    if (vfio_connect_container(group)) {
-        error_report("vfio: failed to setup container for group %d", groupid);
+    if (vfio_connect_context(group, dma)) {
+        error_report("vfio: failed to setup context for group %d", groupid);
         close(group->fd);
         g_free(group);
         return NULL;
@@ -2765,7 +2781,7 @@ static void vfio_put_group(VFIOGroup *group)
         return;
     }
 
-    vfio_disconnect_container(group);
+    vfio_disconnect_context(group);
     QLIST_REMOVE(group, next);
     DPRINTF("vfio_put_group: close group->fd\n");
     close(group->fd);
@@ -2980,7 +2996,7 @@ static int vfio_initfn(PCIDevice *pdev)
     DPRINTF("%s(%04x:%02x:%02x.%x) group %d\n", __func__, vdev->host.domain,
             vdev->host.bus, vdev->host.slot, vdev->host.function, groupid);
 
-    group = vfio_get_group(groupid);
+    group = vfio_get_group(groupid, pdev->dma);
     if (!group) {
         error_report("vfio: failed to get group %d", groupid);
         return -ENOENT;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
                   ` (2 preceding siblings ...)
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 3/5] vfio: Associate VFIO groups with DMAContexts David Gibson
@ 2013-04-24 12:01 ` David Gibson
  2013-04-24 15:12   ` Alex Williamson
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 5/5] vfio: Only use memory listeners when appropriate David Gibson
  4 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel, David Gibson

At the moment, vfio maintains a global list of containers that are assumed
to be more or less interchangeable, since they are all set up with a
MemoryListener to have all of system memory mapped.  However, that only
makes sense if all the containers are used on devices which really do
expect a dma address space identical to system memory.

This patch moves towards that by making the list of containers per
DMAContext (which corresponds to a dma address space) instead of global.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dma-helpers.c          |    2 ++
 hw/misc/vfio.c         |   13 ++++++++-----
 include/hw/misc/vfio.h |   28 ++++++++++++++++++++++++++++
 include/sysemu/dma.h   |    2 ++
 stubs/Makefile.objs    |    1 +
 stubs/vfio.c           |    6 ++++++
 6 files changed, 47 insertions(+), 5 deletions(-)
 create mode 100644 include/hw/misc/vfio.h
 create mode 100644 stubs/vfio.c

diff --git a/dma-helpers.c b/dma-helpers.c
index 272632f..f0c7866 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -11,6 +11,7 @@
 #include "trace.h"
 #include "qemu/range.h"
 #include "qemu/thread.h"
+#include "hw/misc/vfio.h"
 
 /* #define DEBUG_IOMMU */
 
@@ -386,6 +387,7 @@ void dma_context_init(DMAContext *dma, AddressSpace *as, DMATranslateFunc transl
     dma->translate = translate;
     dma->map = map;
     dma->unmap = unmap;
+    dma_context_init_vfio(dma);
 }
 
 void *iommu_dma_memory_map(DMAContext *dma, dma_addr_t addr, dma_addr_t *len,
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index f77a599..ab870a8 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -39,6 +39,7 @@
 #include "qemu/range.h"
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
+#include "hw/misc/vfio.h"
 
 /* #define DEBUG_VFIO */
 #ifdef DEBUG_VFIO
@@ -179,9 +180,6 @@ typedef struct VFIOGroup {
 
 #define MSIX_CAP_LENGTH 12
 
-static QLIST_HEAD(, VFIOContainer)
-    container_list = QLIST_HEAD_INITIALIZER(container_list);
-
 static QLIST_HEAD(, VFIOGroup)
     group_list = QLIST_HEAD_INITIALIZER(group_list);
 
@@ -2613,6 +2611,11 @@ static int vfio_load_rom(VFIODevice *vdev)
     return 0;
 }
 
+void dma_context_init_vfio(DMAContext *dma)
+{
+    QLIST_INIT(&dma->vfio.containers);
+}
+
 static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
 {
     VFIOContainer *container;
@@ -2628,7 +2631,7 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
         }
     }
 
-    QLIST_FOREACH(container, &container_list, next) {
+    QLIST_FOREACH(container, &dma->vfio.containers, next) {
         if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
             group->container = container;
             QLIST_INSERT_HEAD(&container->group_list, group, container_next);
@@ -2683,7 +2686,7 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
     }
 
     QLIST_INIT(&container->group_list);
-    QLIST_INSERT_HEAD(&container_list, container, next);
+    QLIST_INSERT_HEAD(&dma->vfio.containers, container, next);
 
     group->container = container;
     QLIST_INSERT_HEAD(&container->group_list, group, container_next);
diff --git a/include/hw/misc/vfio.h b/include/hw/misc/vfio.h
new file mode 100644
index 0000000..18fe144
--- /dev/null
+++ b/include/hw/misc/vfio.h
@@ -0,0 +1,28 @@
+/*
+ * vfio based device assignment
+ *
+ * Copyright 2013 David Gibson, IBM Corporation.
+ * Copyright Red Hat, Inc. 2012
+ *
+ * This work is licensed under the terms of the GNU GPL, version
+ * 2. See the COPYING file in the top-level directory.
+ */
+#ifndef QEMU_VFIO_H
+#define QEMU_VFIO_H
+
+#include "qemu/queue.h"
+
+typedef struct DMAContext DMAContext;
+struct DMAContext;
+
+typedef struct VFIOContainer VFIOContainer;
+struct VFIOContainer;
+
+typedef struct DMAContextVFIO DMAContextVFIO;
+struct DMAContextVFIO {
+    QLIST_HEAD(, VFIOContainer) containers;
+};
+
+void dma_context_init_vfio(DMAContext *dma);
+
+#endif /* QEMU_VFIO_H */
diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
index a52c93a..8692d0a 100644
--- a/include/sysemu/dma.h
+++ b/include/sysemu/dma.h
@@ -15,6 +15,7 @@
 #include "hw/hw.h"
 #include "block/block.h"
 #include "sysemu/kvm.h"
+#include "hw/misc/vfio.h"
 
 typedef struct DMAContext DMAContext;
 typedef struct ScatterGatherEntry ScatterGatherEntry;
@@ -66,6 +67,7 @@ struct DMAContext {
     DMATranslateFunc *translate;
     DMAMapFunc *map;
     DMAUnmapFunc *unmap;
+    DMAContextVFIO vfio;
 };
 
 /* A global DMA context corresponding to the address_space_memory
diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
index 9c55b34..858ca6b 100644
--- a/stubs/Makefile.objs
+++ b/stubs/Makefile.objs
@@ -20,6 +20,7 @@ stub-obj-y += reset.o
 stub-obj-y += set-fd-handler.o
 stub-obj-y += slirp.o
 stub-obj-y += sysbus.o
+stub-obj-y += vfio.o
 stub-obj-y += vm-stop.o
 stub-obj-y += vmstate.o
 stub-obj-$(CONFIG_WIN32) += fd-register.o
diff --git a/stubs/vfio.c b/stubs/vfio.c
new file mode 100644
index 0000000..6fe4a84
--- /dev/null
+++ b/stubs/vfio.c
@@ -0,0 +1,6 @@
+#include "hw/misc/vfio.h"
+#include "sysemu/dma.h"
+
+void dma_context_init_vfio(DMAContext *dma)
+{
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [Qemu-devel] [PATCH 5/5] vfio: Only use memory listeners when appropriate
  2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
                   ` (3 preceding siblings ...)
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext David Gibson
@ 2013-04-24 12:01 ` David Gibson
  4 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2013-04-24 12:01 UTC (permalink / raw)
  To: alex.williamson, aik; +Cc: qemu-devel, David Gibson

Currently, vfio registers a MemoryListener for every vfio container we
create, to keep the container's mappings in sync with main system memory.
That's only correct though, if the context the container is attached to
represents a dma address space which actually matches main system memory -
roughly speaking that means that there is no guest side IOMMU above the
vfio device in question.

This patch corrects the code, by only registering the MemoryListener when
the container belongs to a DMAContext which does not include an IOMMU (i.e.
which has no ->translate function).  In other cases we given an error; that
will change when vfio support for guest side IOMMUs is added.

In addition, this generalizes the code slightly, by attaching the
MemoryListener to the DMAContext's underlying AddressSpace, rather than
just assuming that it is main system memory.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/misc/vfio.c |   79 +++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 46 insertions(+), 33 deletions(-)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index ab870a8..dce6189 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -2616,28 +2616,11 @@ void dma_context_init_vfio(DMAContext *dma)
     QLIST_INIT(&dma->vfio.containers);
 }
 
-static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
+static int vfio_connect_container_address_space(VFIOGroup *group,
+                                                DMAContext *dma)
 {
     VFIOContainer *container;
-    int ret, fd;
-
-    if (group->container) {
-        if (group->container->dma == dma) {
-            return 0;
-        } else {
-            error_report("vfio: group %d used in multiple DMA contexts",
-                         group->groupid);
-            return -EBUSY;
-        }
-    }
-
-    QLIST_FOREACH(container, &dma->vfio.containers, next) {
-        if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
-            group->container = container;
-            QLIST_INSERT_HEAD(&container->group_list, group, container_next);
-            return 0;
-        }
-    }
+    int fd, ret;
 
     fd = qemu_open("/dev/vfio/vfio", O_RDWR);
     if (fd < 0) {
@@ -2657,15 +2640,15 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
     container->dma = dma;
     container->fd = fd;
 
-    if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
-        ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
-        if (ret) {
-            error_report("vfio: failed to set group container: %m");
-            g_free(container);
-            close(fd);
-            return -errno;
-        }
+    ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd);
+    if (ret) {
+        error_report("vfio: failed to set group container: %m");
+        g_free(container);
+        close(fd);
+        return -errno;
+    }
 
+    if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_TYPE1_IOMMU)) {
         ret = ioctl(fd, VFIO_SET_IOMMU, VFIO_TYPE1_IOMMU);
         if (ret) {
             error_report("vfio: failed to set iommu for container: %m");
@@ -2673,11 +2656,6 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
             close(fd);
             return -errno;
         }
-
-        container->iommu_data.listener = vfio_memory_listener;
-        container->iommu_data.release = vfio_listener_release;
-
-        memory_listener_register(&container->iommu_data.listener, &address_space_memory);
     } else {
         error_report("vfio: No available IOMMU models");
         g_free(container);
@@ -2685,6 +2663,11 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
         return -EINVAL;
     }
 
+    container->iommu_data.listener = vfio_memory_listener;
+    container->iommu_data.release = vfio_listener_release;
+
+    memory_listener_register(&container->iommu_data.listener, dma->as);
+
     QLIST_INIT(&container->group_list);
     QLIST_INSERT_HEAD(&dma->vfio.containers, container, next);
 
@@ -2694,6 +2677,36 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
     return 0;
 }
 
+static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
+{
+    VFIOContainer *container;
+
+    if (group->container) {
+        if (group->container->dma == dma) {
+            return 0;
+        } else {
+            error_report("vfio: group %d used in multiple DMA contexts",
+                         group->groupid);
+            return -EBUSY;
+        }
+    }
+
+    QLIST_FOREACH(container, &dma->vfio.containers, next) {
+        if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
+            group->container = container;
+            QLIST_INSERT_HEAD(&container->group_list, group, container_next);
+            return 0;
+        }
+    }
+
+    if (!dma->translate) {
+        return vfio_connect_container_address_space(group, dma);
+    }
+
+    error_report("vfio: no support for guest side IOMMU");
+    return -ENODEV;
+}
+
 static void vfio_disconnect_context(VFIOGroup *group)
 {
     VFIOContainer *container = group->container;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device David Gibson
@ 2013-04-24 12:34   ` Paolo Bonzini
  2013-04-24 13:07     ` David Gibson
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-24 12:34 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, alex.williamson, qemu-devel

Il 24/04/2013 14:01, David Gibson ha scritto:
> So while its true that the DMAContext handling needs to be better
> integrated with the MemoryRegion and AddressSpace handling, that commit
> wasn't actually a step in the right direction for it.  Since then, the
> DMAContext has been extended so it can backend onto an AddressSpace, and
> thereby, a MemoryRegion.  Effectively a DMAContext is now an AddressSpace
> with iommu translation handling on top.
> 
> Therefore, this patch essentially reverts the earlier commit, making all
> PCI devices by default shared the global dma_context_memory which backs
> onto main system memory.  Those cases which need to set up different DMA
> address spaces for each PCI device should supply a suitable dma_context_fn
> in the bus to correctly assign / create suitable DMAContext structures.

I think this will be handled correctly when I submit IOMMU AddressSpace
patches (next week or so).  The structure will be

        PCI device 1                           PCI device 2
    ---------------------------------------------------------------
       AddressSpace 1                          AddressSpace 2
           |                                       |
           | (enable/disable)                      | (enable/disable)
           '-------------------.  .----------------'
                               v  v
                         IOMMU AddressSpace
                                |
                                | (translation)
                                v
                           system memory

VFIO will be able to access the IOMMU AddressSpace simply via
pci_dev->iommu, and that field will be ==-identical  for different PCI
devices.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit David Gibson
@ 2013-04-24 12:36   ` Paolo Bonzini
  2013-04-24 13:06     ` David Gibson
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-24 12:36 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, alex.williamson, qemu-devel

Il 24/04/2013 14:01, David Gibson ha scritto:
> In commit 1c380f9460522f32c8dd2577b2a53d518ec91c6d "pci: honor
> PCI_COMMAND_MASTER" the PCI_COMMAND_MASTER bit of the PCI command register
> was implemented by toggling the enable bit on a memory region alias
> interposed between the PCI device's dma address space and the main
> system memory region.
> 
> Introducing an extra alias region for every PCI device just to implement
> that bit seems like serious overkill.  Furthermore, it doesn't work when
> there's a (guest side) iommu present, since that uses a different path for
> constructing the PCI device's dma address space.
> 
> This patch removes the aliased window, instead implementing
> PCI_COMMAND_MASTER with tests in the PCI DMA functions.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

This doesn't work.

You have no guarantee that PCI devices use the PCI DMA functions.  The
device could just pass the DMAContext to another function, and indeed
the OHCI controller does exactly that.

This will be even simpler after IOMMU/DMAContext are also unified in the
AddressSpace framework.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit
  2013-04-24 12:36   ` Paolo Bonzini
@ 2013-04-24 13:06     ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2013-04-24 13:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: aik, alex.williamson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1625 bytes --]

On Wed, Apr 24, 2013 at 02:36:46PM +0200, Paolo Bonzini wrote:
> Il 24/04/2013 14:01, David Gibson ha scritto:
> > In commit 1c380f9460522f32c8dd2577b2a53d518ec91c6d "pci: honor
> > PCI_COMMAND_MASTER" the PCI_COMMAND_MASTER bit of the PCI command register
> > was implemented by toggling the enable bit on a memory region alias
> > interposed between the PCI device's dma address space and the main
> > system memory region.
> > 
> > Introducing an extra alias region for every PCI device just to implement
> > that bit seems like serious overkill.  Furthermore, it doesn't work when
> > there's a (guest side) iommu present, since that uses a different path for
> > constructing the PCI device's dma address space.
> > 
> > This patch removes the aliased window, instead implementing
> > PCI_COMMAND_MASTER with tests in the PCI DMA functions.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> This doesn't work.

Well.. say rather that it fails to work in a different set of
circumstances from those in which the current scheme fails to work.

> You have no guarantee that PCI devices use the PCI DMA functions.  The
> device could just pass the DMAContext to another function, and indeed
> the OHCI controller does exactly that.

Ah, good point.  Drat.

> This will be even simpler after IOMMU/DMAContext are also unified in the
> AddressSpace framework.
> 
> Paolo
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device
  2013-04-24 12:34   ` Paolo Bonzini
@ 2013-04-24 13:07     ` David Gibson
  0 siblings, 0 replies; 19+ messages in thread
From: David Gibson @ 2013-04-24 13:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: aik, alex.williamson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2220 bytes --]

On Wed, Apr 24, 2013 at 02:34:21PM +0200, Paolo Bonzini wrote:
> Il 24/04/2013 14:01, David Gibson ha scritto:
> > So while its true that the DMAContext handling needs to be better
> > integrated with the MemoryRegion and AddressSpace handling, that commit
> > wasn't actually a step in the right direction for it.  Since then, the
> > DMAContext has been extended so it can backend onto an AddressSpace, and
> > thereby, a MemoryRegion.  Effectively a DMAContext is now an AddressSpace
> > with iommu translation handling on top.
> > 
> > Therefore, this patch essentially reverts the earlier commit, making all
> > PCI devices by default shared the global dma_context_memory which backs
> > onto main system memory.  Those cases which need to set up different DMA
> > address spaces for each PCI device should supply a suitable dma_context_fn
> > in the bus to correctly assign / create suitable DMAContext structures.
> 
> I think this will be handled correctly when I submit IOMMU AddressSpace
> patches (next week or so).  The structure will be
> 
>         PCI device 1                           PCI device 2
>     ---------------------------------------------------------------
>        AddressSpace 1                          AddressSpace 2
>            |                                       |
>            | (enable/disable)                      | (enable/disable)
>            '-------------------.  .----------------'
>                                v  v
>                          IOMMU AddressSpace
>                                 |
>                                 | (translation)
>                                 v
>                            system memory
> 
> VFIO will be able to access the IOMMU AddressSpace simply via
> pci_dev->iommu, and that field will be ==-identical  for different PCI
> devices.

Hrm, ok.  Still seems excessively complicated, but as long as there's
a way to get from the pci device to the common address space, I can
work with that.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-24 12:01 ` [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext David Gibson
@ 2013-04-24 15:12   ` Alex Williamson
  2013-04-24 16:33     ` Paolo Bonzini
  2013-04-25  6:35     ` David Gibson
  0 siblings, 2 replies; 19+ messages in thread
From: Alex Williamson @ 2013-04-24 15:12 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, qemu-devel

On Wed, 2013-04-24 at 22:01 +1000, David Gibson wrote:
> At the moment, vfio maintains a global list of containers that are assumed
> to be more or less interchangeable, since they are all set up with a
> MemoryListener to have all of system memory mapped.  However, that only
> makes sense if all the containers are used on devices which really do
> expect a dma address space identical to system memory.
> 
> This patch moves towards that by making the list of containers per
> DMAContext (which corresponds to a dma address space) instead of global.

This seems like an unnecessary intrusion into common code.  Why not
create a vfio specific list of dma objects, each with a list of
containers?  Thanks,

Alex

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  dma-helpers.c          |    2 ++
>  hw/misc/vfio.c         |   13 ++++++++-----
>  include/hw/misc/vfio.h |   28 ++++++++++++++++++++++++++++
>  include/sysemu/dma.h   |    2 ++
>  stubs/Makefile.objs    |    1 +
>  stubs/vfio.c           |    6 ++++++
>  6 files changed, 47 insertions(+), 5 deletions(-)
>  create mode 100644 include/hw/misc/vfio.h
>  create mode 100644 stubs/vfio.c
> 
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 272632f..f0c7866 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -11,6 +11,7 @@
>  #include "trace.h"
>  #include "qemu/range.h"
>  #include "qemu/thread.h"
> +#include "hw/misc/vfio.h"
>  
>  /* #define DEBUG_IOMMU */
>  
> @@ -386,6 +387,7 @@ void dma_context_init(DMAContext *dma, AddressSpace *as, DMATranslateFunc transl
>      dma->translate = translate;
>      dma->map = map;
>      dma->unmap = unmap;
> +    dma_context_init_vfio(dma);
>  }
>  
>  void *iommu_dma_memory_map(DMAContext *dma, dma_addr_t addr, dma_addr_t *len,
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index f77a599..ab870a8 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -39,6 +39,7 @@
>  #include "qemu/range.h"
>  #include "sysemu/kvm.h"
>  #include "sysemu/sysemu.h"
> +#include "hw/misc/vfio.h"
>  
>  /* #define DEBUG_VFIO */
>  #ifdef DEBUG_VFIO
> @@ -179,9 +180,6 @@ typedef struct VFIOGroup {
>  
>  #define MSIX_CAP_LENGTH 12
>  
> -static QLIST_HEAD(, VFIOContainer)
> -    container_list = QLIST_HEAD_INITIALIZER(container_list);
> -
>  static QLIST_HEAD(, VFIOGroup)
>      group_list = QLIST_HEAD_INITIALIZER(group_list);
>  
> @@ -2613,6 +2611,11 @@ static int vfio_load_rom(VFIODevice *vdev)
>      return 0;
>  }
>  
> +void dma_context_init_vfio(DMAContext *dma)
> +{
> +    QLIST_INIT(&dma->vfio.containers);
> +}
> +
>  static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
>  {
>      VFIOContainer *container;
> @@ -2628,7 +2631,7 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
>          }
>      }
>  
> -    QLIST_FOREACH(container, &container_list, next) {
> +    QLIST_FOREACH(container, &dma->vfio.containers, next) {
>          if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
>              group->container = container;
>              QLIST_INSERT_HEAD(&container->group_list, group, container_next);
> @@ -2683,7 +2686,7 @@ static int vfio_connect_context(VFIOGroup *group, DMAContext *dma)
>      }
>  
>      QLIST_INIT(&container->group_list);
> -    QLIST_INSERT_HEAD(&container_list, container, next);
> +    QLIST_INSERT_HEAD(&dma->vfio.containers, container, next);
>  
>      group->container = container;
>      QLIST_INSERT_HEAD(&container->group_list, group, container_next);
> diff --git a/include/hw/misc/vfio.h b/include/hw/misc/vfio.h
> new file mode 100644
> index 0000000..18fe144
> --- /dev/null
> +++ b/include/hw/misc/vfio.h
> @@ -0,0 +1,28 @@
> +/*
> + * vfio based device assignment
> + *
> + * Copyright 2013 David Gibson, IBM Corporation.
> + * Copyright Red Hat, Inc. 2012
> + *
> + * This work is licensed under the terms of the GNU GPL, version
> + * 2. See the COPYING file in the top-level directory.
> + */
> +#ifndef QEMU_VFIO_H
> +#define QEMU_VFIO_H
> +
> +#include "qemu/queue.h"
> +
> +typedef struct DMAContext DMAContext;
> +struct DMAContext;
> +
> +typedef struct VFIOContainer VFIOContainer;
> +struct VFIOContainer;
> +
> +typedef struct DMAContextVFIO DMAContextVFIO;
> +struct DMAContextVFIO {
> +    QLIST_HEAD(, VFIOContainer) containers;
> +};
> +
> +void dma_context_init_vfio(DMAContext *dma);
> +
> +#endif /* QEMU_VFIO_H */
> diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
> index a52c93a..8692d0a 100644
> --- a/include/sysemu/dma.h
> +++ b/include/sysemu/dma.h
> @@ -15,6 +15,7 @@
>  #include "hw/hw.h"
>  #include "block/block.h"
>  #include "sysemu/kvm.h"
> +#include "hw/misc/vfio.h"
>  
>  typedef struct DMAContext DMAContext;
>  typedef struct ScatterGatherEntry ScatterGatherEntry;
> @@ -66,6 +67,7 @@ struct DMAContext {
>      DMATranslateFunc *translate;
>      DMAMapFunc *map;
>      DMAUnmapFunc *unmap;
> +    DMAContextVFIO vfio;
>  };
>  
>  /* A global DMA context corresponding to the address_space_memory
> diff --git a/stubs/Makefile.objs b/stubs/Makefile.objs
> index 9c55b34..858ca6b 100644
> --- a/stubs/Makefile.objs
> +++ b/stubs/Makefile.objs
> @@ -20,6 +20,7 @@ stub-obj-y += reset.o
>  stub-obj-y += set-fd-handler.o
>  stub-obj-y += slirp.o
>  stub-obj-y += sysbus.o
> +stub-obj-y += vfio.o
>  stub-obj-y += vm-stop.o
>  stub-obj-y += vmstate.o
>  stub-obj-$(CONFIG_WIN32) += fd-register.o
> diff --git a/stubs/vfio.c b/stubs/vfio.c
> new file mode 100644
> index 0000000..6fe4a84
> --- /dev/null
> +++ b/stubs/vfio.c
> @@ -0,0 +1,6 @@
> +#include "hw/misc/vfio.h"
> +#include "sysemu/dma.h"
> +
> +void dma_context_init_vfio(DMAContext *dma)
> +{
> +}

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-24 15:12   ` Alex Williamson
@ 2013-04-24 16:33     ` Paolo Bonzini
  2013-04-25  6:36       ` David Gibson
  2013-04-25  6:35     ` David Gibson
  1 sibling, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-24 16:33 UTC (permalink / raw)
  To: Alex Williamson; +Cc: aik, qemu-devel, David Gibson

Il 24/04/2013 17:12, Alex Williamson ha scritto:
>> > At the moment, vfio maintains a global list of containers that are assumed
>> > to be more or less interchangeable, since they are all set up with a
>> > MemoryListener to have all of system memory mapped.  However, that only
>> > makes sense if all the containers are used on devices which really do
>> > expect a dma address space identical to system memory.
>> > 
>> > This patch moves towards that by making the list of containers per
>> > DMAContext (which corresponds to a dma address space) instead of global.
> This seems like an unnecessary intrusion into common code.  Why not
> create a vfio specific list of dma objects, each with a list of
> containers?  Thanks,

Yeah, I suggest that this is re-evaluated on top of the iommu patches.

You can find them at git://github.com/bonzini/qemu.git, branch iommu.
It seems to work with pseries, at least my guest crashes at the same
place with and without.  USB works, and so do VGA and spapr-vscsi.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-24 15:12   ` Alex Williamson
  2013-04-24 16:33     ` Paolo Bonzini
@ 2013-04-25  6:35     ` David Gibson
  1 sibling, 0 replies; 19+ messages in thread
From: David Gibson @ 2013-04-25  6:35 UTC (permalink / raw)
  To: Alex Williamson; +Cc: aik, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1490 bytes --]

On Wed, Apr 24, 2013 at 09:12:39AM -0600, Alex Williamson wrote:
> On Wed, 2013-04-24 at 22:01 +1000, David Gibson wrote:
> > At the moment, vfio maintains a global list of containers that are assumed
> > to be more or less interchangeable, since they are all set up with a
> > MemoryListener to have all of system memory mapped.  However, that only
> > makes sense if all the containers are used on devices which really do
> > expect a dma address space identical to system memory.
> > 
> > This patch moves towards that by making the list of containers per
> > DMAContext (which corresponds to a dma address space) instead of global.
> 
> This seems like an unnecessary intrusion into common code.  Why not
> create a vfio specific list of dma objects, each with a list of
> containers?  Thanks,

Possible, but ugly.  DMAContext *is* the handle for a DMA address
space.  Having a parallel array of VFIO contextns would be converting
from the generic handle to the VFIO one would be a search, rather than
just a dereference, and making sure the lifetime of the VFIO one
matches the lifetime of the generic object would be unnecessarily
awkward.

Which reminds me, I forgot to implement a VFIO hook for DMAContext
destruction in this patch.  That does need to be fixed.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-24 16:33     ` Paolo Bonzini
@ 2013-04-25  6:36       ` David Gibson
  2013-04-26  8:44         ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: David Gibson @ 2013-04-25  6:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: aik, Alex Williamson, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 1418 bytes --]

On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
> Il 24/04/2013 17:12, Alex Williamson ha scritto:
> >> > At the moment, vfio maintains a global list of containers that are assumed
> >> > to be more or less interchangeable, since they are all set up with a
> >> > MemoryListener to have all of system memory mapped.  However, that only
> >> > makes sense if all the containers are used on devices which really do
> >> > expect a dma address space identical to system memory.
> >> > 
> >> > This patch moves towards that by making the list of containers per
> >> > DMAContext (which corresponds to a dma address space) instead of global.
> > This seems like an unnecessary intrusion into common code.  Why not
> > create a vfio specific list of dma objects, each with a list of
> > containers?  Thanks,
> 
> Yeah, I suggest that this is re-evaluated on top of the iommu patches.

> You can find them at git://github.com/bonzini/qemu.git, branch iommu.
> It seems to work with pseries, at least my guest crashes at the same
> place with and without.  USB works, and so do VGA and spapr-vscsi.

Ok, I'll have a look when I get a chance.  Any guesses as to when they
might reach mainline?

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-25  6:36       ` David Gibson
@ 2013-04-26  8:44         ` Paolo Bonzini
  2013-04-26  8:46           ` Alexey Kardashevskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-26  8:44 UTC (permalink / raw)
  To: David Gibson; +Cc: aik, Alex Williamson, qemu-devel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Il 25/04/2013 08:36, David Gibson ha scritto:
> On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
>> Il 24/04/2013 17:12, Alex Williamson ha scritto:
>>>>> At the moment, vfio maintains a global list of containers
>>>>> that are assumed to be more or less interchangeable, since
>>>>> they are all set up with a MemoryListener to have all of
>>>>> system memory mapped.  However, that only makes sense if
>>>>> all the containers are used on devices which really do 
>>>>> expect a dma address space identical to system memory.
>>>>> 
>>>>> This patch moves towards that by making the list of
>>>>> containers per DMAContext (which corresponds to a dma
>>>>> address space) instead of global.
>>> This seems like an unnecessary intrusion into common code.  Why
>>> not create a vfio specific list of dma objects, each with a
>>> list of containers?  Thanks,
>> 
>> Yeah, I suggest that this is re-evaluated on top of the iommu
>> patches.
> 
>> You can find them at git://github.com/bonzini/qemu.git, branch
>> iommu. It seems to work with pseries, at least my guest crashes
>> at the same place with and without.  USB works, and so do VGA and
>> spapr-vscsi.
> 
> Ok, I'll have a look when I get a chance.  Any guesses as to when
> they might reach mainline?

If I get your Tested-by, early in 1.6.

Paolo

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRej5jAAoJEBvWZb6bTYby6I0P/1lUAJhlrZ3mJ7h3HOK/Hee4
3U+csDnpp9kBkkoueWcfg/u5FCEIQ5MDII4/qvcJ6onLB6kp6cTyU5KRUQEFw+SF
er/CfC3bIIo1wz3Ze6l9shOLa4bqiCawBGA5+dRKzh9KWuNwFmmkWjbUNBEgNcnb
0Iz2/jdL6KWEE8a7brCchJZpv6Ib0AhDMEp1wf6OwdJOWQx6BbLSr/SzFTb3/Wgs
JOawqd6Dd/O45E+b0rDkXoF2Fit/XvrBSm0Hju8zvT3XnWeJxIMEiFroJIN053F2
/QHPWoF/xLRnP1JPhxiudSHZCKOMxQn/XBYa+olxt4MhRzEU+0F3bgUY19+sjfbh
OAgtB5icjTP+gFPC2mw3wZZnHJ6Q0bWkLOSmQ4UXpcv88ZagJUDUTCiAWp2dPYnE
czKLUC5IB1KAp8AWb3dqiym/BpDPuO0s/15t19e2/qJOZKU4PDQJD9hSnnbcYwRN
dVMQWVpxKro2cZUXV6NbAhNXIdfM4WCdRDUDtRssAeaVOG5iiwdr481PC90+o7O1
gLLfKfsMytHflKwCre9ew6o+obwHXvSLbliou0kJAY0iuYyoAogu6zq4Eu+i+ia7
28JX5qpI/hRebacIKDryc0xGHGah06JHMLdITF7rLqH9fVg2X1Mz+4FAGwjUmO+m
+23q7kaLcfBMjmKKfS3O
=FkrW
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-26  8:44         ` Paolo Bonzini
@ 2013-04-26  8:46           ` Alexey Kardashevskiy
  2013-04-26  8:52             ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2013-04-26  8:46 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alex Williamson, qemu-devel, David Gibson

On 04/26/2013 06:44 PM, Paolo Bonzini wrote:

> Il 25/04/2013 08:36, David Gibson ha scritto:
>> On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
>>> Il 24/04/2013 17:12, Alex Williamson ha scritto:
>>>>>> At the moment, vfio maintains a global list of containers
>>>>>> that are assumed to be more or less interchangeable, since
>>>>>> they are all set up with a MemoryListener to have all of
>>>>>> system memory mapped.  However, that only makes sense if
>>>>>> all the containers are used on devices which really do
>>>>>> expect a dma address space identical to system memory.
>>>>>>
>>>>>> This patch moves towards that by making the list of
>>>>>> containers per DMAContext (which corresponds to a dma
>>>>>> address space) instead of global.
>>>> This seems like an unnecessary intrusion into common code.  Why
>>>> not create a vfio specific list of dma objects, each with a
>>>> list of containers?  Thanks,
>>>
>>> Yeah, I suggest that this is re-evaluated on top of the iommu
>>> patches.
>>
>>> You can find them at git://github.com/bonzini/qemu.git, branch
>>> iommu. It seems to work with pseries, at least my guest crashes
>>> at the same place with and without.  USB works, and so do VGA and
>>> spapr-vscsi.
>>
>> Ok, I'll have a look when I get a chance.  Any guesses as to when
>> they might reach mainline?
>
> If I get your Tested-by, early in 1.6.


Emulated PCI works on ppc64/spapr so far.



-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-26  8:46           ` Alexey Kardashevskiy
@ 2013-04-26  8:52             ` Paolo Bonzini
  2013-04-26  8:56               ` Alexey Kardashevskiy
  0 siblings, 1 reply; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-26  8:52 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alex Williamson, qemu-devel, David Gibson

Il 26/04/2013 10:46, Alexey Kardashevskiy ha scritto:
> On 04/26/2013 06:44 PM, Paolo Bonzini wrote:
>> Il 25/04/2013 08:36, David Gibson ha scritto:
>>> On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
>>>> Il 24/04/2013 17:12, Alex Williamson ha scritto:
>>>>>>> At the moment, vfio maintains a global list of containers
>>>>>>> that are assumed to be more or less interchangeable, since
>>>>>>> they are all set up with a MemoryListener to have all of
>>>>>>> system memory mapped.  However, that only makes sense if
>>>>>>> all the containers are used on devices which really do
>>>>>>> expect a dma address space identical to system memory.
>>>>>>>
>>>>>>> This patch moves towards that by making the list of
>>>>>>> containers per DMAContext (which corresponds to a dma
>>>>>>> address space) instead of global.
>>>>> This seems like an unnecessary intrusion into common code.  Why
>>>>> not create a vfio specific list of dma objects, each with a
>>>>> list of containers?  Thanks,
>>>>
>>>> Yeah, I suggest that this is re-evaluated on top of the iommu
>>>> patches.
>>>
>>>> You can find them at git://github.com/bonzini/qemu.git, branch
>>>> iommu. It seems to work with pseries, at least my guest crashes
>>>> at the same place with and without.  USB works, and so do VGA and
>>>> spapr-vscsi.
>>>
>>> Ok, I'll have a look when I get a chance.  Any guesses as to when
>>> they might reach mainline?
>>
>> If I get your Tested-by, early in 1.6.
> 
> Emulated PCI works on ppc64/spapr so far.

What about VIO?  Doesn't it go through the IOMMU as well?  I'm sure you
can test it more than I did (it did break in Avi's original patches, so
it must not be that bad...).

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-26  8:52             ` Paolo Bonzini
@ 2013-04-26  8:56               ` Alexey Kardashevskiy
  2013-04-26  9:08                 ` Paolo Bonzini
  0 siblings, 1 reply; 19+ messages in thread
From: Alexey Kardashevskiy @ 2013-04-26  8:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alex Williamson, qemu-devel, David Gibson

On 04/26/2013 06:52 PM, Paolo Bonzini wrote:
> Il 26/04/2013 10:46, Alexey Kardashevskiy ha scritto:
>> On 04/26/2013 06:44 PM, Paolo Bonzini wrote:
>>> Il 25/04/2013 08:36, David Gibson ha scritto:
>>>> On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
>>>>> Il 24/04/2013 17:12, Alex Williamson ha scritto:
>>>>>>>> At the moment, vfio maintains a global list of containers
>>>>>>>> that are assumed to be more or less interchangeable, since
>>>>>>>> they are all set up with a MemoryListener to have all of
>>>>>>>> system memory mapped.  However, that only makes sense if
>>>>>>>> all the containers are used on devices which really do
>>>>>>>> expect a dma address space identical to system memory.
>>>>>>>>
>>>>>>>> This patch moves towards that by making the list of
>>>>>>>> containers per DMAContext (which corresponds to a dma
>>>>>>>> address space) instead of global.
>>>>>> This seems like an unnecessary intrusion into common code.  Why
>>>>>> not create a vfio specific list of dma objects, each with a
>>>>>> list of containers?  Thanks,
>>>>>
>>>>> Yeah, I suggest that this is re-evaluated on top of the iommu
>>>>> patches.
>>>>
>>>>> You can find them at git://github.com/bonzini/qemu.git, branch
>>>>> iommu. It seems to work with pseries, at least my guest crashes
>>>>> at the same place with and without.  USB works, and so do VGA and
>>>>> spapr-vscsi.
>>>>
>>>> Ok, I'll have a look when I get a chance.  Any guesses as to when
>>>> they might reach mainline?
>>>
>>> If I get your Tested-by, early in 1.6.
>>
>> Emulated PCI works on ppc64/spapr so far.
>
> What about VIO?  Doesn't it go through the IOMMU as well?  I'm sure you
> can test it more than I did (it did break in Avi's original patches, so
> it must not be that bad...).


I run QEMU with
-net nic,model=ibmveth -net user,hostfwd=tcp::5000-:22
and run in the guest: "dhclient ; wget google.com" which worked fine. When 
it damaged, it does not go further than dhclient :)



-- 
Alexey

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext
  2013-04-26  8:56               ` Alexey Kardashevskiy
@ 2013-04-26  9:08                 ` Paolo Bonzini
  0 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2013-04-26  9:08 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alex Williamson, qemu-devel, David Gibson

Il 26/04/2013 10:56, Alexey Kardashevskiy ha scritto:
> On 04/26/2013 06:52 PM, Paolo Bonzini wrote:
>> Il 26/04/2013 10:46, Alexey Kardashevskiy ha scritto:
>>> On 04/26/2013 06:44 PM, Paolo Bonzini wrote:
>>>> Il 25/04/2013 08:36, David Gibson ha scritto:
>>>>> On Wed, Apr 24, 2013 at 06:33:33PM +0200, Paolo Bonzini wrote:
>>>>>> Il 24/04/2013 17:12, Alex Williamson ha scritto:
>>>>>>>>> At the moment, vfio maintains a global list of containers
>>>>>>>>> that are assumed to be more or less interchangeable, since
>>>>>>>>> they are all set up with a MemoryListener to have all of
>>>>>>>>> system memory mapped.  However, that only makes sense if
>>>>>>>>> all the containers are used on devices which really do
>>>>>>>>> expect a dma address space identical to system memory.
>>>>>>>>>
>>>>>>>>> This patch moves towards that by making the list of
>>>>>>>>> containers per DMAContext (which corresponds to a dma
>>>>>>>>> address space) instead of global.
>>>>>>> This seems like an unnecessary intrusion into common code.  Why
>>>>>>> not create a vfio specific list of dma objects, each with a
>>>>>>> list of containers?  Thanks,
>>>>>>
>>>>>> Yeah, I suggest that this is re-evaluated on top of the iommu
>>>>>> patches.
>>>>>
>>>>>> You can find them at git://github.com/bonzini/qemu.git, branch
>>>>>> iommu. It seems to work with pseries, at least my guest crashes
>>>>>> at the same place with and without.  USB works, and so do VGA and
>>>>>> spapr-vscsi.
>>>>>
>>>>> Ok, I'll have a look when I get a chance.  Any guesses as to when
>>>>> they might reach mainline?
>>>>
>>>> If I get your Tested-by, early in 1.6.
>>>
>>> Emulated PCI works on ppc64/spapr so far.
>>
>> What about VIO?  Doesn't it go through the IOMMU as well?  I'm sure you
>> can test it more than I did (it did break in Avi's original patches, so
>> it must not be that bad...).
> 
> I run QEMU with
> -net nic,model=ibmveth -net user,hostfwd=tcp::5000-:22
> and run in the guest: "dhclient ; wget google.com" which worked fine.
> When it damaged, it does not go further than dhclient :)

Ok, that counts as a Tested-by from you, I guess.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2013-04-26  9:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-24 12:01 [Qemu-devel] [0/5] RFC: Preparations for supporting VFIO with guest IOMMUs David Gibson
2013-04-24 12:01 ` [Qemu-devel] [PATCH 1/5] pci: Simpler implementation of PCI_COMMAND_MASTER bit David Gibson
2013-04-24 12:36   ` Paolo Bonzini
2013-04-24 13:06     ` David Gibson
2013-04-24 12:01 ` [Qemu-devel] [PATCH 2/5] pci: Don't create an address space object for every PCI device David Gibson
2013-04-24 12:34   ` Paolo Bonzini
2013-04-24 13:07     ` David Gibson
2013-04-24 12:01 ` [Qemu-devel] [PATCH 3/5] vfio: Associate VFIO groups with DMAContexts David Gibson
2013-04-24 12:01 ` [Qemu-devel] [PATCH 4/5] vfio: Move container list to DMAContext David Gibson
2013-04-24 15:12   ` Alex Williamson
2013-04-24 16:33     ` Paolo Bonzini
2013-04-25  6:36       ` David Gibson
2013-04-26  8:44         ` Paolo Bonzini
2013-04-26  8:46           ` Alexey Kardashevskiy
2013-04-26  8:52             ` Paolo Bonzini
2013-04-26  8:56               ` Alexey Kardashevskiy
2013-04-26  9:08                 ` Paolo Bonzini
2013-04-25  6:35     ` David Gibson
2013-04-24 12:01 ` [Qemu-devel] [PATCH 5/5] vfio: Only use memory listeners when appropriate David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).