qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero()
  2012-03-01  5:35 [Qemu-devel] [0/13] RFC: Support for guest-visible IOMMUs David Gibson
@ 2012-03-01  5:36 ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-01  5:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: eduard.munteanu, rth, David Gibson, mst

This patch adds cpu_physical_memory_zero() function.  This is equivalent to
calling cpu_physical_memory_write() with a buffer full of zeroes, but
avoids actually allocating such a buffer along the way.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 cpu-common.h |    1 +
 exec.c       |   56 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index a40c57d..aee37bf 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -53,6 +53,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write);
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len);
 static inline void cpu_physical_memory_read(target_phys_addr_t addr,
                                             void *buf, int len)
 {
diff --git a/exec.c b/exec.c
index 0094f53..145a753 100644
--- a/exec.c
+++ b/exec.c
@@ -3639,6 +3639,62 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
     }
 }
 
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len)
+{
+    int l, io_index;
+    uint8_t *ptr;
+    target_phys_addr_t page;
+    ram_addr_t pd;
+    PhysPageDesc p;
+
+    while (len > 0) {
+        page = addr & TARGET_PAGE_MASK;
+        l = (page + TARGET_PAGE_SIZE) - addr;
+        if (l > len) {
+            l = len;
+        }
+        p = phys_page_find(page >> TARGET_PAGE_BITS);
+        pd = p.phys_offset;
+
+        if ((pd & ~TARGET_PAGE_MASK) != io_mem_ram.ram_addr) {
+            target_phys_addr_t addr1;
+            io_index = pd & (IO_MEM_NB_ENTRIES - 1);
+            addr1 = (addr & ~TARGET_PAGE_MASK) + p.region_offset;
+            /* XXX: could force cpu_single_env to NULL to avoid
+               potential bugs */
+            if (l >= 4 && ((addr1 & 3) == 0)) {
+                /* 32 bit write access */
+                io_mem_write(io_index, addr1, 0, 4);
+                l = 4;
+            } else if (l >= 2 && ((addr1 & 1) == 0)) {
+                /* 16 bit write access */
+                io_mem_write(io_index, addr1, 0, 2);
+                l = 2;
+            } else {
+                /* 8 bit write access */
+                io_mem_write(io_index, addr1, 0, 1);
+                l = 1;
+            }
+        } else {
+            ram_addr_t addr1;
+            addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK);
+            /* RAM case */
+            ptr = qemu_get_ram_ptr(addr1);
+            memset(ptr, 0, l);
+            if (!cpu_physical_memory_is_dirty(addr1)) {
+                /* invalidate code */
+                tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
+                /* set dirty bit */
+                cpu_physical_memory_set_dirty_flags(
+                    addr1, (0xff & ~CODE_DIRTY_FLAG));
+            }
+            qemu_put_ram_ptr(ptr);
+        }
+        len -= l;
+        addr += l;
+    }
+}
+
 /* used for ROM loading : can write in RAM and ROM */
 void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len)
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [0/13] Implement support for guest visible IOMMUs
@ 2012-03-09  5:01 David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
                   ` (12 more replies)
  0 siblings, 13 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, eduard.munteanu

This patch series introduces a general DMA infrastructure which allows
the emulation of guest-visible IOMMUs.  That is, it provides a
framework by which an IOMMU device can be implemented, such that DMA
from other device emulations will be translated according to the
mappings provided by the IOMMU.

One example IOMMU implementation is included, for the para-virtualized
TCE tables specified by PAPR and used in the pseries machine for both
virtual IO and PCI devices.

This series is an updated and cleaned-up version of patches posted by
Eduard - Gabriel Munteanu some time ago.  Those prompted some
discussion at the time, but no resolution was reached.

The series also converts a number of existing device models to use the
new DMA infrastructure, so they can be used with IOMMUs.  Along with
the pci_dma_*() wrapper functions which are already in, this means
that with this series applied, essentially all PCI devices should work
with an emulated IOMMU, as well as pseries VIO devices.  Other types
of devices would need some further conversion to work with the new
framework, but that should be quite straightforward.

This new version of the series has incorporated feedback from the last
RFC round.  I think it's ready to be considered for merge.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  9:34   ` Kevin Wolf
  2012-03-09 10:01   ` Paolo Bonzini
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables David Gibson
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, mst, agraf, eduard.munteanu, David Gibson, rth

Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
determine the direction of DMA it is emulating.  We already have a
DMADirection enum designed specifically to encode DMA directions.
This patch uses it for dma_bdrv_io() as well.  This involves removing
the DMADirection definition from the #ifdef it was inside, but since that
only existed to protect the definition of dma_addr_t from places where
config.h is not included, there wasn't any reason for it to be there in
the first place.

Cc: Kevin Wolf <kwolf@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dma-helpers.c  |   20 ++++++++++++--------
 dma.h          |   12 ++++++------
 hw/ide/core.c  |    3 ++-
 hw/ide/macio.c |    3 ++-
 4 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index c29ea6d..5f19a85 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -42,7 +42,7 @@ typedef struct {
     BlockDriverAIOCB *acb;
     QEMUSGList *sg;
     uint64_t sector_num;
-    bool to_dev;
+    DMADirection dir;
     bool in_cancel;
     int sg_cur_index;
     dma_addr_t sg_cur_byte;
@@ -76,7 +76,8 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
 
     for (i = 0; i < dbs->iov.niov; ++i) {
         cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
-                                  dbs->iov.iov[i].iov_len, !dbs->to_dev,
+                                  dbs->iov.iov[i].iov_len,
+                                  dbs->dir != DMA_DIRECTION_TO_DEVICE,
                                   dbs->iov.iov[i].iov_len);
     }
     qemu_iovec_reset(&dbs->iov);
@@ -123,7 +124,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
     while (dbs->sg_cur_index < dbs->sg->nsg) {
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
-        mem = cpu_physical_memory_map(cur_addr, &cur_len, !dbs->to_dev);
+        mem = cpu_physical_memory_map(cur_addr, &cur_len,
+                                      dbs->dir != DMA_DIRECTION_TO_DEVICE);
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);
@@ -170,11 +172,11 @@ static AIOPool dma_aio_pool = {
 BlockDriverAIOCB *dma_bdrv_io(
     BlockDriverState *bs, QEMUSGList *sg, uint64_t sector_num,
     DMAIOFunc *io_func, BlockDriverCompletionFunc *cb,
-    void *opaque, bool to_dev)
+    void *opaque, DMADirection dir)
 {
     DMAAIOCB *dbs = qemu_aio_get(&dma_aio_pool, bs, cb, opaque);
 
-    trace_dma_bdrv_io(dbs, bs, sector_num, to_dev);
+    trace_dma_bdrv_io(dbs, bs, sector_num, (dir == DMA_DIRECTION_TO_DEVICE));
 
     dbs->acb = NULL;
     dbs->bs = bs;
@@ -182,7 +184,7 @@ BlockDriverAIOCB *dma_bdrv_io(
     dbs->sector_num = sector_num;
     dbs->sg_cur_index = 0;
     dbs->sg_cur_byte = 0;
-    dbs->to_dev = to_dev;
+    dbs->dir = dir;
     dbs->io_func = io_func;
     dbs->bh = NULL;
     qemu_iovec_init(&dbs->iov, sg->nsg);
@@ -195,14 +197,16 @@ BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
                                 QEMUSGList *sg, uint64_t sector,
                                 void (*cb)(void *opaque, int ret), void *opaque)
 {
-    return dma_bdrv_io(bs, sg, sector, bdrv_aio_readv, cb, opaque, false);
+    return dma_bdrv_io(bs, sg, sector, bdrv_aio_readv, cb, opaque,
+                       DMA_DIRECTION_FROM_DEVICE);
 }
 
 BlockDriverAIOCB *dma_bdrv_write(BlockDriverState *bs,
                                  QEMUSGList *sg, uint64_t sector,
                                  void (*cb)(void *opaque, int ret), void *opaque)
 {
-    return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque, true);
+    return dma_bdrv_io(bs, sg, sector, bdrv_aio_writev, cb, opaque,
+                       DMA_DIRECTION_TO_DEVICE);
 }
 
 
diff --git a/dma.h b/dma.h
index 20e86d2..05ac325 100644
--- a/dma.h
+++ b/dma.h
@@ -17,6 +17,11 @@
 
 typedef struct ScatterGatherEntry ScatterGatherEntry;
 
+typedef enum {
+    DMA_DIRECTION_TO_DEVICE = 0,
+    DMA_DIRECTION_FROM_DEVICE = 1,
+} DMADirection;
+
 struct QEMUSGList {
     ScatterGatherEntry *sg;
     int nsg;
@@ -29,11 +34,6 @@ typedef target_phys_addr_t dma_addr_t;
 
 #define DMA_ADDR_FMT TARGET_FMT_plx
 
-typedef enum {
-    DMA_DIRECTION_TO_DEVICE = 0,
-    DMA_DIRECTION_FROM_DEVICE = 1,
-} DMADirection;
-
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
@@ -51,7 +51,7 @@ typedef BlockDriverAIOCB *DMAIOFunc(BlockDriverState *bs, int64_t sector_num,
 BlockDriverAIOCB *dma_bdrv_io(BlockDriverState *bs,
                               QEMUSGList *sg, uint64_t sector_num,
                               DMAIOFunc *io_func, BlockDriverCompletionFunc *cb,
-                              void *opaque, bool to_dev);
+                              void *opaque, DMADirection dir);
 BlockDriverAIOCB *dma_bdrv_read(BlockDriverState *bs,
                                 QEMUSGList *sg, uint64_t sector,
                                 BlockDriverCompletionFunc *cb, void *opaque);
diff --git a/hw/ide/core.c b/hw/ide/core.c
index 4d568ac..43da841 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -604,7 +604,8 @@ void ide_dma_cb(void *opaque, int ret)
         break;
     case IDE_DMA_TRIM:
         s->bus->dma->aiocb = dma_bdrv_io(s->bs, &s->sg, sector_num,
-                                         ide_issue_trim, ide_dma_cb, s, true);
+                                         ide_issue_trim, ide_dma_cb, s,
+                                         DMA_DIRECTION_TO_DEVICE);
         break;
     }
     return;
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index abbc41b..edcf885 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -149,7 +149,8 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
         break;
     case IDE_DMA_TRIM:
         m->aiocb = dma_bdrv_io(s->bs, &s->sg, sector_num,
-                               ide_issue_trim, pmac_ide_transfer_cb, s, true);
+                               ide_issue_trim, pmac_ide_transfer_cb, s,
+                               DMA_DIRECTION_TO_DEVICE);
         break;
     }
     return;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09 10:00   ` Paolo Bonzini
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 03/13] usb-xhci: Use PCI DMA helper functions David Gibson
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

A while back, we introduced the dma_addr_t type, which is supposed to
be used for bus visible memory addresses.  At present, this is an
alias for target_phys_addr_t, but this will change when we eventually
add support for guest visible IOMMUs.

There are some instances of target_phys_addr_t in the code now which
should really be dma_addr_t, but can't be trivially converted due to
missing features which this patch corrects.

 * We add DMA_ADDR_BITS analagous to TARGET_PHYS_ADDR_BITS.  This is
   important where we need to make a compile-time (#if) based on the
   size of dma_addr_t.

 * We add a new helper macro to create device properties which take a
   dma_addr_t.  This lets us correctly convert code which cuurrently
   has DEFINE_PROP_TADDR() for variables which should be dma_addr_t
   instead of target_phys_addr_t.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 Makefile.objs |    2 +-
 dma.h         |    1 +
 hw/qdev-dma.c |   77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/qdev-dma.h |    8 ++++++
 4 files changed, 87 insertions(+), 1 deletions(-)
 create mode 100644 hw/qdev-dma.c
 create mode 100644 hw/qdev-dma.h

diff --git a/Makefile.objs b/Makefile.objs
index 808de6a..59aed69 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -287,7 +287,7 @@ hw-obj-$(CONFIG_LSI_SCSI_PCI) += lsi53c895a.o
 hw-obj-$(CONFIG_ESP) += esp.o
 
 hw-obj-y += dma-helpers.o sysbus.o isa-bus.o
-hw-obj-y += qdev-addr.o
+hw-obj-y += qdev-addr.o qdev-dma.o
 
 # VGA
 hw-obj-$(CONFIG_VGA_PCI) += vga-pci.o
diff --git a/dma.h b/dma.h
index 05ac325..463095c 100644
--- a/dma.h
+++ b/dma.h
@@ -32,6 +32,7 @@ struct QEMUSGList {
 #if defined(TARGET_PHYS_ADDR_BITS)
 typedef target_phys_addr_t dma_addr_t;
 
+#define DMA_ADDR_BITS TARGET_PHYS_ADDR_BITS
 #define DMA_ADDR_FMT TARGET_FMT_plx
 
 struct ScatterGatherEntry {
diff --git a/hw/qdev-dma.c b/hw/qdev-dma.c
new file mode 100644
index 0000000..2b21bf4
--- /dev/null
+++ b/hw/qdev-dma.c
@@ -0,0 +1,77 @@
+#include "qdev.h"
+#include "qdev-dma.h"
+#include "dma.h"
+
+/* --- target physical address --- */
+
+static int parse_dmaaddr(DeviceState *dev, Property *prop, const char *str)
+{
+    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
+
+    *ptr = strtoull(str, NULL, 16);
+    return 0;
+}
+
+static int print_dmaaddr(DeviceState *dev, Property *prop, char *dest,
+                         size_t len)
+{
+    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
+    return snprintf(dest, len, "0x" DMA_ADDR_FMT, *ptr);
+}
+
+static void get_dmaaddr(Object *obj, Visitor *v, void *opaque,
+                      const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
+    int64_t value;
+
+    value = *ptr;
+    visit_type_int(v, &value, name, errp);
+}
+
+static void set_dmaaddr(Object *obj, Visitor *v, void *opaque,
+                      const char *name, Error **errp)
+{
+    DeviceState *dev = DEVICE(obj);
+    Property *prop = opaque;
+    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
+    Error *local_err = NULL;
+    int64_t value;
+
+    if (dev->state != DEV_STATE_CREATED) {
+        error_set(errp, QERR_PERMISSION_DENIED);
+        return;
+    }
+
+    visit_type_int(v, &value, name, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if ((uint64_t)value <= (uint64_t) ~(dma_addr_t)0) {
+        *ptr = value;
+    } else {
+        error_set(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
+                  dev->id?:"", name, value, (uint64_t) 0,
+                  (uint64_t) ~(dma_addr_t)0);
+    }
+}
+
+
+PropertyInfo qdev_prop_dmaaddr = {
+    .name  = "dmaaddr",
+    .parse = parse_dmaaddr,
+    .print = print_dmaaddr,
+    .get   = get_dmaaddr,
+    .set   = set_dmaaddr,
+};
+
+void qdev_prop_set_dmaaddr(DeviceState *dev, const char *name, dma_addr_t value)
+{
+    Error *errp = NULL;
+    object_property_set_int(OBJECT(dev), value, name, &errp);
+    assert(!errp);
+
+}
diff --git a/hw/qdev-dma.h b/hw/qdev-dma.h
new file mode 100644
index 0000000..8b01fda
--- /dev/null
+++ b/hw/qdev-dma.h
@@ -0,0 +1,8 @@
+#include "dma.h"
+
+#define DEFINE_PROP_DMAADDR(_n, _s, _f, _d)                               \
+    DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_dmaaddr, dma_addr_t)
+
+extern PropertyInfo qdev_prop_dmaaddr;
+void qdev_prop_set_dmaaddr(DeviceState *dev, const char *name,
+                           dma_addr_t value);
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 03/13] usb-xhci: Use PCI DMA helper functions
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero() David Gibson
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, agraf, Gerd Hoffman, eduard.munteanu, David Gibson, rth

Shortly before 1.0, we added helper functions / wrappers for doing PCI DMA
from individual devices.  This makes what's going on clearer and means that
when we add IOMMU support somewhere in the future, only the general PCI
code will have to change, not every device that uses PCI DMA.

However, usb-xhci is not using these wrappers, despite being a PCI only
device.  This patch remedies the situation, using the pci dma functions
instead of direct calls to cpu_physical_memory_{read,write}().  Likewise
address parameters for DMA are changed to dma_addr_t instead of
target_phys_addr_t.

Cc: Gerd Hoffman <kraxel@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-xhci.c |  176 +++++++++++++++++++++++++++-----------------------------
 1 files changed, 85 insertions(+), 91 deletions(-)

diff --git a/hw/usb-xhci.c b/hw/usb-xhci.c
index fc5b542..f0ffa8f 100644
--- a/hw/usb-xhci.c
+++ b/hw/usb-xhci.c
@@ -22,7 +22,6 @@
 #include "qemu-timer.h"
 #include "usb.h"
 #include "pci.h"
-#include "qdev-addr.h"
 #include "msi.h"
 
 //#define DEBUG_XHCI
@@ -140,7 +139,7 @@ typedef struct XHCITRB {
     uint64_t parameter;
     uint32_t status;
     uint32_t control;
-    target_phys_addr_t addr;
+    dma_addr_t addr;
     bool ccs;
 } XHCITRB;
 
@@ -291,8 +290,8 @@ typedef enum EPType {
 } EPType;
 
 typedef struct XHCIRing {
-    target_phys_addr_t base;
-    target_phys_addr_t dequeue;
+    dma_addr_t base;
+    dma_addr_t dequeue;
     bool ccs;
 } XHCIRing;
 
@@ -345,7 +344,7 @@ typedef struct XHCIEPContext {
     unsigned int next_bg;
     XHCITransfer bg_transfers[BG_XFERS];
     EPType type;
-    target_phys_addr_t pctx;
+    dma_addr_t pctx;
     unsigned int max_psize;
     bool has_bg;
     uint32_t state;
@@ -353,7 +352,7 @@ typedef struct XHCIEPContext {
 
 typedef struct XHCISlot {
     bool enabled;
-    target_phys_addr_t ctx;
+    dma_addr_t ctx;
     unsigned int port;
     unsigned int devaddr;
     XHCIEPContext * eps[31];
@@ -402,7 +401,7 @@ struct XHCIState {
     uint32_t erdp_low;
     uint32_t erdp_high;
 
-    target_phys_addr_t er_start;
+    dma_addr_t er_start;
     uint32_t er_size;
     bool er_pcs;
     unsigned int er_ep_idx;
@@ -479,18 +478,18 @@ static const char *trb_name(XHCITRB *trb)
 static void xhci_kick_ep(XHCIState *xhci, unsigned int slotid,
                          unsigned int epid);
 
-static inline target_phys_addr_t xhci_addr64(uint32_t low, uint32_t high)
+static inline dma_addr_t xhci_addr64(uint32_t low, uint32_t high)
 {
-#if TARGET_PHYS_ADDR_BITS > 32
-    return low | ((target_phys_addr_t)high << 32);
+#if DMA_ADDR_BITS > 32
+    return low | ((dma_addr_t)high << 32);
 #else
     return low;
 #endif
 }
 
-static inline target_phys_addr_t xhci_mask64(uint64_t addr)
+static inline dma_addr_t xhci_mask64(uint64_t addr)
 {
-#if TARGET_PHYS_ADDR_BITS > 32
+#if DMA_ADDR_BITS > 32
     return addr;
 #else
     return addr & 0xffffffff;
@@ -532,7 +531,7 @@ static void xhci_die(XHCIState *xhci)
 static void xhci_write_event(XHCIState *xhci, XHCIEvent *event)
 {
     XHCITRB ev_trb;
-    target_phys_addr_t addr;
+    dma_addr_t addr;
 
     ev_trb.parameter = cpu_to_le64(event->ptr);
     ev_trb.status = cpu_to_le32(event->length | (event->ccode << 24));
@@ -548,7 +547,7 @@ static void xhci_write_event(XHCIState *xhci, XHCIEvent *event)
             trb_name(&ev_trb));
 
     addr = xhci->er_start + TRB_SIZE*xhci->er_ep_idx;
-    cpu_physical_memory_write(addr, (uint8_t *) &ev_trb, TRB_SIZE);
+    pci_dma_write(&xhci->pci_dev, addr, &ev_trb, TRB_SIZE);
 
     xhci->er_ep_idx++;
     if (xhci->er_ep_idx >= xhci->er_size) {
@@ -559,7 +558,7 @@ static void xhci_write_event(XHCIState *xhci, XHCIEvent *event)
 
 static void xhci_events_update(XHCIState *xhci)
 {
-    target_phys_addr_t erdp;
+    dma_addr_t erdp;
     unsigned int dp_idx;
     bool do_irq = 0;
 
@@ -570,8 +569,8 @@ static void xhci_events_update(XHCIState *xhci)
     erdp = xhci_addr64(xhci->erdp_low, xhci->erdp_high);
     if (erdp < xhci->er_start ||
         erdp >= (xhci->er_start + TRB_SIZE*xhci->er_size)) {
-        fprintf(stderr, "xhci: ERDP out of bounds: "TARGET_FMT_plx"\n", erdp);
-        fprintf(stderr, "xhci: ER at "TARGET_FMT_plx" len %d\n",
+        fprintf(stderr, "xhci: ERDP out of bounds: "DMA_ADDR_FMT"\n", erdp);
+        fprintf(stderr, "xhci: ER at "DMA_ADDR_FMT" len %d\n",
                 xhci->er_start, xhci->er_size);
         xhci_die(xhci);
         return;
@@ -630,7 +629,7 @@ static void xhci_events_update(XHCIState *xhci)
 
 static void xhci_event(XHCIState *xhci, XHCIEvent *event)
 {
-    target_phys_addr_t erdp;
+    dma_addr_t erdp;
     unsigned int dp_idx;
 
     if (xhci->er_full) {
@@ -649,8 +648,8 @@ static void xhci_event(XHCIState *xhci, XHCIEvent *event)
     erdp = xhci_addr64(xhci->erdp_low, xhci->erdp_high);
     if (erdp < xhci->er_start ||
         erdp >= (xhci->er_start + TRB_SIZE*xhci->er_size)) {
-        fprintf(stderr, "xhci: ERDP out of bounds: "TARGET_FMT_plx"\n", erdp);
-        fprintf(stderr, "xhci: ER at "TARGET_FMT_plx" len %d\n",
+        fprintf(stderr, "xhci: ERDP out of bounds: "DMA_ADDR_FMT"\n", erdp);
+        fprintf(stderr, "xhci: ER at "DMA_ADDR_FMT" len %d\n",
                 xhci->er_start, xhci->er_size);
         xhci_die(xhci);
         return;
@@ -686,7 +685,7 @@ static void xhci_event(XHCIState *xhci, XHCIEvent *event)
 }
 
 static void xhci_ring_init(XHCIState *xhci, XHCIRing *ring,
-                           target_phys_addr_t base)
+                           dma_addr_t base)
 {
     ring->base = base;
     ring->dequeue = base;
@@ -694,18 +693,18 @@ static void xhci_ring_init(XHCIState *xhci, XHCIRing *ring,
 }
 
 static TRBType xhci_ring_fetch(XHCIState *xhci, XHCIRing *ring, XHCITRB *trb,
-                               target_phys_addr_t *addr)
+                               dma_addr_t *addr)
 {
     while (1) {
         TRBType type;
-        cpu_physical_memory_read(ring->dequeue, (uint8_t *) trb, TRB_SIZE);
+        pci_dma_read(&xhci->pci_dev, ring->dequeue, trb, TRB_SIZE);
         trb->addr = ring->dequeue;
         trb->ccs = ring->ccs;
         le64_to_cpus(&trb->parameter);
         le32_to_cpus(&trb->status);
         le32_to_cpus(&trb->control);
 
-        DPRINTF("xhci: TRB fetched [" TARGET_FMT_plx "]: "
+        DPRINTF("xhci: TRB fetched [" DMA_ADDR_FMT "]: "
                 "%016" PRIx64 " %08x %08x %s\n",
                 ring->dequeue, trb->parameter, trb->status, trb->control,
                 trb_name(trb));
@@ -735,19 +734,19 @@ static int xhci_ring_chain_length(XHCIState *xhci, const XHCIRing *ring)
 {
     XHCITRB trb;
     int length = 0;
-    target_phys_addr_t dequeue = ring->dequeue;
+    dma_addr_t dequeue = ring->dequeue;
     bool ccs = ring->ccs;
     /* hack to bundle together the two/three TDs that make a setup transfer */
     bool control_td_set = 0;
 
     while (1) {
         TRBType type;
-        cpu_physical_memory_read(dequeue, (uint8_t *) &trb, TRB_SIZE);
+        pci_dma_read(&xhci->pci_dev, dequeue, &trb, TRB_SIZE);
         le64_to_cpus(&trb.parameter);
         le32_to_cpus(&trb.status);
         le32_to_cpus(&trb.control);
 
-        DPRINTF("xhci: TRB peeked [" TARGET_FMT_plx "]: "
+        DPRINTF("xhci: TRB peeked [" DMA_ADDR_FMT "]: "
                 "%016" PRIx64 " %08x %08x\n",
                 dequeue, trb.parameter, trb.status, trb.control);
 
@@ -790,8 +789,8 @@ static void xhci_er_reset(XHCIState *xhci)
         xhci_die(xhci);
         return;
     }
-    target_phys_addr_t erstba = xhci_addr64(xhci->erstba_low, xhci->erstba_high);
-    cpu_physical_memory_read(erstba, (uint8_t *) &seg, sizeof(seg));
+    dma_addr_t erstba = xhci_addr64(xhci->erstba_low, xhci->erstba_high);
+    pci_dma_read(&xhci->pci_dev, erstba, &seg, sizeof(seg));
     le32_to_cpus(&seg.addr_low);
     le32_to_cpus(&seg.addr_high);
     le32_to_cpus(&seg.size);
@@ -807,7 +806,7 @@ static void xhci_er_reset(XHCIState *xhci)
     xhci->er_pcs = 1;
     xhci->er_full = 0;
 
-    DPRINTF("xhci: event ring:" TARGET_FMT_plx " [%d]\n",
+    DPRINTF("xhci: event ring:" DMA_ADDR_FMT " [%d]\n",
             xhci->er_start, xhci->er_size);
 }
 
@@ -833,24 +832,24 @@ static void xhci_set_ep_state(XHCIState *xhci, XHCIEPContext *epctx,
         return;
     }
 
-    cpu_physical_memory_read(epctx->pctx, (uint8_t *) ctx, sizeof(ctx));
+    pci_dma_read(&xhci->pci_dev, epctx->pctx, ctx, sizeof(ctx));
     ctx[0] &= ~EP_STATE_MASK;
     ctx[0] |= state;
     ctx[2] = epctx->ring.dequeue | epctx->ring.ccs;
     ctx[3] = (epctx->ring.dequeue >> 16) >> 16;
-    DPRINTF("xhci: set epctx: " TARGET_FMT_plx " state=%d dequeue=%08x%08x\n",
+    DPRINTF("xhci: set epctx: " DMA_ADDR_FMT " state=%d dequeue=%08x%08x\n",
             epctx->pctx, state, ctx[3], ctx[2]);
-    cpu_physical_memory_write(epctx->pctx, (uint8_t *) ctx, sizeof(ctx));
+    pci_dma_write(&xhci->pci_dev, epctx->pctx, ctx, sizeof(ctx));
     epctx->state = state;
 }
 
 static TRBCCode xhci_enable_ep(XHCIState *xhci, unsigned int slotid,
-                               unsigned int epid, target_phys_addr_t pctx,
+                               unsigned int epid, dma_addr_t pctx,
                                uint32_t *ctx)
 {
     XHCISlot *slot;
     XHCIEPContext *epctx;
-    target_phys_addr_t dequeue;
+    dma_addr_t dequeue;
     int i;
 
     assert(slotid >= 1 && slotid <= MAXSLOTS);
@@ -1087,7 +1086,7 @@ static TRBCCode xhci_set_ep_dequeue(XHCIState *xhci, unsigned int slotid,
 {
     XHCISlot *slot;
     XHCIEPContext *epctx;
-    target_phys_addr_t dequeue;
+    dma_addr_t dequeue;
 
     assert(slotid >= 1 && slotid <= MAXSLOTS);
 
@@ -1142,7 +1141,7 @@ static int xhci_xfer_data(XHCITransfer *xfer, uint8_t *data,
 
     for (i = 0; i < xfer->trb_count; i++) {
         XHCITRB *trb = &xfer->trbs[i];
-        target_phys_addr_t addr;
+        dma_addr_t addr;
         unsigned int chunk = 0;
 
         switch (TRB_TYPE(*trb)) {
@@ -1173,11 +1172,11 @@ static int xhci_xfer_data(XHCITransfer *xfer, uint8_t *data,
                     memcpy(data, &idata, chunk);
                 } else {
                     DPRINTF("xhci_xfer_data: r/w(%d) %d bytes at "
-                            TARGET_FMT_plx "\n", in_xfer, chunk, addr);
+                            DMA_ADDR_FMT "\n", in_xfer, chunk, addr);
                     if (in_xfer) {
-                        cpu_physical_memory_write(addr, data, chunk);
+                        pci_dma_write(&xhci->pci_dev, addr, data, chunk);
                     } else {
-                        cpu_physical_memory_read(addr, data, chunk);
+                        pci_dma_read(&xhci->pci_dev, addr, data, chunk);
                     }
 #ifdef DEBUG_DATA
                     unsigned int count = chunk;
@@ -1240,7 +1239,7 @@ static void xhci_stall_ep(XHCITransfer *xfer)
     epctx->ring.ccs = xfer->trbs[0].ccs;
     xhci_set_ep_state(xhci, epctx, EP_HALTED);
     DPRINTF("xhci: stalled slot %d ep %d\n", xfer->slotid, xfer->epid);
-    DPRINTF("xhci: will continue at "TARGET_FMT_plx"\n", epctx->ring.dequeue);
+    DPRINTF("xhci: will continue at "DMA_ADDR_FMT"\n", epctx->ring.dequeue);
 }
 
 static int xhci_submit(XHCIState *xhci, XHCITransfer *xfer,
@@ -1805,7 +1804,7 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
 {
     XHCISlot *slot;
     USBDevice *dev;
-    target_phys_addr_t ictx, octx, dcbaap;
+    dma_addr_t ictx, octx, dcbaap;
     uint64_t poctx;
     uint32_t ictl_ctx[2];
     uint32_t slot_ctx[4];
@@ -1818,15 +1817,14 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
     DPRINTF("xhci_address_slot(%d)\n", slotid);
 
     dcbaap = xhci_addr64(xhci->dcbaap_low, xhci->dcbaap_high);
-    cpu_physical_memory_read(dcbaap + 8*slotid,
-                             (uint8_t *) &poctx, sizeof(poctx));
+    pci_dma_read(&xhci->pci_dev, dcbaap + 8*slotid, &poctx, sizeof(poctx));
     ictx = xhci_mask64(pictx);
     octx = xhci_mask64(le64_to_cpu(poctx));
 
-    DPRINTF("xhci: input context at "TARGET_FMT_plx"\n", ictx);
-    DPRINTF("xhci: output context at "TARGET_FMT_plx"\n", octx);
+    DPRINTF("xhci: input context at "DMA_ADDR_FMT"\n", ictx);
+    DPRINTF("xhci: output context at "DMA_ADDR_FMT"\n", octx);
 
-    cpu_physical_memory_read(ictx, (uint8_t *) ictl_ctx, sizeof(ictl_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx, ictl_ctx, sizeof(ictl_ctx));
 
     if (ictl_ctx[0] != 0x0 || ictl_ctx[1] != 0x3) {
         fprintf(stderr, "xhci: invalid input context control %08x %08x\n",
@@ -1834,8 +1832,8 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
         return CC_TRB_ERROR;
     }
 
-    cpu_physical_memory_read(ictx+32, (uint8_t *) slot_ctx, sizeof(slot_ctx));
-    cpu_physical_memory_read(ictx+64, (uint8_t *) ep0_ctx, sizeof(ep0_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx+32, slot_ctx, sizeof(slot_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx+64, ep0_ctx, sizeof(ep0_ctx));
 
     DPRINTF("xhci: input slot context: %08x %08x %08x %08x\n",
             slot_ctx[0], slot_ctx[1], slot_ctx[2], slot_ctx[3]);
@@ -1884,8 +1882,8 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
     DPRINTF("xhci: output ep0 context: %08x %08x %08x %08x %08x\n",
             ep0_ctx[0], ep0_ctx[1], ep0_ctx[2], ep0_ctx[3], ep0_ctx[4]);
 
-    cpu_physical_memory_write(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
-    cpu_physical_memory_write(octx+32, (uint8_t *) ep0_ctx, sizeof(ep0_ctx));
+    pci_dma_write(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
+    pci_dma_write(&xhci->pci_dev, octx+32, ep0_ctx, sizeof(ep0_ctx));
 
     return res;
 }
@@ -1894,7 +1892,7 @@ static TRBCCode xhci_address_slot(XHCIState *xhci, unsigned int slotid,
 static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
                                   uint64_t pictx, bool dc)
 {
-    target_phys_addr_t ictx, octx;
+    dma_addr_t ictx, octx;
     uint32_t ictl_ctx[2];
     uint32_t slot_ctx[4];
     uint32_t islot_ctx[4];
@@ -1908,8 +1906,8 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
     ictx = xhci_mask64(pictx);
     octx = xhci->slots[slotid-1].ctx;
 
-    DPRINTF("xhci: input context at "TARGET_FMT_plx"\n", ictx);
-    DPRINTF("xhci: output context at "TARGET_FMT_plx"\n", octx);
+    DPRINTF("xhci: input context at "DMA_ADDR_FMT"\n", ictx);
+    DPRINTF("xhci: output context at "DMA_ADDR_FMT"\n", octx);
 
     if (dc) {
         for (i = 2; i <= 31; i++) {
@@ -1918,17 +1916,17 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
             }
         }
 
-        cpu_physical_memory_read(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+        pci_dma_read(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
         slot_ctx[3] &= ~(SLOT_STATE_MASK << SLOT_STATE_SHIFT);
         slot_ctx[3] |= SLOT_ADDRESSED << SLOT_STATE_SHIFT;
         DPRINTF("xhci: output slot context: %08x %08x %08x %08x\n",
                 slot_ctx[0], slot_ctx[1], slot_ctx[2], slot_ctx[3]);
-        cpu_physical_memory_write(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+        pci_dma_write(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
 
         return CC_SUCCESS;
     }
 
-    cpu_physical_memory_read(ictx, (uint8_t *) ictl_ctx, sizeof(ictl_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx, ictl_ctx, sizeof(ictl_ctx));
 
     if ((ictl_ctx[0] & 0x3) != 0x0 || (ictl_ctx[1] & 0x3) != 0x1) {
         fprintf(stderr, "xhci: invalid input context control %08x %08x\n",
@@ -1936,8 +1934,8 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
         return CC_TRB_ERROR;
     }
 
-    cpu_physical_memory_read(ictx+32, (uint8_t *) islot_ctx, sizeof(islot_ctx));
-    cpu_physical_memory_read(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx+32, islot_ctx, sizeof(islot_ctx));
+    pci_dma_read(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
 
     if (SLOT_STATE(slot_ctx[3]) < SLOT_ADDRESSED) {
         fprintf(stderr, "xhci: invalid slot state %08x\n", slot_ctx[3]);
@@ -1949,8 +1947,8 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
             xhci_disable_ep(xhci, slotid, i);
         }
         if (ictl_ctx[1] & (1<<i)) {
-            cpu_physical_memory_read(ictx+32+(32*i),
-                                     (uint8_t *) ep_ctx, sizeof(ep_ctx));
+            pci_dma_read(&xhci->pci_dev, ictx+32+(32*i), ep_ctx,
+                         sizeof(ep_ctx));
             DPRINTF("xhci: input ep%d.%d context: %08x %08x %08x %08x %08x\n",
                     i/2, i%2, ep_ctx[0], ep_ctx[1], ep_ctx[2],
                     ep_ctx[3], ep_ctx[4]);
@@ -1962,8 +1960,7 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
             DPRINTF("xhci: output ep%d.%d context: %08x %08x %08x %08x %08x\n",
                     i/2, i%2, ep_ctx[0], ep_ctx[1], ep_ctx[2],
                     ep_ctx[3], ep_ctx[4]);
-            cpu_physical_memory_write(octx+(32*i),
-                                      (uint8_t *) ep_ctx, sizeof(ep_ctx));
+            pci_dma_write(&xhci->pci_dev, octx+(32*i), ep_ctx, sizeof(ep_ctx));
         }
     }
 
@@ -1975,7 +1972,7 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
     DPRINTF("xhci: output slot context: %08x %08x %08x %08x\n",
             slot_ctx[0], slot_ctx[1], slot_ctx[2], slot_ctx[3]);
 
-    cpu_physical_memory_write(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+    pci_dma_write(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
 
     return CC_SUCCESS;
 }
@@ -1984,7 +1981,7 @@ static TRBCCode xhci_configure_slot(XHCIState *xhci, unsigned int slotid,
 static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
                                    uint64_t pictx)
 {
-    target_phys_addr_t ictx, octx;
+    dma_addr_t ictx, octx;
     uint32_t ictl_ctx[2];
     uint32_t iep0_ctx[5];
     uint32_t ep0_ctx[5];
@@ -1997,10 +1994,10 @@ static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
     ictx = xhci_mask64(pictx);
     octx = xhci->slots[slotid-1].ctx;
 
-    DPRINTF("xhci: input context at "TARGET_FMT_plx"\n", ictx);
-    DPRINTF("xhci: output context at "TARGET_FMT_plx"\n", octx);
+    DPRINTF("xhci: input context at "DMA_ADDR_FMT"\n", ictx);
+    DPRINTF("xhci: output context at "DMA_ADDR_FMT"\n", octx);
 
-    cpu_physical_memory_read(ictx, (uint8_t *) ictl_ctx, sizeof(ictl_ctx));
+    pci_dma_read(&xhci->pci_dev, ictx, ictl_ctx, sizeof(ictl_ctx));
 
     if (ictl_ctx[0] != 0x0 || ictl_ctx[1] & ~0x3) {
         fprintf(stderr, "xhci: invalid input context control %08x %08x\n",
@@ -2009,13 +2006,12 @@ static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
     }
 
     if (ictl_ctx[1] & 0x1) {
-        cpu_physical_memory_read(ictx+32,
-                                 (uint8_t *) islot_ctx, sizeof(islot_ctx));
+        pci_dma_read(&xhci->pci_dev, ictx+32, islot_ctx, sizeof(islot_ctx));
 
         DPRINTF("xhci: input slot context: %08x %08x %08x %08x\n",
                 islot_ctx[0], islot_ctx[1], islot_ctx[2], islot_ctx[3]);
 
-        cpu_physical_memory_read(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+        pci_dma_read(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
 
         slot_ctx[1] &= ~0xFFFF; /* max exit latency */
         slot_ctx[1] |= islot_ctx[1] & 0xFFFF;
@@ -2025,18 +2021,17 @@ static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
         DPRINTF("xhci: output slot context: %08x %08x %08x %08x\n",
                 slot_ctx[0], slot_ctx[1], slot_ctx[2], slot_ctx[3]);
 
-        cpu_physical_memory_write(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+        pci_dma_write(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
     }
 
     if (ictl_ctx[1] & 0x2) {
-        cpu_physical_memory_read(ictx+64,
-                                 (uint8_t *) iep0_ctx, sizeof(iep0_ctx));
+        pci_dma_read(&xhci->pci_dev, ictx+64, iep0_ctx, sizeof(iep0_ctx));
 
         DPRINTF("xhci: input ep0 context: %08x %08x %08x %08x %08x\n",
                 iep0_ctx[0], iep0_ctx[1], iep0_ctx[2],
                 iep0_ctx[3], iep0_ctx[4]);
 
-        cpu_physical_memory_read(octx+32, (uint8_t *) ep0_ctx, sizeof(ep0_ctx));
+        pci_dma_read(&xhci->pci_dev, octx+32, ep0_ctx, sizeof(ep0_ctx));
 
         ep0_ctx[1] &= ~0xFFFF0000; /* max packet size*/
         ep0_ctx[1] |= iep0_ctx[1] & 0xFFFF0000;
@@ -2044,8 +2039,7 @@ static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
         DPRINTF("xhci: output ep0 context: %08x %08x %08x %08x %08x\n",
                 ep0_ctx[0], ep0_ctx[1], ep0_ctx[2], ep0_ctx[3], ep0_ctx[4]);
 
-        cpu_physical_memory_write(octx+32,
-                                  (uint8_t *) ep0_ctx, sizeof(ep0_ctx));
+        pci_dma_write(&xhci->pci_dev, octx+32, ep0_ctx, sizeof(ep0_ctx));
     }
 
     return CC_SUCCESS;
@@ -2054,7 +2048,7 @@ static TRBCCode xhci_evaluate_slot(XHCIState *xhci, unsigned int slotid,
 static TRBCCode xhci_reset_slot(XHCIState *xhci, unsigned int slotid)
 {
     uint32_t slot_ctx[4];
-    target_phys_addr_t octx;
+    dma_addr_t octx;
     int i;
 
     assert(slotid >= 1 && slotid <= MAXSLOTS);
@@ -2062,7 +2056,7 @@ static TRBCCode xhci_reset_slot(XHCIState *xhci, unsigned int slotid)
 
     octx = xhci->slots[slotid-1].ctx;
 
-    DPRINTF("xhci: output context at "TARGET_FMT_plx"\n", octx);
+    DPRINTF("xhci: output context at "DMA_ADDR_FMT"\n", octx);
 
     for (i = 2; i <= 31; i++) {
         if (xhci->slots[slotid-1].eps[i-1]) {
@@ -2070,12 +2064,12 @@ static TRBCCode xhci_reset_slot(XHCIState *xhci, unsigned int slotid)
         }
     }
 
-    cpu_physical_memory_read(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+    pci_dma_read(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
     slot_ctx[3] &= ~(SLOT_STATE_MASK << SLOT_STATE_SHIFT);
     slot_ctx[3] |= SLOT_DEFAULT << SLOT_STATE_SHIFT;
     DPRINTF("xhci: output slot context: %08x %08x %08x %08x\n",
             slot_ctx[0], slot_ctx[1], slot_ctx[2], slot_ctx[3]);
-    cpu_physical_memory_write(octx, (uint8_t *) slot_ctx, sizeof(slot_ctx));
+    pci_dma_write(&xhci->pci_dev, octx, slot_ctx, sizeof(slot_ctx));
 
     return CC_SUCCESS;
 }
@@ -2098,19 +2092,19 @@ static unsigned int xhci_get_slot(XHCIState *xhci, XHCIEvent *event, XHCITRB *tr
 
 static TRBCCode xhci_get_port_bandwidth(XHCIState *xhci, uint64_t pctx)
 {
-    target_phys_addr_t ctx;
+    dma_addr_t ctx;
     uint8_t bw_ctx[MAXPORTS+1];
 
     DPRINTF("xhci_get_port_bandwidth()\n");
 
     ctx = xhci_mask64(pctx);
 
-    DPRINTF("xhci: bandwidth context at "TARGET_FMT_plx"\n", ctx);
+    DPRINTF("xhci: bandwidth context at "DMA_ADDR_FMT"\n", ctx);
 
     /* TODO: actually implement real values here */
     bw_ctx[0] = 0;
     memset(&bw_ctx[1], 80, MAXPORTS); /* 80% */
-    cpu_physical_memory_write(ctx, bw_ctx, sizeof(bw_ctx));
+    pci_dma_write(&xhci->pci_dev, ctx, bw_ctx, sizeof(bw_ctx));
 
     return CC_SUCCESS;
 }
@@ -2131,13 +2125,13 @@ static uint32_t xhci_nec_challenge(uint32_t hi, uint32_t lo)
     return ~val;
 }
 
-static void xhci_via_challenge(uint64_t addr)
+static void xhci_via_challenge(XHCIState *xhci, uint64_t addr)
 {
     uint32_t buf[8];
     uint32_t obuf[8];
-    target_phys_addr_t paddr = xhci_mask64(addr);
+    dma_addr_t paddr = xhci_mask64(addr);
 
-    cpu_physical_memory_read(paddr, (uint8_t *) &buf, 32);
+    pci_dma_read(&xhci->pci_dev, paddr, &buf, 32);
 
     memcpy(obuf, buf, sizeof(obuf));
 
@@ -2153,7 +2147,7 @@ static void xhci_via_challenge(uint64_t addr)
         obuf[7] = obuf[2] ^ obuf[3] ^ 0x65866593;
     }
 
-    cpu_physical_memory_write(paddr, (uint8_t *) &obuf, 32);
+    pci_dma_write(&xhci->pci_dev, paddr, &obuf, 32);
 }
 
 static void xhci_process_commands(XHCIState *xhci)
@@ -2161,7 +2155,7 @@ static void xhci_process_commands(XHCIState *xhci)
     XHCITRB trb;
     TRBType type;
     XHCIEvent event = {ER_COMMAND_COMPLETE, CC_SUCCESS};
-    target_phys_addr_t addr;
+    dma_addr_t addr;
     unsigned int i, slotid = 0;
 
     DPRINTF("xhci_process_commands()\n");
@@ -2250,7 +2244,7 @@ static void xhci_process_commands(XHCIState *xhci)
             event.ccode = xhci_get_port_bandwidth(xhci, trb.parameter);
             break;
         case CR_VENDOR_VIA_CHALLENGE_RESPONSE:
-            xhci_via_challenge(trb.parameter);
+            xhci_via_challenge(xhci, trb.parameter);
             break;
         case CR_VENDOR_NEC_FIRMWARE_REVISION:
             event.type = 48; /* NEC reply */
@@ -2540,7 +2534,7 @@ static void xhci_oper_write(XHCIState *xhci, uint32_t reg, uint32_t val)
             xhci_event(xhci, &event);
             DPRINTF("xhci: command ring stopped (CRCR=%08x)\n", xhci->crcr_low);
         } else {
-            target_phys_addr_t base = xhci_addr64(xhci->crcr_low & ~0x3f, val);
+            dma_addr_t base = xhci_addr64(xhci->crcr_low & ~0x3f, val);
             xhci_ring_init(xhci, &xhci->cmd_ring, base);
         }
         xhci->crcr_low &= ~(CRCR_CA | CRCR_CS);
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero()
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (2 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 03/13] usb-xhci: Use PCI DMA helper functions David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions David Gibson
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

This patch adds cpu_physical_memory_zero() function.  This is equivalent to
calling cpu_physical_memory_write() with a buffer full of zeroes, but
avoids actually allocating such a buffer along the way.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 cpu-common.h |    1 +
 exec.c       |   57 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index dca5175..146429c 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -53,6 +53,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write);
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len);
 static inline void cpu_physical_memory_read(target_phys_addr_t addr,
                                             void *buf, int len)
 {
diff --git a/exec.c b/exec.c
index a37d748..61d29bd 100644
--- a/exec.c
+++ b/exec.c
@@ -3914,6 +3914,63 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
     }
 }
 
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len)
+{
+    int l, io_index;
+    uint8_t *ptr;
+    target_phys_addr_t page;
+    MemoryRegionSection section;
+
+    while (len > 0) {
+        page = addr & TARGET_PAGE_MASK;
+        l = (page + TARGET_PAGE_SIZE) - addr;
+        if (l > len)
+            l = len;
+        section = phys_page_find(page >> TARGET_PAGE_BITS);
+
+        if (!memory_region_is_ram(section.mr)) {
+            target_phys_addr_t addr1;
+            io_index = memory_region_get_ram_addr(section.mr)
+                & (IO_MEM_NB_ENTRIES - 1);
+            addr1 = (addr & ~TARGET_PAGE_MASK)
+                + section.offset_within_region;
+            /* XXX: could force cpu_single_env to NULL to avoid
+               potential bugs */
+            if (l >= 4 && ((addr1 & 3) == 0)) {
+                /* 32 bit write access */
+                io_mem_write(io_index, addr1, 0, 4);
+                l = 4;
+            } else if (l >= 2 && ((addr1 & 1) == 0)) {
+                /* 16 bit write access */
+                io_mem_write(io_index, addr1, 0, 2);
+                l = 2;
+            } else {
+                /* 8 bit write access */
+                io_mem_write(io_index, addr1, 0, 1);
+                l = 1;
+            }
+        } else if (!section.readonly) {
+            ram_addr_t addr1;
+            addr1 = (memory_region_get_ram_addr(section.mr)
+                     + section.offset_within_region)
+                | (addr & ~TARGET_PAGE_MASK);
+            /* RAM case */
+            ptr = qemu_get_ram_ptr(addr1);
+            memset(ptr, 0, l);
+            if (!cpu_physical_memory_is_dirty(addr1)) {
+                /* invalidate code */
+                tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
+                /* set dirty bit */
+                cpu_physical_memory_set_dirty_flags(
+                    addr1, (0xff & ~CODE_DIRTY_FLAG));
+            }
+            qemu_put_ram_ptr(ptr);
+        }
+        len -= l;
+        addr += l;
+    }
+}
+
 /* used for ROM loading : can write in RAM and ROM */
 void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len)
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (3 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero() David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09 10:06   ` Paolo Bonzini
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 06/13] usb-ohci: Use " David Gibson
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, Joerg Rodel, agraf, eduard.munteanu, David Gibson, rth

Not that long ago, every device implementation using DMA directly
accessed guest memory using cpu_physical_memory_*().  This meant that
adding support for a guest visible IOMMU would require changing every
one of these devices to go through IOMMU translation.

Shortly before qemu 1.0, I made a start on fixing this by providing
helper functions for PCI DMA.  These are currently just stubs which
call the direct access functions, but mean that an IOMMU can be
implemented in one place, rather than for every PCI device.

Clearly, this doesn't help for non PCI devices, which could also be
IOMMU translated on some platforms.  It is also problematic for the
devices which have both PCI and non-PCI version (e.g. OHCI, AHCI) - we
cannot use the the pci_dma_*() functions, because they assume the
presence of a PCIDevice, but we don't want to have to check between
pci_dma_*() and cpu_physical_memory_*() every time we do a DMA in the
device code.

This patch makes the first step on addressing both these problems, by
introducing new (stub) dma helper functions which can be used for any
DMA capable device.

These dma functions take a DMAContext *, a new (currently empty)
variable describing the DMA address space in which the operation is to
take place.  NULL indicates untranslated DMA directly into guest
physical address space.  The intention is that in future non-NULL
values will given information about any necessary IOMMU translation.

DMA using devices must obtain a DMAContext (or, potentially, contexts)
from their bus or platform.  For now this patch just converts the PCI
wrappers to be implemented in terms of the universal wrappers,
converting other drivers can take place over time.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Joerg Rodel <Joerg.Rodel@amd.com>
Cc: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Cc: Richard Henderson <rth@twiddle.net>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dma.h         |  103 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 hw/pci.h      |   22 +++++++-----
 qemu-common.h |    1 +
 3 files changed, 117 insertions(+), 9 deletions(-)

diff --git a/dma.h b/dma.h
index 463095c..8b6ef44 100644
--- a/dma.h
+++ b/dma.h
@@ -35,6 +35,109 @@ typedef target_phys_addr_t dma_addr_t;
 #define DMA_ADDR_BITS TARGET_PHYS_ADDR_BITS
 #define DMA_ADDR_FMT TARGET_FMT_plx
 
+typedef void DMAInvalidateMapFunc(void *);
+
+/* Checks that the given range of addresses is valid for DMA.  This is
+ * useful for certain cases, but usually you should just use
+ * dma_memory_{read,write}() and check for errors */
+static inline bool dma_memory_valid(DMAContext *dma, dma_addr_t addr,
+                                    dma_addr_t len, DMADirection dir)
+{
+    /* Stub version, with no iommu we assume all bus addresses are valid */
+    return true;
+}
+
+static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+                                void *buf, dma_addr_t len, DMADirection dir)
+{
+    /* Stub version when we have no iommu support */
+    cpu_physical_memory_rw(addr, buf, (target_phys_addr_t)len,
+                           dir == DMA_DIRECTION_FROM_DEVICE);
+    return 0;
+}
+
+static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
+                                  void *buf, dma_addr_t len)
+{
+    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
+}
+
+static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
+                                   const void *buf, dma_addr_t len)
+{
+    return dma_memory_rw(dma, addr, (void *)buf, len,
+                         DMA_DIRECTION_FROM_DEVICE);
+}
+
+static inline int dma_memory_zero(DMAContext *dma, dma_addr_t addr,
+                                  dma_addr_t len)
+{
+    /* Stub version when we have no iommu support */
+    cpu_physical_memory_zero(addr, len);
+    return 0;
+}
+
+static inline void *dma_memory_map(DMAContext *dma,
+                                   DMAInvalidateMapFunc *cb, void *opaque,
+                                   dma_addr_t addr, dma_addr_t *len,
+                                   DMADirection dir)
+{
+    target_phys_addr_t xlen = *len;
+    void *p;
+
+    p = cpu_physical_memory_map(addr, &xlen,
+                                dir == DMA_DIRECTION_FROM_DEVICE);
+    *len = xlen;
+    return p;
+}
+
+static inline void dma_memory_unmap(DMAContext *dma,
+                                    void *buffer, dma_addr_t len,
+                                    DMADirection dir, dma_addr_t access_len)
+{
+    return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
+                                     dir == DMA_DIRECTION_FROM_DEVICE,
+                                     access_len);
+}
+
+#define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
+    static inline uint##_bits##_t ld##_lname##_##_end##_dma(DMAContext *dma, \
+                                                            dma_addr_t addr) \
+    {                                                                   \
+        uint##_bits##_t val;                                            \
+        dma_memory_read(dma, addr, &val, (_bits) / 8);                  \
+        return _end##_bits##_to_cpu(val);                               \
+    }                                                                   \
+    static inline void st##_sname##_##_end##_dma(DMAContext *dma,       \
+                                                 dma_addr_t addr,       \
+                                                 uint##_bits##_t val)   \
+    {                                                                   \
+        val = cpu_to_##_end##_bits(val);                                \
+        dma_memory_write(dma, addr, &val, (_bits) / 8);                 \
+    }
+
+static inline uint8_t ldub_dma(DMAContext *dma, dma_addr_t addr)
+{
+    uint8_t val;
+
+    dma_memory_read(dma, addr, &val, 1);
+    return val;
+}
+
+static inline void stb_dma(DMAContext *dma, dma_addr_t addr, uint8_t val)
+{
+    dma_memory_write(dma, addr, &val, 1);
+}
+
+DEFINE_LDST_DMA(uw, w, 16, le);
+DEFINE_LDST_DMA(l, l, 32, le);
+DEFINE_LDST_DMA(q, q, 64, le);
+DEFINE_LDST_DMA(uw, w, 16, be);
+DEFINE_LDST_DMA(l, l, 32, be);
+DEFINE_LDST_DMA(q, q, 64, be);
+
+#undef DEFINE_LDST_DMA
+
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
diff --git a/hw/pci.h b/hw/pci.h
index 4f19fdb..c021805 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -545,10 +545,16 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
 }
 
 /* DMA access functions */
+static inline DMAContext *pci_dma_context(PCIDevice *dev)
+{
+    /* Stub for when we have no PCI iommu support */
+    return NULL;
+}
+
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    cpu_physical_memory_rw(addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
+    dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
     return 0;
 }
 
@@ -568,12 +574,12 @@ static inline int pci_dma_write(PCIDevice *dev, dma_addr_t addr,
     static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev,      \
                                                    dma_addr_t addr)     \
     {                                                                   \
-        return ld##_l##_phys(addr);                                     \
+        return ld##_l##_dma(pci_dma_context(dev), addr);                \
     }                                                                   \
     static inline void st##_s##_pci_dma(PCIDevice *dev,                 \
-                          dma_addr_t addr, uint##_bits##_t val)         \
+                                        dma_addr_t addr, uint##_bits##_t val) \
     {                                                                   \
-        st##_s##_phys(addr, val);                                       \
+        st##_s##_dma(pci_dma_context(dev), addr, val);                  \
     }
 
 PCI_DMA_DEFINE_LDST(ub, b, 8);
@@ -587,21 +593,19 @@ PCI_DMA_DEFINE_LDST(q_be, q_be, 64);
 #undef PCI_DMA_DEFINE_LDST
 
 static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
+                                DMAInvalidateMapFunc *cb, void *opaque,
                                 dma_addr_t *plen, DMADirection dir)
 {
-    target_phys_addr_t len = *plen;
     void *buf;
 
-    buf = cpu_physical_memory_map(addr, &len, dir == DMA_DIRECTION_FROM_DEVICE);
-    *plen = len;
+    buf = dma_memory_map(pci_dma_context(dev), cb, opaque, addr, plen, dir);
     return buf;
 }
 
 static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len,
                                  DMADirection dir, dma_addr_t access_len)
 {
-    cpu_physical_memory_unmap(buffer, len, dir == DMA_DIRECTION_FROM_DEVICE,
-                              access_len);
+    dma_memory_unmap(pci_dma_context(dev), buffer, len, dir, access_len);
 }
 
 static inline void pci_dma_sglist_init(QEMUSGList *qsg, PCIDevice *dev,
diff --git a/qemu-common.h b/qemu-common.h
index dbfce6f..af2dd51 100644
--- a/qemu-common.h
+++ b/qemu-common.h
@@ -256,6 +256,7 @@ typedef struct SSIBus SSIBus;
 typedef struct EventNotifier EventNotifier;
 typedef struct VirtIODevice VirtIODevice;
 typedef struct QEMUSGList QEMUSGList;
+typedef struct DMAContext DMAContext;
 
 typedef uint64_t pcibus_t;
 
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 06/13] usb-ohci: Use universal DMA helper functions
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (4 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers David Gibson
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, agraf, Gerd Hoffmann, eduard.munteanu, David Gibson, rth

The OHCI device emulation can provide both PCI and SysBus OHCI
implementations.  Because of this, it was not previously converted to
use the PCI DMA helper functions.

This patch converts it to use the new universal DMA helper functions.
In the PCI case, it obtains its DMAContext from pci_dma_context(), in
the SysBus case, it uses NULL - i.e. assumes for now that there will
be no IOMMU translation for a SysBus OHCI.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-ohci.c |   93 +++++++++++++++++++++++++++++++-------------------------
 1 files changed, 51 insertions(+), 42 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index bc99519..4ad53d7 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -32,7 +32,7 @@
 #include "pci.h"
 #include "usb-ohci.h"
 #include "sysbus.h"
-#include "qdev-addr.h"
+#include "qdev-dma.h"
 
 //#define DEBUG_OHCI
 /* Dump packet contents.  */
@@ -63,6 +63,7 @@ typedef struct {
     USBBus bus;
     qemu_irq irq;
     MemoryRegion mem;
+    DMAContext *dma;
     int num_ports;
     const char *name;
 
@@ -105,7 +106,7 @@ typedef struct {
     uint32_t htest;
 
     /* SM501 local memory offset */
-    target_phys_addr_t localmem_base;
+    dma_addr_t localmem_base;
 
     /* Active packets.  */
     uint32_t old_ctl;
@@ -483,14 +484,14 @@ static void ohci_reset(void *opaque)
 
 /* Get an array of dwords from main memory */
 static inline int get_dwords(OHCIState *ohci,
-                             uint32_t addr, uint32_t *buf, int num)
+                             dma_addr_t addr, uint32_t *buf, int num)
 {
     int i;
 
     addr += ohci->localmem_base;
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-        cpu_physical_memory_read(addr, buf, sizeof(*buf));
+        dma_memory_read(ohci->dma, addr, buf, sizeof(*buf));
         *buf = le32_to_cpu(*buf);
     }
 
@@ -499,7 +500,7 @@ static inline int get_dwords(OHCIState *ohci,
 
 /* Put an array of dwords in to main memory */
 static inline int put_dwords(OHCIState *ohci,
-                             uint32_t addr, uint32_t *buf, int num)
+                             dma_addr_t addr, uint32_t *buf, int num)
 {
     int i;
 
@@ -507,7 +508,7 @@ static inline int put_dwords(OHCIState *ohci,
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
         uint32_t tmp = cpu_to_le32(*buf);
-        cpu_physical_memory_write(addr, &tmp, sizeof(tmp));
+        dma_memory_write(ohci->dma, addr, &tmp, sizeof(tmp));
     }
 
     return 1;
@@ -515,14 +516,14 @@ static inline int put_dwords(OHCIState *ohci,
 
 /* Get an array of words from main memory */
 static inline int get_words(OHCIState *ohci,
-                            uint32_t addr, uint16_t *buf, int num)
+                            dma_addr_t addr, uint16_t *buf, int num)
 {
     int i;
 
     addr += ohci->localmem_base;
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-        cpu_physical_memory_read(addr, buf, sizeof(*buf));
+        dma_memory_read(ohci->dma, addr, buf, sizeof(*buf));
         *buf = le16_to_cpu(*buf);
     }
 
@@ -531,7 +532,7 @@ static inline int get_words(OHCIState *ohci,
 
 /* Put an array of words in to main memory */
 static inline int put_words(OHCIState *ohci,
-                            uint32_t addr, uint16_t *buf, int num)
+                            dma_addr_t addr, uint16_t *buf, int num)
 {
     int i;
 
@@ -539,40 +540,40 @@ static inline int put_words(OHCIState *ohci,
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
         uint16_t tmp = cpu_to_le16(*buf);
-        cpu_physical_memory_write(addr, &tmp, sizeof(tmp));
+        dma_memory_write(ohci->dma, addr, &tmp, sizeof(tmp));
     }
 
     return 1;
 }
 
 static inline int ohci_read_ed(OHCIState *ohci,
-                               uint32_t addr, struct ohci_ed *ed)
+                               dma_addr_t addr, struct ohci_ed *ed)
 {
     return get_dwords(ohci, addr, (uint32_t *)ed, sizeof(*ed) >> 2);
 }
 
 static inline int ohci_read_td(OHCIState *ohci,
-                               uint32_t addr, struct ohci_td *td)
+                               dma_addr_t addr, struct ohci_td *td)
 {
     return get_dwords(ohci, addr, (uint32_t *)td, sizeof(*td) >> 2);
 }
 
 static inline int ohci_read_iso_td(OHCIState *ohci,
-                                   uint32_t addr, struct ohci_iso_td *td)
+                                   dma_addr_t addr, struct ohci_iso_td *td)
 {
     return (get_dwords(ohci, addr, (uint32_t *)td, 4) &&
             get_words(ohci, addr + 16, td->offset, 8));
 }
 
 static inline int ohci_read_hcca(OHCIState *ohci,
-                                 uint32_t addr, struct ohci_hcca *hcca)
+                                 dma_addr_t addr, struct ohci_hcca *hcca)
 {
-    cpu_physical_memory_read(addr + ohci->localmem_base, hcca, sizeof(*hcca));
+    dma_memory_read(ohci->dma, addr + ohci->localmem_base, hcca, sizeof(*hcca));
     return 1;
 }
 
 static inline int ohci_put_ed(OHCIState *ohci,
-                              uint32_t addr, struct ohci_ed *ed)
+                              dma_addr_t addr, struct ohci_ed *ed)
 {
     /* ed->tail is under control of the HCD.
      * Since just ed->head is changed by HC, just write back this
@@ -584,64 +585,63 @@ static inline int ohci_put_ed(OHCIState *ohci,
 }
 
 static inline int ohci_put_td(OHCIState *ohci,
-                              uint32_t addr, struct ohci_td *td)
+                              dma_addr_t addr, struct ohci_td *td)
 {
     return put_dwords(ohci, addr, (uint32_t *)td, sizeof(*td) >> 2);
 }
 
 static inline int ohci_put_iso_td(OHCIState *ohci,
-                                  uint32_t addr, struct ohci_iso_td *td)
+                                  dma_addr_t addr, struct ohci_iso_td *td)
 {
     return (put_dwords(ohci, addr, (uint32_t *)td, 4) &&
             put_words(ohci, addr + 16, td->offset, 8));
 }
 
 static inline int ohci_put_hcca(OHCIState *ohci,
-                                uint32_t addr, struct ohci_hcca *hcca)
+                                dma_addr_t addr, struct ohci_hcca *hcca)
 {
-    cpu_physical_memory_write(addr + ohci->localmem_base + HCCA_WRITEBACK_OFFSET,
-                              (char *)hcca + HCCA_WRITEBACK_OFFSET,
-                              HCCA_WRITEBACK_SIZE);
+    dma_memory_write(ohci->dma,
+                     addr + ohci->localmem_base + HCCA_WRITEBACK_OFFSET,
+                     (char *)hcca + HCCA_WRITEBACK_OFFSET,
+                     HCCA_WRITEBACK_SIZE);
     return 1;
 }
 
 /* Read/Write the contents of a TD from/to main memory.  */
 static void ohci_copy_td(OHCIState *ohci, struct ohci_td *td,
-                         uint8_t *buf, int len, int write)
+                         uint8_t *buf, int len, DMADirection dir)
 {
-    uint32_t ptr;
-    uint32_t n;
+    dma_addr_t ptr, n;
 
     ptr = td->cbp;
     n = 0x1000 - (ptr & 0xfff);
     if (n > len)
         n = len;
-    cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, n, write);
+    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, n, dir);
     if (n == len)
         return;
     ptr = td->be & ~0xfffu;
     buf += n;
-    cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, len - n, write);
+    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, len - n, dir);
 }
 
 /* Read/Write the contents of an ISO TD from/to main memory.  */
 static void ohci_copy_iso_td(OHCIState *ohci,
                              uint32_t start_addr, uint32_t end_addr,
-                             uint8_t *buf, int len, int write)
+                             uint8_t *buf, int len, DMADirection dir)
 {
-    uint32_t ptr;
-    uint32_t n;
+    dma_addr_t ptr, n;
 
     ptr = start_addr;
     n = 0x1000 - (ptr & 0xfff);
     if (n > len)
         n = len;
-    cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, n, write);
+    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, n, dir);
     if (n == len)
         return;
     ptr = end_addr & ~0xfffu;
     buf += n;
-    cpu_physical_memory_rw(ptr + ohci->localmem_base, buf, len - n, write);
+    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, len - n, dir);
 }
 
 static void ohci_process_lists(OHCIState *ohci, int completion);
@@ -804,7 +804,8 @@ static int ohci_service_iso_td(OHCIState *ohci, struct ohci_ed *ed,
     }
 
     if (len && dir != OHCI_TD_DIR_IN) {
-        ohci_copy_iso_td(ohci, start_addr, end_addr, ohci->usb_buf, len, 0);
+        ohci_copy_iso_td(ohci, start_addr, end_addr, ohci->usb_buf, len,
+                         DMA_DIRECTION_TO_DEVICE);
     }
 
     if (completion) {
@@ -828,7 +829,8 @@ static int ohci_service_iso_td(OHCIState *ohci, struct ohci_ed *ed,
     /* Writeback */
     if (dir == OHCI_TD_DIR_IN && ret >= 0 && ret <= len) {
         /* IN transfer succeeded */
-        ohci_copy_iso_td(ohci, start_addr, end_addr, ohci->usb_buf, ret, 1);
+        ohci_copy_iso_td(ohci, start_addr, end_addr, ohci->usb_buf, ret,
+                         DMA_DIRECTION_FROM_DEVICE);
         OHCI_SET_BM(iso_td.offset[relative_frame_number], TD_PSW_CC,
                     OHCI_CC_NOERROR);
         OHCI_SET_BM(iso_td.offset[relative_frame_number], TD_PSW_SIZE, ret);
@@ -971,7 +973,8 @@ static int ohci_service_td(OHCIState *ohci, struct ohci_ed *ed)
                 pktlen = len;
             }
             if (!completion) {
-                ohci_copy_td(ohci, &td, ohci->usb_buf, pktlen, 0);
+                ohci_copy_td(ohci, &td, ohci->usb_buf, pktlen,
+                             DMA_DIRECTION_TO_DEVICE);
             }
         }
     }
@@ -1021,7 +1024,8 @@ static int ohci_service_td(OHCIState *ohci, struct ohci_ed *ed)
     }
     if (ret >= 0) {
         if (dir == OHCI_TD_DIR_IN) {
-            ohci_copy_td(ohci, &td, ohci->usb_buf, ret, 1);
+            ohci_copy_td(ohci, &td, ohci->usb_buf, ret,
+                         DMA_DIRECTION_FROM_DEVICE);
 #ifdef DEBUG_PACKET
             DPRINTF("  data:");
             for (i = 0; i < ret; i++)
@@ -1747,11 +1751,14 @@ static USBBusOps ohci_bus_ops = {
 };
 
 static int usb_ohci_init(OHCIState *ohci, DeviceState *dev,
-                         int num_ports, uint32_t localmem_base,
-                         char *masterbus, uint32_t firstport)
+                         int num_ports, dma_addr_t localmem_base,
+                         char *masterbus, uint32_t firstport,
+                         DMAContext *dma)
 {
     int i;
 
+    ohci->dma = dma;
+
     if (usb_frame_time == 0) {
 #ifdef OHCI_TIME_WARP
         usb_frame_time = get_ticks_per_sec();
@@ -1816,7 +1823,8 @@ static int usb_ohci_initfn_pci(struct PCIDevice *dev)
     ohci->pci_dev.config[PCI_INTERRUPT_PIN] = 0x01; /* interrupt pin A */
 
     if (usb_ohci_init(&ohci->state, &dev->qdev, ohci->num_ports, 0,
-                      ohci->masterbus, ohci->firstport) != 0) {
+                      ohci->masterbus, ohci->firstport,
+                      pci_dma_context(dev)) != 0) {
         return -1;
     }
     ohci->state.irq = ohci->pci_dev.irq[0];
@@ -1835,7 +1843,7 @@ typedef struct {
     SysBusDevice busdev;
     OHCIState ohci;
     uint32_t num_ports;
-    target_phys_addr_t dma_offset;
+    dma_addr_t dma_offset;
 } OHCISysBusState;
 
 static int ohci_init_pxa(SysBusDevice *dev)
@@ -1843,7 +1851,8 @@ static int ohci_init_pxa(SysBusDevice *dev)
     OHCISysBusState *s = FROM_SYSBUS(OHCISysBusState, dev);
 
     /* Cannot fail as we pass NULL for masterbus */
-    usb_ohci_init(&s->ohci, &dev->qdev, s->num_ports, s->dma_offset, NULL, 0);
+    usb_ohci_init(&s->ohci, &dev->qdev, s->num_ports, s->dma_offset, NULL, 0,
+                  NULL);
     sysbus_init_irq(dev, &s->ohci.irq);
     sysbus_init_mmio(dev, &s->ohci.mem);
 
@@ -1879,7 +1888,7 @@ static TypeInfo ohci_pci_info = {
 
 static Property ohci_sysbus_properties[] = {
     DEFINE_PROP_UINT32("num-ports", OHCISysBusState, num_ports, 3),
-    DEFINE_PROP_TADDR("dma-offset", OHCISysBusState, dma_offset, 3),
+    DEFINE_PROP_DMAADDR("dma-offset", OHCISysBusState, dma_offset, 3),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (5 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 06/13] usb-ohci: Use " David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  9:43   ` Kevin Wolf
  2012-03-09 10:11   ` Paolo Bonzini
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions David Gibson
                   ` (5 subsequent siblings)
  12 siblings, 2 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, mst, agraf, eduard.munteanu, David Gibson, rth

dma-helpers.c contains a number of helper functions for doing
scatter/gather DMA, and various block device related DMA.  Currently,
these directly access guest memory using cpu_physical_memory_*(),
assuming no IOMMU translation.

This patch updates this code to use the new universal DMA helper
functions.  qemu_sglist_init() now takes a DMAContext * to describe
the DMA address space in which the scatter/gather will take place.

We minimally update the callers qemu_sglist_init() to pass NULL
(i.e. no translation, same as current behaviour).  Some of those
callers should pass something else in some cases to allow proper IOMMU
translation in future, but that will be fixed in later patches.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 dma-helpers.c  |   26 ++++++++++++++++++--------
 dma.h          |    3 ++-
 hw/ide/ahci.c  |    3 ++-
 hw/ide/macio.c |    4 ++--
 hw/pci.h       |    2 +-
 5 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index 5f19a85..9dcfb2c 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -11,12 +11,13 @@
 #include "block_int.h"
 #include "trace.h"
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
 {
     qsg->sg = g_malloc(alloc_hint * sizeof(ScatterGatherEntry));
     qsg->nsg = 0;
     qsg->nalloc = alloc_hint;
     qsg->size = 0;
+    qsg->dma = dma;
 }
 
 void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len)
@@ -75,10 +76,9 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
     int i;
 
     for (i = 0; i < dbs->iov.niov; ++i) {
-        cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
-                                  dbs->iov.iov[i].iov_len,
-                                  dbs->dir != DMA_DIRECTION_TO_DEVICE,
-                                  dbs->iov.iov[i].iov_len);
+        dma_memory_unmap(dbs->sg->dma, dbs->iov.iov[i].iov_base,
+                         dbs->iov.iov[i].iov_len, dbs->dir,
+                         dbs->iov.iov[i].iov_len);
     }
     qemu_iovec_reset(&dbs->iov);
 }
@@ -104,10 +104,20 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
     }
 }
 
+static void dma_bdrv_cancel(void *opaque)
+{
+    DMAAIOCB *dbs = opaque;
+
+    bdrv_aio_cancel(dbs->acb);
+    dma_bdrv_unmap(dbs);
+    qemu_iovec_destroy(&dbs->iov);
+    qemu_aio_release(dbs);
+}
+
 static void dma_bdrv_cb(void *opaque, int ret)
 {
     DMAAIOCB *dbs = (DMAAIOCB *)opaque;
-    target_phys_addr_t cur_addr, cur_len;
+    dma_addr_t cur_addr, cur_len;
     void *mem;
 
     trace_dma_bdrv_cb(dbs, ret);
@@ -124,8 +134,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
     while (dbs->sg_cur_index < dbs->sg->nsg) {
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
-        mem = cpu_physical_memory_map(cur_addr, &cur_len,
-                                      dbs->dir != DMA_DIRECTION_TO_DEVICE);
+        mem = dma_memory_map(dbs->sg->dma, dma_bdrv_cancel, dbs,
+                             cur_addr, &cur_len, dbs->dir);
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);
diff --git a/dma.h b/dma.h
index 8b6ef44..a66e3d7 100644
--- a/dma.h
+++ b/dma.h
@@ -27,6 +27,7 @@ struct QEMUSGList {
     int nsg;
     int nalloc;
     size_t size;
+    DMAContext *dma;
 };
 
 #if defined(TARGET_PHYS_ADDR_BITS)
@@ -143,7 +144,7 @@ struct ScatterGatherEntry {
     dma_addr_t len;
 };
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma);
 void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len);
 void qemu_sglist_destroy(QEMUSGList *qsg);
 #endif
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 041ce1e..6a218b5 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -667,7 +667,8 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist)
     if (sglist_alloc_hint > 0) {
         AHCI_SG *tbl = (AHCI_SG *)prdt;
 
-        qemu_sglist_init(sglist, sglist_alloc_hint);
+        /* FIXME: pass the correct DMAContext */
+        qemu_sglist_init(sglist, sglist_alloc_hint, NULL);
         for (i = 0; i < sglist_alloc_hint; i++) {
             /* flags_size is zero-based */
             qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index edcf885..568a299 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -76,7 +76,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
 
     s->io_buffer_size = io->len;
 
-    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL);
     qemu_sglist_add(&s->sg, io->addr, io->len);
     io->addr += io->len;
     io->len = 0;
@@ -133,7 +133,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
     s->io_buffer_index = 0;
     s->io_buffer_size = io->len;
 
-    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1);
+    qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1, NULL);
     qemu_sglist_add(&s->sg, io->addr, io->len);
     io->addr += io->len;
     io->len = 0;
diff --git a/hw/pci.h b/hw/pci.h
index c021805..41dcd05 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -611,7 +611,7 @@ static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len,
 static inline void pci_dma_sglist_init(QEMUSGList *qsg, PCIDevice *dev,
                                        int alloc_hint)
 {
-    qemu_sglist_init(qsg, alloc_hint);
+    qemu_sglist_init(qsg, alloc_hint, pci_dma_context(dev));
 }
 
 extern const VMStateDescription vmstate_pci_device;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (6 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  9:48   ` Kevin Wolf
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 09/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers David Gibson
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Kevin Wolf, mst, agraf, eduard.munteanu, David Gibson, rth

The AHCI device can provide both PCI and SysBus AHCI device
emulations.  For this reason, it wasn't previously converted to use
the pci_dma_*() helper functions.  Now that we have universal DMA
helper functions, this converts AHCI to use them.

The DMAContext is obtained from pci_dma_context() in the PCI case and
set to NULL in the SysBus case (i.e. we assume for now that a SysBus
AHCI has no IOMMU translation).

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/ide/ahci.c |    7 ++++---
 hw/ide/ahci.h |    3 ++-
 hw/ide/ich.c  |    2 +-
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index 6a218b5..3d31179 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -668,7 +668,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist)
         AHCI_SG *tbl = (AHCI_SG *)prdt;
 
         /* FIXME: pass the correct DMAContext */
-        qemu_sglist_init(sglist, sglist_alloc_hint, NULL);
+        qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma);
         for (i = 0; i < sglist_alloc_hint; i++) {
             /* flags_size is zero-based */
             qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
@@ -1115,11 +1115,12 @@ static const IDEDMAOps ahci_dma_ops = {
     .reset = ahci_dma_reset,
 };
 
-void ahci_init(AHCIState *s, DeviceState *qdev, int ports)
+void ahci_init(AHCIState *s, DeviceState *qdev, DMAContext *dma, int ports)
 {
     qemu_irq *irqs;
     int i;
 
+    s->dma = dma;
     s->ports = ports;
     s->dev = g_malloc0(sizeof(AHCIDevice) * ports);
     ahci_reg_init(s);
@@ -1182,7 +1183,7 @@ static const VMStateDescription vmstate_sysbus_ahci = {
 static int sysbus_ahci_init(SysBusDevice *dev)
 {
     SysbusAHCIState *s = FROM_SYSBUS(SysbusAHCIState, dev);
-    ahci_init(&s->ahci, &dev->qdev, s->num_ports);
+    ahci_init(&s->ahci, &dev->qdev, NULL, s->num_ports);
 
     sysbus_init_mmio(dev, &s->ahci.mem);
     sysbus_init_irq(dev, &s->ahci.irq);
diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
index b223d2c..af8c6ef 100644
--- a/hw/ide/ahci.h
+++ b/hw/ide/ahci.h
@@ -299,6 +299,7 @@ typedef struct AHCIState {
     uint32_t idp_index;     /* Current IDP index */
     int ports;
     qemu_irq irq;
+    DMAContext *dma;
 } AHCIState;
 
 typedef struct AHCIPCIState {
@@ -329,7 +330,7 @@ typedef struct NCQFrame {
     uint8_t reserved10;
 } QEMU_PACKED NCQFrame;
 
-void ahci_init(AHCIState *s, DeviceState *qdev, int ports);
+void ahci_init(AHCIState *s, DeviceState *qdev, DMAContext *dma, int ports);
 void ahci_uninit(AHCIState *s);
 
 void ahci_reset(void *opaque);
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index 560ae37..5354e13 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -91,7 +91,7 @@ static int pci_ich9_ahci_init(PCIDevice *dev)
     uint8_t *sata_cap;
     d = DO_UPCAST(struct AHCIPCIState, card, dev);
 
-    ahci_init(&d->ahci, &dev->qdev, 6);
+    ahci_init(&d->ahci, &dev->qdev, pci_dma_context(dev), 6);
 
     pci_config_set_prog_interface(d->card.config, AHCI_PROGMODE_MAJOR_REV_1);
 
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 09/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (7 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure David Gibson
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

The USB UHCI and EHCI drivers were converted some time ago to use the
pci_dma_*() helper functions.  However, this conversion was not complete
because in some places both these drivers do DMA via the usb_packet_map()
function in usb-libhw.c.  That function directly used
cpu_physical_memory_map().

Now that the sglist code uses DMA wrappers properly, we can convert the
functions in usb-libhw.c, thus conpleting the conversion of UHCI and EHCI
to use the DMA wrappers.

Note that usb_packet_map() invokes dma_memory_map() with a NULL invalidate
callback function.  When IOMMU support is added, this will mean that
usb_packet_map() and the corresponding usb_packet_unmap() must be called in
close proximity without dropping the qemu device lock - otherwise the guest
might invalidate IOMMU mappings while they are still in use by the device
code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/usb-ehci.c  |    4 ++--
 hw/usb-libhw.c |   22 ++++++++++++----------
 hw/usb-uhci.c  |    2 +-
 hw/usb.h       |    2 +-
 4 files changed, 16 insertions(+), 14 deletions(-)

diff --git a/hw/usb-ehci.c b/hw/usb-ehci.c
index afc8ccf..f7dd50e 100644
--- a/hw/usb-ehci.c
+++ b/hw/usb-ehci.c
@@ -1343,7 +1343,7 @@ err:
         set_field(&q->qh.token, q->tbytes, QTD_TOKEN_TBYTES);
     }
     ehci_finish_transfer(q, q->usb_status);
-    usb_packet_unmap(&q->packet);
+    usb_packet_unmap(&q->packet, &q->sgl);
 
     q->qh.token ^= QTD_TOKEN_DTOGGLE;
     q->qh.token &= ~QTD_TOKEN_ACTIVE;
@@ -1464,7 +1464,7 @@ static int ehci_process_itd(EHCIState *ehci,
                 usb_packet_map(&ehci->ipacket, &ehci->isgl);
                 ret = usb_handle_packet(dev, &ehci->ipacket);
                 assert(ret != USB_RET_ASYNC);
-                usb_packet_unmap(&ehci->ipacket);
+                usb_packet_unmap(&ehci->ipacket, &ehci->isgl);
             } else {
                 DPRINTF("ISOCH: attempt to addess non-iso endpoint\n");
                 ret = USB_RET_NAK;
diff --git a/hw/usb-libhw.c b/hw/usb-libhw.c
index 162b42b..b06c8ac 100644
--- a/hw/usb-libhw.c
+++ b/hw/usb-libhw.c
@@ -26,15 +26,16 @@
 
 int usb_packet_map(USBPacket *p, QEMUSGList *sgl)
 {
-    int is_write = (p->pid == USB_TOKEN_IN);
-    target_phys_addr_t len;
+    DMADirection dir = (p->pid == USB_TOKEN_IN) ?
+        DMA_DIRECTION_FROM_DEVICE : DMA_DIRECTION_TO_DEVICE;
+    dma_addr_t len;
     void *mem;
     int i;
 
     for (i = 0; i < sgl->nsg; i++) {
         len = sgl->sg[i].len;
-        mem = cpu_physical_memory_map(sgl->sg[i].base, &len,
-                                      is_write);
+        mem = dma_memory_map(sgl->dma, NULL, NULL,
+                             sgl->sg[i].base, &len, dir);
         if (!mem) {
             goto err;
         }
@@ -46,18 +47,19 @@ int usb_packet_map(USBPacket *p, QEMUSGList *sgl)
     return 0;
 
 err:
-    usb_packet_unmap(p);
+    usb_packet_unmap(p, sgl);
     return -1;
 }
 
-void usb_packet_unmap(USBPacket *p)
+void usb_packet_unmap(USBPacket *p, QEMUSGList *sgl)
 {
-    int is_write = (p->pid == USB_TOKEN_IN);
+    DMADirection dir = (p->pid == USB_TOKEN_IN) ?
+        DMA_DIRECTION_FROM_DEVICE : DMA_DIRECTION_TO_DEVICE;
     int i;
 
     for (i = 0; i < p->iov.niov; i++) {
-        cpu_physical_memory_unmap(p->iov.iov[i].iov_base,
-                                  p->iov.iov[i].iov_len, is_write,
-                                  p->iov.iov[i].iov_len);
+        dma_memory_unmap(sgl->dma, p->iov.iov[i].iov_base,
+                         p->iov.iov[i].iov_len, dir,
+                         p->iov.iov[i].iov_len);
     }
 }
diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index 70e3881..a7d2dbc 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -863,7 +863,7 @@ static int uhci_handle_td(UHCIState *s, uint32_t addr, UHCI_TD *td, uint32_t *in
 
 done:
     len = uhci_complete_td(s, td, async, int_mask);
-    usb_packet_unmap(&async->packet);
+    usb_packet_unmap(&async->packet, &async->sgl);
     uhci_async_free(async);
     return len;
 }
diff --git a/hw/usb.h b/hw/usb.h
index 8e83697..720a045 100644
--- a/hw/usb.h
+++ b/hw/usb.h
@@ -336,7 +336,7 @@ void usb_packet_set_state(USBPacket *p, USBPacketState state);
 void usb_packet_setup(USBPacket *p, int pid, USBEndpoint *ep);
 void usb_packet_addbuf(USBPacket *p, void *ptr, size_t len);
 int usb_packet_map(USBPacket *p, QEMUSGList *sgl);
-void usb_packet_unmap(USBPacket *p);
+void usb_packet_unmap(USBPacket *p, QEMUSGList *sgl);
 void usb_packet_copy(USBPacket *p, void *ptr, size_t bytes);
 void usb_packet_skip(USBPacket *p, size_t bytes);
 void usb_packet_cleanup(USBPacket *p);
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (8 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 09/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09 10:23   ` Paolo Bonzini
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 11/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure David Gibson
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

This patch adds the basic infrastructure necessary to emulate an IOMMU
visible to the guest.  The DMAContext structure is extended with
information and a callback describing the translation, and the various
DMA functions used by devices will now perform IOMMU translation using
this callback.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>

Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 configure     |   12 ++++
 dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 dma.h         |  125 +++++++++++++++++++++++++++++++-------
 3 files changed, 306 insertions(+), 22 deletions(-)

diff --git a/configure b/configure
index a5eb832..e6fba2f 100755
--- a/configure
+++ b/configure
@@ -138,6 +138,7 @@ linux_aio=""
 cap_ng=""
 attr=""
 libattr=""
+iommu="yes"
 xfs=""
 
 vhost_net="no"
@@ -784,6 +785,10 @@ for opt do
   ;;
   --enable-vhost-net) vhost_net="yes"
   ;;
+  --enable-iommu) iommu="yes"
+  ;;
+  --disable-iommu) iommu="no"
+  ;;
   --disable-opengl) opengl="no"
   ;;
   --enable-opengl) opengl="yes"
@@ -1085,6 +1090,8 @@ echo "  --enable-docs            enable documentation build"
 echo "  --disable-docs           disable documentation build"
 echo "  --disable-vhost-net      disable vhost-net acceleration support"
 echo "  --enable-vhost-net       enable vhost-net acceleration support"
+echo "  --disable-iommu          disable IOMMU emulation support"
+echo "  --enable-iommu           enable IOMMU emulation support"
 echo "  --enable-trace-backend=B Set trace backend"
 echo "                           Available backends:" $("$source_path"/scripts/tracetool --list-backends)
 echo "  --with-trace-file=NAME   Full PATH,NAME of file to store traces"
@@ -2924,6 +2931,7 @@ echo "posix_madvise     $posix_madvise"
 echo "uuid support      $uuid"
 echo "libcap-ng support $cap_ng"
 echo "vhost-net support $vhost_net"
+echo "IOMMU support     $iommu"
 echo "Trace backend     $trace_backend"
 echo "Trace output file $trace_file-<pid>"
 echo "spice support     $spice"
@@ -3801,6 +3809,10 @@ if test "$target_softmmu" = "yes" -a \( \
   echo "CONFIG_NEED_MMU=y" >> $config_target_mak
 fi
 
+if test "$iommu" = "yes" ; then
+  echo "CONFIG_IOMMU=y" >> $config_host_mak
+fi
+
 if test "$gprof" = "yes" ; then
   echo "TARGET_GPROF=yes" >> $config_target_mak
   if test "$target_linux_user" = "yes" ; then
diff --git a/dma-helpers.c b/dma-helpers.c
index 9dcfb2c..248575d 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -10,6 +10,9 @@
 #include "dma.h"
 #include "block_int.h"
 #include "trace.h"
+#include "range.h"
+
+/* #define DEBUG_IOMMU */
 
 void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
 {
@@ -255,3 +258,191 @@ void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
 {
     bdrv_acct_start(bs, cookie, sg->size, type);
 }
+
+#ifdef CONFIG_IOMMU
+bool __dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
+                        DMADirection dir)
+{
+    target_phys_addr_t paddr, plen;
+
+#ifdef DEBUG_IOMMU
+    fprintf(stderr, "dma_memory_check context=%p addr=0x" DMA_ADDR_FMT
+            " len=0x" DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
+#endif
+
+    while (len) {
+        if (dma->translate(dma, addr, &paddr, &plen, dir) != 0) {
+            return false;
+        }
+
+        /* The translation might be valid for larger regions. */
+        if (plen > len) {
+            plen = len;
+        }
+
+        len -= plen;
+        addr += plen;
+    }
+
+    return true;
+}
+
+int __dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+                    void *buf, dma_addr_t len, DMADirection dir)
+{
+    target_phys_addr_t paddr, plen;
+    int err;
+
+#ifdef DEBUG_IOMMU
+    fprintf(stderr, "dma_memory_rw context=%p addr=0x" DMA_ADDR_FMT " len=0x"
+            DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
+#endif
+
+    while (len) {
+        err = dma->translate(dma, addr, &paddr, &plen, dir);
+        if (err) {
+            return -1;
+        }
+
+        /* The translation might be valid for larger regions. */
+        if (plen > len) {
+            plen = len;
+        }
+
+        cpu_physical_memory_rw(paddr, buf, plen,
+                               dir == DMA_DIRECTION_FROM_DEVICE);
+
+        len -= plen;
+        addr += plen;
+        buf += plen;
+    }
+
+    return 0;
+}
+
+int __dma_memory_zero(DMAContext *dma, dma_addr_t addr, dma_addr_t len)
+{
+    target_phys_addr_t paddr, plen;
+    int err;
+
+#ifdef DEBUG_IOMMU
+    fprintf(stderr, "dma_memory_zero context=%p addr=0x" DMA_ADDR_FMT
+            " len=0x" DMA_ADDR_FMT "\n", dma, addr, len);
+#endif
+
+    while (len) {
+        err = dma->translate(dma, addr, &paddr, &plen,
+                             DMA_DIRECTION_FROM_DEVICE);
+        if (err) {
+            return err;
+        }
+
+        /* The translation might be valid for larger regions. */
+        if (plen > len) {
+            plen = len;
+        }
+
+        cpu_physical_memory_zero(paddr, plen);
+
+        len -= plen;
+        addr += plen;
+    }
+
+    return 0;
+}
+
+typedef struct DMAMemoryMap DMAMemoryMap;
+struct DMAMemoryMap {
+    dma_addr_t              addr;
+    size_t                  len;
+    void                    *buf;
+    DMAInvalidateMapFunc    *invalidate;
+    void                    *invalidate_opaque;
+
+    QLIST_ENTRY(DMAMemoryMap) list;
+};
+
+void dma_context_init(DMAContext *dma, DMATranslateFunc fn)
+{
+#ifdef DEBUG_IOMMU
+    fprintf(stderr, "dma_context_init(%p, %p)\n", dma, fn);
+#endif
+    dma->translate = fn;
+    QLIST_INIT(&dma->memory_maps);
+}
+
+void dma_invalidate_memory_range(DMAContext *dma,
+                                 dma_addr_t addr, dma_addr_t len)
+{
+    DMAMemoryMap *map;
+
+    QLIST_FOREACH(map, &dma->memory_maps, list) {
+        if (ranges_overlap(addr, len, map->addr, map->len)) {
+            map->invalidate(map->invalidate_opaque);
+            QLIST_REMOVE(map, list);
+            free(map);
+        }
+    }
+}
+
+void *__dma_memory_map(DMAContext *dma, DMAInvalidateMapFunc *cb,
+                       void *cb_opaque, dma_addr_t addr, dma_addr_t *len,
+                       DMADirection dir)
+{
+    int err;
+    target_phys_addr_t paddr, plen;
+    void *buf;
+
+    plen = *len;
+    err = dma->translate(dma, addr, &paddr, &plen, dir);
+    if (err) {
+        return NULL;
+    }
+
+    /*
+     * If this is true, the virtual region is contiguous,
+     * but the translated physical region isn't. We just
+     * clamp *len, much like cpu_physical_memory_map() does.
+     */
+    if (plen < *len) {
+        *len = plen;
+    }
+
+    buf = cpu_physical_memory_map(paddr, &plen,
+                                  dir == DMA_DIRECTION_FROM_DEVICE);
+    *len = plen;
+
+    /* We treat maps as remote TLBs to cope with stuff like AIO. */
+    if (cb) {
+        DMAMemoryMap *map;
+
+        map = g_malloc(sizeof(DMAMemoryMap));
+        map->addr = addr;
+        map->len = *len;
+        map->buf = buf;
+        map->invalidate = cb;
+        map->invalidate_opaque = cb_opaque;
+
+        QLIST_INSERT_HEAD(&dma->memory_maps, map, list);
+    }
+
+    return buf;
+}
+
+void __dma_memory_unmap(DMAContext *dma, void *buffer, dma_addr_t len,
+                        DMADirection dir, dma_addr_t access_len)
+{
+    DMAMemoryMap *map;
+
+    cpu_physical_memory_unmap(buffer, len,
+                              dir == DMA_DIRECTION_FROM_DEVICE,
+                              access_len);
+
+    QLIST_FOREACH(map, &dma->memory_maps, list) {
+        if ((map->buf == buffer) && (map->len == len)) {
+            QLIST_REMOVE(map, list);
+            free(map);
+        }
+    }
+}
+#endif /* CONFIG_IOMMU */
diff --git a/dma.h b/dma.h
index a66e3d7..ec06163 100644
--- a/dma.h
+++ b/dma.h
@@ -15,6 +15,7 @@
 #include "hw/hw.h"
 #include "block.h"
 
+typedef struct DMAContext DMAContext;
 typedef struct ScatterGatherEntry ScatterGatherEntry;
 
 typedef enum {
@@ -31,30 +32,82 @@ struct QEMUSGList {
 };
 
 #if defined(TARGET_PHYS_ADDR_BITS)
+
+#ifdef CONFIG_IOMMU
+/*
+ * When an IOMMU is present, bus addresses become distinct from
+ * CPU/memory physical addresses and may be a different size.  Because
+ * the IOVA size depends more on the bus than on the platform, we more
+ * or less have to treat these as 64-bit always to cover all (or at
+ * least most) cases.
+ */
+typedef uint64_t dma_addr_t;
+
+#define DMA_ADDR_BITS 64
+#define DMA_ADDR_FMT "%" PRIx64
+#else
 typedef target_phys_addr_t dma_addr_t;
 
 #define DMA_ADDR_BITS TARGET_PHYS_ADDR_BITS
 #define DMA_ADDR_FMT TARGET_FMT_plx
+#endif
+
+typedef int DMATranslateFunc(DMAContext *dma,
+                             dma_addr_t addr,
+                             target_phys_addr_t *paddr,
+                             target_phys_addr_t *len,
+                             DMADirection dir);
+
+typedef struct DMAContext {
+#ifdef CONFIG_IOMMU
+    DMATranslateFunc *translate;
+    QLIST_HEAD(memory_maps, DMAMemoryMap) memory_maps;
+#endif
+} DMAContext;
+
+#ifdef CONFIG_IOMMU
+static inline bool dma_has_iommu(DMAContext *dma)
+{
+    return !!dma;
+}
+#else
+static inline bool dma_has_iommu(DMAContext *dma)
+{
+    return false;
+}
+#endif
 
 typedef void DMAInvalidateMapFunc(void *);
 
 /* Checks that the given range of addresses is valid for DMA.  This is
  * useful for certain cases, but usually you should just use
  * dma_memory_{read,write}() and check for errors */
-static inline bool dma_memory_valid(DMAContext *dma, dma_addr_t addr,
-                                    dma_addr_t len, DMADirection dir)
+bool __dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
+                        DMADirection dir);
+static inline bool dma_memory_valid(DMAContext *dma,
+                                    dma_addr_t addr, dma_addr_t len,
+                                    DMADirection dir)
 {
-    /* Stub version, with no iommu we assume all bus addresses are valid */
-    return true;
+    if (!dma_has_iommu(dma)) {
+        return true;
+    } else {
+        return __dma_memory_valid(dma, addr, len, dir);
+    }
 }
 
+int __dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+                    void *buf, dma_addr_t len, DMADirection dir);
 static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
                                 void *buf, dma_addr_t len, DMADirection dir)
 {
-    /* Stub version when we have no iommu support */
-    cpu_physical_memory_rw(addr, buf, (target_phys_addr_t)len,
-                           dir == DMA_DIRECTION_FROM_DEVICE);
-    return 0;
+    if (!dma_has_iommu(dma)) {
+        /* Fast-path for no IOMMU */
+        cpu_physical_memory_rw(addr, buf, len,
+                               dir == DMA_DIRECTION_FROM_DEVICE);
+        return 0;
+    } else {
+        return __dma_memory_rw(dma, addr, buf, len, dir);
+    }
 }
 
 static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
@@ -70,35 +123,55 @@ static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
                          DMA_DIRECTION_FROM_DEVICE);
 }
 
+int __dma_memory_zero(DMAContext *dma, dma_addr_t addr, dma_addr_t len);
 static inline int dma_memory_zero(DMAContext *dma, dma_addr_t addr,
                                   dma_addr_t len)
 {
-    /* Stub version when we have no iommu support */
-    cpu_physical_memory_zero(addr, len);
-    return 0;
+    if (!dma_has_iommu(dma)) {
+        /* Fast-path for no IOMMU */
+        cpu_physical_memory_zero(addr, len);
+        return 0;
+    } else {
+        return __dma_memory_zero(dma, addr, len);
+    }
 }
 
+void *__dma_memory_map(DMAContext *dma,
+                       DMAInvalidateMapFunc *cb, void *opaque,
+                       dma_addr_t addr, dma_addr_t *len,
+                       DMADirection dir);
 static inline void *dma_memory_map(DMAContext *dma,
-                                   DMAInvalidateMapFunc *cb, void *opaque,
+                                   DMAInvalidateMapFunc *cb, void *cb_opaque,
                                    dma_addr_t addr, dma_addr_t *len,
                                    DMADirection dir)
 {
-    target_phys_addr_t xlen = *len;
-    void *p;
-
-    p = cpu_physical_memory_map(addr, &xlen,
-                                dir == DMA_DIRECTION_FROM_DEVICE);
-    *len = xlen;
-    return p;
+    if (!dma_has_iommu(dma)) {
+        target_phys_addr_t xlen = *len;
+        void *p;
+
+        p = cpu_physical_memory_map(addr, &xlen,
+                                    dir == DMA_DIRECTION_FROM_DEVICE);
+        *len = xlen;
+        return p;
+    } else {
+        return __dma_memory_map(dma, cb, cb_opaque, addr, len, dir);
+    }
 }
 
+void __dma_memory_unmap(DMAContext *dma,
+                        void *buffer, dma_addr_t len,
+                        DMADirection dir, dma_addr_t access_len);
 static inline void dma_memory_unmap(DMAContext *dma,
                                     void *buffer, dma_addr_t len,
                                     DMADirection dir, dma_addr_t access_len)
 {
-    return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
-                                     dir == DMA_DIRECTION_FROM_DEVICE,
-                                     access_len);
+    if (!dma_has_iommu(dma)) {
+        return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
+                                         dir == DMA_DIRECTION_FROM_DEVICE,
+                                         access_len);
+    } else {
+        __dma_memory_unmap(dma, buffer, len, dir, access_len);
+    }
 }
 
 #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
@@ -139,6 +212,14 @@ DEFINE_LDST_DMA(q, q, 64, be);
 
 #undef DEFINE_LDST_DMA
 
+#ifdef CONFIG_IOMMU
+
+void dma_context_init(DMAContext *dma, DMATranslateFunc fn);
+void dma_invalidate_memory_range(DMAContext *dma,
+                                 dma_addr_t addr, dma_addr_t len);
+
+#endif /* CONFIG_IOMMU */
+
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 11/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (9 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 12/13] iommu: Allow PCI to use " David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices David Gibson
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

The pseries platform already contains an IOMMU implementation, since it is
essential for the platform's paravirtualized VIO devices.  This IOMMU
support is currently built into the implementation of the VIO "bus" and
the various VIO devices.

This patch converts this code to make use of the new common IOMMU
infrastructure.

Cc: Alex Graf <agraf@suse.de>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 Makefile.target  |    2 +-
 configure        |    2 +-
 hw/spapr.c       |    3 +
 hw/spapr.h       |   16 +++
 hw/spapr_iommu.c |  242 ++++++++++++++++++++++++++++++++++++++++++++++
 hw/spapr_llan.c  |   61 ++++++------
 hw/spapr_vio.c   |  282 ++++--------------------------------------------------
 hw/spapr_vio.h   |   71 +++++++-------
 hw/spapr_vscsi.c |   24 +++--
 target-ppc/kvm.c |    4 +-
 10 files changed, 365 insertions(+), 342 deletions(-)
 create mode 100644 hw/spapr_iommu.c

diff --git a/Makefile.target b/Makefile.target
index 5f3fc40..b7bc3f6 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -257,7 +257,7 @@ obj-ppc-y += ppc_oldworld.o
 # NewWorld PowerMac
 obj-ppc-y += ppc_newworld.o
 # IBM pSeries (sPAPR)
-obj-ppc-$(CONFIG_PSERIES) += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o
+obj-ppc-$(CONFIG_PSERIES) += spapr.o spapr_hcall.o spapr_rtas.o spapr_vio.o spapr_iommu.o
 obj-ppc-$(CONFIG_PSERIES) += xics.o spapr_vty.o spapr_llan.o spapr_vscsi.o
 obj-ppc-$(CONFIG_PSERIES) += spapr_pci.o device-hotplug.o pci-hotplug.o
 # PowerPC 4xx boards
diff --git a/configure b/configure
index e6fba2f..29f0997 100755
--- a/configure
+++ b/configure
@@ -3640,7 +3640,7 @@ case "$target_arch2" in
       fi
     fi
 esac
-if test "$target_arch2" = "ppc64" -a "$fdt" = "yes"; then
+if test "$target_arch2" = "ppc64" -a "$fdt" = "yes" -a "$iommu" = "yes" ; then
   echo "CONFIG_PSERIES=y" >> $config_target_mak
 fi
 if test "$target_bigendian" = "yes" ; then
diff --git a/hw/spapr.c b/hw/spapr.c
index 106c896..ac1da63 100644
--- a/hw/spapr.c
+++ b/hw/spapr.c
@@ -619,6 +619,9 @@ static void ppc_spapr_init(ram_addr_t ram_size,
     spapr->icp = xics_system_init(XICS_IRQS);
     spapr->next_irq = 16;
 
+    /* Set up IOMMU */
+    spapr_iommu_init();
+
     /* Set up VIO bus */
     spapr->vio_bus = spapr_vio_bus_init();
 
diff --git a/hw/spapr.h b/hw/spapr.h
index 689f255..210b868 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -319,4 +319,20 @@ target_ulong spapr_rtas_call(sPAPREnvironment *spapr,
 int spapr_rtas_device_tree_setup(void *fdt, target_phys_addr_t rtas_addr,
                                  target_phys_addr_t rtas_size);
 
+#define SPAPR_TCE_PAGE_SHIFT   12
+#define SPAPR_TCE_PAGE_SIZE    (1ULL << SPAPR_TCE_PAGE_SHIFT)
+#define SPAPR_TCE_PAGE_MASK    (SPAPR_TCE_PAGE_SIZE - 1)
+
+typedef struct sPAPRTCE {
+    uint64_t tce;
+} sPAPRTCE;
+
+#define SPAPR_VIO_BASE_LIOBN    0x00000000
+
+void spapr_iommu_init(void);
+DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size);
+void spapr_tce_free(DMAContext *dma);
+int spapr_dma_dt(void *fdt, int node_off, const char *propname,
+                 DMAContext *dma);
+
 #endif /* !defined (__HW_SPAPR_H__) */
diff --git a/hw/spapr_iommu.c b/hw/spapr_iommu.c
new file mode 100644
index 0000000..4bec791
--- /dev/null
+++ b/hw/spapr_iommu.c
@@ -0,0 +1,242 @@
+/*
+ * QEMU sPAPR IOMMU (TCE) code
+ *
+ * Copyright (c) 2010 David Gibson, IBM Corporation <dwg@au1.ibm.com>
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see <http://www.gnu.org/licenses/>.
+ */
+#include "hw.h"
+#include "kvm.h"
+#include "qdev.h"
+#include "kvm_ppc.h"
+#include "dma.h"
+
+#include "hw/spapr.h"
+
+#include <libfdt.h>
+
+/* #define DEBUG_TCE */
+
+enum sPAPRTCEAccess {
+    SPAPR_TCE_FAULT = 0,
+    SPAPR_TCE_RO = 1,
+    SPAPR_TCE_WO = 2,
+    SPAPR_TCE_RW = 3,
+};
+
+typedef struct sPAPRTCETable sPAPRTCETable;
+
+struct sPAPRTCETable {
+    DMAContext dma;
+    uint32_t liobn;
+    uint32_t window_size;
+    sPAPRTCE *table;
+    int fd;
+    QLIST_ENTRY(sPAPRTCETable) list;
+};
+
+
+QLIST_HEAD(spapr_tce_tables, sPAPRTCETable) spapr_tce_tables;
+
+static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
+{
+    sPAPRTCETable *tcet;
+
+    QLIST_FOREACH(tcet, &spapr_tce_tables, list) {
+        if (tcet->liobn == liobn) {
+            return tcet;
+        }
+    }
+
+    return NULL;
+}
+
+static int spapr_tce_translate(DMAContext *dma,
+                               dma_addr_t addr,
+                               target_phys_addr_t *paddr,
+                               target_phys_addr_t *len,
+                               DMADirection dir)
+{
+    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
+    enum sPAPRTCEAccess access = (dir == DMA_DIRECTION_FROM_DEVICE)
+        ? SPAPR_TCE_WO : SPAPR_TCE_RO;
+    uint64_t tce;
+
+#ifdef DEBUG_TCE
+    fprintf(stderr, "spapr_tce_translate liobn=0x%" PRIx32 " addr=0x"
+            DMA_ADDR_FMT "\n", tcet->liobn, addr);
+#endif
+
+    /* Check if we are in bound */
+    if (addr >= tcet->window_size) {
+#ifdef DEBUG_TCE
+        fprintf(stderr, "spapr_tce_translate out of bounds\n");
+#endif
+        return -EFAULT;
+    }
+
+    tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT].tce;
+
+    /* Check TCE */
+    if (!(tce & access)) {
+        return -EPERM;
+    }
+
+    /* How much til end of page ? */
+    *len = ((~addr) & SPAPR_TCE_PAGE_MASK) + 1;
+
+    /* Translate */
+    *paddr = (tce & ~SPAPR_TCE_PAGE_MASK) |
+        (addr & SPAPR_TCE_PAGE_MASK);
+
+#ifdef DEBUG_TCE
+    fprintf(stderr, " ->  *paddr=0x" TARGET_FMT_plx ", *len=0x"
+            TARGET_FMT_plx "\n", *paddr, *len);
+#endif
+
+    return 0;
+}
+
+DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size)
+{
+    sPAPRTCETable *tcet;
+
+    if (!window_size) {
+        return NULL;
+    }
+
+    tcet = g_malloc0(sizeof(*tcet));
+    dma_context_init(&tcet->dma, spapr_tce_translate);
+
+    tcet->liobn = liobn;
+    tcet->window_size = window_size;
+
+    if (kvm_enabled()) {
+        tcet->table = kvmppc_create_spapr_tce(liobn,
+                                              window_size,
+                                              &tcet->fd);
+    }
+
+    if (!tcet->table) {
+        size_t table_size = (window_size >> SPAPR_TCE_PAGE_SHIFT)
+            * sizeof(sPAPRTCE);
+        tcet->table = g_malloc0(table_size);
+    }
+
+#ifdef DEBUG_TCE
+    fprintf(stderr, "spapr_iommu: New TCE table, liobn=0x%x, context @ %p, "
+            "table @ %p, fd=%d\n", liobn, &tcet->dma, tcet->table, tcet->fd);
+#endif
+
+    QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
+
+    return &tcet->dma;
+}
+
+void spapr_tce_free(DMAContext *dma)
+{
+
+    if (dma) {
+        sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
+
+        QLIST_REMOVE(tcet, list);
+
+        if (!kvm_enabled() ||
+            (kvmppc_remove_spapr_tce(tcet->table, tcet->fd,
+                                     tcet->window_size) != 0)) {
+            g_free(tcet->table);
+        }
+
+        g_free(tcet);
+    }
+}
+
+
+static target_ulong h_put_tce(CPUState *env, sPAPREnvironment *spapr,
+                              target_ulong opcode, target_ulong *args)
+{
+    target_ulong liobn = args[0];
+    target_ulong ioba = args[1];
+    target_ulong tce = args[2];
+    sPAPRTCETable *tcet = spapr_tce_find_by_liobn(liobn);
+    sPAPRTCE *tcep;
+
+    if (liobn & 0xFFFFFFFF00000000ULL) {
+        hcall_dprintf("spapr_vio_put_tce on out-of-boundsw LIOBN "
+                      TARGET_FMT_lx "\n", liobn);
+        return H_PARAMETER;
+    }
+    if (!tcet) {
+        hcall_dprintf("spapr_vio_put_tce on non-existent LIOBN "
+                      TARGET_FMT_lx "\n", liobn);
+        return H_PARAMETER;
+    }
+
+    ioba &= ~(SPAPR_TCE_PAGE_SIZE - 1);
+
+#ifdef DEBUG_TCE
+    fprintf(stderr, "spapr_vio_put_tce on liobn=" TARGET_FMT_lx /*%s*/
+            "  ioba 0x" TARGET_FMT_lx "  TCE 0x" TARGET_FMT_lx "\n",
+            liobn, /*dev->qdev.id, */ioba, tce);
+#endif
+
+    if (ioba >= tcet->window_size) {
+        hcall_dprintf("spapr_vio_put_tce on out-of-boards IOBA 0x"
+                      TARGET_FMT_lx "\n", ioba);
+        return H_PARAMETER;
+    }
+
+    tcep = tcet->table + (ioba >> SPAPR_TCE_PAGE_SHIFT);
+    tcep->tce = tce;
+
+    return H_SUCCESS;
+}
+
+void spapr_iommu_init(void)
+{
+    QLIST_INIT(&spapr_tce_tables);
+
+    /* hcall-tce */
+    spapr_register_hypercall(H_PUT_TCE, h_put_tce);
+}
+
+int spapr_dma_dt(void *fdt, int node_off, const char *propname,
+                 DMAContext *dma)
+{
+    if (dma) {
+        sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
+        uint32_t dma_prop[] = {cpu_to_be32(tcet->liobn),
+                               0, 0,
+                               0, cpu_to_be32(tcet->window_size)};
+        int ret;
+
+        ret = fdt_setprop_cell(fdt, node_off, "ibm,#dma-address-cells", 2);
+        if (ret < 0) {
+            return ret;
+        }
+
+        ret = fdt_setprop_cell(fdt, node_off, "ibm,#dma-size-cells", 2);
+        if (ret < 0) {
+            return ret;
+        }
+
+        ret = fdt_setprop(fdt, node_off, propname, dma_prop,
+                          sizeof(dma_prop));
+        if (ret < 0) {
+            return ret;
+        }
+    }
+
+    return 0;
+}
diff --git a/hw/spapr_llan.c b/hw/spapr_llan.c
index 77d4047..55f639d 100644
--- a/hw/spapr_llan.c
+++ b/hw/spapr_llan.c
@@ -71,7 +71,7 @@ typedef uint64_t vlan_bd_t;
 #define VLAN_RXQ_BD_OFF      0
 #define VLAN_FILTER_BD_OFF   8
 #define VLAN_RX_BDS_OFF      16
-#define VLAN_MAX_BUFS        ((SPAPR_VIO_TCE_PAGE_SIZE - VLAN_RX_BDS_OFF) / 8)
+#define VLAN_MAX_BUFS        ((SPAPR_TCE_PAGE_SIZE - VLAN_RX_BDS_OFF) / 8)
 
 typedef struct VIOsPAPRVLANDevice {
     VIOsPAPRDevice sdev;
@@ -95,7 +95,7 @@ static ssize_t spapr_vlan_receive(VLANClientState *nc, const uint8_t *buf,
 {
     VIOsPAPRDevice *sdev = DO_UPCAST(NICState, nc, nc)->opaque;
     VIOsPAPRVLANDevice *dev = (VIOsPAPRVLANDevice *)sdev;
-    vlan_bd_t rxq_bd = ldq_tce(sdev, dev->buf_list + VLAN_RXQ_BD_OFF);
+    vlan_bd_t rxq_bd = vio_ldq(sdev, dev->buf_list + VLAN_RXQ_BD_OFF);
     vlan_bd_t bd;
     int buf_ptr = dev->use_buf_ptr;
     uint64_t handle;
@@ -114,11 +114,11 @@ static ssize_t spapr_vlan_receive(VLANClientState *nc, const uint8_t *buf,
 
     do {
         buf_ptr += 8;
-        if (buf_ptr >= SPAPR_VIO_TCE_PAGE_SIZE) {
+        if (buf_ptr >= SPAPR_TCE_PAGE_SIZE) {
             buf_ptr = VLAN_RX_BDS_OFF;
         }
 
-        bd = ldq_tce(sdev, dev->buf_list + buf_ptr);
+        bd = vio_ldq(sdev, dev->buf_list + buf_ptr);
         dprintf("use_buf_ptr=%d bd=0x%016llx\n",
                 buf_ptr, (unsigned long long)bd);
     } while ((!(bd & VLAN_BD_VALID) || (VLAN_BD_LEN(bd) < (size + 8)))
@@ -132,12 +132,12 @@ static ssize_t spapr_vlan_receive(VLANClientState *nc, const uint8_t *buf,
     /* Remove the buffer from the pool */
     dev->rx_bufs--;
     dev->use_buf_ptr = buf_ptr;
-    stq_tce(sdev, dev->buf_list + dev->use_buf_ptr, 0);
+    vio_stq(sdev, dev->buf_list + dev->use_buf_ptr, 0);
 
     dprintf("Found buffer: ptr=%d num=%d\n", dev->use_buf_ptr, dev->rx_bufs);
 
     /* Transfer the packet data */
-    if (spapr_tce_dma_write(sdev, VLAN_BD_ADDR(bd) + 8, buf, size) < 0) {
+    if (spapr_vio_dma_write(sdev, VLAN_BD_ADDR(bd) + 8, buf, size) < 0) {
         return -1;
     }
 
@@ -149,23 +149,23 @@ static ssize_t spapr_vlan_receive(VLANClientState *nc, const uint8_t *buf,
         control ^= VLAN_RXQC_TOGGLE;
     }
 
-    handle = ldq_tce(sdev, VLAN_BD_ADDR(bd));
-    stq_tce(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 8, handle);
-    stw_tce(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 4, size);
-    sth_tce(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 2, 8);
-    stb_tce(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr, control);
+    handle = vio_ldq(sdev, VLAN_BD_ADDR(bd));
+    vio_stq(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 8, handle);
+    vio_stl(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 4, size);
+    vio_sth(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr + 2, 8);
+    vio_stb(sdev, VLAN_BD_ADDR(rxq_bd) + dev->rxq_ptr, control);
 
     dprintf("wrote rxq entry (ptr=0x%llx): 0x%016llx 0x%016llx\n",
             (unsigned long long)dev->rxq_ptr,
-            (unsigned long long)ldq_tce(sdev, VLAN_BD_ADDR(rxq_bd) +
+            (unsigned long long)vio_ldq(sdev, VLAN_BD_ADDR(rxq_bd) +
                                         dev->rxq_ptr),
-            (unsigned long long)ldq_tce(sdev, VLAN_BD_ADDR(rxq_bd) +
+            (unsigned long long)vio_ldq(sdev, VLAN_BD_ADDR(rxq_bd) +
                                         dev->rxq_ptr + 8));
 
     dev->rxq_ptr += 16;
     if (dev->rxq_ptr >= VLAN_BD_LEN(rxq_bd)) {
         dev->rxq_ptr = 0;
-        stq_tce(sdev, dev->buf_list + VLAN_RXQ_BD_OFF, rxq_bd ^ VLAN_BD_TOGGLE);
+        vio_stq(sdev, dev->buf_list + VLAN_RXQ_BD_OFF, rxq_bd ^ VLAN_BD_TOGGLE);
     }
 
     if (sdev->signal_state & 1) {
@@ -246,8 +246,10 @@ static int check_bd(VIOsPAPRVLANDevice *dev, vlan_bd_t bd,
         return -1;
     }
 
-    if (spapr_vio_check_tces(&dev->sdev, VLAN_BD_ADDR(bd),
-                             VLAN_BD_LEN(bd), SPAPR_TCE_RW) != 0) {
+    if (!spapr_vio_dma_valid(&dev->sdev, VLAN_BD_ADDR(bd),
+                             VLAN_BD_LEN(bd), DMA_DIRECTION_FROM_DEVICE)
+        || !spapr_vio_dma_valid(&dev->sdev, VLAN_BD_ADDR(bd),
+                                VLAN_BD_LEN(bd), DMA_DIRECTION_TO_DEVICE)) {
         return -1;
     }
 
@@ -277,15 +279,15 @@ static target_ulong h_register_logical_lan(CPUState *env,
         return H_RESOURCE;
     }
 
-    if (check_bd(dev, VLAN_VALID_BD(buf_list, SPAPR_VIO_TCE_PAGE_SIZE),
-                 SPAPR_VIO_TCE_PAGE_SIZE) < 0) {
+    if (check_bd(dev, VLAN_VALID_BD(buf_list, SPAPR_TCE_PAGE_SIZE),
+                 SPAPR_TCE_PAGE_SIZE) < 0) {
         hcall_dprintf("Bad buf_list 0x" TARGET_FMT_lx " for "
                       "H_REGISTER_LOGICAL_LAN\n", buf_list);
         return H_PARAMETER;
     }
 
-    filter_list_bd = VLAN_VALID_BD(filter_list, SPAPR_VIO_TCE_PAGE_SIZE);
-    if (check_bd(dev, filter_list_bd, SPAPR_VIO_TCE_PAGE_SIZE) < 0) {
+    filter_list_bd = VLAN_VALID_BD(filter_list, SPAPR_TCE_PAGE_SIZE);
+    if (check_bd(dev, filter_list_bd, SPAPR_TCE_PAGE_SIZE) < 0) {
         hcall_dprintf("Bad filter_list 0x" TARGET_FMT_lx " for "
                       "H_REGISTER_LOGICAL_LAN\n", filter_list);
         return H_PARAMETER;
@@ -303,17 +305,17 @@ static target_ulong h_register_logical_lan(CPUState *env,
     rec_queue &= ~VLAN_BD_TOGGLE;
 
     /* Initialize the buffer list */
-    stq_tce(sdev, buf_list, rec_queue);
-    stq_tce(sdev, buf_list + 8, filter_list_bd);
-    spapr_tce_dma_zero(sdev, buf_list + VLAN_RX_BDS_OFF,
-                       SPAPR_VIO_TCE_PAGE_SIZE - VLAN_RX_BDS_OFF);
+    vio_stq(sdev, buf_list, rec_queue);
+    vio_stq(sdev, buf_list + 8, filter_list_bd);
+    spapr_vio_dma_zero(sdev, buf_list + VLAN_RX_BDS_OFF,
+                       SPAPR_TCE_PAGE_SIZE - VLAN_RX_BDS_OFF);
     dev->add_buf_ptr = VLAN_RX_BDS_OFF - 8;
     dev->use_buf_ptr = VLAN_RX_BDS_OFF - 8;
     dev->rx_bufs = 0;
     dev->rxq_ptr = 0;
 
     /* Initialize the receive queue */
-    spapr_tce_dma_zero(sdev, VLAN_BD_ADDR(rec_queue), VLAN_BD_LEN(rec_queue));
+    spapr_vio_dma_zero(sdev, VLAN_BD_ADDR(rec_queue), VLAN_BD_LEN(rec_queue));
 
     dev->isopen = 1;
     return H_SUCCESS;
@@ -374,14 +376,14 @@ static target_ulong h_add_logical_lan_buffer(CPUState *env,
 
     do {
         dev->add_buf_ptr += 8;
-        if (dev->add_buf_ptr >= SPAPR_VIO_TCE_PAGE_SIZE) {
+        if (dev->add_buf_ptr >= SPAPR_TCE_PAGE_SIZE) {
             dev->add_buf_ptr = VLAN_RX_BDS_OFF;
         }
 
-        bd = ldq_tce(sdev, dev->buf_list + dev->add_buf_ptr);
+        bd = vio_ldq(sdev, dev->buf_list + dev->add_buf_ptr);
     } while (bd & VLAN_BD_VALID);
 
-    stq_tce(sdev, dev->buf_list + dev->add_buf_ptr, buf);
+    vio_stq(sdev, dev->buf_list + dev->add_buf_ptr, buf);
 
     dev->rx_bufs++;
 
@@ -447,7 +449,7 @@ static target_ulong h_send_logical_lan(CPUState *env, sPAPREnvironment *spapr,
     lbuf = alloca(total_len);
     p = lbuf;
     for (i = 0; i < nbufs; i++) {
-        ret = spapr_tce_dma_read(sdev, VLAN_BD_ADDR(bufs[i]),
+        ret = spapr_vio_dma_read(sdev, VLAN_BD_ADDR(bufs[i]),
                                  p, VLAN_BD_LEN(bufs[i]));
         if (ret < 0) {
             return ret;
@@ -492,6 +494,7 @@ static void spapr_vlan_class_init(ObjectClass *klass, void *data)
     k->dt_compatible = "IBM,l-lan";
     k->signal_mask = 0x1;
     dc->props = spapr_vlan_properties;
+    k->rtce_window_size = 0x10000000;
 }
 
 static TypeInfo spapr_vlan_info = {
diff --git a/hw/spapr_vio.c b/hw/spapr_vio.c
index ca0c8e7..686d16f 100644
--- a/hw/spapr_vio.c
+++ b/hw/spapr_vio.c
@@ -39,7 +39,6 @@
 #endif /* CONFIG_FDT */
 
 /* #define DEBUG_SPAPR */
-/* #define DEBUG_TCE */
 
 #ifdef DEBUG_SPAPR
 #define dprintf(fmt, ...) \
@@ -141,26 +140,9 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
         }
     }
 
-    if (dev->rtce_window_size) {
-        uint32_t dma_prop[] = {cpu_to_be32(dev->reg),
-                               0, 0,
-                               0, cpu_to_be32(dev->rtce_window_size)};
-
-        ret = fdt_setprop_cell(fdt, node_off, "ibm,#dma-address-cells", 2);
-        if (ret < 0) {
-            return ret;
-        }
-
-        ret = fdt_setprop_cell(fdt, node_off, "ibm,#dma-size-cells", 2);
-        if (ret < 0) {
-            return ret;
-        }
-
-        ret = fdt_setprop(fdt, node_off, "ibm,my-dma-window", dma_prop,
-                          sizeof(dma_prop));
-        if (ret < 0) {
-            return ret;
-        }
+    ret = spapr_dma_dt(fdt, node_off, "ibm,my-dma-window", dev->dma);
+    if (ret < 0) {
+        return ret;
     }
 
     if (pc->devnode) {
@@ -175,234 +157,6 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
 #endif /* CONFIG_FDT */
 
 /*
- * RTCE handling
- */
-
-static void rtce_init(VIOsPAPRDevice *dev)
-{
-    size_t size = (dev->rtce_window_size >> SPAPR_VIO_TCE_PAGE_SHIFT)
-        * sizeof(VIOsPAPR_RTCE);
-
-    if (size) {
-        dev->rtce_table = kvmppc_create_spapr_tce(dev->reg,
-                                                  dev->rtce_window_size,
-                                                  &dev->kvmtce_fd);
-
-        if (!dev->rtce_table) {
-            dev->rtce_table = g_malloc0(size);
-        }
-    }
-}
-
-static target_ulong h_put_tce(CPUState *env, sPAPREnvironment *spapr,
-                              target_ulong opcode, target_ulong *args)
-{
-    target_ulong liobn = args[0];
-    target_ulong ioba = args[1];
-    target_ulong tce = args[2];
-    VIOsPAPRDevice *dev = spapr_vio_find_by_reg(spapr->vio_bus, liobn);
-    VIOsPAPR_RTCE *rtce;
-
-    if (!dev) {
-        hcall_dprintf("spapr_vio_put_tce on non-existent LIOBN "
-                      TARGET_FMT_lx "\n", liobn);
-        return H_PARAMETER;
-    }
-
-    ioba &= ~(SPAPR_VIO_TCE_PAGE_SIZE - 1);
-
-#ifdef DEBUG_TCE
-    fprintf(stderr, "spapr_vio_put_tce on %s  ioba 0x" TARGET_FMT_lx
-            "  TCE 0x" TARGET_FMT_lx "\n", dev->qdev.id, ioba, tce);
-#endif
-
-    if (ioba >= dev->rtce_window_size) {
-        hcall_dprintf("spapr_vio_put_tce on out-of-boards IOBA 0x"
-                      TARGET_FMT_lx "\n", ioba);
-        return H_PARAMETER;
-    }
-
-    rtce = dev->rtce_table + (ioba >> SPAPR_VIO_TCE_PAGE_SHIFT);
-    rtce->tce = tce;
-
-    return H_SUCCESS;
-}
-
-int spapr_vio_check_tces(VIOsPAPRDevice *dev, target_ulong ioba,
-                         target_ulong len, enum VIOsPAPR_TCEAccess access)
-{
-    int start, end, i;
-
-    start = ioba >> SPAPR_VIO_TCE_PAGE_SHIFT;
-    end = (ioba + len - 1) >> SPAPR_VIO_TCE_PAGE_SHIFT;
-
-    for (i = start; i <= end; i++) {
-        if ((dev->rtce_table[i].tce & access) != access) {
-#ifdef DEBUG_TCE
-            fprintf(stderr, "FAIL on %d\n", i);
-#endif
-            return -1;
-        }
-    }
-
-    return 0;
-}
-
-int spapr_tce_dma_write(VIOsPAPRDevice *dev, uint64_t taddr, const void *buf,
-                        uint32_t size)
-{
-#ifdef DEBUG_TCE
-    fprintf(stderr, "spapr_tce_dma_write taddr=0x%llx size=0x%x\n",
-            (unsigned long long)taddr, size);
-#endif
-
-    /* Check for bypass */
-    if (dev->flags & VIO_PAPR_FLAG_DMA_BYPASS) {
-        cpu_physical_memory_write(taddr, buf, size);
-        return 0;
-    }
-
-    while (size) {
-        uint64_t tce;
-        uint32_t lsize;
-        uint64_t txaddr;
-
-        /* Check if we are in bound */
-        if (taddr >= dev->rtce_window_size) {
-#ifdef DEBUG_TCE
-            fprintf(stderr, "spapr_tce_dma_write out of bounds\n");
-#endif
-            return H_DEST_PARM;
-        }
-        tce = dev->rtce_table[taddr >> SPAPR_VIO_TCE_PAGE_SHIFT].tce;
-
-        /* How much til end of page ? */
-        lsize = MIN(size, ((~taddr) & SPAPR_VIO_TCE_PAGE_MASK) + 1);
-
-        /* Check TCE */
-        if (!(tce & 2)) {
-            return H_DEST_PARM;
-        }
-
-        /* Translate */
-        txaddr = (tce & ~SPAPR_VIO_TCE_PAGE_MASK) |
-            (taddr & SPAPR_VIO_TCE_PAGE_MASK);
-
-#ifdef DEBUG_TCE
-        fprintf(stderr, " -> write to txaddr=0x%llx, size=0x%x\n",
-                (unsigned long long)txaddr, lsize);
-#endif
-
-        /* Do it */
-        cpu_physical_memory_write(txaddr, buf, lsize);
-        buf += lsize;
-        taddr += lsize;
-        size -= lsize;
-    }
-    return 0;
-}
-
-int spapr_tce_dma_zero(VIOsPAPRDevice *dev, uint64_t taddr, uint32_t size)
-{
-    /* FIXME: allocating a temp buffer is nasty, but just stepping
-     * through writing zeroes is awkward.  This will do for now. */
-    uint8_t zeroes[size];
-
-#ifdef DEBUG_TCE
-    fprintf(stderr, "spapr_tce_dma_zero taddr=0x%llx size=0x%x\n",
-            (unsigned long long)taddr, size);
-#endif
-
-    memset(zeroes, 0, size);
-    return spapr_tce_dma_write(dev, taddr, zeroes, size);
-}
-
-void stb_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint8_t val)
-{
-    spapr_tce_dma_write(dev, taddr, &val, sizeof(val));
-}
-
-void sth_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint16_t val)
-{
-    val = tswap16(val);
-    spapr_tce_dma_write(dev, taddr, &val, sizeof(val));
-}
-
-
-void stw_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint32_t val)
-{
-    val = tswap32(val);
-    spapr_tce_dma_write(dev, taddr, &val, sizeof(val));
-}
-
-void stq_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint64_t val)
-{
-    val = tswap64(val);
-    spapr_tce_dma_write(dev, taddr, &val, sizeof(val));
-}
-
-int spapr_tce_dma_read(VIOsPAPRDevice *dev, uint64_t taddr, void *buf,
-                       uint32_t size)
-{
-#ifdef DEBUG_TCE
-    fprintf(stderr, "spapr_tce_dma_write taddr=0x%llx size=0x%x\n",
-            (unsigned long long)taddr, size);
-#endif
-
-    /* Check for bypass */
-    if (dev->flags & VIO_PAPR_FLAG_DMA_BYPASS) {
-        cpu_physical_memory_read(taddr, buf, size);
-        return 0;
-    }
-
-    while (size) {
-        uint64_t tce;
-        uint32_t lsize;
-        uint64_t txaddr;
-
-        /* Check if we are in bound */
-        if (taddr >= dev->rtce_window_size) {
-#ifdef DEBUG_TCE
-            fprintf(stderr, "spapr_tce_dma_read out of bounds\n");
-#endif
-            return H_DEST_PARM;
-        }
-        tce = dev->rtce_table[taddr >> SPAPR_VIO_TCE_PAGE_SHIFT].tce;
-
-        /* How much til end of page ? */
-        lsize = MIN(size, ((~taddr) & SPAPR_VIO_TCE_PAGE_MASK) + 1);
-
-        /* Check TCE */
-        if (!(tce & 1)) {
-            return H_DEST_PARM;
-        }
-
-        /* Translate */
-        txaddr = (tce & ~SPAPR_VIO_TCE_PAGE_MASK) |
-            (taddr & SPAPR_VIO_TCE_PAGE_MASK);
-
-#ifdef DEBUG_TCE
-        fprintf(stderr, " -> write to txaddr=0x%llx, size=0x%x\n",
-                (unsigned long long)txaddr, lsize);
-#endif
-        /* Do it */
-        cpu_physical_memory_read(txaddr, buf, lsize);
-        buf += lsize;
-        taddr += lsize;
-        size -= lsize;
-    }
-    return H_SUCCESS;
-}
-
-uint64_t ldq_tce(VIOsPAPRDevice *dev, uint64_t taddr)
-{
-    uint64_t val;
-
-    spapr_tce_dma_read(dev, taddr, &val, sizeof(val));
-    return tswap64(val);
-}
-
-/*
  * CRQ handling
  */
 static target_ulong h_reg_crq(CPUState *env, sPAPREnvironment *spapr,
@@ -525,7 +279,7 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
     }
 
     /* Maybe do a fast path for KVM just writing to the pages */
-    rc = spapr_tce_dma_read(dev, dev->crq.qladdr + dev->crq.qnext, &byte, 1);
+    rc = spapr_vio_dma_read(dev, dev->crq.qladdr + dev->crq.qnext, &byte, 1);
     if (rc) {
         return rc;
     }
@@ -533,7 +287,7 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
         return 1;
     }
 
-    rc = spapr_tce_dma_write(dev, dev->crq.qladdr + dev->crq.qnext + 8,
+    rc = spapr_vio_dma_write(dev, dev->crq.qladdr + dev->crq.qnext + 8,
                              &crq[8], 8);
     if (rc) {
         return rc;
@@ -541,7 +295,7 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
 
     kvmppc_eieio();
 
-    rc = spapr_tce_dma_write(dev, dev->crq.qladdr + dev->crq.qnext, crq, 8);
+    rc = spapr_vio_dma_write(dev, dev->crq.qladdr + dev->crq.qnext, crq, 8);
     if (rc) {
         return rc;
     }
@@ -559,13 +313,13 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
 
 static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev)
 {
-    dev->flags &= ~VIO_PAPR_FLAG_DMA_BYPASS;
+    VIOsPAPRDeviceClass *pc = VIO_SPAPR_DEVICE_GET_CLASS(dev);
+    uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
 
-    if (dev->rtce_table) {
-        size_t size = (dev->rtce_window_size >> SPAPR_VIO_TCE_PAGE_SHIFT)
-            * sizeof(VIOsPAPR_RTCE);
-        memset(dev->rtce_table, 0, size);
+    if (dev->dma) {
+        spapr_tce_free(dev->dma);
     }
+    dev->dma = spapr_tce_new_dma_context(liobn, pc->rtce_window_size);
 
     dev->crq.qladdr = 0;
     dev->crq.qsize = 0;
@@ -592,9 +346,13 @@ static void rtas_set_tce_bypass(sPAPREnvironment *spapr, uint32_t token,
         return;
     }
     if (enable) {
-        dev->flags |= VIO_PAPR_FLAG_DMA_BYPASS;
+        spapr_tce_free(dev->dma);
+        dev->dma = NULL;
     } else {
-        dev->flags &= ~VIO_PAPR_FLAG_DMA_BYPASS;
+        VIOsPAPRDeviceClass *pc = VIO_SPAPR_DEVICE_GET_CLASS(dev);
+        uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
+
+        dev->dma = spapr_tce_new_dma_context(liobn, pc->rtce_window_size);
     }
 
     rtas_st(rets, 0, 0);
@@ -653,6 +411,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
 {
     VIOsPAPRDevice *dev = (VIOsPAPRDevice *)qdev;
     VIOsPAPRDeviceClass *pc = VIO_SPAPR_DEVICE_GET_CLASS(dev);
+    uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
     char *id;
     int ret;
 
@@ -675,7 +434,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
         return -1;
     }
 
-    rtce_init(dev);
+    dev->dma = spapr_tce_new_dma_context(liobn, pc->rtce_window_size);
 
     return pc->init(dev);
 }
@@ -722,9 +481,6 @@ VIOsPAPRBus *spapr_vio_bus_init(void)
     /* hcall-vio */
     spapr_register_hypercall(H_VIO_SIGNAL, h_vio_signal);
 
-    /* hcall-tce */
-    spapr_register_hypercall(H_PUT_TCE, h_put_tce);
-
     /* hcall-crq */
     spapr_register_hypercall(H_REG_CRQ, h_reg_crq);
     spapr_register_hypercall(H_FREE_CRQ, h_free_crq);
diff --git a/hw/spapr_vio.h b/hw/spapr_vio.h
index d8527be..61032ab 100644
--- a/hw/spapr_vio.h
+++ b/hw/spapr_vio.h
@@ -21,16 +21,7 @@
  * License along with this library; if not, see <http://www.gnu.org/licenses/>.
  */
 
-#define SPAPR_VIO_TCE_PAGE_SHIFT   12
-#define SPAPR_VIO_TCE_PAGE_SIZE    (1ULL << SPAPR_VIO_TCE_PAGE_SHIFT)
-#define SPAPR_VIO_TCE_PAGE_MASK    (SPAPR_VIO_TCE_PAGE_SIZE - 1)
-
-enum VIOsPAPR_TCEAccess {
-    SPAPR_TCE_FAULT = 0,
-    SPAPR_TCE_RO = 1,
-    SPAPR_TCE_WO = 2,
-    SPAPR_TCE_RW = 3,
-};
+#include "dma.h"
 
 #define SPAPR_VTY_BASE_ADDRESS     0x30000000
 
@@ -44,10 +35,6 @@ enum VIOsPAPR_TCEAccess {
 
 struct VIOsPAPRDevice;
 
-typedef struct VIOsPAPR_RTCE {
-    uint64_t tce;
-} VIOsPAPR_RTCE;
-
 typedef struct VIOsPAPR_CRQ {
     uint64_t qladdr;
     uint32_t qsize;
@@ -63,6 +50,7 @@ typedef struct VIOsPAPRDeviceClass {
 
     const char *dt_name, *dt_type, *dt_compatible;
     target_ulong signal_mask;
+    uint32_t rtce_window_size;
     int (*init)(VIOsPAPRDevice *dev);
     void (*hcalls)(VIOsPAPRBus *bus);
     int (*devnode)(VIOsPAPRDevice *dev, void *fdt, int node_off);
@@ -72,20 +60,15 @@ struct VIOsPAPRDevice {
     DeviceState qdev;
     uint32_t reg;
     uint32_t flags;
-#define VIO_PAPR_FLAG_DMA_BYPASS        0x1
     qemu_irq qirq;
     uint32_t vio_irq_num;
     target_ulong signal_state;
-    uint32_t rtce_window_size;
-    VIOsPAPR_RTCE *rtce_table;
-    int kvmtce_fd;
     VIOsPAPR_CRQ crq;
+    DMAContext *dma;
 };
 
 #define DEFINE_SPAPR_PROPERTIES(type, field, default_reg, default_dma_window) \
-        DEFINE_PROP_UINT32("reg", type, field.reg, default_reg), \
-        DEFINE_PROP_UINT32("dma-window", type, field.rtce_window_size, \
-                           default_dma_window)
+        DEFINE_PROP_UINT32("reg", type, field.reg, default_reg)
 
 struct VIOsPAPRBus {
     BusState bus;
@@ -102,20 +85,38 @@ extern int spapr_populate_chosen_stdout(void *fdt, VIOsPAPRBus *bus);
 
 extern int spapr_vio_signal(VIOsPAPRDevice *dev, target_ulong mode);
 
-int spapr_vio_check_tces(VIOsPAPRDevice *dev, target_ulong ioba,
-                         target_ulong len,
-                         enum VIOsPAPR_TCEAccess access);
-
-int spapr_tce_dma_read(VIOsPAPRDevice *dev, uint64_t taddr,
-                       void *buf, uint32_t size);
-int spapr_tce_dma_write(VIOsPAPRDevice *dev, uint64_t taddr,
-                        const void *buf, uint32_t size);
-int spapr_tce_dma_zero(VIOsPAPRDevice *dev, uint64_t taddr, uint32_t size);
-void stb_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint8_t val);
-void sth_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint16_t val);
-void stw_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint32_t val);
-void stq_tce(VIOsPAPRDevice *dev, uint64_t taddr, uint64_t val);
-uint64_t ldq_tce(VIOsPAPRDevice *dev, uint64_t taddr);
+static inline bool spapr_vio_dma_valid(VIOsPAPRDevice *dev, uint64_t taddr,
+                                       uint32_t size, DMADirection dir)
+{
+    return dma_memory_valid(dev->dma, taddr, size, dir);
+}
+
+static inline int spapr_vio_dma_read(VIOsPAPRDevice *dev, uint64_t taddr,
+                                     void *buf, uint32_t size)
+{
+    return (dma_memory_read(dev->dma, taddr, buf, size) != 0) ?
+        H_DEST_PARM : H_SUCCESS;
+}
+
+static inline int spapr_vio_dma_write(VIOsPAPRDevice *dev, uint64_t taddr,
+                                      const void *buf, uint32_t size)
+{
+    return (dma_memory_write(dev->dma, taddr, buf, size) != 0) ?
+        H_DEST_PARM : H_SUCCESS;
+}
+
+static inline int spapr_vio_dma_zero(VIOsPAPRDevice *dev, uint64_t taddr,
+                                     uint32_t size)
+{
+    return (dma_memory_zero(dev->dma, taddr, size) != 0) ?
+        H_DEST_PARM : H_SUCCESS;
+}
+
+#define vio_stb(_dev, _addr, _val) (stb_dma((_dev)->dma, (_addr), (_val)))
+#define vio_sth(_dev, _addr, _val) (stw_be_dma((_dev)->dma, (_addr), (_val)))
+#define vio_stl(_dev, _addr, _val) (stl_be_dma((_dev)->dma, (_addr), (_val)))
+#define vio_stq(_dev, _addr, _val) (stq_be_dma((_dev)->dma, (_addr), (_val)))
+#define vio_ldq(_dev, _addr) (ldq_be_dma((_dev)->dma, (_addr)))
 
 int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq);
 
diff --git a/hw/spapr_vscsi.c b/hw/spapr_vscsi.c
index 2167017..4a54211 100644
--- a/hw/spapr_vscsi.c
+++ b/hw/spapr_vscsi.c
@@ -169,7 +169,7 @@ static int vscsi_send_iu(VSCSIState *s, vscsi_req *req,
     long rc, rc1;
 
     /* First copy the SRP */
-    rc = spapr_tce_dma_write(&s->vdev, req->crq.s.IU_data_ptr,
+    rc = spapr_vio_dma_write(&s->vdev, req->crq.s.IU_data_ptr,
                              &req->iu, length);
     if (rc) {
         fprintf(stderr, "vscsi_send_iu: DMA write failure !\n");
@@ -285,9 +285,9 @@ static int vscsi_srp_direct_data(VSCSIState *s, vscsi_req *req,
     llen = MIN(len, md->len);
     if (llen) {
         if (req->writing) { /* writing = to device = reading from memory */
-            rc = spapr_tce_dma_read(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_read(&s->vdev, md->va, buf, llen);
         } else {
-            rc = spapr_tce_dma_write(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_write(&s->vdev, md->va, buf, llen);
         }
     }
     md->len -= llen;
@@ -333,10 +333,11 @@ static int vscsi_srp_indirect_data(VSCSIState *s, vscsi_req *req,
             md = req->cur_desc = &req->ext_desc;
             dprintf("VSCSI:   Reading desc from 0x%llx\n",
                     (unsigned long long)td->va);
-            rc = spapr_tce_dma_read(&s->vdev, td->va, md,
+            rc = spapr_vio_dma_read(&s->vdev, td->va, md,
                                     sizeof(struct srp_direct_buf));
             if (rc) {
-                dprintf("VSCSI: tce_dma_read -> %d reading ext_desc\n", rc);
+                dprintf("VSCSI: spapr_vio_dma_read -> %d reading ext_desc\n",
+                        rc);
                 break;
             }
             vscsi_swap_desc(md);
@@ -349,12 +350,12 @@ static int vscsi_srp_indirect_data(VSCSIState *s, vscsi_req *req,
         /* Perform transfer */
         llen = MIN(len, md->len);
         if (req->writing) { /* writing = to device = reading from memory */
-            rc = spapr_tce_dma_read(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_read(&s->vdev, md->va, buf, llen);
         } else {
-            rc = spapr_tce_dma_write(&s->vdev, md->va, buf, llen);
+            rc = spapr_vio_dma_write(&s->vdev, md->va, buf, llen);
         }
         if (rc) {
-            dprintf("VSCSI: tce_dma_r/w(%d) -> %d\n", req->writing, rc);
+            dprintf("VSCSI: spapr_vio_dma_r/w(%d) -> %d\n", req->writing, rc);
             break;
         }
         dprintf("VSCSI:     data: %02x %02x %02x %02x...\n",
@@ -732,7 +733,7 @@ static int vscsi_send_adapter_info(VSCSIState *s, vscsi_req *req)
     sinfo = &req->iu.mad.adapter_info;
 
 #if 0 /* What for ? */
-    rc = spapr_tce_dma_read(&s->vdev, be64_to_cpu(sinfo->buffer),
+    rc = spapr_vio_dma_read(&s->vdev, be64_to_cpu(sinfo->buffer),
                             &info, be16_to_cpu(sinfo->common.length));
     if (rc) {
         fprintf(stderr, "vscsi_send_adapter_info: DMA read failure !\n");
@@ -746,7 +747,7 @@ static int vscsi_send_adapter_info(VSCSIState *s, vscsi_req *req)
     info.os_type = cpu_to_be32(2);
     info.port_max_txu[0] = cpu_to_be32(VSCSI_MAX_SECTORS << 9);
 
-    rc = spapr_tce_dma_write(&s->vdev, be64_to_cpu(sinfo->buffer),
+    rc = spapr_vio_dma_write(&s->vdev, be64_to_cpu(sinfo->buffer),
                              &info, be16_to_cpu(sinfo->common.length));
     if (rc)  {
         fprintf(stderr, "vscsi_send_adapter_info: DMA write failure !\n");
@@ -808,7 +809,7 @@ static void vscsi_got_payload(VSCSIState *s, vscsi_crq *crq)
     }
 
     /* XXX Handle failure differently ? */
-    if (spapr_tce_dma_read(&s->vdev, crq->s.IU_data_ptr, &req->iu,
+    if (spapr_vio_dma_read(&s->vdev, crq->s.IU_data_ptr, &req->iu,
                            crq->s.IU_length)) {
         fprintf(stderr, "vscsi_got_payload: DMA read failure !\n");
         g_free(req);
@@ -964,6 +965,7 @@ static void spapr_vscsi_class_init(ObjectClass *klass, void *data)
     k->dt_compatible = "IBM,v-scsi";
     k->signal_mask = 0x00000001;
     dc->props = spapr_vscsi_properties;
+    k->rtce_window_size = 0x10000000;
 }
 
 static TypeInfo spapr_vscsi_info = {
diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index 90c6941..53ecc90 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -858,7 +858,7 @@ void *kvmppc_create_spapr_tce(uint32_t liobn, uint32_t window_size, int *pfd)
         return NULL;
     }
 
-    len = (window_size / SPAPR_VIO_TCE_PAGE_SIZE) * sizeof(VIOsPAPR_RTCE);
+    len = (window_size / SPAPR_TCE_PAGE_SIZE) * sizeof(sPAPRTCE);
     /* FIXME: round this up to page size */
 
     table = mmap(NULL, len, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
@@ -881,7 +881,7 @@ int kvmppc_remove_spapr_tce(void *table, int fd, uint32_t window_size)
         return -1;
     }
 
-    len = (window_size / SPAPR_VIO_TCE_PAGE_SIZE)*sizeof(VIOsPAPR_RTCE);
+    len = (window_size / SPAPR_TCE_PAGE_SIZE)*sizeof(sPAPRTCE);
     if ((munmap(table, len) < 0) ||
         (close(fd) < 0)) {
         fprintf(stderr, "KVM: Unexpected error removing TCE table: %s",
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 12/13] iommu: Allow PCI to use IOMMU infrastructure
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (10 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 11/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices David Gibson
  12 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

This patch adds some hooks to let PCI devices and busses use the new IOMMU
infrastructure.  When IOMMU support is enabled, each PCI device now
contains a DMAContext * which is used by the pci_dma_*() wrapper functions.

By default, the contexts are initialized to NULL, assuming no IOMMU.
However the platform or host bridge code which sets up the PCI bus can use
pci_setup_iommu() to set a function which will determine the correct
DMAContext for a given PCI device.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
---
 hw/pci.c           |   13 +++++++++++++
 hw/pci.h           |   13 ++++++++++++-
 hw/pci_internals.h |    4 ++++
 3 files changed, 29 insertions(+), 1 deletions(-)

diff --git a/hw/pci.c b/hw/pci.c
index bf046bf..1f54c45 100644
--- a/hw/pci.c
+++ b/hw/pci.c
@@ -761,6 +761,11 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
         return NULL;
     }
     pci_dev->bus = bus;
+#ifdef CONFIG_IOMMU
+    if (bus->dma_context_fn) {
+        pci_dev->dma = bus->dma_context_fn(bus, bus->dma_context_opaque, devfn);
+    }
+#endif
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
     pci_dev->irq_state = 0;
@@ -2005,6 +2010,14 @@ static void pci_device_class_init(ObjectClass *klass, void *data)
     k->bus_info = &pci_bus_info;
 }
 
+#ifdef CONFIG_IOMMU
+void pci_setup_iommu(PCIBus *bus, PCIDMAContextFunc fn, void *opaque)
+{
+    bus->dma_context_fn = fn;
+    bus->dma_context_opaque = opaque;
+}
+#endif
+
 static TypeInfo pci_device_type_info = {
     .name = TYPE_PCI_DEVICE,
     .parent = TYPE_DEVICE,
diff --git a/hw/pci.h b/hw/pci.h
index 41dcd05..1273dc3 100644
--- a/hw/pci.h
+++ b/hw/pci.h
@@ -170,6 +170,7 @@ typedef struct PCIDeviceClass {
 
 struct PCIDevice {
     DeviceState qdev;
+
     /* PCI config space */
     uint8_t *config;
 
@@ -191,6 +192,9 @@ struct PCIDevice {
     uint32_t devfn;
     char name[64];
     PCIIORegion io_regions[PCI_NUM_REGIONS];
+#ifdef CONFIG_IOMMU
+    DMAContext *dma;
+#endif
 
     /* do not access the following fields */
     PCIConfigReadFunc *config_read;
@@ -311,6 +315,10 @@ int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp,
 
 void pci_device_deassert_intx(PCIDevice *dev);
 
+typedef DMAContext *(*PCIDMAContextFunc)(PCIBus *, void *, int);
+
+void pci_setup_iommu(PCIBus *bus, PCIDMAContextFunc fn, void *opaque);
+
 static inline void
 pci_set_byte(uint8_t *config, uint8_t val)
 {
@@ -547,8 +555,11 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
 /* DMA access functions */
 static inline DMAContext *pci_dma_context(PCIDevice *dev)
 {
-    /* Stub for when we have no PCI iommu support */
+#ifdef CONFIG_IOMMU
+    return dev->dma;
+#else
     return NULL;
+#endif
 }
 
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
diff --git a/hw/pci_internals.h b/hw/pci_internals.h
index 96690b7..b6b7a0e 100644
--- a/hw/pci_internals.h
+++ b/hw/pci_internals.h
@@ -16,6 +16,10 @@ extern struct BusInfo pci_bus_info;
 
 struct PCIBus {
     BusState qbus;
+#ifdef CONFIG_IOMMU
+    PCIDMAContextFunc dma_context_fn;
+    void *dma_context_opaque;
+#endif
     uint8_t devfn_min;
     pci_set_irq_fn set_irq;
     pci_map_irq_fn map_irq;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices
  2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
                   ` (11 preceding siblings ...)
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 12/13] iommu: Allow PCI to use " David Gibson
@ 2012-03-09  5:01 ` David Gibson
  2012-03-09 10:23   ` Paolo Bonzini
  12 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09  5:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: rth, mst, agraf, David Gibson, eduard.munteanu

Currently the pseries machine emulation does not support DMA for emulated
PCI devices, because the PAPR spec always requires a (guest visible,
paravirtualized) IOMMU which was not implemented.  Now that we have
infrastructure for IOMMU emulation, we can correct this and allow PCI DMA
for pseries.

With the existing PAPR IOMMU code used for VIO devices, this is almost
trivial. We use a single DMAContext for each (virtual) PCI host bridge,
which is the usual configuration on real PAPR machines (which often have
_many_ PCI host bridges).

Cc: Alex Graf <agraf@suse.de>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/spapr.h     |    1 +
 hw/spapr_pci.c |   15 +++++++++++++++
 hw/spapr_pci.h |    1 +
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/hw/spapr.h b/hw/spapr.h
index 210b868..00b15c8 100644
--- a/hw/spapr.h
+++ b/hw/spapr.h
@@ -328,6 +328,7 @@ typedef struct sPAPRTCE {
 } sPAPRTCE;
 
 #define SPAPR_VIO_BASE_LIOBN    0x00000000
+#define SPAPR_PCI_BASE_LIOBN    0x80000000
 
 void spapr_iommu_init(void);
 DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size);
diff --git a/hw/spapr_pci.c b/hw/spapr_pci.c
index 523227b..b53fd40 100644
--- a/hw/spapr_pci.c
+++ b/hw/spapr_pci.c
@@ -201,6 +201,14 @@ static MemoryRegionOps spapr_io_ops = {
 /*
  * PHB PCI device
  */
+static DMAContext *spapr_pci_dma_context_fn(PCIBus *bus, void *opaque,
+                                            int devfn)
+{
+    sPAPRPHBState *phb = opaque;
+
+    return phb->dma;
+}
+
 static int spapr_phb_init(SysBusDevice *s)
 {
     sPAPRPHBState *phb = FROM_SYSBUS(sPAPRPHBState, s);
@@ -208,6 +216,7 @@ static int spapr_phb_init(SysBusDevice *s)
     char namebuf[64];
     int i;
     PCIBus *bus;
+    uint32_t liobn;
 
     sprintf(busname, "pci@%" PRIx64, phb->buid);
 
@@ -248,6 +257,10 @@ static int spapr_phb_init(SysBusDevice *s)
                                                  PCI_DEVFN(0, 0),
                                                  SPAPR_PCI_NUM_LSI);
 
+    liobn = SPAPR_PCI_BASE_LIOBN | (pci_find_domain(bus) << 16);
+    phb->dma = spapr_tce_new_dma_context(liobn, 0x40000000);
+    pci_setup_iommu(bus, spapr_pci_dma_context_fn, phb);
+
     QLIST_INSERT_HEAD(&spapr->phbs, phb, list);
 
     /* Initialize the LSI table */
@@ -400,6 +413,8 @@ int spapr_populate_pci_devices(sPAPRPHBState *phb,
     _FDT(fdt_setprop(fdt, bus_off, "interrupt-map", &interrupt_map,
                      7 * sizeof(interrupt_map[0])));
 
+    spapr_dma_dt(fdt, bus_off, "ibm,dma-window", phb->dma);
+
     return 0;
 }
 
diff --git a/hw/spapr_pci.h b/hw/spapr_pci.h
index b4b8a73..365c75e 100644
--- a/hw/spapr_pci.h
+++ b/hw/spapr_pci.h
@@ -37,6 +37,7 @@ typedef struct sPAPRPHBState {
     MemoryRegion memspace, iospace;
     target_phys_addr_t mem_win_addr, mem_win_size, io_win_addr, io_win_size;
     MemoryRegion memwindow, iowindow;
+    DMAContext *dma;
 
     struct {
         uint32_t dt_irq;
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
@ 2012-03-09  9:34   ` Kevin Wolf
  2012-03-09 10:01   ` Paolo Bonzini
  1 sibling, 0 replies; 39+ messages in thread
From: Kevin Wolf @ 2012-03-09  9:34 UTC (permalink / raw)
  To: David Gibson; +Cc: rth, eduard.munteanu, mst, qemu-devel, agraf

Am 09.03.2012 06:01, schrieb David Gibson:
> Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
> determine the direction of DMA it is emulating.  We already have a
> DMADirection enum designed specifically to encode DMA directions.
> This patch uses it for dma_bdrv_io() as well.  This involves removing
> the DMADirection definition from the #ifdef it was inside, but since that
> only existed to protect the definition of dma_addr_t from places where
> config.h is not included, there wasn't any reason for it to be there in
> the first place.
> 
> Cc: Kevin Wolf <kwolf@redhat.com>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Kevin Wolf <kwolf@redhat.com>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers David Gibson
@ 2012-03-09  9:43   ` Kevin Wolf
  2012-03-13 10:42     ` David Gibson
  2012-03-09 10:11   ` Paolo Bonzini
  1 sibling, 1 reply; 39+ messages in thread
From: Kevin Wolf @ 2012-03-09  9:43 UTC (permalink / raw)
  To: David Gibson; +Cc: rth, eduard.munteanu, mst, qemu-devel, agraf

Am 09.03.2012 06:01, schrieb David Gibson:
> dma-helpers.c contains a number of helper functions for doing
> scatter/gather DMA, and various block device related DMA.  Currently,
> these directly access guest memory using cpu_physical_memory_*(),
> assuming no IOMMU translation.
> 
> This patch updates this code to use the new universal DMA helper
> functions.  qemu_sglist_init() now takes a DMAContext * to describe
> the DMA address space in which the scatter/gather will take place.
> 
> We minimally update the callers qemu_sglist_init() to pass NULL
> (i.e. no translation, same as current behaviour).  Some of those
> callers should pass something else in some cases to allow proper IOMMU
> translation in future, but that will be fixed in later patches.
> 
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  dma-helpers.c  |   26 ++++++++++++++++++--------
>  dma.h          |    3 ++-
>  hw/ide/ahci.c  |    3 ++-
>  hw/ide/macio.c |    4 ++--
>  hw/pci.h       |    2 +-
>  5 files changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 5f19a85..9dcfb2c 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -11,12 +11,13 @@
>  #include "block_int.h"
>  #include "trace.h"
>  
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
>  {
>      qsg->sg = g_malloc(alloc_hint * sizeof(ScatterGatherEntry));
>      qsg->nsg = 0;
>      qsg->nalloc = alloc_hint;
>      qsg->size = 0;
> +    qsg->dma = dma;
>  }
>  
>  void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len)
> @@ -75,10 +76,9 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
>      int i;
>  
>      for (i = 0; i < dbs->iov.niov; ++i) {
> -        cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
> -                                  dbs->iov.iov[i].iov_len,
> -                                  dbs->dir != DMA_DIRECTION_TO_DEVICE,
> -                                  dbs->iov.iov[i].iov_len);
> +        dma_memory_unmap(dbs->sg->dma, dbs->iov.iov[i].iov_base,
> +                         dbs->iov.iov[i].iov_len, dbs->dir,
> +                         dbs->iov.iov[i].iov_len);
>      }
>      qemu_iovec_reset(&dbs->iov);
>  }
> @@ -104,10 +104,20 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
>      }
>  }
>  
> +static void dma_bdrv_cancel(void *opaque)
> +{
> +    DMAAIOCB *dbs = opaque;
> +
> +    bdrv_aio_cancel(dbs->acb);
> +    dma_bdrv_unmap(dbs);
> +    qemu_iovec_destroy(&dbs->iov);
> +    qemu_aio_release(dbs);
> +}

I'm lacking the context to know when this is actually called, but it
looks suspicious. Did you consider that bdrv_aio_cancel() can actually
invoke the completion callback?

What's the difference between the existing dma_aio_cancel() and the
function that you need here?

Kevin

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions David Gibson
@ 2012-03-09  9:48   ` Kevin Wolf
  2012-03-10  5:30     ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Kevin Wolf @ 2012-03-09  9:48 UTC (permalink / raw)
  To: David Gibson; +Cc: rth, eduard.munteanu, mst, qemu-devel, agraf

Am 09.03.2012 06:01, schrieb David Gibson:
> The AHCI device can provide both PCI and SysBus AHCI device
> emulations.  For this reason, it wasn't previously converted to use
> the pci_dma_*() helper functions.  Now that we have universal DMA
> helper functions, this converts AHCI to use them.
> 
> The DMAContext is obtained from pci_dma_context() in the PCI case and
> set to NULL in the SysBus case (i.e. we assume for now that a SysBus
> AHCI has no IOMMU translation).
> 
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  hw/ide/ahci.c |    7 ++++---
>  hw/ide/ahci.h |    3 ++-
>  hw/ide/ich.c  |    2 +-
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> index 6a218b5..3d31179 100644
> --- a/hw/ide/ahci.c
> +++ b/hw/ide/ahci.c
> @@ -668,7 +668,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist)
>          AHCI_SG *tbl = (AHCI_SG *)prdt;
>  
>          /* FIXME: pass the correct DMAContext */
> -        qemu_sglist_init(sglist, sglist_alloc_hint, NULL);
> +        qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma);

Does the FIXME still apply after this change?

Kevin

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables David Gibson
@ 2012-03-09 10:00   ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:00 UTC (permalink / raw)
  To: David Gibson; +Cc: agraf, mst, eduard.munteanu, qemu-devel, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> A while back, we introduced the dma_addr_t type, which is supposed to
> be used for bus visible memory addresses.  At present, this is an
> alias for target_phys_addr_t, but this will change when we eventually
> add support for guest visible IOMMUs.
> 
> There are some instances of target_phys_addr_t in the code now which
> should really be dma_addr_t, but can't be trivially converted due to
> missing features which this patch corrects.
> 
>  * We add DMA_ADDR_BITS analagous to TARGET_PHYS_ADDR_BITS.  This is
>    important where we need to make a compile-time (#if) based on the
>    size of dma_addr_t.
> 
>  * We add a new helper macro to create device properties which take a
>    dma_addr_t.  This lets us correctly convert code which cuurrently
>    has DEFINE_PROP_TADDR() for variables which should be dma_addr_t
>    instead of target_phys_addr_t.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  Makefile.objs |    2 +-
>  dma.h         |    1 +
>  hw/qdev-dma.c |   77 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  hw/qdev-dma.h |    8 ++++++
>  4 files changed, 87 insertions(+), 1 deletions(-)
>  create mode 100644 hw/qdev-dma.c
>  create mode 100644 hw/qdev-dma.h
> 
> diff --git a/Makefile.objs b/Makefile.objs
> index 808de6a..59aed69 100644
> --- a/Makefile.objs
> +++ b/Makefile.objs
> @@ -287,7 +287,7 @@ hw-obj-$(CONFIG_LSI_SCSI_PCI) += lsi53c895a.o
>  hw-obj-$(CONFIG_ESP) += esp.o
>  
>  hw-obj-y += dma-helpers.o sysbus.o isa-bus.o
> -hw-obj-y += qdev-addr.o
> +hw-obj-y += qdev-addr.o qdev-dma.o
>  
>  # VGA
>  hw-obj-$(CONFIG_VGA_PCI) += vga-pci.o
> diff --git a/dma.h b/dma.h
> index 05ac325..463095c 100644
> --- a/dma.h
> +++ b/dma.h
> @@ -32,6 +32,7 @@ struct QEMUSGList {
>  #if defined(TARGET_PHYS_ADDR_BITS)
>  typedef target_phys_addr_t dma_addr_t;
>  
> +#define DMA_ADDR_BITS TARGET_PHYS_ADDR_BITS
>  #define DMA_ADDR_FMT TARGET_FMT_plx
>  
>  struct ScatterGatherEntry {
> diff --git a/hw/qdev-dma.c b/hw/qdev-dma.c
> new file mode 100644
> index 0000000..2b21bf4
> --- /dev/null
> +++ b/hw/qdev-dma.c
> @@ -0,0 +1,77 @@
> +#include "qdev.h"
> +#include "qdev-dma.h"
> +#include "dma.h"
> +
> +/* --- target physical address --- */
> +
> +static int parse_dmaaddr(DeviceState *dev, Property *prop, const char *str)
> +{
> +    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
> +
> +    *ptr = strtoull(str, NULL, 16);
> +    return 0;
> +}

My fault here.  I didn't update qdev-addr.c in patch
97aa6e9b8f9df37add21d86fac1a9ca6ce7df9b7, and you obviously based your
work on it.  Can you please add an if statement here like this:

+    if (str[0] != '0' || str[1] != 'x') {
+        return -EINVAL;
+    }
+

Paolo

> +static int print_dmaaddr(DeviceState *dev, Property *prop, char *dest,
> +                         size_t len)
> +{
> +    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
> +    return snprintf(dest, len, "0x" DMA_ADDR_FMT, *ptr);
> +}


> +static void get_dmaaddr(Object *obj, Visitor *v, void *opaque,
> +                      const char *name, Error **errp)
> +{
> +    DeviceState *dev = DEVICE(obj);
> +    Property *prop = opaque;
> +    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
> +    int64_t value;
> +
> +    value = *ptr;
> +    visit_type_int(v, &value, name, errp);
> +}
> +
> +static void set_dmaaddr(Object *obj, Visitor *v, void *opaque,
> +                      const char *name, Error **errp)
> +{
> +    DeviceState *dev = DEVICE(obj);
> +    Property *prop = opaque;
> +    dma_addr_t *ptr = qdev_get_prop_ptr(dev, prop);
> +    Error *local_err = NULL;
> +    int64_t value;
> +
> +    if (dev->state != DEV_STATE_CREATED) {
> +        error_set(errp, QERR_PERMISSION_DENIED);
> +        return;
> +    }
> +
> +    visit_type_int(v, &value, name, &local_err);
> +    if (local_err) {
> +        error_propagate(errp, local_err);
> +        return;
> +    }
> +    if ((uint64_t)value <= (uint64_t) ~(dma_addr_t)0) {
> +        *ptr = value;
> +    } else {
> +        error_set(errp, QERR_PROPERTY_VALUE_OUT_OF_RANGE,
> +                  dev->id?:"", name, value, (uint64_t) 0,
> +                  (uint64_t) ~(dma_addr_t)0);
> +    }
> +}
> +
> +
> +PropertyInfo qdev_prop_dmaaddr = {
> +    .name  = "dmaaddr",
> +    .parse = parse_dmaaddr,
> +    .print = print_dmaaddr,
> +    .get   = get_dmaaddr,
> +    .set   = set_dmaaddr,
> +};
> +
> +void qdev_prop_set_dmaaddr(DeviceState *dev, const char *name, dma_addr_t value)
> +{
> +    Error *errp = NULL;
> +    object_property_set_int(OBJECT(dev), value, name, &errp);
> +    assert(!errp);
> +
> +}
> diff --git a/hw/qdev-dma.h b/hw/qdev-dma.h
> new file mode 100644
> index 0000000..8b01fda
> --- /dev/null
> +++ b/hw/qdev-dma.h
> @@ -0,0 +1,8 @@
> +#include "dma.h"
> +
> +#define DEFINE_PROP_DMAADDR(_n, _s, _f, _d)                               \
> +    DEFINE_PROP_DEFAULT(_n, _s, _f, _d, qdev_prop_dmaaddr, dma_addr_t)
> +
> +extern PropertyInfo qdev_prop_dmaaddr;
> +void qdev_prop_set_dmaaddr(DeviceState *dev, const char *name,
> +                           dma_addr_t value);

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
  2012-03-09  9:34   ` Kevin Wolf
@ 2012-03-09 10:01   ` Paolo Bonzini
  2012-03-13  5:53     ` David Gibson
  1 sibling, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:01 UTC (permalink / raw)
  To: David Gibson; +Cc: Kevin Wolf, mst, agraf, qemu-devel, eduard.munteanu, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
> determine the direction of DMA it is emulating.  We already have a
> DMADirection enum designed specifically to encode DMA directions.
> This patch uses it for dma_bdrv_io() as well.  This involves removing
> the DMADirection definition from the #ifdef it was inside, but since that
> only existed to protect the definition of dma_addr_t from places where
> config.h is not included, there wasn't any reason for it to be there in
> the first place.

Looks good, but please use this for dma_buf_* too.

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions David Gibson
@ 2012-03-09 10:06   ` Paolo Bonzini
  2012-03-09 11:19     ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:06 UTC (permalink / raw)
  To: David Gibson; +Cc: mst, Joerg Rodel, agraf, qemu-devel, eduard.munteanu, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> +static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
> +                                  void *buf, dma_addr_t len)
> +{
> +    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
> +}
> +
> +static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
> +                                   const void *buf, dma_addr_t len)
> +{
> +    return dma_memory_rw(dma, addr, (void *)buf, len,
> +                         DMA_DIRECTION_FROM_DEVICE);
> +}

This is opposite to the convention of dma_buf_read/dma_buf_write, which
is from the point of view of the device.  Either way is okay for me,
including removing dma_buf_read/dma_buf_write altogether and using
dma_buf_rw in the few users (so that there will be an explicit
DMADirection argument that clears all confusion).

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers David Gibson
  2012-03-09  9:43   ` Kevin Wolf
@ 2012-03-09 10:11   ` Paolo Bonzini
  2012-03-13  6:37     ` David Gibson
  1 sibling, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:11 UTC (permalink / raw)
  To: David Gibson; +Cc: Kevin Wolf, mst, agraf, qemu-devel, eduard.munteanu, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> dma-helpers.c contains a number of helper functions for doing
> scatter/gather DMA, and various block device related DMA.  Currently,
> these directly access guest memory using cpu_physical_memory_*(),
> assuming no IOMMU translation.
> 
> This patch updates this code to use the new universal DMA helper
> functions.  qemu_sglist_init() now takes a DMAContext * to describe
> the DMA address space in which the scatter/gather will take place.
> 
> We minimally update the callers qemu_sglist_init() to pass NULL
> (i.e. no translation, same as current behaviour).  Some of those
> callers should pass something else in some cases to allow proper IOMMU
> translation in future, but that will be fixed in later patches.
> 
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  dma-helpers.c  |   26 ++++++++++++++++++--------
>  dma.h          |    3 ++-
>  hw/ide/ahci.c  |    3 ++-
>  hw/ide/macio.c |    4 ++--
>  hw/pci.h       |    2 +-
>  5 files changed, 25 insertions(+), 13 deletions(-)
> 
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 5f19a85..9dcfb2c 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -11,12 +11,13 @@
>  #include "block_int.h"
>  #include "trace.h"
>  
> -void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint)
> +void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
>  {
>      qsg->sg = g_malloc(alloc_hint * sizeof(ScatterGatherEntry));
>      qsg->nsg = 0;
>      qsg->nalloc = alloc_hint;
>      qsg->size = 0;
> +    qsg->dma = dma;
>  }
>  
>  void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len)
> @@ -75,10 +76,9 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
>      int i;
>  
>      for (i = 0; i < dbs->iov.niov; ++i) {
> -        cpu_physical_memory_unmap(dbs->iov.iov[i].iov_base,
> -                                  dbs->iov.iov[i].iov_len,
> -                                  dbs->dir != DMA_DIRECTION_TO_DEVICE,
> -                                  dbs->iov.iov[i].iov_len);
> +        dma_memory_unmap(dbs->sg->dma, dbs->iov.iov[i].iov_base,
> +                         dbs->iov.iov[i].iov_len, dbs->dir,
> +                         dbs->iov.iov[i].iov_len);
>      }
>      qemu_iovec_reset(&dbs->iov);
>  }
> @@ -104,10 +104,20 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
>      }
>  }
>  
> +static void dma_bdrv_cancel(void *opaque)
> +{
> +    DMAAIOCB *dbs = opaque;
> +
> +    bdrv_aio_cancel(dbs->acb);
> +    dma_bdrv_unmap(dbs);
> +    qemu_iovec_destroy(&dbs->iov);
> +    qemu_aio_release(dbs);
> +}

What Kevin said.  Instead of a generic callback, dma_memory_map should
probably just receive the AIOCB (in this case &dbs->common) and call
bdrv_aio_cancel on it.

>  static void dma_bdrv_cb(void *opaque, int ret)
>  {
>      DMAAIOCB *dbs = (DMAAIOCB *)opaque;
> -    target_phys_addr_t cur_addr, cur_len;
> +    dma_addr_t cur_addr, cur_len;
>      void *mem;
>  
>      trace_dma_bdrv_cb(dbs, ret);
> @@ -124,8 +134,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
>      while (dbs->sg_cur_index < dbs->sg->nsg) {
>          cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
>          cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
> -        mem = cpu_physical_memory_map(cur_addr, &cur_len,
> -                                      dbs->dir != DMA_DIRECTION_TO_DEVICE);
> +        mem = dma_memory_map(dbs->sg->dma, dma_bdrv_cancel, dbs,
> +                             cur_addr, &cur_len, dbs->dir);
>          if (!mem)
>              break;
>          qemu_iovec_add(&dbs->iov, mem, cur_len);

dma_buf_rw should also use the DMAContext here (passing a NULL
invalidate function).

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure David Gibson
@ 2012-03-09 10:23   ` Paolo Bonzini
  2012-03-13  5:07     ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:23 UTC (permalink / raw)
  To: David Gibson; +Cc: agraf, mst, eduard.munteanu, qemu-devel, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> This patch adds the basic infrastructure necessary to emulate an IOMMU
> visible to the guest.  The DMAContext structure is extended with
> information and a callback describing the translation, and the various
> DMA functions used by devices will now perform IOMMU translation using
> this callback.
> 
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> 
> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  configure     |   12 ++++
>  dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  dma.h         |  125 +++++++++++++++++++++++++++++++-------
>  3 files changed, 306 insertions(+), 22 deletions(-)
> 
> diff --git a/configure b/configure
> index a5eb832..e6fba2f 100755
> --- a/configure
> +++ b/configure
> @@ -138,6 +138,7 @@ linux_aio=""
>  cap_ng=""
>  attr=""
>  libattr=""
> +iommu="yes"
>  xfs=""
>  
>  vhost_net="no"
> @@ -784,6 +785,10 @@ for opt do
>    ;;
>    --enable-vhost-net) vhost_net="yes"
>    ;;
> +  --enable-iommu) iommu="yes"
> +  ;;
> +  --disable-iommu) iommu="no"
> +  ;;

This need not be a configure option.  Just enable it, or it will bitrot.

>    --disable-opengl) opengl="no"
>    ;;
>    --enable-opengl) opengl="yes"
> @@ -1085,6 +1090,8 @@ echo "  --enable-docs            enable documentation build"
>  echo "  --disable-docs           disable documentation build"
>  echo "  --disable-vhost-net      disable vhost-net acceleration support"
>  echo "  --enable-vhost-net       enable vhost-net acceleration support"
> +echo "  --disable-iommu          disable IOMMU emulation support"
> +echo "  --enable-iommu           enable IOMMU emulation support"
>  echo "  --enable-trace-backend=B Set trace backend"
>  echo "                           Available backends:" $("$source_path"/scripts/tracetool --list-backends)
>  echo "  --with-trace-file=NAME   Full PATH,NAME of file to store traces"
> @@ -2924,6 +2931,7 @@ echo "posix_madvise     $posix_madvise"
>  echo "uuid support      $uuid"
>  echo "libcap-ng support $cap_ng"
>  echo "vhost-net support $vhost_net"
> +echo "IOMMU support     $iommu"
>  echo "Trace backend     $trace_backend"
>  echo "Trace output file $trace_file-<pid>"
>  echo "spice support     $spice"
> @@ -3801,6 +3809,10 @@ if test "$target_softmmu" = "yes" -a \( \
>    echo "CONFIG_NEED_MMU=y" >> $config_target_mak
>  fi
>  
> +if test "$iommu" = "yes" ; then
> +  echo "CONFIG_IOMMU=y" >> $config_host_mak
> +fi
> +
>  if test "$gprof" = "yes" ; then
>    echo "TARGET_GPROF=yes" >> $config_target_mak
>    if test "$target_linux_user" = "yes" ; then
> diff --git a/dma-helpers.c b/dma-helpers.c
> index 9dcfb2c..248575d 100644
> --- a/dma-helpers.c
> +++ b/dma-helpers.c
> @@ -10,6 +10,9 @@
>  #include "dma.h"
>  #include "block_int.h"
>  #include "trace.h"
> +#include "range.h"
> +
> +/* #define DEBUG_IOMMU */
>  
>  void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
>  {
> @@ -255,3 +258,191 @@ void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
>  {
>      bdrv_acct_start(bs, cookie, sg->size, type);
>  }
> +
> +#ifdef CONFIG_IOMMU
> +bool __dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
> +                        DMADirection dir)

No __ identifiers, please.  You could call these
dma_context_memory_{valid,read,write}.

> +{
> +    target_phys_addr_t paddr, plen;
> +
> +#ifdef DEBUG_IOMMU
> +    fprintf(stderr, "dma_memory_check context=%p addr=0x" DMA_ADDR_FMT
> +            " len=0x" DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
> +#endif
> +
> +    while (len) {
> +        if (dma->translate(dma, addr, &paddr, &plen, dir) != 0) {
> +            return false;
> +        }
> +
> +        /* The translation might be valid for larger regions. */
> +        if (plen > len) {
> +            plen = len;
> +        }
> +
> +        len -= plen;
> +        addr += plen;
> +    }
> +
> +    return true;
> +}
> +
> +int __dma_memory_rw(DMAContext *dma, dma_addr_t addr,
> +                    void *buf, dma_addr_t len, DMADirection dir)
> +{
> +    target_phys_addr_t paddr, plen;
> +    int err;
> +
> +#ifdef DEBUG_IOMMU
> +    fprintf(stderr, "dma_memory_rw context=%p addr=0x" DMA_ADDR_FMT " len=0x"
> +            DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
> +#endif
> +
> +    while (len) {
> +        err = dma->translate(dma, addr, &paddr, &plen, dir);
> +        if (err) {
> +            return -1;
> +        }
> +
> +        /* The translation might be valid for larger regions. */
> +        if (plen > len) {
> +            plen = len;
> +        }
> +
> +        cpu_physical_memory_rw(paddr, buf, plen,
> +                               dir == DMA_DIRECTION_FROM_DEVICE);
> +
> +        len -= plen;
> +        addr += plen;
> +        buf += plen;
> +    }
> +
> +    return 0;
> +}
> +
> +int __dma_memory_zero(DMAContext *dma, dma_addr_t addr, dma_addr_t len)
> +{
> +    target_phys_addr_t paddr, plen;
> +    int err;
> +
> +#ifdef DEBUG_IOMMU
> +    fprintf(stderr, "dma_memory_zero context=%p addr=0x" DMA_ADDR_FMT
> +            " len=0x" DMA_ADDR_FMT "\n", dma, addr, len);
> +#endif
> +
> +    while (len) {
> +        err = dma->translate(dma, addr, &paddr, &plen,
> +                             DMA_DIRECTION_FROM_DEVICE);
> +        if (err) {
> +            return err;
> +        }
> +
> +        /* The translation might be valid for larger regions. */
> +        if (plen > len) {
> +            plen = len;
> +        }
> +
> +        cpu_physical_memory_zero(paddr, plen);
> +
> +        len -= plen;
> +        addr += plen;
> +    }
> +
> +    return 0;
> +}
> +
> +typedef struct DMAMemoryMap DMAMemoryMap;
> +struct DMAMemoryMap {
> +    dma_addr_t              addr;
> +    size_t                  len;
> +    void                    *buf;
> +    DMAInvalidateMapFunc    *invalidate;
> +    void                    *invalidate_opaque;
> +
> +    QLIST_ENTRY(DMAMemoryMap) list;
> +};
> +
> +void dma_context_init(DMAContext *dma, DMATranslateFunc fn)
> +{
> +#ifdef DEBUG_IOMMU
> +    fprintf(stderr, "dma_context_init(%p, %p)\n", dma, fn);
> +#endif
> +    dma->translate = fn;
> +    QLIST_INIT(&dma->memory_maps);
> +}
> +
> +void dma_invalidate_memory_range(DMAContext *dma,
> +                                 dma_addr_t addr, dma_addr_t len)
> +{
> +    DMAMemoryMap *map;
> +
> +    QLIST_FOREACH(map, &dma->memory_maps, list) {
> +        if (ranges_overlap(addr, len, map->addr, map->len)) {
> +            map->invalidate(map->invalidate_opaque);
> +            QLIST_REMOVE(map, list);
> +            free(map);
> +        }
> +    }
> +}
> +
> +void *__dma_memory_map(DMAContext *dma, DMAInvalidateMapFunc *cb,
> +                       void *cb_opaque, dma_addr_t addr, dma_addr_t *len,
> +                       DMADirection dir)
> +{
> +    int err;
> +    target_phys_addr_t paddr, plen;
> +    void *buf;
> +
> +    plen = *len;
> +    err = dma->translate(dma, addr, &paddr, &plen, dir);
> +    if (err) {
> +        return NULL;
> +    }
> +
> +    /*
> +     * If this is true, the virtual region is contiguous,
> +     * but the translated physical region isn't. We just
> +     * clamp *len, much like cpu_physical_memory_map() does.
> +     */
> +    if (plen < *len) {
> +        *len = plen;
> +    }
> +
> +    buf = cpu_physical_memory_map(paddr, &plen,
> +                                  dir == DMA_DIRECTION_FROM_DEVICE);
> +    *len = plen;
> +
> +    /* We treat maps as remote TLBs to cope with stuff like AIO. */
> +    if (cb) {
> +        DMAMemoryMap *map;
> +
> +        map = g_malloc(sizeof(DMAMemoryMap));
> +        map->addr = addr;
> +        map->len = *len;
> +        map->buf = buf;
> +        map->invalidate = cb;
> +        map->invalidate_opaque = cb_opaque;
> +
> +        QLIST_INSERT_HEAD(&dma->memory_maps, map, list);
> +    }
> +
> +    return buf;
> +}
> +
> +void __dma_memory_unmap(DMAContext *dma, void *buffer, dma_addr_t len,
> +                        DMADirection dir, dma_addr_t access_len)
> +{
> +    DMAMemoryMap *map;
> +
> +    cpu_physical_memory_unmap(buffer, len,
> +                              dir == DMA_DIRECTION_FROM_DEVICE,
> +                              access_len);
> +
> +    QLIST_FOREACH(map, &dma->memory_maps, list) {
> +        if ((map->buf == buffer) && (map->len == len)) {
> +            QLIST_REMOVE(map, list);
> +            free(map);
> +        }
> +    }
> +}
> +#endif /* CONFIG_IOMMU */
> diff --git a/dma.h b/dma.h
> index a66e3d7..ec06163 100644
> --- a/dma.h
> +++ b/dma.h
> @@ -15,6 +15,7 @@
>  #include "hw/hw.h"
>  #include "block.h"
>  
> +typedef struct DMAContext DMAContext;
>  typedef struct ScatterGatherEntry ScatterGatherEntry;
>  
>  typedef enum {
> @@ -31,30 +32,82 @@ struct QEMUSGList {
>  };
>  
>  #if defined(TARGET_PHYS_ADDR_BITS)
> +
> +#ifdef CONFIG_IOMMU

... which i making dma_addr_t always 64-bit.  This way you can even
remove the new qdev-dma.c file.  Just make DEFINE_PROP_DMAADDR an alias
for DEFINE_PROP_HEX64.

> +/*
> + * When an IOMMU is present, bus addresses become distinct from
> + * CPU/memory physical addresses and may be a different size.  Because
> + * the IOVA size depends more on the bus than on the platform, we more
> + * or less have to treat these as 64-bit always to cover all (or at
> + * least most) cases.
> + */
> +typedef uint64_t dma_addr_t;
> +
> +#define DMA_ADDR_BITS 64
> +#define DMA_ADDR_FMT "%" PRIx64
> +#else
>  typedef target_phys_addr_t dma_addr_t;
>  
>  #define DMA_ADDR_BITS TARGET_PHYS_ADDR_BITS
>  #define DMA_ADDR_FMT TARGET_FMT_plx
> +#endif
> +
> +typedef int DMATranslateFunc(DMAContext *dma,
> +                             dma_addr_t addr,
> +                             target_phys_addr_t *paddr,
> +                             target_phys_addr_t *len,
> +                             DMADirection dir);
> +
> +typedef struct DMAContext {
> +#ifdef CONFIG_IOMMU
> +    DMATranslateFunc *translate;
> +    QLIST_HEAD(memory_maps, DMAMemoryMap) memory_maps;
> +#endif
> +} DMAContext;
> +
> +#ifdef CONFIG_IOMMU
> +static inline bool dma_has_iommu(DMAContext *dma)
> +{
> +    return !!dma;
> +}
> +#else
> +static inline bool dma_has_iommu(DMAContext *dma)
> +{
> +    return false;
> +}
> +#endif
>  
>  typedef void DMAInvalidateMapFunc(void *);
>  
>  /* Checks that the given range of addresses is valid for DMA.  This is
>   * useful for certain cases, but usually you should just use
>   * dma_memory_{read,write}() and check for errors */
> -static inline bool dma_memory_valid(DMAContext *dma, dma_addr_t addr,
> -                                    dma_addr_t len, DMADirection dir)
> +bool __dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
> +                        DMADirection dir);
> +static inline bool dma_memory_valid(DMAContext *dma,
> +                                    dma_addr_t addr, dma_addr_t len,
> +                                    DMADirection dir)
>  {
> -    /* Stub version, with no iommu we assume all bus addresses are valid */
> -    return true;
> +    if (!dma_has_iommu(dma)) {
> +        return true;
> +    } else {
> +        return __dma_memory_valid(dma, addr, len, dir);
> +    }
>  }
>  
> +int __dma_memory_rw(DMAContext *dma, dma_addr_t addr,
> +                    void *buf, dma_addr_t len, DMADirection dir);
>  static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
>                                  void *buf, dma_addr_t len, DMADirection dir)
>  {
> -    /* Stub version when we have no iommu support */
> -    cpu_physical_memory_rw(addr, buf, (target_phys_addr_t)len,
> -                           dir == DMA_DIRECTION_FROM_DEVICE);
> -    return 0;
> +    if (!dma_has_iommu(dma)) {
> +        /* Fast-path for no IOMMU */
> +        cpu_physical_memory_rw(addr, buf, len,
> +                               dir == DMA_DIRECTION_FROM_DEVICE);
> +        return 0;
> +    } else {
> +        return __dma_memory_rw(dma, addr, buf, len, dir);
> +    }
>  }
>  
>  static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
> @@ -70,35 +123,55 @@ static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
>                           DMA_DIRECTION_FROM_DEVICE);
>  }
>  
> +int __dma_memory_zero(DMAContext *dma, dma_addr_t addr, dma_addr_t len);
>  static inline int dma_memory_zero(DMAContext *dma, dma_addr_t addr,
>                                    dma_addr_t len)
>  {
> -    /* Stub version when we have no iommu support */
> -    cpu_physical_memory_zero(addr, len);
> -    return 0;
> +    if (!dma_has_iommu(dma)) {
> +        /* Fast-path for no IOMMU */
> +        cpu_physical_memory_zero(addr, len);
> +        return 0;
> +    } else {
> +        return __dma_memory_zero(dma, addr, len);
> +    }
>  }
>  
> +void *__dma_memory_map(DMAContext *dma,
> +                       DMAInvalidateMapFunc *cb, void *opaque,
> +                       dma_addr_t addr, dma_addr_t *len,
> +                       DMADirection dir);
>  static inline void *dma_memory_map(DMAContext *dma,
> -                                   DMAInvalidateMapFunc *cb, void *opaque,
> +                                   DMAInvalidateMapFunc *cb, void *cb_opaque,
>                                     dma_addr_t addr, dma_addr_t *len,
>                                     DMADirection dir)
>  {
> -    target_phys_addr_t xlen = *len;
> -    void *p;
> -
> -    p = cpu_physical_memory_map(addr, &xlen,
> -                                dir == DMA_DIRECTION_FROM_DEVICE);
> -    *len = xlen;
> -    return p;
> +    if (!dma_has_iommu(dma)) {
> +        target_phys_addr_t xlen = *len;
> +        void *p;
> +
> +        p = cpu_physical_memory_map(addr, &xlen,
> +                                    dir == DMA_DIRECTION_FROM_DEVICE);
> +        *len = xlen;
> +        return p;
> +    } else {
> +        return __dma_memory_map(dma, cb, cb_opaque, addr, len, dir);
> +    }
>  }
>  
> +void __dma_memory_unmap(DMAContext *dma,
> +                        void *buffer, dma_addr_t len,
> +                        DMADirection dir, dma_addr_t access_len);
>  static inline void dma_memory_unmap(DMAContext *dma,
>                                      void *buffer, dma_addr_t len,
>                                      DMADirection dir, dma_addr_t access_len)
>  {
> -    return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
> -                                     dir == DMA_DIRECTION_FROM_DEVICE,
> -                                     access_len);
> +    if (!dma_has_iommu(dma)) {
> +        return cpu_physical_memory_unmap(buffer, (target_phys_addr_t)len,
> +                                         dir == DMA_DIRECTION_FROM_DEVICE,
> +                                         access_len);
> +    } else {
> +        __dma_memory_unmap(dma, buffer, len, dir, access_len);
> +    }
>  }
>  
>  #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
> @@ -139,6 +212,14 @@ DEFINE_LDST_DMA(q, q, 64, be);
>  
>  #undef DEFINE_LDST_DMA
>  
> +#ifdef CONFIG_IOMMU
> +
> +void dma_context_init(DMAContext *dma, DMATranslateFunc fn);
> +void dma_invalidate_memory_range(DMAContext *dma,
> +                                 dma_addr_t addr, dma_addr_t len);

Should dma_invalidate_memory_range be changed to a no-op if dma == NULL?
 No idea, just asking.

Paolo

> +
> +#endif /* CONFIG_IOMMU */
> +
>  struct ScatterGatherEntry {
>      dma_addr_t base;
>      dma_addr_t len;

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices
  2012-03-09  5:01 ` [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices David Gibson
@ 2012-03-09 10:23   ` Paolo Bonzini
  2012-03-09 10:58     ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 10:23 UTC (permalink / raw)
  To: David Gibson; +Cc: agraf, mst, eduard.munteanu, qemu-devel, rth

Il 09/03/2012 06:01, David Gibson ha scritto:
> Currently the pseries machine emulation does not support DMA for emulated
> PCI devices, because the PAPR spec always requires a (guest visible,
> paravirtualized) IOMMU which was not implemented.  Now that we have
> infrastructure for IOMMU emulation, we can correct this and allow PCI DMA
> for pseries.
> 
> With the existing PAPR IOMMU code used for VIO devices, this is almost
> trivial. We use a single DMAContext for each (virtual) PCI host bridge,
> which is the usual configuration on real PAPR machines (which often have
> _many_ PCI host bridges).

What about virtio?

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices
  2012-03-09 10:23   ` Paolo Bonzini
@ 2012-03-09 10:58     ` David Gibson
  2012-03-11  2:02       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09 10:58 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: agraf, mst, eduard.munteanu, qemu-devel, rth

On Fri, Mar 09, 2012 at 11:23:58AM +0100, Paolo Bonzini wrote:
> Il 09/03/2012 06:01, David Gibson ha scritto:
> > Currently the pseries machine emulation does not support DMA for emulated
> > PCI devices, because the PAPR spec always requires a (guest visible,
> > paravirtualized) IOMMU which was not implemented.  Now that we have
> > infrastructure for IOMMU emulation, we can correct this and allow PCI DMA
> > for pseries.
> > 
> > With the existing PAPR IOMMU code used for VIO devices, this is almost
> > trivial. We use a single DMAContext for each (virtual) PCI host bridge,
> > which is the usual configuration on real PAPR machines (which often have
> > _many_ PCI host bridges).
> 
> What about virtio?

virtio doesn't use virtualized PCI DMA, it uses direct hypervisor
access to guest memory, by guest physical address.  It *shouldn't*,
but it does - that's the way it's specced and that's the way the guest
kernel expects it to be.  It could be fixed with a new feature bit,
but that's a project for another day.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions
  2012-03-09 10:06   ` Paolo Bonzini
@ 2012-03-09 11:19     ` David Gibson
  2012-03-09 11:55       ` Paolo Bonzini
  0 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-09 11:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: mst, Joerg Rodel, agraf, qemu-devel, eduard.munteanu, rth

On Fri, Mar 09, 2012 at 11:06:41AM +0100, Paolo Bonzini wrote:
> Il 09/03/2012 06:01, David Gibson ha scritto:
> > +static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
> > +                                  void *buf, dma_addr_t len)
> > +{
> > +    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
> > +}
> > +
> > +static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
> > +                                   const void *buf, dma_addr_t len)
> > +{
> > +    return dma_memory_rw(dma, addr, (void *)buf, len,
> > +                         DMA_DIRECTION_FROM_DEVICE);
> > +}
> 
> This is opposite to the convention of dma_buf_read/dma_buf_write, which
> is from the point of view of the device.

Uh, these are from the point of view of the device - the device writes
to memory == transfer from device.  More importantly, this is the same
convention as cpu_physical_memory_{read,write}() which is what these
generally replace.  dma_buf_read() is from the point of view of the
core/memory.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions
  2012-03-09 11:19     ` David Gibson
@ 2012-03-09 11:55       ` Paolo Bonzini
  0 siblings, 0 replies; 39+ messages in thread
From: Paolo Bonzini @ 2012-03-09 11:55 UTC (permalink / raw)
  To: qemu-devel, mst, Joerg Rodel, agraf, eduard.munteanu, rth

Il 09/03/2012 12:19, David Gibson ha scritto:
>> > 
>> > This is opposite to the convention of dma_buf_read/dma_buf_write, which
>> > is from the point of view of the device.
> Uh, these are from the point of view of the device - the device writes
> to memory == transfer from device.

Right, dma_buf_read = "read from device".  The idea with dma_buf_read
was to make it the same as dma_bdrv_read().

> More importantly, this is the same
> convention as cpu_physical_memory_{read,write}() which is what these
> generally replace.

Understood.

My point was not really to bikeshed, just to note that probably
dma_buf_read/dma_buf_write become very confusing now.  So it is best to
remove them and just replace it with dma_buf_rw and a DMADirection
argument).  Can you do that?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions
  2012-03-09  9:48   ` Kevin Wolf
@ 2012-03-10  5:30     ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-10  5:30 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: rth, eduard.munteanu, mst, qemu-devel, agraf

On Fri, Mar 09, 2012 at 10:48:20AM +0100, Kevin Wolf wrote:
> Am 09.03.2012 06:01, schrieb David Gibson:
> > The AHCI device can provide both PCI and SysBus AHCI device
> > emulations.  For this reason, it wasn't previously converted to use
> > the pci_dma_*() helper functions.  Now that we have universal DMA
> > helper functions, this converts AHCI to use them.
> > 
> > The DMAContext is obtained from pci_dma_context() in the PCI case and
> > set to NULL in the SysBus case (i.e. we assume for now that a SysBus
> > AHCI has no IOMMU translation).
> > 
> > Cc: Kevin Wolf <kwolf@redhat.com>
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  hw/ide/ahci.c |    7 ++++---
> >  hw/ide/ahci.h |    3 ++-
> >  hw/ide/ich.c  |    2 +-
> >  3 files changed, 7 insertions(+), 5 deletions(-)
> > 
> > diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
> > index 6a218b5..3d31179 100644
> > --- a/hw/ide/ahci.c
> > +++ b/hw/ide/ahci.c
> > @@ -668,7 +668,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist)
> >          AHCI_SG *tbl = (AHCI_SG *)prdt;
> >  
> >          /* FIXME: pass the correct DMAContext */
> > -        qemu_sglist_init(sglist, sglist_alloc_hint, NULL);
> > +        qemu_sglist_init(sglist, sglist_alloc_hint, ad->hba->dma);
> 
> Does the FIXME still apply after this change?

Ah, no, I'll fix that in the next version.  I've also spotted there
are still a bunch of cpu_physical_memory_*() calls in there that I
either missed, or which have been added since I did the conversion.
I'll tackle that next week.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices
  2012-03-09 10:58     ` David Gibson
@ 2012-03-11  2:02       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 39+ messages in thread
From: Benjamin Herrenschmidt @ 2012-03-11  2:02 UTC (permalink / raw)
  To: David Gibson; +Cc: mst, qemu-devel, agraf, eduard.munteanu, Paolo Bonzini, rth

On Fri, 2012-03-09 at 21:58 +1100, David Gibson wrote:
> > What about virtio?
> 
> virtio doesn't use virtualized PCI DMA, it uses direct hypervisor
> access to guest memory, by guest physical address.  It *shouldn't*,
> but it does - that's the way it's specced and that's the way the guest
> kernel expects it to be.  It could be fixed with a new feature bit,
> but that's a project for another day.

More precisely, the patch don't break virtio as virtio just bypasses
this all.

Also having virtio go through the iommu might not be such a great idea,
it should definitely remain optional. The ability of virtio to go
straight to guest memory has some significant performance advantages.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-09 10:23   ` Paolo Bonzini
@ 2012-03-13  5:07     ` David Gibson
  2012-03-13 13:56       ` Alexander Graf
  0 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-13  5:07 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: agraf, mst, eduard.munteanu, qemu-devel, rth

On Fri, Mar 09, 2012 at 11:23:01AM +0100, Paolo Bonzini wrote:
> Il 09/03/2012 06:01, David Gibson ha scritto:
> > This patch adds the basic infrastructure necessary to emulate an IOMMU
> > visible to the guest.  The DMAContext structure is extended with
> > information and a callback describing the translation, and the various
> > DMA functions used by devices will now perform IOMMU translation using
> > this callback.
> > 
> > Cc: Michael S. Tsirkin <mst@redhat.com>
> > Cc: Richard Henderson <rth@twiddle.net>
> > 
> > Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  configure     |   12 ++++
> >  dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  dma.h         |  125 +++++++++++++++++++++++++++++++-------
> >  3 files changed, 306 insertions(+), 22 deletions(-)
> > 
> > diff --git a/configure b/configure
> > index a5eb832..e6fba2f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -138,6 +138,7 @@ linux_aio=""
> >  cap_ng=""
> >  attr=""
> >  libattr=""
> > +iommu="yes"
> >  xfs=""
> >  
> >  vhost_net="no"
> > @@ -784,6 +785,10 @@ for opt do
> >    ;;
> >    --enable-vhost-net) vhost_net="yes"
> >    ;;
> > +  --enable-iommu) iommu="yes"
> > +  ;;
> > +  --disable-iommu) iommu="no"
> > +  ;;
> 
> This need not be a configure option.  Just enable it, or it will
> bitrot.

I did wonder about that.  I'd like to hear that suggested by more than
one person before I unconditionally add code which will impose an
overhead on all emulated DMAs.

[snip]
> > +
> > +#ifdef CONFIG_IOMMU
> > +bool __dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
> > +                        DMADirection dir)
> 
> No __ identifiers, please.  You could call these
> dma_context_memory_{valid,read,write}.

Ok, I'll think of different names.

[snip]
> > diff --git a/dma.h b/dma.h
> > index a66e3d7..ec06163 100644
> > --- a/dma.h
> > +++ b/dma.h
> > @@ -15,6 +15,7 @@
> >  #include "hw/hw.h"
> >  #include "block.h"
> >  
> > +typedef struct DMAContext DMAContext;
> >  typedef struct ScatterGatherEntry ScatterGatherEntry;
> >  
> >  typedef enum {
> > @@ -31,30 +32,82 @@ struct QEMUSGList {
> >  };
> >  
> >  #if defined(TARGET_PHYS_ADDR_BITS)
> > +
> > +#ifdef CONFIG_IOMMU
> 
> ... which i making dma_addr_t always 64-bit.  This way you can even
> remove the new qdev-dma.c file.  Just make DEFINE_PROP_DMAADDR an alias
> for DEFINE_PROP_HEX64.

Ah, true, I could #define it to DEFINE_PROP_TADDR before this change,
too.  Ok will, revise accordingly.

[snip]
> > +#ifdef CONFIG_IOMMU
> > +
> > +void dma_context_init(DMAContext *dma, DMATranslateFunc fn);
> > +void dma_invalidate_memory_range(DMAContext *dma,
> > +                                 dma_addr_t addr, dma_addr_t len);
> 
> Should dma_invalidate_memory_range be changed to a no-op if dma == NULL?
>  No idea, just asking.

Not really.  Since dma_invalidate_memory_range() would be called from
the IOMMU implementation when translations are removed, it should
never be called with dma == NULL (if there's IOMMU code to do the
calling, then there must be a non-NULL DMAContext to describe that IOMMU).

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io
  2012-03-09 10:01   ` Paolo Bonzini
@ 2012-03-13  5:53     ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-13  5:53 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, mst, agraf, qemu-devel, eduard.munteanu, rth

On Fri, Mar 09, 2012 at 11:01:16AM +0100, Paolo Bonzini wrote:
> Il 09/03/2012 06:01, David Gibson ha scritto:
> > Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
> > determine the direction of DMA it is emulating.  We already have a
> > DMADirection enum designed specifically to encode DMA directions.
> > This patch uses it for dma_bdrv_io() as well.  This involves removing
> > the DMADirection definition from the #ifdef it was inside, but since that
> > only existed to protect the definition of dma_addr_t from places where
> > config.h is not included, there wasn't any reason for it to be there in
> > the first place.
> 
> Looks good, but please use this for dma_buf_* too.

Well, dma_buf_rw() is the only one that needs change, and it is
static, so the importance is not so great.  However, more importantly,
dma_buf_rw() still uses cpu_physical_memory_rw() because I missed
switching it to the universal wrappers.  I'll fix that up as well as
the DMADirection stuff in the next spin of the patch adding DMAContext
awareness to sglists and the rest.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-09 10:11   ` Paolo Bonzini
@ 2012-03-13  6:37     ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-13  6:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Kevin Wolf, mst, agraf, qemu-devel, eduard.munteanu, rth

On Fri, Mar 09, 2012 at 11:11:41AM +0100, Paolo Bonzini wrote:
> Il 09/03/2012 06:01, David Gibson ha scritto:
[snip]
> > +static void dma_bdrv_cancel(void *opaque)
> > +{
> > +    DMAAIOCB *dbs = opaque;
> > +
> > +    bdrv_aio_cancel(dbs->acb);
> > +    dma_bdrv_unmap(dbs);
> > +    qemu_iovec_destroy(&dbs->iov);
> > +    qemu_aio_release(dbs);
> > +}
> 
> What Kevin said.  Instead of a generic callback, dma_memory_map should
> probably just receive the AIOCB (in this case &dbs->common) and call
> bdrv_aio_cancel on it.

Uh, that doesn't really make sense.  There's no guarantee that a
dma_memory_map() invocation will be associated with an AIOCB, although
that is a mjor use.

> >  static void dma_bdrv_cb(void *opaque, int ret)
> >  {
> >      DMAAIOCB *dbs = (DMAAIOCB *)opaque;
> > -    target_phys_addr_t cur_addr, cur_len;
> > +    dma_addr_t cur_addr, cur_len;
> >      void *mem;
> >  
> >      trace_dma_bdrv_cb(dbs, ret);
> > @@ -124,8 +134,8 @@ static void dma_bdrv_cb(void *opaque, int ret)
> >      while (dbs->sg_cur_index < dbs->sg->nsg) {
> >          cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
> >          cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
> > -        mem = cpu_physical_memory_map(cur_addr, &cur_len,
> > -                                      dbs->dir != DMA_DIRECTION_TO_DEVICE);
> > +        mem = dma_memory_map(dbs->sg->dma, dma_bdrv_cancel, dbs,
> > +                             cur_addr, &cur_len, dbs->dir);
> >          if (!mem)
> >              break;
> >          qemu_iovec_add(&dbs->iov, mem, cur_len);
> > dma_buf_rw should also use the DMAContext here (passing a NULL
> invalidate function).

Done in the next spin.  Although no invalidate function is needed,
NULL or otherwise, since dma_buf_rw uses plain _rw, rather than
map/unmap.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-09  9:43   ` Kevin Wolf
@ 2012-03-13 10:42     ` David Gibson
  2012-03-13 11:55       ` Kevin Wolf
  0 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-13 10:42 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: rth, eduard.munteanu, mst, qemu-devel, agraf

On Fri, Mar 09, 2012 at 10:43:35AM +0100, Kevin Wolf wrote:
> Am 09.03.2012 06:01, schrieb David Gibson:
[snip]
> > @@ -104,10 +104,20 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
> >      }
> >  }
> >  
> > +static void dma_bdrv_cancel(void *opaque)
> > +{
> > +    DMAAIOCB *dbs = opaque;
> > +
> > +    bdrv_aio_cancel(dbs->acb);
> > +    dma_bdrv_unmap(dbs);
> > +    qemu_iovec_destroy(&dbs->iov);
> > +    qemu_aio_release(dbs);
> > +}
> 
> I'm lacking the context to know when this is actually called, but it
> looks suspicious. Did you consider that bdrv_aio_cancel() can actually
> invoke the completion callback?
> 
> What's the difference between the existing dma_aio_cancel() and the
> function that you need here?

So, first thing to note is that as I said in another sub-thread, there
are several approaches we could take for handling invalidation of
IOMMU mappings while they're in use by drivers, and I'm not sure which
is best yet.

Second is that this piece of code comes from Eduard - Gabriel's
original and I haven't actually understood it as well as I should have
:).

So, examining in more detail it looks like dma_aio_cancel() is the
right thing to do (I don't know if it existed when Eduard - Gabriel
wrote the initial version).

The semantics of the callback are that after it's complete, there
should be no further access to the dma_memory_map()ed memory areas.  I
haven't yet understood the bdrv stuff sufficiently well to be
completely sure that's true.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers
  2012-03-13 10:42     ` David Gibson
@ 2012-03-13 11:55       ` Kevin Wolf
  0 siblings, 0 replies; 39+ messages in thread
From: Kevin Wolf @ 2012-03-13 11:55 UTC (permalink / raw)
  To: qemu-devel, agraf, eduard.munteanu, mst, rth

Am 13.03.2012 11:42, schrieb David Gibson:
> On Fri, Mar 09, 2012 at 10:43:35AM +0100, Kevin Wolf wrote:
>> Am 09.03.2012 06:01, schrieb David Gibson:
> [snip]
>>> @@ -104,10 +104,20 @@ static void dma_complete(DMAAIOCB *dbs, int ret)
>>>      }
>>>  }
>>>  
>>> +static void dma_bdrv_cancel(void *opaque)
>>> +{
>>> +    DMAAIOCB *dbs = opaque;
>>> +
>>> +    bdrv_aio_cancel(dbs->acb);
>>> +    dma_bdrv_unmap(dbs);
>>> +    qemu_iovec_destroy(&dbs->iov);
>>> +    qemu_aio_release(dbs);
>>> +}
>>
>> I'm lacking the context to know when this is actually called, but it
>> looks suspicious. Did you consider that bdrv_aio_cancel() can actually
>> invoke the completion callback?
>>
>> What's the difference between the existing dma_aio_cancel() and the
>> function that you need here?
> 
> So, first thing to note is that as I said in another sub-thread, there
> are several approaches we could take for handling invalidation of
> IOMMU mappings while they're in use by drivers, and I'm not sure which
> is best yet.
> 
> Second is that this piece of code comes from Eduard - Gabriel's
> original and I haven't actually understood it as well as I should have
> :).
> 
> So, examining in more detail it looks like dma_aio_cancel() is the
> right thing to do (I don't know if it existed when Eduard - Gabriel
> wrote the initial version).

I'm pretty sure that it's older than these patches.

> The semantics of the callback are that after it's complete, there
> should be no further access to the dma_memory_map()ed memory areas.  I
> haven't yet understood the bdrv stuff sufficiently well to be
> completely sure that's true.

Yes, I think it is. Once bdrv_aio_cancel() returns, the block layer
doesn't do anything with the request any more. The caller (e.g. IDE)
can't use the memory areas either because it doesn't even know them.

Kevin

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-13  5:07     ` David Gibson
@ 2012-03-13 13:56       ` Alexander Graf
  2012-03-13 14:04         ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2012-03-13 13:56 UTC (permalink / raw)
  To: David Gibson; +Cc: Paolo Bonzini, mst, eduard.munteanu, qemu-devel, rth


On 13.03.2012, at 06:07, David Gibson wrote:

> On Fri, Mar 09, 2012 at 11:23:01AM +0100, Paolo Bonzini wrote:
>> Il 09/03/2012 06:01, David Gibson ha scritto:
>>> This patch adds the basic infrastructure necessary to emulate an IOMMU
>>> visible to the guest.  The DMAContext structure is extended with
>>> information and a callback describing the translation, and the various
>>> DMA functions used by devices will now perform IOMMU translation using
>>> this callback.
>>> 
>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>> Cc: Richard Henderson <rth@twiddle.net>
>>> 
>>> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>> configure     |   12 ++++
>>> dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> dma.h         |  125 +++++++++++++++++++++++++++++++-------
>>> 3 files changed, 306 insertions(+), 22 deletions(-)
>>> 
>>> diff --git a/configure b/configure
>>> index a5eb832..e6fba2f 100755
>>> --- a/configure
>>> +++ b/configure
>>> @@ -138,6 +138,7 @@ linux_aio=""
>>> cap_ng=""
>>> attr=""
>>> libattr=""
>>> +iommu="yes"
>>> xfs=""
>>> 
>>> vhost_net="no"
>>> @@ -784,6 +785,10 @@ for opt do
>>>   ;;
>>>   --enable-vhost-net) vhost_net="yes"
>>>   ;;
>>> +  --enable-iommu) iommu="yes"
>>> +  ;;
>>> +  --disable-iommu) iommu="no"
>>> +  ;;
>> 
>> This need not be a configure option.  Just enable it, or it will
>> bitrot.
> 
> I did wonder about that.  I'd like to hear that suggested by more than
> one person before I unconditionally add code which will impose an
> overhead on all emulated DMAs.

Maybe make it a target option? That way targets that want and need an IOMMU can compile the code in, while the others don't see performance hits. That would of course mean that we'd still have conditional code paths, but at least they wouldn't bitrot, as building multiple targets means you'd build both paths.


Alex

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-13 13:56       ` Alexander Graf
@ 2012-03-13 14:04         ` David Gibson
  2012-03-13 14:37           ` Alexander Graf
  0 siblings, 1 reply; 39+ messages in thread
From: David Gibson @ 2012-03-13 14:04 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paolo Bonzini, mst, eduard.munteanu, qemu-devel, rth

On Tue, Mar 13, 2012 at 02:56:47PM +0100, Alexander Graf wrote:
> 
> On 13.03.2012, at 06:07, David Gibson wrote:
> 
> > On Fri, Mar 09, 2012 at 11:23:01AM +0100, Paolo Bonzini wrote:
> >> Il 09/03/2012 06:01, David Gibson ha scritto:
> >>> This patch adds the basic infrastructure necessary to emulate an IOMMU
> >>> visible to the guest.  The DMAContext structure is extended with
> >>> information and a callback describing the translation, and the various
> >>> DMA functions used by devices will now perform IOMMU translation using
> >>> this callback.
> >>> 
> >>> Cc: Michael S. Tsirkin <mst@redhat.com>
> >>> Cc: Richard Henderson <rth@twiddle.net>
> >>> 
> >>> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>> ---
> >>> configure     |   12 ++++
> >>> dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>> dma.h         |  125 +++++++++++++++++++++++++++++++-------
> >>> 3 files changed, 306 insertions(+), 22 deletions(-)
> >>> 
> >>> diff --git a/configure b/configure
> >>> index a5eb832..e6fba2f 100755
> >>> --- a/configure
> >>> +++ b/configure
> >>> @@ -138,6 +138,7 @@ linux_aio=""
> >>> cap_ng=""
> >>> attr=""
> >>> libattr=""
> >>> +iommu="yes"
> >>> xfs=""
> >>> 
> >>> vhost_net="no"
> >>> @@ -784,6 +785,10 @@ for opt do
> >>>   ;;
> >>>   --enable-vhost-net) vhost_net="yes"
> >>>   ;;
> >>> +  --enable-iommu) iommu="yes"
> >>> +  ;;
> >>> +  --disable-iommu) iommu="no"
> >>> +  ;;
> >> 
> >> This need not be a configure option.  Just enable it, or it will
> >> bitrot.
> > 
> > I did wonder about that.  I'd like to hear that suggested by more than
> > one person before I unconditionally add code which will impose an
> > overhead on all emulated DMAs.
> 
> Maybe make it a target option? That way targets that want and need
> an IOMMU can compile the code in, while the others don't see
> performance hits. That would of course mean that we'd still have
> conditional code paths, but at least they wouldn't bitrot, as
> building multiple targets means you'd build both paths.

Heh, yeah, so I tried that way back.  It's problematic, unfortunately,
as it would require pulling a number of things out of libhw and back
into target dependent code.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-13 14:04         ` David Gibson
@ 2012-03-13 14:37           ` Alexander Graf
  2012-03-14  9:05             ` David Gibson
  0 siblings, 1 reply; 39+ messages in thread
From: Alexander Graf @ 2012-03-13 14:37 UTC (permalink / raw)
  To: David Gibson; +Cc: Paolo Bonzini, mst, eduard.munteanu, qemu-devel, rth


On 13.03.2012, at 15:04, David Gibson wrote:

> On Tue, Mar 13, 2012 at 02:56:47PM +0100, Alexander Graf wrote:
>> 
>> On 13.03.2012, at 06:07, David Gibson wrote:
>> 
>>> On Fri, Mar 09, 2012 at 11:23:01AM +0100, Paolo Bonzini wrote:
>>>> Il 09/03/2012 06:01, David Gibson ha scritto:
>>>>> This patch adds the basic infrastructure necessary to emulate an IOMMU
>>>>> visible to the guest.  The DMAContext structure is extended with
>>>>> information and a callback describing the translation, and the various
>>>>> DMA functions used by devices will now perform IOMMU translation using
>>>>> this callback.
>>>>> 
>>>>> Cc: Michael S. Tsirkin <mst@redhat.com>
>>>>> Cc: Richard Henderson <rth@twiddle.net>
>>>>> 
>>>>> Signed-off-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
>>>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>>>> ---
>>>>> configure     |   12 ++++
>>>>> dma-helpers.c |  191 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>> dma.h         |  125 +++++++++++++++++++++++++++++++-------
>>>>> 3 files changed, 306 insertions(+), 22 deletions(-)
>>>>> 
>>>>> diff --git a/configure b/configure
>>>>> index a5eb832..e6fba2f 100755
>>>>> --- a/configure
>>>>> +++ b/configure
>>>>> @@ -138,6 +138,7 @@ linux_aio=""
>>>>> cap_ng=""
>>>>> attr=""
>>>>> libattr=""
>>>>> +iommu="yes"
>>>>> xfs=""
>>>>> 
>>>>> vhost_net="no"
>>>>> @@ -784,6 +785,10 @@ for opt do
>>>>>  ;;
>>>>>  --enable-vhost-net) vhost_net="yes"
>>>>>  ;;
>>>>> +  --enable-iommu) iommu="yes"
>>>>> +  ;;
>>>>> +  --disable-iommu) iommu="no"
>>>>> +  ;;
>>>> 
>>>> This need not be a configure option.  Just enable it, or it will
>>>> bitrot.
>>> 
>>> I did wonder about that.  I'd like to hear that suggested by more than
>>> one person before I unconditionally add code which will impose an
>>> overhead on all emulated DMAs.
>> 
>> Maybe make it a target option? That way targets that want and need
>> an IOMMU can compile the code in, while the others don't see
>> performance hits. That would of course mean that we'd still have
>> conditional code paths, but at least they wouldn't bitrot, as
>> building multiple targets means you'd build both paths.
> 
> Heh, yeah, so I tried that way back.  It's problematic, unfortunately,
> as it would require pulling a number of things out of libhw and back
> into target dependent code.

Hrm. How about measuring the performance overhead then? Maybe it's negligible?


Alex

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure
  2012-03-13 14:37           ` Alexander Graf
@ 2012-03-14  9:05             ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-14  9:05 UTC (permalink / raw)
  To: Alexander Graf; +Cc: Paolo Bonzini, mst, eduard.munteanu, qemu-devel, rth

On Tue, Mar 13, 2012 at 03:37:09PM +0100, Alexander Graf wrote:
> On 13.03.2012, at 15:04, David Gibson wrote:
> > On Tue, Mar 13, 2012 at 02:56:47PM +0100, Alexander Graf wrote:
> >> On 13.03.2012, at 06:07, David Gibson wrote:
[snip]
> >>>> This need not be a configure option.  Just enable it, or it will
> >>>> bitrot.
> >>> 
> >>> I did wonder about that.  I'd like to hear that suggested by more than
> >>> one person before I unconditionally add code which will impose an
> >>> overhead on all emulated DMAs.
> >> 
> >> Maybe make it a target option? That way targets that want and need
> >> an IOMMU can compile the code in, while the others don't see
> >> performance hits. That would of course mean that we'd still have
> >> conditional code paths, but at least they wouldn't bitrot, as
> >> building multiple targets means you'd build both paths.
> > 
> > Heh, yeah, so I tried that way back.  It's problematic, unfortunately,
> > as it would require pulling a number of things out of libhw and back
> > into target dependent code.
> 
> Hrm. How about measuring the performance overhead then? Maybe it's
> negligible?

Yeah, I should do that.  Finding a suitable testcase and running it on
a variety of targets is non-trivial, though.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero()
  2012-03-22  2:14 [Qemu-devel] [0/13] RFC: Guest visible IOMMU David Gibson
@ 2012-03-22  2:14 ` David Gibson
  0 siblings, 0 replies; 39+ messages in thread
From: David Gibson @ 2012-03-22  2:14 UTC (permalink / raw)
  To: qemu-devel; +Cc: David Gibson

This patch adds cpu_physical_memory_zero() function.  This is equivalent to
calling cpu_physical_memory_write() with a buffer full of zeroes, but
avoids actually allocating such a buffer along the way.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 cpu-common.h |    1 +
 exec.c       |   53 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index dca5175..146429c 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -53,6 +53,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev);
 
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write);
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len);
 static inline void cpu_physical_memory_read(target_phys_addr_t addr,
                                             void *buf, int len)
 {
diff --git a/exec.c b/exec.c
index 265e895..43f83be 100644
--- a/exec.c
+++ b/exec.c
@@ -3868,6 +3868,59 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
     }
 }
 
+void cpu_physical_memory_zero(target_phys_addr_t addr, int len)
+{
+    int l;
+    uint8_t *ptr;
+    target_phys_addr_t page;
+    MemoryRegionSection *section;
+
+    while (len > 0) {
+        page = addr & TARGET_PAGE_MASK;
+        l = (page + TARGET_PAGE_SIZE) - addr;
+        if (l > len)
+            l = len;
+        section = phys_page_find(page >> TARGET_PAGE_BITS);
+
+        if (!memory_region_is_ram(section->mr)) {
+            target_phys_addr_t addr1;
+            addr1 = section_addr(section, addr);
+            /* XXX: could force cpu_single_env to NULL to avoid
+               potential bugs */
+            if (l >= 4 && ((addr1 & 3) == 0)) {
+                /* 32 bit write access */
+                io_mem_write(section->mr, addr1, 0, 4);
+                l = 4;
+            } else if (l >= 2 && ((addr1 & 1) == 0)) {
+                /* 16 bit write access */
+                io_mem_write(section->mr, addr1, 0, 2);
+                l = 2;
+            } else {
+                /* 8 bit write access */
+                io_mem_write(section->mr, addr1, 0, 1);
+                l = 1;
+            }
+        } else if (!section->readonly) {
+            ram_addr_t addr1;
+            addr1 = memory_region_get_ram_addr(section->mr)
+                + section_addr(section, addr);
+            /* RAM case */
+            ptr = qemu_get_ram_ptr(addr1);
+            memset(ptr, 0, l);
+            if (!cpu_physical_memory_is_dirty(addr1)) {
+                /* invalidate code */
+                tb_invalidate_phys_page_range(addr1, addr1 + l, 0);
+                /* set dirty bit */
+                cpu_physical_memory_set_dirty_flags(
+                    addr1, (0xff & ~CODE_DIRTY_FLAG));
+            }
+            qemu_put_ram_ptr(ptr);
+        }
+        len -= l;
+        addr += l;
+    }
+}
+
 /* used for ROM loading : can write in RAM and ROM */
 void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len)
-- 
1.7.9.1

^ permalink raw reply related	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2012-03-22  2:15 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-09  5:01 [Qemu-devel] [0/13] Implement support for guest visible IOMMUs David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 01/13] Use DMADirection type for dma_bdrv_io David Gibson
2012-03-09  9:34   ` Kevin Wolf
2012-03-09 10:01   ` Paolo Bonzini
2012-03-13  5:53     ` David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 02/13] Better support for dma_addr_t variables David Gibson
2012-03-09 10:00   ` Paolo Bonzini
2012-03-09  5:01 ` [Qemu-devel] [PATCH 03/13] usb-xhci: Use PCI DMA helper functions David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero() David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 05/13] iommu: Add universal DMA helper functions David Gibson
2012-03-09 10:06   ` Paolo Bonzini
2012-03-09 11:19     ` David Gibson
2012-03-09 11:55       ` Paolo Bonzini
2012-03-09  5:01 ` [Qemu-devel] [PATCH 06/13] usb-ohci: Use " David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 07/13] iommu: Make sglists and dma_bdrv helpers use new universal DMA helpers David Gibson
2012-03-09  9:43   ` Kevin Wolf
2012-03-13 10:42     ` David Gibson
2012-03-13 11:55       ` Kevin Wolf
2012-03-09 10:11   ` Paolo Bonzini
2012-03-13  6:37     ` David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 08/13] ide/ahci: Use universal DMA helper functions David Gibson
2012-03-09  9:48   ` Kevin Wolf
2012-03-10  5:30     ` David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 09/13] usb: Convert usb_packet_{map, unmap} to universal DMA helpers David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 10/13] iommu: Introduce IOMMU emulation infrastructure David Gibson
2012-03-09 10:23   ` Paolo Bonzini
2012-03-13  5:07     ` David Gibson
2012-03-13 13:56       ` Alexander Graf
2012-03-13 14:04         ` David Gibson
2012-03-13 14:37           ` Alexander Graf
2012-03-14  9:05             ` David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 11/13] pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 12/13] iommu: Allow PCI to use " David Gibson
2012-03-09  5:01 ` [Qemu-devel] [PATCH 13/13] pseries: Implement IOMMU and DMA for PAPR PCI devices David Gibson
2012-03-09 10:23   ` Paolo Bonzini
2012-03-09 10:58     ` David Gibson
2012-03-11  2:02       ` Benjamin Herrenschmidt
  -- strict thread matches above, loose matches on Subject: below --
2012-03-22  2:14 [Qemu-devel] [0/13] RFC: Guest visible IOMMU David Gibson
2012-03-22  2:14 ` [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero() David Gibson
2012-03-01  5:35 [Qemu-devel] [0/13] RFC: Support for guest-visible IOMMUs David Gibson
2012-03-01  5:36 ` [Qemu-devel] [PATCH 04/13] Implement cpu_physical_memory_zero() David Gibson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).