qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation
@ 2013-05-30 21:16 Paolo Bonzini
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region Paolo Bonzini
                   ` (20 more replies)
  0 siblings, 21 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

This part includes Jan's subpage reorganization (needed to unify MMIO
and PIO dispatch) and the implementation of IOMMU regions.  Compared to
previous submissions, there is full support for 64-bit-sized IOMMU
regions, so I'm reverting the patches from part 1 that reduced the
maximum size of sections to 62 bits.

Patches 1-10 are the new ones.

Alexey Kardashevskiy (1):
  memory: give name to every AddressSpace

Avi Kivity (3):
  memory: iommu support
  vfio: abort if an emulated iommu is used
  pci: use memory core for iommu support

David Gibson (1):
  memory: Add iommu map/unmap notifiers

Jan Kiszka (4):
  memory: Introduce address_space_lookup_region
  exec: Allow unaligned address_space_rw
  exec: Resolve subpages in one step except for IOTLB fills
  exec: Implement subpage_read/write via address_space_rw

Paolo Bonzini (12):
  memory: move private types to exec.c
  exec: return MemoryRegion from address_space_translate
  Revert "memory: limit sections in the radix tree to the actual address
    space size"
  Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62"
  exec: reorganize mem_add to match Int128 version
  memory: make section size a 128-bit integer
  spapr: convert TCE API to use an opaque type
  spapr: make IOMMU translation go through IOMMUTLBEntry
  spapr: use memory core for iommu support
  dma: eliminate old-style IOMMU support
  spapr_vio: take care of creating our own AddressSpace/DMAContext
  dma: eliminate DMAContext

 cputlb.c                       |   4 +-
 dma-helpers.c                  | 190 +----------------
 exec.c                         | 456 ++++++++++++++++++++++-------------------
 hw/core/loader.c               |   2 +-
 hw/display/exynos4210_fimd.c   |   4 +-
 hw/display/framebuffer.c       |   3 +-
 hw/dma/pl330.c                 |   8 +-
 hw/ide/ahci.c                  |  18 +-
 hw/ide/ahci.h                  |   4 +-
 hw/ide/ich.c                   |   2 +-
 hw/ide/macio.c                 |   4 +-
 hw/misc/vfio.c                 |   6 +-
 hw/pci/pci.c                   |  44 ++--
 hw/ppc/spapr_iommu.c           | 119 ++++++-----
 hw/ppc/spapr_pci.c             |  16 +-
 hw/ppc/spapr_vio.c             |  13 +-
 hw/scsi/megasas.c              |   4 +-
 hw/scsi/virtio-scsi.c          |   2 +-
 hw/scsi/vmw_pvscsi.c           |   2 +-
 hw/sd/sdhci.c                  |  22 +-
 hw/usb/hcd-ehci-pci.c          |   4 +-
 hw/usb/hcd-ehci-sysbus.c       |   2 +-
 hw/usb/hcd-ehci.c              |  12 +-
 hw/usb/hcd-ehci.h              |   2 +-
 hw/usb/hcd-ohci.c              |  30 +--
 hw/usb/libhw.c                 |   4 +-
 hw/virtio/dataplane/hostmem.c  |   2 +-
 hw/virtio/vhost.c              |   4 +-
 hw/virtio/virtio-balloon.c     |   2 +-
 include/exec/cputlb.h          |   4 +
 include/exec/memory-internal.h |  15 --
 include/exec/memory.h          | 114 ++++++++++-
 include/hw/pci-host/spapr.h    |   3 +-
 include/hw/pci/pci.h           |  21 +-
 include/hw/pci/pci_bus.h       |   4 +-
 include/hw/ppc/spapr.h         |  12 +-
 include/hw/ppc/spapr_vio.h     |  21 +-
 include/qemu/int128.h          |  10 +
 include/sysemu/dma.h           | 144 ++++---------
 kvm-all.c                      |  23 ++-
 memory.c                       |  56 +++--
 target-s390x/cpu.h             |   5 +-
 xen-all.c                      |   6 +-
 43 files changed, 671 insertions(+), 752 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 21:56   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c Paolo Bonzini
                   ` (19 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This introduces a wrapper for phys_page_find (before we complicate
address_space_translate with IOMMU translation).

The function will also include subpage handling.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 5b8b40d..2cd4eb3 100644
--- a/exec.c
+++ b/exec.c
@@ -203,6 +203,12 @@ bool memory_region_is_unassigned(MemoryRegion *mr)
         && mr != &io_mem_watch;
 }
 
+static MemoryRegionSection *address_space_lookup_region(AddressSpace *as,
+                                                        hwaddr addr)
+{
+    return phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
+}
+
 MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
                                              hwaddr *xlat, hwaddr *plen,
                                              bool is_write)
@@ -210,7 +216,7 @@ MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
     MemoryRegionSection *section;
     Int128 diff;
 
-    section = phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
+    section = address_space_lookup_region(as, addr);
     /* Compute offset within MemoryRegionSection */
     addr -= section->offset_within_address_space;
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 21:57   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw Paolo Bonzini
                   ` (18 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                         | 16 ++++++++++++++++
 include/exec/memory-internal.h | 15 ---------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index 2cd4eb3..bae4d30 100644
--- a/exec.c
+++ b/exec.c
@@ -81,6 +81,22 @@ int use_icount;
 
 #if !defined(CONFIG_USER_ONLY)
 
+typedef struct PhysPageEntry PhysPageEntry;
+
+struct PhysPageEntry {
+    uint16_t is_leaf:1;
+     /* index into phys_sections (is_leaf) or phys_map_nodes (!is_leaf) */
+    uint16_t ptr:15;
+};
+
+struct AddressSpaceDispatch {
+    /* This is a multi-level map on the physical address space.
+     * The bottom level has pointers to MemoryRegionSections.
+     */
+    PhysPageEntry phys_map;
+    MemoryListener listener;
+};
+
 static MemoryRegionSection *phys_sections;
 static unsigned phys_sections_nb, phys_sections_nb_alloc;
 static uint16_t phys_section_unassigned;
diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
index 799c02a..26689fe 100644
--- a/include/exec/memory-internal.h
+++ b/include/exec/memory-internal.h
@@ -22,24 +22,9 @@
 #ifndef CONFIG_USER_ONLY
 #include "hw/xen/xen.h"
 
-typedef struct PhysPageEntry PhysPageEntry;
-
-struct PhysPageEntry {
-    uint16_t is_leaf : 1;
-     /* index into phys_sections (is_leaf) or phys_map_nodes (!is_leaf) */
-    uint16_t ptr : 15;
-};
 
 typedef struct AddressSpaceDispatch AddressSpaceDispatch;
 
-struct AddressSpaceDispatch {
-    /* This is a multi-level map on the physical address space.
-     * The bottom level has pointers to MemoryRegionSections.
-     */
-    PhysPageEntry phys_map;
-    MemoryListener listener;
-};
-
 void address_space_init_dispatch(AddressSpace *as);
 void address_space_destroy_dispatch(AddressSpace *as);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region Paolo Bonzini
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 21:57   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills Paolo Bonzini
                   ` (17 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This will be needed for some corner cases with para-virtual I/O ports.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index bae4d30..5556169 100644
--- a/exec.c
+++ b/exec.c
@@ -1913,12 +1913,12 @@ static inline bool memory_access_is_direct(MemoryRegion *mr, bool is_write)
     return false;
 }
 
-static inline int memory_access_size(int l, hwaddr addr)
+static inline int memory_access_size(MemoryRegion *mr, int l, hwaddr addr)
 {
-    if (l >= 4 && ((addr & 3) == 0)) {
+    if (l >= 4 && (((addr & 3) == 0 || mr->ops->impl.unaligned))) {
         return 4;
     }
-    if (l >= 2 && ((addr & 1) == 0)) {
+    if (l >= 2 && (((addr & 1) == 0) || mr->ops->impl.unaligned)) {
         return 2;
     }
     return 1;
@@ -1940,7 +1940,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
 
         if (is_write) {
             if (!memory_access_is_direct(section->mr, is_write)) {
-                l = memory_access_size(l, addr1);
+                l = memory_access_size(section->mr, l, addr1);
                 /* XXX: could force cpu_single_env to NULL to avoid
                    potential bugs */
                 if (l == 4) {
@@ -1966,7 +1966,7 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
         } else {
             if (!memory_access_is_direct(section->mr, is_write)) {
                 /* I/O case */
-                l = memory_access_size(l, addr1);
+                l = memory_access_size(section->mr, l, addr1);
                 if (l == 4) {
                     /* 32 bit read access */
                     error |= io_mem_read(section->mr, addr1, &val, 4);
@@ -2097,7 +2097,7 @@ bool address_space_access_valid(AddressSpace *as, hwaddr addr, int len, bool is_
         l = len;
         section = address_space_translate(as, addr, &xlat, &l, is_write);
         if (!memory_access_is_direct(section->mr, is_write)) {
-            l = memory_access_size(l, addr);
+            l = memory_access_size(section->mr, l, addr);
             if (!memory_region_access_valid(section->mr, xlat, l, is_write)) {
                 return false;
             }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 21:58   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw Paolo Bonzini
                   ` (16 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

Except for the case of setting the IOTLB entry in TCG mode, we can avoid
the subpage dispatching handlers and do the resolution directly on
address_space_lookup_region. An IOTLB entry describes a full page, not
only the region that the first access to a sub-divided page may return.

This patch therefore introduces a special translation function,
address_space_translate_for_iotlb, that avoids the subpage resolutions.
In contrast, callers of the existing address_space_translate service
will now always receive the terminal memory region section. This will be
important for breaking the BQL and for enabling unaligned memory region.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cputlb.c              |  4 ++--
 exec.c                | 49 ++++++++++++++++++++++++++++++++++++-------------
 include/exec/cputlb.h |  4 ++++
 3 files changed, 42 insertions(+), 15 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 86666c8..060c67d 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -256,8 +256,8 @@ void tlb_set_page(CPUArchState *env, target_ulong vaddr,
     }
 
     sz = size;
-    section = address_space_translate(&address_space_memory, paddr, &xlat, &sz,
-                                      false);
+    section = address_space_translate_for_iotlb(&address_space_memory, paddr,
+                                                &xlat, &sz);
     assert(sz >= TARGET_PAGE_SIZE);
 
 #if defined(DEBUG_TLB)
diff --git a/exec.c b/exec.c
index 5556169..ddc51a6 100644
--- a/exec.c
+++ b/exec.c
@@ -97,6 +97,13 @@ struct AddressSpaceDispatch {
     MemoryListener listener;
 };
 
+#define SUBPAGE_IDX(addr) ((addr) & ~TARGET_PAGE_MASK)
+typedef struct subpage_t {
+    MemoryRegion iomem;
+    hwaddr base;
+    uint16_t sub_section[TARGET_PAGE_SIZE];
+} subpage_t;
+
 static MemoryRegionSection *phys_sections;
 static unsigned phys_sections_nb, phys_sections_nb_alloc;
 static uint16_t phys_section_unassigned;
@@ -220,19 +227,28 @@ bool memory_region_is_unassigned(MemoryRegion *mr)
 }
 
 static MemoryRegionSection *address_space_lookup_region(AddressSpace *as,
-                                                        hwaddr addr)
+                                                        hwaddr addr,
+                                                        bool resolve_subpage)
 {
-    return phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
+    MemoryRegionSection *section;
+    subpage_t *subpage;
+
+    section = phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
+    if (resolve_subpage && section->mr->subpage) {
+        subpage = container_of(section->mr, subpage_t, iomem);
+        section = &phys_sections[subpage->sub_section[SUBPAGE_IDX(addr)]];
+    }
+    return section;
 }
 
-MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
-                                             hwaddr *xlat, hwaddr *plen,
-                                             bool is_write)
+static MemoryRegionSection *
+address_space_translate_internal(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+                                 hwaddr *plen, bool resolve_subpage)
 {
     MemoryRegionSection *section;
     Int128 diff;
 
-    section = address_space_lookup_region(as, addr);
+    section = address_space_lookup_region(as, addr, resolve_subpage);
     /* Compute offset within MemoryRegionSection */
     addr -= section->offset_within_address_space;
 
@@ -243,6 +259,20 @@ MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
     *plen = MIN(int128_get64(diff), *plen);
     return section;
 }
+
+MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
+                                             hwaddr *xlat, hwaddr *plen,
+                                             bool is_write)
+{
+    return address_space_translate_internal(as, addr, xlat, plen, true);
+}
+
+MemoryRegionSection *
+address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+                                  hwaddr *plen)
+{
+    return address_space_translate_internal(as, addr, xlat, plen, false);
+}
 #endif
 
 void cpu_exec_init_all(void)
@@ -697,13 +727,6 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 
 #if !defined(CONFIG_USER_ONLY)
 
-#define SUBPAGE_IDX(addr) ((addr) & ~TARGET_PAGE_MASK)
-typedef struct subpage_t {
-    MemoryRegion iomem;
-    hwaddr base;
-    uint16_t sub_section[TARGET_PAGE_SIZE];
-} subpage_t;
-
 static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
                              uint16_t section);
 static subpage_t *subpage_init(hwaddr base);
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index e821660..e21cb60 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -32,6 +32,10 @@ extern int tlb_flush_count;
 
 /* exec.c */
 void tb_flush_jmp_cache(CPUArchState *env, target_ulong addr);
+
+MemoryRegionSection *
+address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
+                                  hwaddr *plen);
 hwaddr memory_region_section_get_iotlb(CPUArchState *env,
                                        MemoryRegionSection *section,
                                        target_ulong vaddr,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 21:59   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate Paolo Bonzini
                   ` (15 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

This will allow to add support for unaligned memory regions: the subpage
container region can activate unaligned support unconditionally because
the read/write handler will now ensure that accesses are split as
required by calling address_space_rw. We can furthermore drop the
special handling of RAM subpages, address_space_rw takes care of this
already.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 125 +++++++++++++++++++++++++----------------------------------------
 1 file changed, 47 insertions(+), 78 deletions(-)

diff --git a/exec.c b/exec.c
index ddc51a6..bf374e4 100644
--- a/exec.c
+++ b/exec.c
@@ -66,7 +66,7 @@ AddressSpace address_space_memory;
 DMAContext dma_context_memory;
 
 MemoryRegion io_mem_rom, io_mem_notdirty;
-static MemoryRegion io_mem_unassigned, io_mem_subpage_ram;
+static MemoryRegion io_mem_unassigned;
 
 #endif
 
@@ -95,11 +95,13 @@ struct AddressSpaceDispatch {
      */
     PhysPageEntry phys_map;
     MemoryListener listener;
+    AddressSpace *as;
 };
 
 #define SUBPAGE_IDX(addr) ((addr) & ~TARGET_PAGE_MASK)
 typedef struct subpage_t {
     MemoryRegion iomem;
+    AddressSpace *as;
     hwaddr base;
     uint16_t sub_section[TARGET_PAGE_SIZE];
 } subpage_t;
@@ -729,7 +731,7 @@ hwaddr memory_region_section_get_iotlb(CPUArchState *env,
 
 static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
                              uint16_t section);
-static subpage_t *subpage_init(hwaddr base);
+static subpage_t *subpage_init(AddressSpace *as, hwaddr base);
 static void destroy_page_desc(uint16_t section_index)
 {
     MemoryRegionSection *section = &phys_sections[section_index];
@@ -806,7 +808,7 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
     assert(existing->mr->subpage || existing->mr == &io_mem_unassigned);
 
     if (!(existing->mr->subpage)) {
-        subpage = subpage_init(base);
+        subpage = subpage_init(d->as, base);
         subsection.mr = &subpage->iomem;
         phys_page_set(d, base >> TARGET_PAGE_BITS, 1,
                       phys_section_add(&subsection));
@@ -1569,60 +1571,64 @@ static const MemoryRegionOps watch_mem_ops = {
 static uint64_t subpage_read(void *opaque, hwaddr addr,
                              unsigned len)
 {
-    subpage_t *mmio = opaque;
-    unsigned int idx = SUBPAGE_IDX(addr);
-    uint64_t val;
+    subpage_t *subpage = opaque;
+    uint8_t buf[4];
 
-    MemoryRegionSection *section;
 #if defined(DEBUG_SUBPAGE)
-    printf("%s: subpage %p len %d addr " TARGET_FMT_plx " idx %d\n", __func__,
-           mmio, len, addr, idx);
+    printf("%s: subpage %p len %d addr " TARGET_FMT_plx "\n", __func__,
+           subpage, len, addr);
 #endif
-
-    section = &phys_sections[mmio->sub_section[idx]];
-    addr += mmio->base;
-    addr -= section->offset_within_address_space;
-    addr += section->offset_within_region;
-    io_mem_read(section->mr, addr, &val, len);
-    return val;
+    address_space_read(subpage->as, addr + subpage->base, buf, len);
+    switch (len) {
+    case 1:
+        return ldub_p(buf);
+    case 2:
+        return lduw_p(buf);
+    case 4:
+        return ldl_p(buf);
+    default:
+        abort();
+    }
 }
 
 static void subpage_write(void *opaque, hwaddr addr,
                           uint64_t value, unsigned len)
 {
-    subpage_t *mmio = opaque;
-    unsigned int idx = SUBPAGE_IDX(addr);
-    MemoryRegionSection *section;
+    subpage_t *subpage = opaque;
+    uint8_t buf[4];
+
 #if defined(DEBUG_SUBPAGE)
     printf("%s: subpage %p len %d addr " TARGET_FMT_plx
-           " idx %d value %"PRIx64"\n",
-           __func__, mmio, len, addr, idx, value);
+           " value %"PRIx64"\n",
+           __func__, subpage, len, addr, value);
 #endif
-
-    section = &phys_sections[mmio->sub_section[idx]];
-    addr += mmio->base;
-    addr -= section->offset_within_address_space;
-    addr += section->offset_within_region;
-    io_mem_write(section->mr, addr, value, len);
+    switch (len) {
+    case 1:
+        stb_p(buf, value);
+        break;
+    case 2:
+        stw_p(buf, value);
+        break;
+    case 4:
+        stl_p(buf, value);
+        break;
+    default:
+        abort();
+    }
+    address_space_write(subpage->as, addr + subpage->base, buf, len);
 }
 
 static bool subpage_accepts(void *opaque, hwaddr addr,
                             unsigned size, bool is_write)
 {
-    subpage_t *mmio = opaque;
-    unsigned int idx = SUBPAGE_IDX(addr);
-    MemoryRegionSection *section;
+    subpage_t *subpage = opaque;
 #if defined(DEBUG_SUBPAGE)
-    printf("%s: subpage %p %c len %d addr " TARGET_FMT_plx
-           " idx %d\n", __func__, mmio,
-           is_write ? 'w' : 'r', len, addr, idx);
+    printf("%s: subpage %p %c len %d addr " TARGET_FMT_plx "\n",
+           __func__, subpage, is_write ? 'w' : 'r', len, addr);
 #endif
 
-    section = &phys_sections[mmio->sub_section[idx]];
-    addr += mmio->base;
-    addr -= section->offset_within_address_space;
-    addr += section->offset_within_region;
-    return memory_region_access_valid(section->mr, addr, size, is_write);
+    return address_space_access_valid(subpage->as, addr + subpage->base,
+                                      size, is_write);
 }
 
 static const MemoryRegionOps subpage_ops = {
@@ -1632,38 +1638,6 @@ static const MemoryRegionOps subpage_ops = {
     .endianness = DEVICE_NATIVE_ENDIAN,
 };
 
-static uint64_t subpage_ram_read(void *opaque, hwaddr addr,
-                                 unsigned size)
-{
-    ram_addr_t raddr = addr;
-    void *ptr = qemu_get_ram_ptr(raddr);
-    switch (size) {
-    case 1: return ldub_p(ptr);
-    case 2: return lduw_p(ptr);
-    case 4: return ldl_p(ptr);
-    default: abort();
-    }
-}
-
-static void subpage_ram_write(void *opaque, hwaddr addr,
-                              uint64_t value, unsigned size)
-{
-    ram_addr_t raddr = addr;
-    void *ptr = qemu_get_ram_ptr(raddr);
-    switch (size) {
-    case 1: return stb_p(ptr, value);
-    case 2: return stw_p(ptr, value);
-    case 4: return stl_p(ptr, value);
-    default: abort();
-    }
-}
-
-static const MemoryRegionOps subpage_ram_ops = {
-    .read = subpage_ram_read,
-    .write = subpage_ram_write,
-    .endianness = DEVICE_NATIVE_ENDIAN,
-};
-
 static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
                              uint16_t section)
 {
@@ -1677,11 +1651,6 @@ static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
     printf("%s: %p start %08x end %08x idx %08x eidx %08x mem %ld\n", __func__,
            mmio, start, end, idx, eidx, memory);
 #endif
-    if (memory_region_is_ram(phys_sections[section].mr)) {
-        MemoryRegionSection new_section = phys_sections[section];
-        new_section.mr = &io_mem_subpage_ram;
-        section = phys_section_add(&new_section);
-    }
     for (; idx <= eidx; idx++) {
         mmio->sub_section[idx] = section;
     }
@@ -1689,12 +1658,13 @@ static int subpage_register (subpage_t *mmio, uint32_t start, uint32_t end,
     return 0;
 }
 
-static subpage_t *subpage_init(hwaddr base)
+static subpage_t *subpage_init(AddressSpace *as, hwaddr base)
 {
     subpage_t *mmio;
 
     mmio = g_malloc0(sizeof(subpage_t));
 
+    mmio->as = as;
     mmio->base = base;
     memory_region_init_io(&mmio->iomem, &subpage_ops, mmio,
                           "subpage", TARGET_PAGE_SIZE);
@@ -1732,8 +1702,6 @@ static void io_mem_init(void)
                           "unassigned", UINT64_MAX);
     memory_region_init_io(&io_mem_notdirty, &notdirty_mem_ops, NULL,
                           "notdirty", UINT64_MAX);
-    memory_region_init_io(&io_mem_subpage_ram, &subpage_ram_ops, NULL,
-                          "subpage-ram", UINT64_MAX);
     memory_region_init_io(&io_mem_watch, &watch_mem_ops, NULL,
                           "watch", UINT64_MAX);
 }
@@ -1823,6 +1791,7 @@ void address_space_init_dispatch(AddressSpace *as)
         .region_nop = mem_add,
         .priority = 0,
     };
+    d->as = as;
     as->dispatch = d;
     memory_listener_register(&d->listener, as);
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 22:00   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size" Paolo Bonzini
                   ` (14 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

Only address_space_translate_for_iotlb needs to return the section.
Every caller of address_space_translate now uses only section->mr,
return it directly.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 150 +++++++++++++++++++++++++-------------------------
 include/exec/memory.h |   8 +--
 2 files changed, 79 insertions(+), 79 deletions(-)

diff --git a/exec.c b/exec.c
index bf374e4..2a81b6e 100644
--- a/exec.c
+++ b/exec.c
@@ -262,11 +262,11 @@ address_space_translate_internal(AddressSpace *as, hwaddr addr, hwaddr *xlat,
     return section;
 }
 
-MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
-                                             hwaddr *xlat, hwaddr *plen,
-                                             bool is_write)
+MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
+                                      hwaddr *xlat, hwaddr *plen,
+                                      bool is_write)
 {
-    return address_space_translate_internal(as, addr, xlat, plen, true);
+    return address_space_translate_internal(as, addr, xlat, plen, true)->mr;
 }
 
 MemoryRegionSection *
@@ -1923,58 +1923,58 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
     uint8_t *ptr;
     uint64_t val;
     hwaddr addr1;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     bool error = false;
 
     while (len > 0) {
         l = len;
-        section = address_space_translate(as, addr, &addr1, &l, is_write);
+        mr = address_space_translate(as, addr, &addr1, &l, is_write);
 
         if (is_write) {
-            if (!memory_access_is_direct(section->mr, is_write)) {
-                l = memory_access_size(section->mr, l, addr1);
+            if (!memory_access_is_direct(mr, is_write)) {
+                l = memory_access_size(mr, l, addr1);
                 /* XXX: could force cpu_single_env to NULL to avoid
                    potential bugs */
                 if (l == 4) {
                     /* 32 bit write access */
                     val = ldl_p(buf);
-                    error |= io_mem_write(section->mr, addr1, val, 4);
+                    error |= io_mem_write(mr, addr1, val, 4);
                 } else if (l == 2) {
                     /* 16 bit write access */
                     val = lduw_p(buf);
-                    error |= io_mem_write(section->mr, addr1, val, 2);
+                    error |= io_mem_write(mr, addr1, val, 2);
                 } else {
                     /* 8 bit write access */
                     val = ldub_p(buf);
-                    error |= io_mem_write(section->mr, addr1, val, 1);
+                    error |= io_mem_write(mr, addr1, val, 1);
                 }
             } else {
-                addr1 += memory_region_get_ram_addr(section->mr);
+                addr1 += memory_region_get_ram_addr(mr);
                 /* RAM case */
                 ptr = qemu_get_ram_ptr(addr1);
                 memcpy(ptr, buf, l);
                 invalidate_and_set_dirty(addr1, l);
             }
         } else {
-            if (!memory_access_is_direct(section->mr, is_write)) {
+            if (!memory_access_is_direct(mr, is_write)) {
                 /* I/O case */
-                l = memory_access_size(section->mr, l, addr1);
+                l = memory_access_size(mr, l, addr1);
                 if (l == 4) {
                     /* 32 bit read access */
-                    error |= io_mem_read(section->mr, addr1, &val, 4);
+                    error |= io_mem_read(mr, addr1, &val, 4);
                     stl_p(buf, val);
                 } else if (l == 2) {
                     /* 16 bit read access */
-                    error |= io_mem_read(section->mr, addr1, &val, 2);
+                    error |= io_mem_read(mr, addr1, &val, 2);
                     stw_p(buf, val);
                 } else {
                     /* 8 bit read access */
-                    error |= io_mem_read(section->mr, addr1, &val, 1);
+                    error |= io_mem_read(mr, addr1, &val, 1);
                     stb_p(buf, val);
                 }
             } else {
                 /* RAM case */
-                ptr = qemu_get_ram_ptr(section->mr->ram_addr + addr1);
+                ptr = qemu_get_ram_ptr(mr->ram_addr + addr1);
                 memcpy(buf, ptr, l);
             }
         }
@@ -2011,18 +2011,18 @@ void cpu_physical_memory_write_rom(hwaddr addr,
     hwaddr l;
     uint8_t *ptr;
     hwaddr addr1;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
 
     while (len > 0) {
         l = len;
-        section = address_space_translate(&address_space_memory,
-                                          addr, &addr1, &l, true);
+        mr = address_space_translate(&address_space_memory,
+                                     addr, &addr1, &l, true);
 
-        if (!(memory_region_is_ram(section->mr) ||
-              memory_region_is_romd(section->mr))) {
+        if (!(memory_region_is_ram(mr) ||
+              memory_region_is_romd(mr))) {
             /* do nothing */
         } else {
-            addr1 += memory_region_get_ram_addr(section->mr);
+            addr1 += memory_region_get_ram_addr(mr);
             /* ROM/RAM case */
             ptr = qemu_get_ram_ptr(addr1);
             memcpy(ptr, buf, l);
@@ -2082,15 +2082,15 @@ static void cpu_notify_map_clients(void)
 
 bool address_space_access_valid(AddressSpace *as, hwaddr addr, int len, bool is_write)
 {
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l, xlat;
 
     while (len > 0) {
         l = len;
-        section = address_space_translate(as, addr, &xlat, &l, is_write);
-        if (!memory_access_is_direct(section->mr, is_write)) {
-            l = memory_access_size(section->mr, l, addr);
-            if (!memory_region_access_valid(section->mr, xlat, l, is_write)) {
+        mr = address_space_translate(as, addr, &xlat, &l, is_write);
+        if (!memory_access_is_direct(mr, is_write)) {
+            l = memory_access_size(mr, l, addr);
+            if (!memory_region_access_valid(mr, xlat, l, is_write)) {
                 return false;
             }
         }
@@ -2116,16 +2116,16 @@ void *address_space_map(AddressSpace *as,
     hwaddr len = *plen;
     hwaddr todo = 0;
     hwaddr l, xlat;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     ram_addr_t raddr = RAM_ADDR_MAX;
     ram_addr_t rlen;
     void *ret;
 
     while (len > 0) {
         l = len;
-        section = address_space_translate(as, addr, &xlat, &l, is_write);
+        mr = address_space_translate(as, addr, &xlat, &l, is_write);
 
-        if (!memory_access_is_direct(section->mr, is_write)) {
+        if (!memory_access_is_direct(mr, is_write)) {
             if (todo || bounce.buffer) {
                 break;
             }
@@ -2140,9 +2140,9 @@ void *address_space_map(AddressSpace *as,
             return bounce.buffer;
         }
         if (!todo) {
-            raddr = memory_region_get_ram_addr(section->mr) + xlat;
+            raddr = memory_region_get_ram_addr(mr) + xlat;
         } else {
-            if (memory_region_get_ram_addr(section->mr) + xlat != raddr + todo) {
+            if (memory_region_get_ram_addr(mr) + xlat != raddr + todo) {
                 break;
             }
         }
@@ -2209,15 +2209,15 @@ static inline uint32_t ldl_phys_internal(hwaddr addr,
 {
     uint8_t *ptr;
     uint64_t val;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 4;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      false);
-    if (l < 4 || !memory_access_is_direct(section->mr, false)) {
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 false);
+    if (l < 4 || !memory_access_is_direct(mr, false)) {
         /* I/O case */
-        io_mem_read(section->mr, addr1, &val, 4);
+        io_mem_read(mr, addr1, &val, 4);
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap32(val);
@@ -2229,7 +2229,7 @@ static inline uint32_t ldl_phys_internal(hwaddr addr,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(section->mr)
+        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -2268,15 +2268,15 @@ static inline uint64_t ldq_phys_internal(hwaddr addr,
 {
     uint8_t *ptr;
     uint64_t val;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 8;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      false);
-    if (l < 8 || !memory_access_is_direct(section->mr, false)) {
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 false);
+    if (l < 8 || !memory_access_is_direct(mr, false)) {
         /* I/O case */
-        io_mem_read(section->mr, addr1, &val, 8);
+        io_mem_read(mr, addr1, &val, 8);
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap64(val);
@@ -2288,7 +2288,7 @@ static inline uint64_t ldq_phys_internal(hwaddr addr,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(section->mr)
+        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -2335,15 +2335,15 @@ static inline uint32_t lduw_phys_internal(hwaddr addr,
 {
     uint8_t *ptr;
     uint64_t val;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 2;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      false);
-    if (l < 2 || !memory_access_is_direct(section->mr, false)) {
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 false);
+    if (l < 2 || !memory_access_is_direct(mr, false)) {
         /* I/O case */
-        io_mem_read(section->mr, addr1, &val, 2);
+        io_mem_read(mr, addr1, &val, 2);
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap16(val);
@@ -2355,7 +2355,7 @@ static inline uint32_t lduw_phys_internal(hwaddr addr,
 #endif
     } else {
         /* RAM case */
-        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(section->mr)
+        ptr = qemu_get_ram_ptr((memory_region_get_ram_addr(mr)
                                 & TARGET_PAGE_MASK)
                                + addr1);
         switch (endian) {
@@ -2394,16 +2394,16 @@ uint32_t lduw_be_phys(hwaddr addr)
 void stl_phys_notdirty(hwaddr addr, uint32_t val)
 {
     uint8_t *ptr;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 4;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      true);
-    if (l < 4 || !memory_access_is_direct(section->mr, true)) {
-        io_mem_write(section->mr, addr1, val, 4);
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 true);
+    if (l < 4 || !memory_access_is_direct(mr, true)) {
+        io_mem_write(mr, addr1, val, 4);
     } else {
-        addr1 += memory_region_get_ram_addr(section->mr) & TARGET_PAGE_MASK;
+        addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
         ptr = qemu_get_ram_ptr(addr1);
         stl_p(ptr, val);
 
@@ -2424,13 +2424,13 @@ static inline void stl_phys_internal(hwaddr addr, uint32_t val,
                                      enum device_endian endian)
 {
     uint8_t *ptr;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 4;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      true);
-    if (l < 4 || !memory_access_is_direct(section->mr, true)) {
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 true);
+    if (l < 4 || !memory_access_is_direct(mr, true)) {
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap32(val);
@@ -2440,10 +2440,10 @@ static inline void stl_phys_internal(hwaddr addr, uint32_t val,
             val = bswap32(val);
         }
 #endif
-        io_mem_write(section->mr, addr1, val, 4);
+        io_mem_write(mr, addr1, val, 4);
     } else {
         /* RAM case */
-        addr1 += memory_region_get_ram_addr(section->mr) & TARGET_PAGE_MASK;
+        addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
         ptr = qemu_get_ram_ptr(addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -2487,13 +2487,13 @@ static inline void stw_phys_internal(hwaddr addr, uint32_t val,
                                      enum device_endian endian)
 {
     uint8_t *ptr;
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 2;
     hwaddr addr1;
 
-    section = address_space_translate(&address_space_memory, addr, &addr1, &l,
-                                      true);
-    if (l < 2 || !memory_access_is_direct(section->mr, true)) {
+    mr = address_space_translate(&address_space_memory, addr, &addr1, &l,
+                                 true);
+    if (l < 2 || !memory_access_is_direct(mr, true)) {
 #if defined(TARGET_WORDS_BIGENDIAN)
         if (endian == DEVICE_LITTLE_ENDIAN) {
             val = bswap16(val);
@@ -2503,10 +2503,10 @@ static inline void stw_phys_internal(hwaddr addr, uint32_t val,
             val = bswap16(val);
         }
 #endif
-        io_mem_write(section->mr, addr1, val, 2);
+        io_mem_write(mr, addr1, val, 2);
     } else {
         /* RAM case */
-        addr1 += memory_region_get_ram_addr(section->mr) & TARGET_PAGE_MASK;
+        addr1 += memory_region_get_ram_addr(mr) & TARGET_PAGE_MASK;
         ptr = qemu_get_ram_ptr(addr1);
         switch (endian) {
         case DEVICE_LITTLE_ENDIAN:
@@ -2608,13 +2608,13 @@ bool virtio_is_big_endian(void)
 #ifndef CONFIG_USER_ONLY
 bool cpu_physical_memory_is_io(hwaddr phys_addr)
 {
-    MemoryRegionSection *section;
+    MemoryRegion *mr;
     hwaddr l = 1;
 
-    section = address_space_translate(&address_space_memory,
-                                      phys_addr, &phys_addr, &l, false);
+    mr = address_space_translate(&address_space_memory,
+                                 phys_addr, &phys_addr, &l, false);
 
-    return !(memory_region_is_ram(section->mr) ||
-             memory_region_is_romd(section->mr));
+    return !(memory_region_is_ram(mr) ||
+             memory_region_is_romd(mr));
 }
 #endif
diff --git a/include/exec/memory.h b/include/exec/memory.h
index d53a6a1..c747f67 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -859,7 +859,7 @@ bool address_space_write(AddressSpace *as, hwaddr addr,
 bool address_space_read(AddressSpace *as, hwaddr addr, uint8_t *buf, int len);
 
 /* address_space_translate: translate an address range into an address space
- * into a MemoryRegionSection and an address range into that section
+ * into a MemoryRegion and an address range into that section
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
@@ -868,9 +868,9 @@ bool address_space_read(AddressSpace *as, hwaddr addr, uint8_t *buf, int len);
  * @len: pointer to length
  * @is_write: indicates the transfer direction
  */
-MemoryRegionSection *address_space_translate(AddressSpace *as, hwaddr addr,
-                                             hwaddr *xlat, hwaddr *len,
-                                             bool is_write);
+MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
+                                      hwaddr *xlat, hwaddr *len,
+                                      bool is_write);
 
 /* address_space_access_valid: check for validity of accessing an address
  * space range
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size"
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (5 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 22:04   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 08/21] Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62" Paolo Bonzini
                   ` (13 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

This reverts commit 86a8623692b1b559a419a92eb8b6897c221bca74.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 13 +------------
 include/exec/memory.h |  3 ---
 2 files changed, 1 insertion(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index 2a81b6e..e3a08d6 100644
--- a/exec.c
+++ b/exec.c
@@ -835,21 +835,10 @@ static void register_multipage(AddressSpaceDispatch *d, MemoryRegionSection *sec
                   section_index);
 }
 
-QEMU_BUILD_BUG_ON(TARGET_PHYS_ADDR_SPACE_BITS > MAX_PHYS_ADDR_SPACE_BITS)
-
-static MemoryRegionSection limit(MemoryRegionSection section)
-{
-    section.size = MIN(section.offset_within_address_space + section.size,
-                       MAX_PHYS_ADDR + 1)
-                   - section.offset_within_address_space;
-
-    return section;
-}
-
 static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
 {
     AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, listener);
-    MemoryRegionSection now = limit(*section), remain = limit(*section);
+    MemoryRegionSection now = *section, remain = *section;
 
     if ((now.offset_within_address_space & ~TARGET_PAGE_MASK)
         || (now.size < TARGET_PAGE_SIZE)) {
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c747f67..c11a3f8 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -26,9 +26,6 @@
 #include "exec/ioport.h"
 #include "qemu/int128.h"
 
-#define MAX_PHYS_ADDR_SPACE_BITS 62
-#define MAX_PHYS_ADDR            (((hwaddr)1 << MAX_PHYS_ADDR_SPACE_BITS) - 1)
-
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionPortio MemoryRegionPortio;
 typedef struct MemoryRegionMmio MemoryRegionMmio;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 08/21] Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62"
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (6 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size" Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version Paolo Bonzini
                   ` (12 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

This reverts commit 311f83ca08c011b048c063c2fd3038a8957970bc.  The next
patches will implement full support for 2^64-byte regions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target-s390x/cpu.h | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
index 6304c4d..0ce82cf 100644
--- a/target-s390x/cpu.h
+++ b/target-s390x/cpu.h
@@ -34,10 +34,7 @@
 #include "exec/cpu-defs.h"
 #define TARGET_PAGE_BITS 12
 
-/* Actually 64-bits, limited by the memory API to 62 bits.  We
- * never use that much.
- */
-#define TARGET_PHYS_ADDR_SPACE_BITS 62
+#define TARGET_PHYS_ADDR_SPACE_BITS 64
 #define TARGET_VIRT_ADDR_SPACE_BITS 64
 
 #include "exec/cpu-all.h"
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (7 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 08/21] Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62" Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 22:42   ` Richard Henderson
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
                   ` (11 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

When adding support for 2^64-byte sections, we will have to change
the structure of mem_add to avoid failures in int128_get64.
Reorganize the code now before introducing Int128.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c | 39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/exec.c b/exec.c
index e3a08d6..cf3ea6c 100644
--- a/exec.c
+++ b/exec.c
@@ -824,15 +824,11 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
 static void register_multipage(AddressSpaceDispatch *d, MemoryRegionSection *section)
 {
     hwaddr start_addr = section->offset_within_address_space;
-    ram_addr_t size = section->size;
-    hwaddr addr;
     uint16_t section_index = phys_section_add(section);
+    uint64_t num_pages = section->size >> TARGET_PAGE_BITS;
 
-    assert(size);
-
-    addr = start_addr;
-    phys_page_set(d, addr >> TARGET_PAGE_BITS, size >> TARGET_PAGE_BITS,
-                  section_index);
+    assert(num_pages);
+    phys_page_set(d, start_addr >> TARGET_PAGE_BITS, num_pages, section_index);
 }
 
 static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
@@ -840,32 +836,29 @@ static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
     AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, listener);
     MemoryRegionSection now = *section, remain = *section;
 
-    if ((now.offset_within_address_space & ~TARGET_PAGE_MASK)
-        || (now.size < TARGET_PAGE_SIZE)) {
-        now.size = MIN(TARGET_PAGE_ALIGN(now.offset_within_address_space)
-                       - now.offset_within_address_space,
-                       now.size);
+    if (now.offset_within_address_space & ~TARGET_PAGE_MASK) {
+        uint64_t left = TARGET_PAGE_ALIGN(now.offset_within_address_space)
+                       - now.offset_within_address_space;
+
+        now.size = MIN(left, now.size);
         register_subpage(d, &now);
+    } else {
+        now.size = 0;
+    }
+    while (remain.size != now.size) {
         remain.size -= now.size;
         remain.offset_within_address_space += now.size;
         remain.offset_within_region += now.size;
-    }
-    while (remain.size >= TARGET_PAGE_SIZE) {
         now = remain;
-        if (remain.offset_within_region & ~TARGET_PAGE_MASK) {
+        if (remain.size < TARGET_PAGE_SIZE) {
+            register_subpage(d, &now);
+        } else if (remain.offset_within_region & ~TARGET_PAGE_MASK) {
             now.size = TARGET_PAGE_SIZE;
             register_subpage(d, &now);
         } else {
-            now.size &= TARGET_PAGE_MASK;
+            now.size &= -TARGET_PAGE_SIZE;
             register_multipage(d, &now);
         }
-        remain.size -= now.size;
-        remain.offset_within_address_space += now.size;
-        remain.offset_within_region += now.size;
-    }
-    now = remain;
-    if (now.size) {
-        register_subpage(d, &now);
     }
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (8 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31  6:56   ` Alexey Kardashevskiy
                     ` (2 more replies)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 11/21] memory: iommu support Paolo Bonzini
                   ` (10 subsequent siblings)
  20 siblings, 3 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel

So far, the size of all regions passed to listeners could fit in 64 bits,
because artificial regions (containers and aliases) are eliminated by
the memory core, leaving only device regions which have reasonable sizes

An IOMMU however cannot be eliminated by the memory core, and may have
an artificial size, hence we may need 65 bits to represent its size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                        | 37 +++++++++++++++++++++----------------
 hw/core/loader.c              |  2 +-
 hw/display/exynos4210_fimd.c  |  4 ++--
 hw/display/framebuffer.c      |  3 ++-
 hw/misc/vfio.c                |  4 ++--
 hw/virtio/dataplane/hostmem.c |  2 +-
 hw/virtio/vhost.c             |  4 ++--
 hw/virtio/virtio-balloon.c    |  2 +-
 include/exec/memory.h         |  5 ++++-
 include/qemu/int128.h         | 10 ++++++++++
 kvm-all.c                     | 23 ++++++++++++++---------
 memory.c                      | 14 +++++++-------
 xen-all.c                     |  6 +++---
 13 files changed, 70 insertions(+), 46 deletions(-)

diff --git a/exec.c b/exec.c
index cf3ea6c..b86f0cc 100644
--- a/exec.c
+++ b/exec.c
@@ -801,7 +801,7 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
     MemoryRegionSection *existing = phys_page_find(d, base >> TARGET_PAGE_BITS);
     MemoryRegionSection subsection = {
         .offset_within_address_space = base,
-        .size = TARGET_PAGE_SIZE,
+        .size = int128_make64(TARGET_PAGE_SIZE),
     };
     hwaddr start, end;
 
@@ -816,16 +816,18 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
         subpage = container_of(existing->mr, subpage_t, iomem);
     }
     start = section->offset_within_address_space & ~TARGET_PAGE_MASK;
-    end = start + section->size - 1;
+    end = start + int128_get64(section->size) - 1;
     subpage_register(subpage, start, end, phys_section_add(section));
 }
 
 
-static void register_multipage(AddressSpaceDispatch *d, MemoryRegionSection *section)
+static void register_multipage(AddressSpaceDispatch *d,
+                               MemoryRegionSection *section)
 {
     hwaddr start_addr = section->offset_within_address_space;
     uint16_t section_index = phys_section_add(section);
-    uint64_t num_pages = section->size >> TARGET_PAGE_BITS;
+    uint64_t num_pages = int128_get64(int128_rshift(section->size,
+                                                    TARGET_PAGE_BITS));
 
     assert(num_pages);
     phys_page_set(d, start_addr >> TARGET_PAGE_BITS, num_pages, section_index);
@@ -835,28 +837,29 @@ static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
 {
     AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, listener);
     MemoryRegionSection now = *section, remain = *section;
+    Int128 page_size = int128_make64(TARGET_PAGE_SIZE);
 
     if (now.offset_within_address_space & ~TARGET_PAGE_MASK) {
         uint64_t left = TARGET_PAGE_ALIGN(now.offset_within_address_space)
                        - now.offset_within_address_space;
 
-        now.size = MIN(left, now.size);
+        now.size = int128_min(int128_make64(left), now.size);
         register_subpage(d, &now);
     } else {
-        now.size = 0;
+        now.size = int128_zero();
     }
-    while (remain.size != now.size) {
-        remain.size -= now.size;
-        remain.offset_within_address_space += now.size;
-        remain.offset_within_region += now.size;
+    while (int128_ne(remain.size, now.size)) {
+        remain.size = int128_sub(remain.size, now.size);
+        remain.offset_within_address_space += int128_get64(now.size);
+        remain.offset_within_region += int128_get64(now.size);
         now = remain;
-        if (remain.size < TARGET_PAGE_SIZE) {
+        if (int128_lt(remain.size, page_size)) {
             register_subpage(d, &now);
         } else if (remain.offset_within_region & ~TARGET_PAGE_MASK) {
-            now.size = TARGET_PAGE_SIZE;
+            now.size = page_size;
             register_subpage(d, &now);
         } else {
-            now.size &= -TARGET_PAGE_SIZE;
+            now.size = int128_and(now.size, int128_neg(page_size));
             register_multipage(d, &now);
         }
     }
@@ -1666,7 +1669,7 @@ static uint16_t dummy_section(MemoryRegion *mr)
         .mr = mr,
         .offset_within_address_space = 0,
         .offset_within_region = 0,
-        .size = UINT64_MAX,
+        .size = int128_2_64(),
     };
 
     return phys_section_add(&section);
@@ -1735,14 +1738,16 @@ static void io_region_add(MemoryListener *listener,
     mrio->mr = section->mr;
     mrio->offset = section->offset_within_region;
     iorange_init(&mrio->iorange, &memory_region_iorange_ops,
-                 section->offset_within_address_space, section->size);
+                 section->offset_within_address_space,
+                 int128_get64(section->size));
     ioport_register(&mrio->iorange);
 }
 
 static void io_region_del(MemoryListener *listener,
                           MemoryRegionSection *section)
 {
-    isa_unassign_ioport(section->offset_within_address_space, section->size);
+    isa_unassign_ioport(section->offset_within_address_space,
+                        int128_get64(section->size));
 }
 
 static MemoryListener core_memory_listener = {
diff --git a/hw/core/loader.c b/hw/core/loader.c
index 7507914..3a60cbe 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -726,7 +726,7 @@ int rom_load_all(void)
         addr  = rom->addr;
         addr += rom->romsize;
         section = memory_region_find(get_system_memory(), rom->addr, 1);
-        rom->isrom = section.size && memory_region_is_rom(section.mr);
+        rom->isrom = int128_nz(section.size) && memory_region_is_rom(section.mr);
     }
     qemu_register_reset(rom_reset, NULL);
     roms_loaded = 1;
diff --git a/hw/display/exynos4210_fimd.c b/hw/display/exynos4210_fimd.c
index 6cb5016..0da00a9 100644
--- a/hw/display/exynos4210_fimd.c
+++ b/hw/display/exynos4210_fimd.c
@@ -1133,7 +1133,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
     DPRINT_TRACE("Window %u framebuffer changed: address=0x%08x, len=0x%x\n",
             win, fb_start_addr, w->fb_len);
 
-    if (w->mem_section.size != w->fb_len ||
+    if (int128_get64(w->mem_section.size) != w->fb_len ||
             !memory_region_is_ram(w->mem_section.mr)) {
         DPRINT_ERROR("Failed to find window %u framebuffer region\n", win);
         goto error_return;
@@ -1155,7 +1155,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
 
 error_return:
     w->mem_section.mr = NULL;
-    w->mem_section.size = 0;
+    w->mem_section.size = int128_zero();
     w->host_fb_addr = NULL;
     w->fb_len = 0;
 }
diff --git a/hw/display/framebuffer.c b/hw/display/framebuffer.c
index 6be31db..49c9e59 100644
--- a/hw/display/framebuffer.c
+++ b/hw/display/framebuffer.c
@@ -54,7 +54,8 @@ void framebuffer_update_display(
     src_len = src_width * rows;
 
     mem_section = memory_region_find(address_space, base, src_len);
-    if (mem_section.size != src_len || !memory_region_is_ram(mem_section.mr)) {
+    if (int128_get64(mem_section.size) != src_len ||
+            !memory_region_is_ram(mem_section.mr)) {
         return;
     }
     mem = mem_section.mr;
diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index 693a9ff..c89676b 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -1953,7 +1953,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
     }
 
     iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
-    end = (section->offset_within_address_space + section->size) &
+    end = (section->offset_within_address_space + int128_get64(section->size)) &
           TARGET_PAGE_MASK;
 
     if (iova >= end) {
@@ -1997,7 +1997,7 @@ static void vfio_listener_region_del(MemoryListener *listener,
     }
 
     iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
-    end = (section->offset_within_address_space + section->size) &
+    end = (section->offset_within_address_space + int128_get64(section->size)) &
           TARGET_PAGE_MASK;
 
     if (iova >= end) {
diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c
index 37292ff..7e46723 100644
--- a/hw/virtio/dataplane/hostmem.c
+++ b/hw/virtio/dataplane/hostmem.c
@@ -90,7 +90,7 @@ static void hostmem_append_new_region(HostMem *hostmem,
     hostmem->new_regions[num] = (HostMemRegion){
         .host_addr = ram_ptr + section->offset_within_region,
         .guest_addr = section->offset_within_address_space,
-        .size = section->size,
+        .size = int128_get64(section->size),
         .readonly = section->readonly,
     };
     hostmem->num_new_regions++;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index fbabf99..baf84ea 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -81,7 +81,7 @@ static int vhost_sync_dirty_bitmap(struct vhost_dev *dev,
         return 0;
     }
     start_addr = section->offset_within_address_space;
-    end_addr = range_get_last(start_addr, section->size);
+    end_addr = range_get_last(start_addr, int128_get64(section->size));
     start_addr = MAX(first, start_addr);
     end_addr = MIN(last, end_addr);
 
@@ -379,7 +379,7 @@ static void vhost_set_memory(MemoryListener *listener,
     struct vhost_dev *dev = container_of(listener, struct vhost_dev,
                                          memory_listener);
     hwaddr start_addr = section->offset_within_address_space;
-    ram_addr_t size = section->size;
+    ram_addr_t size = int128_get64(section->size);
     bool log_dirty = memory_region_is_logging(section->mr);
     int s = offsetof(struct vhost_memory, regions) +
         (dev->mem->nregions + 1) * sizeof dev->mem->regions[0];
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index d669756..a27051c 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -197,7 +197,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 
             /* FIXME: remove get_system_memory(), but how? */
             section = memory_region_find(get_system_memory(), pa, 1);
-            if (!section.size || !memory_region_is_ram(section.mr))
+            if (!int128_nz(section.size) || !memory_region_is_ram(section.mr)) {
                 continue;
+            }
 
             /* Using memory_region_get_ram_ptr is bending the rules a bit, but
                should be OK because we only want a single page.  */
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c11a3f8..fddc6ad 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -26,6 +26,9 @@
 #include "exec/ioport.h"
 #include "qemu/int128.h"
 
+#define MAX_PHYS_ADDR_SPACE_BITS 62
+#define MAX_PHYS_ADDR            (((hwaddr)1 << MAX_PHYS_ADDR_SPACE_BITS) - 1)
+
 typedef struct MemoryRegionOps MemoryRegionOps;
 typedef struct MemoryRegionPortio MemoryRegionPortio;
 typedef struct MemoryRegionMmio MemoryRegionMmio;
@@ -185,7 +188,7 @@ struct MemoryRegionSection {
     MemoryRegion *mr;
     AddressSpace *address_space;
     hwaddr offset_within_region;
-    uint64_t size;
+    Int128 size;
     hwaddr offset_within_address_space;
     bool readonly;
 };
diff --git a/include/qemu/int128.h b/include/qemu/int128.h
index b3864b6..f094c5c 100644
--- a/include/qemu/int128.h
+++ b/include/qemu/int128.h
@@ -34,6 +34,16 @@ static inline Int128 int128_2_64(void)
     return (Int128) { 0, 1 };
 }
 
+static inline Int128 int128_and(Int128 a, Int128 b)
+{
+    return (Int128) { a.lo & b.lo, a.hi & b.hi };
+}
+
+static inline Int128 int128_rshift(Int128 a, int n)
+{
+    return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), (a.hi >> n) };
+}
+
 static inline Int128 int128_add(Int128 a, Int128 b)
 {
     Int128 r = { a.lo + b.lo, a.hi + b.hi };
diff --git a/kvm-all.c b/kvm-all.c
index 8222729..86c9af3 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -316,7 +316,7 @@ static void kvm_log_start(MemoryListener *listener,
     int r;
 
     r = kvm_dirty_pages_log_change(section->offset_within_address_space,
-                                   section->size, true);
+                                   int128_get64(section->size), true);
     if (r < 0) {
         abort();
     }
@@ -328,7 +328,7 @@ static void kvm_log_stop(MemoryListener *listener,
     int r;
 
     r = kvm_dirty_pages_log_change(section->offset_within_address_space,
-                                   section->size, false);
+                                   int128_get64(section->size), false);
     if (r < 0) {
         abort();
     }
@@ -366,7 +366,8 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
     unsigned int i, j;
     unsigned long page_number, c;
     hwaddr addr, addr1;
-    unsigned int len = ((section->size / getpagesize()) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
+    unsigned int pages = int128_get64(section->size) / getpagesize();
+    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
     unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
 
     /*
@@ -409,7 +410,7 @@ static int kvm_physical_sync_dirty_bitmap(MemoryRegionSection *section)
     KVMSlot *mem;
     int ret = 0;
     hwaddr start_addr = section->offset_within_address_space;
-    hwaddr end_addr = start_addr + section->size;
+    hwaddr end_addr = start_addr + int128_get64(section->size);
 
     d.dirty_bitmap = NULL;
     while (start_addr < end_addr) {
@@ -619,7 +620,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
     MemoryRegion *mr = section->mr;
     bool log_dirty = memory_region_is_logging(mr);
     hwaddr start_addr = section->offset_within_address_space;
-    ram_addr_t size = section->size;
+    ram_addr_t size = int128_get64(section->size);
     void *ram = NULL;
     unsigned delta;
 
@@ -811,7 +812,8 @@ static void kvm_mem_ioeventfd_add(MemoryListener *listener,
     int r;
 
     r = kvm_set_ioeventfd_mmio(fd, section->offset_within_address_space,
-                               data, true, section->size, match_data);
+                               data, true, int128_get64(section->size),
+                               match_data);
     if (r < 0) {
         abort();
     }
@@ -826,7 +828,8 @@ static void kvm_mem_ioeventfd_del(MemoryListener *listener,
     int r;
 
     r = kvm_set_ioeventfd_mmio(fd, section->offset_within_address_space,
-                               data, false, section->size, match_data);
+                               data, false, int128_get64(section->size),
+                               match_data);
     if (r < 0) {
         abort();
     }
@@ -841,7 +844,8 @@ static void kvm_io_ioeventfd_add(MemoryListener *listener,
     int r;
 
     r = kvm_set_ioeventfd_pio(fd, section->offset_within_address_space,
-                              data, true, section->size, match_data);
+                              data, true, int128_get64(section->size),
+                              match_data);
     if (r < 0) {
         abort();
     }
@@ -857,7 +861,8 @@ static void kvm_io_ioeventfd_del(MemoryListener *listener,
     int r;
 
     r = kvm_set_ioeventfd_pio(fd, section->offset_within_address_space,
-                              data, false, section->size, match_data);
+                              data, false, int128_get64(section->size),
+                              match_data);
     if (r < 0) {
         abort();
     }
diff --git a/memory.c b/memory.c
index 5cb8f4a..5be2d7b 100644
--- a/memory.c
+++ b/memory.c
@@ -152,7 +152,7 @@ static bool memory_listener_match(MemoryListener *listener,
         .mr = (fr)->mr,                                                 \
         .address_space = (as),                                          \
         .offset_within_region = (fr)->offset_in_region,                 \
-        .size = int128_get64((fr)->addr.size),                          \
+        .size = (fr)->addr.size,                                        \
         .offset_within_address_space = int128_get64((fr)->addr.start),  \
         .readonly = (fr)->readonly,                                     \
               }))
@@ -634,7 +634,7 @@ static void address_space_add_del_ioeventfds(AddressSpace *as,
             section = (MemoryRegionSection) {
                 .address_space = as,
                 .offset_within_address_space = int128_get64(fd->addr.start),
-                .size = int128_get64(fd->addr.size),
+                .size = fd->addr.size,
             };
             MEMORY_LISTENER_CALL(eventfd_del, Forward, &section,
                                  fd->match_data, fd->data, fd->e);
@@ -647,7 +647,7 @@ static void address_space_add_del_ioeventfds(AddressSpace *as,
             section = (MemoryRegionSection) {
                 .address_space = as,
                 .offset_within_address_space = int128_get64(fd->addr.start),
-                .size = int128_get64(fd->addr.size),
+                .size = fd->addr.size,
             };
             MEMORY_LISTENER_CALL(eventfd_add, Reverse, &section,
                                  fd->match_data, fd->data, fd->e);
@@ -1215,7 +1215,7 @@ static void memory_region_update_coalesced_range_as(MemoryRegion *mr, AddressSpa
             section = (MemoryRegionSection) {
                 .address_space = as,
                 .offset_within_address_space = int128_get64(fr->addr.start),
-                .size = int128_get64(fr->addr.size),
+                .size = fr->addr.size,
             };
 
             MEMORY_LISTENER_CALL(coalesced_mmio_del, Reverse, &section,
@@ -1506,7 +1506,7 @@ static FlatRange *address_space_lookup(AddressSpace *as, AddrRange addr)
 MemoryRegionSection memory_region_find(MemoryRegion *mr,
                                        hwaddr addr, uint64_t size)
 {
-    MemoryRegionSection ret = { .mr = NULL, .size = 0 };
+    MemoryRegionSection ret = { .mr = NULL };
     MemoryRegion *root;
     AddressSpace *as;
     AddrRange range;
@@ -1536,7 +1536,7 @@ MemoryRegionSection memory_region_find(MemoryRegion *mr,
     ret.offset_within_region = fr->offset_in_region;
     ret.offset_within_region += int128_get64(int128_sub(range.start,
                                                         fr->addr.start));
-    ret.size = int128_get64(range.size);
+    ret.size = range.size;
     ret.offset_within_address_space = int128_get64(range.start);
     ret.readonly = fr->readonly;
     return ret;
@@ -1584,7 +1584,7 @@ static void listener_add_address_space(MemoryListener *listener,
             .mr = fr->mr,
             .address_space = as,
             .offset_within_region = fr->offset_in_region,
-            .size = int128_get64(fr->addr.size),
+            .size = fr->addr.size,
             .offset_within_address_space = int128_get64(fr->addr.start),
             .readonly = fr->readonly,
         };
diff --git a/xen-all.c b/xen-all.c
index 539a154..cd520b1 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -418,7 +418,7 @@ static void xen_set_memory(struct MemoryListener *listener,
 {
     XenIOState *state = container_of(listener, XenIOState, memory_listener);
     hwaddr start_addr = section->offset_within_address_space;
-    ram_addr_t size = section->size;
+    ram_addr_t size = int128_get64(section->size);
     bool log_dirty = memory_region_is_logging(section->mr);
     hvmmem_type_t mem_type;
 
@@ -522,7 +522,7 @@ static void xen_log_start(MemoryListener *listener,
     XenIOState *state = container_of(listener, XenIOState, memory_listener);
 
     xen_sync_dirty_bitmap(state, section->offset_within_address_space,
-                          section->size);
+                          int128_get64(section->size));
 }
 
 static void xen_log_stop(MemoryListener *listener, MemoryRegionSection *section)
@@ -539,7 +539,7 @@ static void xen_log_sync(MemoryListener *listener, MemoryRegionSection *section)
     XenIOState *state = container_of(listener, XenIOState, memory_listener);
 
     xen_sync_dirty_bitmap(state, section->offset_within_address_space,
-                          section->size);
+                          int128_get64(section->size));
 }
 
 static void xen_log_global_start(MemoryListener *listener)
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 11/21] memory: iommu support
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (9 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
@ 2013-05-30 21:16 ` Paolo Bonzini
  2013-05-31 22:54   ` Richard Henderson
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers Paolo Bonzini
                   ` (9 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: Avi Kivity

From: Avi Kivity <avi.kivity@gmail.com>

Add a new memory region type that translates addresses it is given,
then forwards them to a target address space.  This is similar to
an alias, except that the mapping is more flexible than a linear
translation and trucation, and also less efficient since the
translation happens at runtime.

The implementation uses an AddressSpace mapping the target region to
avoid hierarchical dispatch all the way to the resolved region; only
iommu regions are looked up dynamically.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[Modified to put translation in address_space_translate; assume
 IOMMUs are not reachable from TCG. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 35 +++++++++++++++++++++++++--
 include/exec/memory.h | 66 +++++++++++++++++++++++++++++++++++++++++++++++----
 memory.c              | 16 +++++++++++++
 3 files changed, 111 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index b86f0cc..f6716d5 100644
--- a/exec.c
+++ b/exec.c
@@ -266,14 +266,45 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
                                       hwaddr *xlat, hwaddr *plen,
                                       bool is_write)
 {
-    return address_space_translate_internal(as, addr, xlat, plen, true)->mr;
+    IOMMUTLBEntry iotlb;
+    MemoryRegionSection *section;
+    MemoryRegion *mr;
+    hwaddr len = *plen;
+
+    for (;;) {
+        section = address_space_translate_internal(as, addr, &addr, plen, true);
+        mr = section->mr;
+
+        if (!mr->iommu_ops) {
+            break;
+        }
+
+        iotlb = mr->iommu_ops->translate(mr, addr);
+        addr = ((iotlb.translated_addr & ~iotlb.addr_mask)
+                | (addr & iotlb.addr_mask));
+        len = MIN(len, (addr | iotlb.addr_mask) - addr + 1);
+        if (!(iotlb.perm & (1 << is_write))) {
+            mr = &io_mem_unassigned;
+            break;
+        }
+
+        as = iotlb.target_as;
+    }
+
+    *plen = len;
+    *xlat = addr;
+    return mr;
 }
 
 MemoryRegionSection *
 address_space_translate_for_iotlb(AddressSpace *as, hwaddr addr, hwaddr *xlat,
                                   hwaddr *plen)
 {
-    return address_space_translate_internal(as, addr, xlat, plen, false);
+    MemoryRegionSection *section;
+    section = address_space_translate_internal(as, addr, xlat, plen, false);
+
+    assert(!section->mr->iommu_ops);
+    return section;
 }
 #endif
 
diff --git a/include/exec/memory.h b/include/exec/memory.h
index fddc6ad..714e805 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -53,6 +53,24 @@ struct MemoryRegionIORange {
     hwaddr offset;
 };
 
+typedef struct IOMMUTLBEntry IOMMUTLBEntry;
+
+/* See address_space_translate: bit 0 is read, bit 1 is write.  */
+typedef enum {
+    IOMMU_NONE = 0,
+    IOMMU_RO   = 1,
+    IOMMU_WO   = 2,
+    IOMMU_RW   = 3,
+} IOMMUAccessFlags;
+
+struct IOMMUTLBEntry {
+    AddressSpace    *target_as;
+    hwaddr           iova;
+    hwaddr           translated_addr;
+    hwaddr           addr_mask;  /* 0xfff = 4k translation */
+    IOMMUAccessFlags perm;
+};
+
 /*
  * Memory region callbacks
  */
@@ -115,12 +133,20 @@ struct MemoryRegionOps {
     const MemoryRegionMmio old_mmio;
 };
 
+typedef struct MemoryRegionIOMMUOps MemoryRegionIOMMUOps;
+
+struct MemoryRegionIOMMUOps {
+    /* Return a TLB entry that contains a given address. */
+    IOMMUTLBEntry (*translate)(MemoryRegion *iommu, hwaddr addr);
+};
+
 typedef struct CoalescedMemoryRange CoalescedMemoryRange;
 typedef struct MemoryRegionIoeventfd MemoryRegionIoeventfd;
 
 struct MemoryRegion {
     /* All fields are private - violators will be prosecuted */
     const MemoryRegionOps *ops;
+    const MemoryRegionIOMMUOps *iommu_ops;
     void *opaque;
     MemoryRegion *parent;
     Int128 size;
@@ -332,6 +358,24 @@ void memory_region_init_rom_device(MemoryRegion *mr,
 void memory_region_init_reservation(MemoryRegion *mr,
                                     const char *name,
                                     uint64_t size);
+
+/**
+ * memory_region_init_iommu: Initialize a memory region that translates
+ * addresses
+ *
+ * An IOMMU region translates addresses and forwards accesses to a target
+ * memory region.
+ *
+ * @mr: the #MemoryRegion to be initialized
+ * @ops: a function that translates addresses into the @target region
+ * @name: used for debugging; not visible to the user or ABI
+ * @size: size of the region.
+ */
+void memory_region_init_iommu(MemoryRegion *mr,
+                              MemoryRegionIOMMUOps *ops,
+                              const char *name,
+                              uint64_t size);
+
 /**
  * memory_region_destroy: Destroy a memory region and reclaim all resources.
  *
@@ -371,6 +415,15 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
 }
 
 /**
+ * memory_region_is_iommu: check whether a memory region is an iommu
+ *
+ * Returns %true is a memory region is an iommu.
+ *
+ * @mr: the memory region being queried
+ */
+bool memory_region_is_iommu(MemoryRegion *mr);
+
+/**
  * memory_region_name: get a memory region's name
  *
  * Returns the string that was used to initialize the memory region.
@@ -825,7 +878,8 @@ void address_space_destroy(AddressSpace *as);
 /**
  * address_space_rw: read from or write to an address space.
  *
- * Return true if the operation hit any unassigned memory.
+ * Return true if the operation hit any unassigned memory or encountered an
+ * IOMMU fault.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
@@ -838,7 +892,8 @@ bool address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
 /**
  * address_space_write: write to address space.
  *
- * Return true if the operation hit any unassigned memory.
+ * Return true if the operation hit any unassigned memory or encountered an
+ * IOMMU fault.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
@@ -850,7 +905,8 @@ bool address_space_write(AddressSpace *as, hwaddr addr,
 /**
  * address_space_read: read from an address space.
  *
- * Return true if the operation hit any unassigned memory.
+ * Return true if the operation hit any unassigned memory or encountered an
+ * IOMMU fault.
  *
  * @as: #AddressSpace to be accessed
  * @addr: address within that address space
@@ -875,7 +931,9 @@ MemoryRegion *address_space_translate(AddressSpace *as, hwaddr addr,
 /* address_space_access_valid: check for validity of accessing an address
  * space range
  *
- * Check whether memory is assigned to the given address space range.
+ * Check whether memory is assigned to the given address space range, and
+ * access is permitted by any IOMMU regions that are active for the address
+ * space.
  *
  * For now, addr and len should be aligned to a page size.  This limitation
  * will be lifted in the future.
diff --git a/memory.c b/memory.c
index 5be2d7b..cc37ecf 100644
--- a/memory.c
+++ b/memory.c
@@ -824,6 +824,7 @@ void memory_region_init(MemoryRegion *mr,
 {
     mr->ops = &unassigned_mem_ops;
     mr->opaque = NULL;
+    mr->iommu_ops = NULL;
     mr->parent = NULL;
     mr->size = int128_make64(size);
     if (size == UINT64_MAX) {
@@ -1063,6 +1064,16 @@ void memory_region_init_rom_device(MemoryRegion *mr,
     mr->ram_addr = qemu_ram_alloc(size, mr);
 }
 
+void memory_region_init_iommu(MemoryRegion *mr,
+                              MemoryRegionIOMMUOps *ops,
+                              const char *name,
+                              uint64_t size)
+{
+    memory_region_init(mr, name, size);
+    mr->iommu_ops = ops,
+    mr->terminates = true;  /* then re-forwards */
+}
+
 void memory_region_init_reservation(MemoryRegion *mr,
                                     const char *name,
                                     uint64_t size)
@@ -1108,6 +1119,11 @@ bool memory_region_is_rom(MemoryRegion *mr)
     return mr->ram && mr->readonly;
 }
 
+bool memory_region_is_iommu(MemoryRegion *mr)
+{
+    return mr->iommu_ops;
+}
+
 void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
 {
     uint8_t mask = 1 << client;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (10 preceding siblings ...)
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 11/21] memory: iommu support Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-31 22:56   ` Richard Henderson
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used Paolo Bonzini
                   ` (8 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: David Gibson

From: David Gibson <david@gibson.dropbear.id.au>

This patch adds a NotifierList to MemoryRegions which represent IOMMUs
allowing other parts of the code to register interest in mappings or
unmappings from the IOMMU.  All IOMMU implementations will need to call
memory_region_notify_iommu() to inform those waiting on the notifier list,
whenever an IOMMU mapping is made or removed.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 include/exec/memory.h | 32 ++++++++++++++++++++++++++++++++
 memory.c              | 18 ++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/include/exec/memory.h b/include/exec/memory.h
index 714e805..c81acfd 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -25,6 +25,7 @@
 #include "exec/iorange.h"
 #include "exec/ioport.h"
 #include "qemu/int128.h"
+#include "qemu/notify.h"
 
 #define MAX_PHYS_ADDR_SPACE_BITS 62
 #define MAX_PHYS_ADDR            (((hwaddr)1 << MAX_PHYS_ADDR_SPACE_BITS) - 1)
@@ -173,6 +174,7 @@ struct MemoryRegion {
     uint8_t dirty_log_mask;
     unsigned ioeventfd_nb;
     MemoryRegionIoeventfd *ioeventfds;
+    NotifierList iommu_notify;
 };
 
 struct MemoryRegionPortio {
@@ -424,6 +426,36 @@ static inline bool memory_region_is_romd(MemoryRegion *mr)
 bool memory_region_is_iommu(MemoryRegion *mr);
 
 /**
+ * memory_region_notify_iommu: notify a change in an IOMMU translation entry.
+ *
+ * @mr: the memory region that was changed
+ * @entry: the new entry in the IOMMU translation table.  The entry
+ *         replaces all old entries for the same virtual I/O address range.
+ *         Deleted entries have .@perm == 0.
+ */
+void memory_region_notify_iommu(MemoryRegion *mr,
+                                IOMMUTLBEntry entry);
+
+/**
+ * memory_region_register_iommu_notifier: register a notifier for changes to
+ * IOMMU translation entries.
+ *
+ * @mr: the memory region to observe
+ * @n: the notifier to be added; the notifier receives a pointer to an
+ *     #IOMMUTLBEntry as the opaque value; the pointer ceases to be
+ *     valid on exit from the notifier.
+ */
+void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n);
+
+/**
+ * memory_region_unregister_iommu_notifier: unregister a notifier for
+ * changes to IOMMU translation entries.
+ *
+ * @n: the notifier to be removed.
+ */
+void memory_region_unregister_iommu_notifier(Notifier *n);
+
+/**
  * memory_region_name: get a memory region's name
  *
  * Returns the string that was used to initialize the memory region.
diff --git a/memory.c b/memory.c
index cc37ecf..802867d 100644
--- a/memory.c
+++ b/memory.c
@@ -1072,6 +1072,7 @@ void memory_region_init_iommu(MemoryRegion *mr,
     memory_region_init(mr, name, size);
     mr->iommu_ops = ops,
     mr->terminates = true;  /* then re-forwards */
+    notifier_list_init(&mr->iommu_notify);
 }
 
 void memory_region_init_reservation(MemoryRegion *mr,
@@ -1124,6 +1125,23 @@ bool memory_region_is_iommu(MemoryRegion *mr)
     return mr->iommu_ops;
 }
 
+void memory_region_register_iommu_notifier(MemoryRegion *mr, Notifier *n)
+{
+    notifier_list_add(&mr->iommu_notify, n);
+}
+
+void memory_region_unregister_iommu_notifier(Notifier *n)
+{
+    notifier_remove(n);
+}
+
+void memory_region_notify_iommu(MemoryRegion *mr,
+                                IOMMUTLBEntry entry)
+{
+    assert(memory_region_is_iommu(mr));
+    notifier_list_notify(&mr->iommu_notify, &entry);
+}
+
 void memory_region_set_log(MemoryRegion *mr, bool log, unsigned client)
 {
     uint8_t mask = 1 << client;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (11 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-31 22:57   ` Richard Henderson
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 14/21] spapr: convert TCE API to use an opaque type Paolo Bonzini
                   ` (7 subsequent siblings)
  20 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Avi Kivity

From: Avi Kivity <avi.kivity@gmail.com>

vfio doesn't support guest iommus yet, indicate it to the user
by gently depositing a core on their disk.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/misc/vfio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index c89676b..52fb036 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -1939,6 +1939,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
     void *vaddr;
     int ret;
 
+    assert(!memory_region_is_iommu(section->mr));
+
     if (vfio_listener_skipped_section(section)) {
         DPRINTF("SKIPPING region_add %"HWADDR_PRIx" - %"PRIx64"\n",
                 section->offset_within_address_space,
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 14/21] spapr: convert TCE API to use an opaque type
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (12 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 15/21] spapr: make IOMMU translation go through IOMMUTLBEntry Paolo Bonzini
                   ` (6 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

The TCE table is currently returned as a DMAContext, and non-type-safe
APIs are called later passing back the DMAContext.  Since we want to move
away from DMAContext, use an opaque type instead, and add an accessor
to retrieve the DMAContext from it.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/ppc/spapr_iommu.c        | 54 ++++++++++++++++++---------------------------
 hw/ppc/spapr_pci.c          |  8 +++----
 hw/ppc/spapr_vio.c          | 13 ++++++-----
 include/hw/pci-host/spapr.h |  2 +-
 include/hw/ppc/spapr.h      | 12 +++++-----
 include/hw/ppc/spapr_vio.h  |  1 +
 6 files changed, 42 insertions(+), 48 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index e1fe941..7a507e0 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -36,8 +36,6 @@ enum sPAPRTCEAccess {
     SPAPR_TCE_RW = 3,
 };
 
-typedef struct sPAPRTCETable sPAPRTCETable;
-
 struct sPAPRTCETable {
     DMAContext dma;
     uint32_t liobn;
@@ -122,7 +120,7 @@ static int spapr_tce_translate(DMAContext *dma,
     return 0;
 }
 
-DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size)
+sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
 {
     sPAPRTCETable *tcet;
 
@@ -155,43 +153,40 @@ DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size)
     }
 
 #ifdef DEBUG_TCE
-    fprintf(stderr, "spapr_iommu: New TCE table, liobn=0x%x, context @ %p, "
-            "table @ %p, fd=%d\n", liobn, &tcet->dma, tcet->table, tcet->fd);
+    fprintf(stderr, "spapr_iommu: New TCE table @ %p, liobn=0x%x, "
+            "table @ %p, fd=%d\n", tcet, liobn, tcet->table, tcet->fd);
 #endif
 
     QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
-    return &tcet->dma;
+    return tcet;
 }
 
-void spapr_tce_free(DMAContext *dma)
+void spapr_tce_free(sPAPRTCETable *tcet)
 {
+    QLIST_REMOVE(tcet, list);
 
-    if (dma) {
-        sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
-
-        QLIST_REMOVE(tcet, list);
-
-        if (!kvm_enabled() ||
-            (kvmppc_remove_spapr_tce(tcet->table, tcet->fd,
-                                     tcet->window_size) != 0)) {
-            g_free(tcet->table);
-        }
-
-        g_free(tcet);
+    if (!kvm_enabled() ||
+        (kvmppc_remove_spapr_tce(tcet->table, tcet->fd,
+                                 tcet->window_size) != 0)) {
+        g_free(tcet->table);
     }
+
+    g_free(tcet);
 }
 
-void spapr_tce_set_bypass(DMAContext *dma, bool bypass)
+DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet)
 {
-    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
+    return &tcet->dma;
+}
 
+void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass)
+{
     tcet->bypass = bypass;
 }
 
-void spapr_tce_reset(DMAContext *dma)
+void spapr_tce_reset(sPAPRTCETable *tcet)
 {
-    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
     size_t table_size = (tcet->window_size >> SPAPR_TCE_PAGE_SHIFT)
         * sizeof(sPAPRTCE);
 
@@ -277,17 +272,12 @@ int spapr_dma_dt(void *fdt, int node_off, const char *propname,
 }
 
 int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
-                      DMAContext *iommu)
+                      sPAPRTCETable *tcet)
 {
-    if (!iommu) {
+    if (!tcet) {
         return 0;
     }
 
-    if (iommu->translate == spapr_tce_translate) {
-        sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, iommu);
-        return spapr_dma_dt(fdt, node_off, propname,
-                tcet->liobn, 0, tcet->window_size);
-    }
-
-    return -1;
+    return spapr_dma_dt(fdt, node_off, propname,
+                        tcet->liobn, 0, tcet->window_size);
 }
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 62ff323..eb64a8f 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -511,7 +511,7 @@ static DMAContext *spapr_pci_dma_context_fn(PCIBus *bus, void *opaque,
 {
     sPAPRPHBState *phb = opaque;
 
-    return phb->dma;
+    return spapr_tce_get_dma(phb->tcet);
 }
 
 static int spapr_phb_init(SysBusDevice *s)
@@ -646,8 +646,8 @@ static int spapr_phb_init(SysBusDevice *s)
 
     sphb->dma_window_start = 0;
     sphb->dma_window_size = 0x40000000;
-    sphb->dma = spapr_tce_new_dma_context(sphb->dma_liobn, sphb->dma_window_size);
-    if (!sphb->dma) {
+    sphb->tcet = spapr_tce_new_table(sphb->dma_liobn, sphb->dma_window_size);
+    if (!sphb->tcet) {
         fprintf(stderr, "Unable to create TCE table for %s\n", sphb->dtbusname);
         return -1;
     }
@@ -676,7 +676,7 @@ static void spapr_phb_reset(DeviceState *qdev)
     sPAPRPHBState *sphb = SPAPR_PCI_HOST_BRIDGE(s);
 
     /* Reset the IOMMU state */
-    spapr_tce_reset(sphb->dma);
+    spapr_tce_reset(sphb->tcet);
 }
 
 static Property spapr_phb_properties[] = {
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 1405c32..a06ac94 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -145,7 +145,7 @@ static int vio_make_devnode(VIOsPAPRDevice *dev,
         }
     }
 
-    ret = spapr_tcet_dma_dt(fdt, node_off, "ibm,my-dma-window", dev->dma);
+    ret = spapr_tcet_dma_dt(fdt, node_off, "ibm,my-dma-window", dev->tcet);
     if (ret < 0) {
         return ret;
     }
@@ -319,8 +319,8 @@ int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq)
 
 static void spapr_vio_quiesce_one(VIOsPAPRDevice *dev)
 {
-    if (dev->dma) {
-        spapr_tce_reset(dev->dma);
+    if (dev->tcet) {
+        spapr_tce_reset(dev->tcet);
     }
     free_crq(dev);
 }
@@ -345,12 +345,12 @@ static void rtas_set_tce_bypass(sPAPREnvironment *spapr, uint32_t token,
         return;
     }
 
-    if (!dev->dma) {
+    if (!dev->tcet) {
         rtas_st(rets, 0, -3);
         return;
     }
 
-    spapr_tce_set_bypass(dev->dma, !!enable);
+    spapr_tce_set_bypass(dev->tcet, !!enable);
 
     rtas_st(rets, 0, 0);
 }
@@ -457,7 +457,8 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
 
     if (pc->rtce_window_size) {
         uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
-        dev->dma = spapr_tce_new_dma_context(liobn, pc->rtce_window_size);
+        dev->tcet = spapr_tce_new_table(liobn, pc->rtce_window_size);
+        dev->dma = spapr_tce_get_dma(dev->tcet);
     }
 
     return pc->init(dev);
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index b21080c..653dd40 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -49,7 +49,7 @@ typedef struct sPAPRPHBState {
     uint32_t dma_liobn;
     uint64_t dma_window_start;
     uint64_t dma_window_size;
-    DMAContext *dma;
+    sPAPRTCETable *tcet;
 
     struct {
         uint32_t irq;
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 864bee9..e8d617b 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -342,17 +342,19 @@ typedef struct sPAPRTCE {
 
 #define RTAS_ERROR_LOG_MAX      2048
 
+typedef struct sPAPRTCETable sPAPRTCETable;
 
 void spapr_iommu_init(void);
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
-DMAContext *spapr_tce_new_dma_context(uint32_t liobn, size_t window_size);
-void spapr_tce_free(DMAContext *dma);
-void spapr_tce_reset(DMAContext *dma);
-void spapr_tce_set_bypass(DMAContext *dma, bool bypass);
+sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size);
+DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet);
+void spapr_tce_free(sPAPRTCETable *tcet);
+void spapr_tce_reset(sPAPRTCETable *tcet);
+void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass);
 int spapr_dma_dt(void *fdt, int node_off, const char *propname,
                  uint32_t liobn, uint64_t window, uint32_t size);
 int spapr_tcet_dma_dt(void *fdt, int node_off, const char *propname,
-                      DMAContext *dma);
+                      sPAPRTCETable *tcet);
 
 #endif /* !defined (__HW_SPAPR_H__) */
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index f98ec0a..56f2821 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -63,6 +63,7 @@ struct VIOsPAPRDevice {
     uint32_t irq;
     target_ulong signal_state;
     VIOsPAPR_CRQ crq;
+    sPAPRTCETable *tcet;
     DMAContext *dma;
 };
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 15/21] spapr: make IOMMU translation go through IOMMUTLBEntry
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (13 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 14/21] spapr: convert TCE API to use an opaque type Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 16/21] spapr: use memory core for iommu support Paolo Bonzini
                   ` (5 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

The next step is to introduce the translation code that will be used for
IOMMU MemoryRegions, but still do the actual translation in a DMAContext.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/ppc/spapr_iommu.c | 60 ++++++++++++++++++++++++++++++----------------------
 1 file changed, 35 insertions(+), 25 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 7a507e0..cf5ccb1 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -68,15 +68,8 @@ static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
     return NULL;
 }
 
-static int spapr_tce_translate(DMAContext *dma,
-                               dma_addr_t addr,
-                               hwaddr *paddr,
-                               hwaddr *len,
-                               DMADirection dir)
+static IOMMUTLBEntry spapr_tce_translate_iommu(sPAPRTCETable *tcet, hwaddr addr)
 {
-    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
-    enum sPAPRTCEAccess access = (dir == DMA_DIRECTION_FROM_DEVICE)
-        ? SPAPR_TCE_WO : SPAPR_TCE_RO;
     uint64_t tce;
 
 #ifdef DEBUG_TCE
@@ -85,9 +78,13 @@ static int spapr_tce_translate(DMAContext *dma,
 #endif
 
     if (tcet->bypass) {
-        *paddr = addr;
-        *len = (hwaddr)-1;
-        return 0;
+        return (IOMMUTLBEntry) {
+            .target_as = &address_space_memory,
+            .iova = 0,
+            .translated_addr = 0,
+            .addr_mask = ~(hwaddr)0,
+            .perm = IOMMU_RW,
+        };
     }
 
     /* Check if we are in bound */
@@ -95,28 +92,41 @@ static int spapr_tce_translate(DMAContext *dma,
 #ifdef DEBUG_TCE
         fprintf(stderr, "spapr_tce_translate out of bounds\n");
 #endif
-        return -EFAULT;
+        return (IOMMUTLBEntry) { .perm = IOMMU_NONE };
     }
 
     tce = tcet->table[addr >> SPAPR_TCE_PAGE_SHIFT].tce;
 
-    /* Check TCE */
-    if (!(tce & access)) {
+#ifdef DEBUG_TCE
+    fprintf(stderr, " ->  *paddr=0x%llx, *len=0x%llx\n",
+            (tce & ~SPAPR_TCE_PAGE_MASK), SPAPR_TCE_PAGE_MASK + 1);
+#endif
+
+    return (IOMMUTLBEntry) {
+        .target_as = &address_space_memory,
+        .iova = addr & ~SPAPR_TCE_PAGE_MASK,
+        .translated_addr = tce & ~SPAPR_TCE_PAGE_MASK,
+        .addr_mask = SPAPR_TCE_PAGE_MASK,
+        .perm = tce,
+    };
+}
+
+static int spapr_tce_translate(DMAContext *dma,
+                               dma_addr_t addr,
+                               hwaddr *paddr,
+                               hwaddr *len,
+                               DMADirection dir)
+ {
+    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
+    bool is_write = (dir == DMA_DIRECTION_FROM_DEVICE);
+    IOMMUTLBEntry entry = spapr_tce_translate_iommu(tcet, addr);
+    if (!(entry.perm & (1 << is_write))) {
         return -EPERM;
     }
 
-    /* How much til end of page ? */
-    *len = ((~addr) & SPAPR_TCE_PAGE_MASK) + 1;
-
     /* Translate */
-    *paddr = (tce & ~SPAPR_TCE_PAGE_MASK) |
-        (addr & SPAPR_TCE_PAGE_MASK);
-
-#ifdef DEBUG_TCE
-    fprintf(stderr, " ->  *paddr=0x" TARGET_FMT_plx ", *len=0x"
-            TARGET_FMT_plx "\n", *paddr, *len);
-#endif
-
+    *paddr = entry.translated_addr | (addr & entry.addr_mask);
+    *len = (addr | entry.addr_mask) - addr + 1;
     return 0;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 16/21] spapr: use memory core for iommu support
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (14 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 15/21] spapr: make IOMMU translation go through IOMMUTLBEntry Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 17/21] dma: eliminate old-style IOMMU support Paolo Bonzini
                   ` (4 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

Now we can stop using a "translating" DMAContext, but we do not yet modify
the sPAPRTCETable users to get an AddressSpace; they keep using the table
via a DMAContext.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/ppc/spapr_iommu.c   | 48 +++++++++++++++++++++++++++---------------------
 include/hw/ppc/spapr.h |  1 +
 2 files changed, 28 insertions(+), 21 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index cf5ccb1..6e33929 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -37,12 +37,16 @@ enum sPAPRTCEAccess {
 };
 
 struct sPAPRTCETable {
+    /* temporary until everyone has its own AddressSpace */
     DMAContext dma;
+    AddressSpace as;
+
     uint32_t liobn;
     uint32_t window_size;
     sPAPRTCE *table;
     bool bypass;
     int fd;
+    MemoryRegion iommu;
     QLIST_ENTRY(sPAPRTCETable) list;
 };
 
@@ -68,8 +72,9 @@ static sPAPRTCETable *spapr_tce_find_by_liobn(uint32_t liobn)
     return NULL;
 }
 
-static IOMMUTLBEntry spapr_tce_translate_iommu(sPAPRTCETable *tcet, hwaddr addr)
+static IOMMUTLBEntry spapr_tce_translate_iommu(MemoryRegion *iommu, hwaddr addr)
 {
+    sPAPRTCETable *tcet = container_of(iommu, sPAPRTCETable, iommu);
     uint64_t tce;
 
 #ifdef DEBUG_TCE
@@ -111,24 +116,9 @@ static IOMMUTLBEntry spapr_tce_translate_iommu(sPAPRTCETable *tcet, hwaddr addr)
     };
 }
 
-static int spapr_tce_translate(DMAContext *dma,
-                               dma_addr_t addr,
-                               hwaddr *paddr,
-                               hwaddr *len,
-                               DMADirection dir)
- {
-    sPAPRTCETable *tcet = DO_UPCAST(sPAPRTCETable, dma, dma);
-    bool is_write = (dir == DMA_DIRECTION_FROM_DEVICE);
-    IOMMUTLBEntry entry = spapr_tce_translate_iommu(tcet, addr);
-    if (!(entry.perm & (1 << is_write))) {
-        return -EPERM;
-    }
-
-    /* Translate */
-    *paddr = entry.translated_addr | (addr & entry.addr_mask);
-    *len = (addr | entry.addr_mask) - addr + 1;
-    return 0;
-}
+static MemoryRegionIOMMUOps spapr_iommu_ops = {
+    .translate = spapr_tce_translate_iommu,
+};
 
 sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
 {
@@ -145,8 +135,6 @@ sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
     }
 
     tcet = g_malloc0(sizeof(*tcet));
-    dma_context_init(&tcet->dma, &address_space_memory, spapr_tce_translate, NULL, NULL);
-
     tcet->liobn = liobn;
     tcet->window_size = window_size;
 
@@ -167,6 +155,11 @@ sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
             "table @ %p, fd=%d\n", tcet, liobn, tcet->table, tcet->fd);
 #endif
 
+    memory_region_init_iommu(&tcet->iommu, &spapr_iommu_ops,
+                             "iommu-spapr", UINT64_MAX);
+    address_space_init(&tcet->as, &tcet->iommu);
+    dma_context_init(&tcet->dma, &tcet->as, NULL, NULL, NULL);
+
     QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
     return tcet;
@@ -190,6 +183,11 @@ DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet)
     return &tcet->dma;
 }
 
+MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
+{
+    return &tcet->iommu;
+}
+
 void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass)
 {
     tcet->bypass = bypass;
@@ -208,6 +206,7 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
                                 target_ulong tce)
 {
     sPAPRTCE *tcep;
+    IOMMUTLBEntry entry;
 
     if (ioba >= tcet->window_size) {
         hcall_dprintf("spapr_vio_put_tce on out-of-bounds IOBA 0x"
@@ -218,6 +217,13 @@ static target_ulong put_tce_emu(sPAPRTCETable *tcet, target_ulong ioba,
     tcep = tcet->table + (ioba >> SPAPR_TCE_PAGE_SHIFT);
     tcep->tce = tce;
 
+    entry.target_as = &address_space_memory,
+    entry.iova = ioba & ~SPAPR_TCE_PAGE_MASK;
+    entry.translated_addr = tce & ~SPAPR_TCE_PAGE_MASK;
+    entry.addr_mask = SPAPR_TCE_PAGE_MASK;
+    entry.perm = tce;
+    memory_region_notify_iommu(&tcet->iommu, entry);
+
     return H_SUCCESS;
 }
 
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index e8d617b..142abb7 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -349,6 +349,7 @@ void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size);
 DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet);
+MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
 void spapr_tce_free(sPAPRTCETable *tcet);
 void spapr_tce_reset(sPAPRTCETable *tcet);
 void spapr_tce_set_bypass(sPAPRTCETable *tcet, bool bypass);
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 17/21] dma: eliminate old-style IOMMU support
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (15 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 16/21] spapr: use memory core for iommu support Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 18/21] pci: use memory core for iommu support Paolo Bonzini
                   ` (3 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

The translate function in the DMAContext is now always NULL.
Remove every reference to it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 dma-helpers.c        | 178 +++------------------------------------------------
 exec.c               |   3 +-
 hw/pci/pci.c         |   3 +-
 hw/ppc/spapr_iommu.c |   2 +-
 include/sysemu/dma.h |  79 ++++-------------------
 5 files changed, 26 insertions(+), 239 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index 2e298b6..c53705a 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -14,32 +14,26 @@
 
 /* #define DEBUG_IOMMU */
 
-static void do_dma_memory_set(AddressSpace *as,
-                              dma_addr_t addr, uint8_t c, dma_addr_t len)
+int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len)
 {
+    AddressSpace *as = dma->as;
+
+    dma_barrier(dma, DMA_DIRECTION_FROM_DEVICE);
+
 #define FILLBUF_SIZE 512
     uint8_t fillbuf[FILLBUF_SIZE];
     int l;
+    bool error = false;
 
     memset(fillbuf, c, FILLBUF_SIZE);
     while (len > 0) {
         l = len < FILLBUF_SIZE ? len : FILLBUF_SIZE;
-        address_space_rw(as, addr, fillbuf, l, true);
+        error |= address_space_rw(as, addr, fillbuf, l, true);
         len -= l;
         addr += l;
     }
-}
 
-int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len)
-{
-    dma_barrier(dma, DMA_DIRECTION_FROM_DEVICE);
-
-    if (dma_has_iommu(dma)) {
-        return iommu_dma_memory_set(dma, addr, c, len);
-    }
-    do_dma_memory_set(dma->as, addr, c, len);
-
-    return 0;
+    return error;
 }
 
 void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
@@ -278,162 +272,10 @@ void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
     bdrv_acct_start(bs, cookie, sg->size, type);
 }
 
-bool iommu_dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
-                            DMADirection dir)
+void dma_context_init(DMAContext *dma, AddressSpace *as)
 {
-    hwaddr paddr, plen;
-
 #ifdef DEBUG_IOMMU
-    fprintf(stderr, "dma_memory_check context=%p addr=0x" DMA_ADDR_FMT
-            " len=0x" DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
-#endif
-
-    while (len) {
-        if (dma->translate(dma, addr, &paddr, &plen, dir) != 0) {
-            return false;
-        }
-
-        /* The translation might be valid for larger regions. */
-        if (plen > len) {
-            plen = len;
-        }
-
-        if (!address_space_access_valid(dma->as, paddr, len,
-                                        dir == DMA_DIRECTION_FROM_DEVICE)) {
-            return false;
-        }
-
-        len -= plen;
-        addr += plen;
-    }
-
-    return true;
-}
-
-int iommu_dma_memory_rw(DMAContext *dma, dma_addr_t addr,
-                        void *buf, dma_addr_t len, DMADirection dir)
-{
-    hwaddr paddr, plen;
-    int err;
-
-#ifdef DEBUG_IOMMU
-    fprintf(stderr, "dma_memory_rw context=%p addr=0x" DMA_ADDR_FMT " len=0x"
-            DMA_ADDR_FMT " dir=%d\n", dma, addr, len, dir);
-#endif
-
-    while (len) {
-        err = dma->translate(dma, addr, &paddr, &plen, dir);
-        if (err) {
-	    /*
-             * In case of failure on reads from the guest, we clean the
-             * destination buffer so that a device that doesn't test
-             * for errors will not expose qemu internal memory.
-	     */
-	    memset(buf, 0, len);
-            return -1;
-        }
-
-        /* The translation might be valid for larger regions. */
-        if (plen > len) {
-            plen = len;
-        }
-
-        address_space_rw(dma->as, paddr, buf, plen, dir == DMA_DIRECTION_FROM_DEVICE);
-
-        len -= plen;
-        addr += plen;
-        buf += plen;
-    }
-
-    return 0;
-}
-
-int iommu_dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c,
-                         dma_addr_t len)
-{
-    hwaddr paddr, plen;
-    int err;
-
-#ifdef DEBUG_IOMMU
-    fprintf(stderr, "dma_memory_set context=%p addr=0x" DMA_ADDR_FMT
-            " len=0x" DMA_ADDR_FMT "\n", dma, addr, len);
-#endif
-
-    while (len) {
-        err = dma->translate(dma, addr, &paddr, &plen,
-                             DMA_DIRECTION_FROM_DEVICE);
-        if (err) {
-            return err;
-        }
-
-        /* The translation might be valid for larger regions. */
-        if (plen > len) {
-            plen = len;
-        }
-
-        do_dma_memory_set(dma->as, paddr, c, plen);
-
-        len -= plen;
-        addr += plen;
-    }
-
-    return 0;
-}
-
-void dma_context_init(DMAContext *dma, AddressSpace *as, DMATranslateFunc translate,
-                      DMAMapFunc map, DMAUnmapFunc unmap)
-{
-#ifdef DEBUG_IOMMU
-    fprintf(stderr, "dma_context_init(%p, %p, %p, %p)\n",
-            dma, translate, map, unmap);
+    fprintf(stderr, "dma_context_init(%p -> %p)\n", dma, as);
 #endif
     dma->as = as;
-    dma->translate = translate;
-    dma->map = map;
-    dma->unmap = unmap;
-}
-
-void *iommu_dma_memory_map(DMAContext *dma, dma_addr_t addr, dma_addr_t *len,
-                           DMADirection dir)
-{
-    int err;
-    hwaddr paddr, plen;
-    void *buf;
-
-    if (dma->map) {
-        return dma->map(dma, addr, len, dir);
-    }
-
-    plen = *len;
-    err = dma->translate(dma, addr, &paddr, &plen, dir);
-    if (err) {
-        return NULL;
-    }
-
-    /*
-     * If this is true, the virtual region is contiguous,
-     * but the translated physical region isn't. We just
-     * clamp *len, much like address_space_map() does.
-     */
-    if (plen < *len) {
-        *len = plen;
-    }
-
-    buf = address_space_map(dma->as, paddr, &plen, dir == DMA_DIRECTION_FROM_DEVICE);
-    *len = plen;
-
-    return buf;
-}
-
-void iommu_dma_memory_unmap(DMAContext *dma, void *buffer, dma_addr_t len,
-                            DMADirection dir, dma_addr_t access_len)
-{
-    if (dma->unmap) {
-        dma->unmap(dma, buffer, len, dir, access_len);
-        return;
-    }
-
-    address_space_unmap(dma->as, buffer, len, dir == DMA_DIRECTION_FROM_DEVICE,
-                        access_len);
-
 }
diff --git a/exec.c b/exec.c
index f6716d5..e7c70de 100644
--- a/exec.c
+++ b/exec.c
@@ -1840,8 +1840,7 @@ static void memory_map_init(void)
     memory_listener_register(&io_memory_listener, &address_space_io);
     memory_listener_register(&tcg_memory_listener, &address_space_memory);
 
-    dma_context_init(&dma_context_memory, &address_space_memory,
-                     NULL, NULL, NULL);
+    dma_context_init(&dma_context_memory, &address_space_memory);
 }
 
 MemoryRegion *get_system_memory(void)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index bb3879b..c9d585c 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -814,8 +814,9 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
         memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
         address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
         pci_dev->dma = g_new(DMAContext, 1);
-        dma_context_init(pci_dev->dma, &pci_dev->bus_master_as, NULL, NULL, NULL);
+        dma_context_init(pci_dev->dma, &pci_dev->bus_master_as);
     }
+
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
     pci_dev->irq_state = 0;
diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 6e33929..4667e11 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -158,7 +158,7 @@ sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
     memory_region_init_iommu(&tcet->iommu, &spapr_iommu_ops,
                              "iommu-spapr", UINT64_MAX);
     address_space_init(&tcet->as, &tcet->iommu);
-    dma_context_init(&tcet->dma, &tcet->as, NULL, NULL, NULL);
+    dma_context_init(&tcet->dma, &tcet->as);
 
     QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
index 02e0dcd..3317dcf 100644
--- a/include/sysemu/dma.h
+++ b/include/sysemu/dma.h
@@ -46,26 +46,8 @@ typedef uint64_t dma_addr_t;
 #define DMA_ADDR_BITS 64
 #define DMA_ADDR_FMT "%" PRIx64
 
-typedef int DMATranslateFunc(DMAContext *dma,
-                             dma_addr_t addr,
-                             hwaddr *paddr,
-                             hwaddr *len,
-                             DMADirection dir);
-typedef void* DMAMapFunc(DMAContext *dma,
-                         dma_addr_t addr,
-                         dma_addr_t *len,
-                         DMADirection dir);
-typedef void DMAUnmapFunc(DMAContext *dma,
-                          void *buffer,
-                          dma_addr_t len,
-                          DMADirection dir,
-                          dma_addr_t access_len);
-
 struct DMAContext {
     AddressSpace *as;
-    DMATranslateFunc *translate;
-    DMAMapFunc *map;
-    DMAUnmapFunc *unmap;
 };
 
 /* A global DMA context corresponding to the address_space_memory
@@ -98,41 +80,22 @@ static inline void dma_barrier(DMAContext *dma, DMADirection dir)
     }
 }
 
-static inline bool dma_has_iommu(DMAContext *dma)
-{
-    return dma && dma->translate;
-}
-
 /* Checks that the given range of addresses is valid for DMA.  This is
  * useful for certain cases, but usually you should just use
  * dma_memory_{read,write}() and check for errors */
-bool iommu_dma_memory_valid(DMAContext *dma, dma_addr_t addr, dma_addr_t len,
-                            DMADirection dir);
 static inline bool dma_memory_valid(DMAContext *dma,
                                     dma_addr_t addr, dma_addr_t len,
                                     DMADirection dir)
 {
-    if (!dma_has_iommu(dma)) {
-        return address_space_access_valid(dma->as, addr, len,
-                                          dir == DMA_DIRECTION_FROM_DEVICE);
-    } else {
-        return iommu_dma_memory_valid(dma, addr, len, dir);
-    }
+    return address_space_access_valid(dma->as, addr, len,
+                                      dir == DMA_DIRECTION_FROM_DEVICE);
 }
 
-int iommu_dma_memory_rw(DMAContext *dma, dma_addr_t addr,
-                        void *buf, dma_addr_t len, DMADirection dir);
 static inline int dma_memory_rw_relaxed(DMAContext *dma, dma_addr_t addr,
                                         void *buf, dma_addr_t len,
                                         DMADirection dir)
 {
-    if (!dma_has_iommu(dma)) {
-        /* Fast-path for no IOMMU */
-        address_space_rw(dma->as, addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
-        return 0;
-    } else {
-        return iommu_dma_memory_rw(dma, addr, buf, len, dir);
-    }
+    return address_space_rw(dma->as, addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
 }
 
 static inline int dma_memory_read_relaxed(DMAContext *dma, dma_addr_t addr,
@@ -170,43 +133,26 @@ static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
                          DMA_DIRECTION_FROM_DEVICE);
 }
 
-int iommu_dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c,
-			 dma_addr_t len);
-
 int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len);
 
-void *iommu_dma_memory_map(DMAContext *dma,
-                           dma_addr_t addr, dma_addr_t *len,
-                           DMADirection dir);
 static inline void *dma_memory_map(DMAContext *dma,
                                    dma_addr_t addr, dma_addr_t *len,
                                    DMADirection dir)
 {
-    if (!dma_has_iommu(dma)) {
-        hwaddr xlen = *len;
-        void *p;
-
-        p = address_space_map(dma->as, addr, &xlen, dir == DMA_DIRECTION_FROM_DEVICE);
-        *len = xlen;
-        return p;
-    } else {
-        return iommu_dma_memory_map(dma, addr, len, dir);
-    }
+    hwaddr xlen = *len;
+    void *p;
+
+    p = address_space_map(dma->as, addr, &xlen, dir == DMA_DIRECTION_FROM_DEVICE);
+    *len = xlen;
+    return p;
 }
 
-void iommu_dma_memory_unmap(DMAContext *dma,
-                            void *buffer, dma_addr_t len,
-                            DMADirection dir, dma_addr_t access_len);
 static inline void dma_memory_unmap(DMAContext *dma,
                                     void *buffer, dma_addr_t len,
                                     DMADirection dir, dma_addr_t access_len)
 {
-    if (!dma_has_iommu(dma)) {
-        address_space_unmap(dma->as, buffer, (hwaddr)len,
-                            dir == DMA_DIRECTION_FROM_DEVICE, access_len);
-    } else {
-        iommu_dma_memory_unmap(dma, buffer, len, dir, access_len);
-    }
+    address_space_unmap(dma->as, buffer, (hwaddr)len,
+                        dir == DMA_DIRECTION_FROM_DEVICE, access_len);
 }
 
 #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
@@ -247,8 +193,7 @@ DEFINE_LDST_DMA(q, q, 64, be);
 
 #undef DEFINE_LDST_DMA
 
-void dma_context_init(DMAContext *dma, AddressSpace *as, DMATranslateFunc translate,
-                      DMAMapFunc map, DMAUnmapFunc unmap);
+void dma_context_init(DMAContext *dma, AddressSpace *as);
 
 struct ScatterGatherEntry {
     dma_addr_t base;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 18/21] pci: use memory core for iommu support
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (16 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 17/21] dma: eliminate old-style IOMMU support Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 19/21] spapr_vio: take care of creating our own AddressSpace/DMAContext Paolo Bonzini
                   ` (2 subsequent siblings)
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Avi Kivity, David Gibson

From: Avi Kivity <avi.kivity@gmail.com>

Use the new iommu support in the memory core for iommu support.  The only
user, spapr, is also converted, but it still provides a DMAContext
interface until the non-PCI bits switch to AddressSpace.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
[ Do not calls memory_region_del_subregion() on the device's
  bus_master_enable_region, it is an alias; return an AddressSpace
  from the IOMMU hook and remove the destructor hook. - David Gibson ]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/pci/pci.c                | 46 ++++++++++++++++++++++-----------------------
 hw/ppc/spapr_pci.c          |  8 ++++----
 include/hw/pci-host/spapr.h |  1 +
 include/hw/pci/pci.h        |  4 ++--
 include/hw/pci/pci_bus.h    |  4 ++--
 5 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index c9d585c..9e6bd77 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -279,6 +279,13 @@ int pci_find_domain(const PCIBus *bus)
     return -1;
 }
 
+static AddressSpace *pci_default_address_space(PCIBus *bus, void *opaque,
+                                               int devfn)
+{
+    /* FIXME: inherit memory region from bus creator */
+    return &address_space_memory;
+}
+
 static void pci_bus_init(PCIBus *bus, DeviceState *parent,
                          const char *name,
                          MemoryRegion *address_space_mem,
@@ -289,6 +295,7 @@ static void pci_bus_init(PCIBus *bus, DeviceState *parent,
     bus->devfn_min = devfn_min;
     bus->address_space_mem = address_space_mem;
     bus->address_space_io = address_space_io;
+    pci_setup_iommu(bus, pci_default_address_space, NULL);
 
     /* host bridge */
     QLIST_INIT(&bus->child);
@@ -786,6 +793,7 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
     PCIDeviceClass *pc = PCI_DEVICE_GET_CLASS(pci_dev);
     PCIConfigReadFunc *config_read = pc->config_read;
     PCIConfigWriteFunc *config_write = pc->config_write;
+    AddressSpace *dma_as;
 
     if (devfn < 0) {
         for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices);
@@ -801,21 +809,15 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
                      PCI_SLOT(devfn), PCI_FUNC(devfn), name, bus->devices[devfn]->name);
         return NULL;
     }
+
     pci_dev->bus = bus;
-    if (bus->dma_context_fn) {
-        pci_dev->dma = bus->dma_context_fn(bus, bus->dma_context_opaque, devfn);
-    } else {
-        /* FIXME: Make dma_context_fn use MemoryRegions instead, so this path is
-         * taken unconditionally */
-        /* FIXME: inherit memory region from bus creator */
-        memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
-                                 get_system_memory(), 0,
-                                 memory_region_size(get_system_memory()));
-        memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
-        address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
-        pci_dev->dma = g_new(DMAContext, 1);
-        dma_context_init(pci_dev->dma, &pci_dev->bus_master_as);
-    }
+    dma_as = bus->iommu_fn(bus, bus->iommu_opaque, devfn);
+    memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
+                             dma_as->root, 0, memory_region_size(dma_as->root));
+    memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
+    address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
+    pci_dev->dma = g_new(DMAContext, 1);
+    dma_context_init(pci_dev->dma, &pci_dev->bus_master_as);
 
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
@@ -870,12 +872,10 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
     pci_dev->bus->devices[pci_dev->devfn] = NULL;
     pci_config_free(pci_dev);
 
-    if (!pci_dev->bus->dma_context_fn) {
-        address_space_destroy(&pci_dev->bus_master_as);
-        memory_region_destroy(&pci_dev->bus_master_enable_region);
-        g_free(pci_dev->dma);
-        pci_dev->dma = NULL;
-    }
+    address_space_destroy(&pci_dev->bus_master_as);
+    memory_region_destroy(&pci_dev->bus_master_enable_region);
+    g_free(pci_dev->dma);
+    pci_dev->dma = NULL;
 }
 
 static void pci_unregister_io_regions(PCIDevice *pci_dev)
@@ -2232,10 +2232,10 @@ static void pci_device_class_init(ObjectClass *klass, void *data)
     k->props = pci_props;
 }
 
-void pci_setup_iommu(PCIBus *bus, PCIDMAContextFunc fn, void *opaque)
+void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque)
 {
-    bus->dma_context_fn = fn;
-    bus->dma_context_opaque = opaque;
+    bus->iommu_fn = fn;
+    bus->iommu_opaque = opaque;
 }
 
 static const TypeInfo pci_device_type_info = {
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index eb64a8f..459398c 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -506,12 +506,11 @@ static const MemoryRegionOps spapr_msi_ops = {
 /*
  * PHB PCI device
  */
-static DMAContext *spapr_pci_dma_context_fn(PCIBus *bus, void *opaque,
-                                            int devfn)
+static AddressSpace *spapr_pci_dma_iommu(PCIBus *bus, void *opaque, int devfn)
 {
     sPAPRPHBState *phb = opaque;
 
-    return spapr_tce_get_dma(phb->tcet);
+    return &phb->iommu_as;
 }
 
 static int spapr_phb_init(SysBusDevice *s)
@@ -651,7 +650,8 @@ static int spapr_phb_init(SysBusDevice *s)
         fprintf(stderr, "Unable to create TCE table for %s\n", sphb->dtbusname);
         return -1;
     }
-    pci_setup_iommu(bus, spapr_pci_dma_context_fn, sphb);
+    address_space_init(&sphb->iommu_as, spapr_tce_get_iommu(sphb->tcet));
+    pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
 
     QLIST_INSERT_HEAD(&spapr->phbs, sphb, list);
 
diff --git a/include/hw/pci-host/spapr.h b/include/hw/pci-host/spapr.h
index 653dd40..1e23dbf 100644
--- a/include/hw/pci-host/spapr.h
+++ b/include/hw/pci-host/spapr.h
@@ -50,6 +50,7 @@ typedef struct sPAPRPHBState {
     uint64_t dma_window_start;
     uint64_t dma_window_size;
     sPAPRTCETable *tcet;
+    AddressSpace iommu_as;
 
     struct {
         uint32_t irq;
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 8d075ab..3a85bce 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -401,9 +401,9 @@ int pci_read_devaddr(Monitor *mon, const char *addr, int *domp, int *busp,
 
 void pci_device_deassert_intx(PCIDevice *dev);
 
-typedef DMAContext *(*PCIDMAContextFunc)(PCIBus *, void *, int);
+typedef AddressSpace *(*PCIIOMMUFunc)(PCIBus *, void *, int);
 
-void pci_setup_iommu(PCIBus *bus, PCIDMAContextFunc fn, void *opaque);
+void pci_setup_iommu(PCIBus *bus, PCIIOMMUFunc fn, void *opaque);
 
 static inline void
 pci_set_byte(uint8_t *config, uint8_t val)
diff --git a/include/hw/pci/pci_bus.h b/include/hw/pci/pci_bus.h
index 6ee443c..66762f6 100644
--- a/include/hw/pci/pci_bus.h
+++ b/include/hw/pci/pci_bus.h
@@ -10,8 +10,8 @@
 
 struct PCIBus {
     BusState qbus;
-    PCIDMAContextFunc dma_context_fn;
-    void *dma_context_opaque;
+    PCIIOMMUFunc iommu_fn;
+    void *iommu_opaque;
     uint8_t devfn_min;
     pci_set_irq_fn set_irq;
     pci_map_irq_fn map_irq;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 19/21] spapr_vio: take care of creating our own AddressSpace/DMAContext
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (17 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 18/21] pci: use memory core for iommu support Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 20/21] dma: eliminate DMAContext Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 21/21] memory: give name to every AddressSpace Paolo Bonzini
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

Fetch the root region from the sPAPRTCETable, and use it to build
an AddressSpace and DMAContext.

Now, everywhere we have a DMAContext we also have access to the
corresponding AddressSpace (either because we create it just before
the DMAContext, or because dma_context_memory's AddressSpace is
trivially address_space_memory).

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/ppc/spapr_iommu.c       | 11 -----------
 hw/ppc/spapr_vio.c         |  3 ++-
 include/hw/ppc/spapr.h     |  1 -
 include/hw/ppc/spapr_vio.h | 21 +++++++++++----------
 4 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/hw/ppc/spapr_iommu.c b/hw/ppc/spapr_iommu.c
index 4667e11..91bc8e4 100644
--- a/hw/ppc/spapr_iommu.c
+++ b/hw/ppc/spapr_iommu.c
@@ -37,10 +37,6 @@ enum sPAPRTCEAccess {
 };
 
 struct sPAPRTCETable {
-    /* temporary until everyone has its own AddressSpace */
-    DMAContext dma;
-    AddressSpace as;
-
     uint32_t liobn;
     uint32_t window_size;
     sPAPRTCE *table;
@@ -157,8 +153,6 @@ sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size)
 
     memory_region_init_iommu(&tcet->iommu, &spapr_iommu_ops,
                              "iommu-spapr", UINT64_MAX);
-    address_space_init(&tcet->as, &tcet->iommu);
-    dma_context_init(&tcet->dma, &tcet->as);
 
     QLIST_INSERT_HEAD(&spapr_tce_tables, tcet, list);
 
@@ -178,11 +172,6 @@ void spapr_tce_free(sPAPRTCETable *tcet)
     g_free(tcet);
 }
 
-DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet)
-{
-    return &tcet->dma;
-}
-
 MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet)
 {
     return &tcet->iommu;
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index a06ac94..8d77a36 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -458,7 +458,8 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
     if (pc->rtce_window_size) {
         uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
         dev->tcet = spapr_tce_new_table(liobn, pc->rtce_window_size);
-        dev->dma = spapr_tce_get_dma(dev->tcet);
+        address_space_init(&dev->as, spapr_tce_get_iommu(dev->tcet));
+        dma_context_init(&dev->dma, &dev->as);
     }
 
     return pc->init(dev);
diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
index 142abb7..a83720e 100644
--- a/include/hw/ppc/spapr.h
+++ b/include/hw/ppc/spapr.h
@@ -348,7 +348,6 @@ void spapr_iommu_init(void);
 void spapr_events_init(sPAPREnvironment *spapr);
 void spapr_events_fdt_skel(void *fdt, uint32_t epow_irq);
 sPAPRTCETable *spapr_tce_new_table(uint32_t liobn, size_t window_size);
-DMAContext *spapr_tce_get_dma(sPAPRTCETable *tcet);
 MemoryRegion *spapr_tce_get_iommu(sPAPRTCETable *tcet);
 void spapr_tce_free(sPAPRTCETable *tcet);
 void spapr_tce_reset(sPAPRTCETable *tcet);
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index 56f2821..2de58f1 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -63,8 +63,9 @@ struct VIOsPAPRDevice {
     uint32_t irq;
     target_ulong signal_state;
     VIOsPAPR_CRQ crq;
+    AddressSpace as;
+    DMAContext dma;
     sPAPRTCETable *tcet;
-    DMAContext *dma;
 };
 
 #define DEFINE_SPAPR_PROPERTIES(type, field)           \
@@ -92,35 +93,35 @@ static inline qemu_irq spapr_vio_qirq(VIOsPAPRDevice *dev)
 static inline bool spapr_vio_dma_valid(VIOsPAPRDevice *dev, uint64_t taddr,
                                        uint32_t size, DMADirection dir)
 {
-    return dma_memory_valid(dev->dma, taddr, size, dir);
+    return dma_memory_valid(&dev->dma, taddr, size, dir);
 }
 
 static inline int spapr_vio_dma_read(VIOsPAPRDevice *dev, uint64_t taddr,
                                      void *buf, uint32_t size)
 {
-    return (dma_memory_read(dev->dma, taddr, buf, size) != 0) ?
+    return (dma_memory_read(&dev->dma, taddr, buf, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
 static inline int spapr_vio_dma_write(VIOsPAPRDevice *dev, uint64_t taddr,
                                       const void *buf, uint32_t size)
 {
-    return (dma_memory_write(dev->dma, taddr, buf, size) != 0) ?
+    return (dma_memory_write(&dev->dma, taddr, buf, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
 static inline int spapr_vio_dma_set(VIOsPAPRDevice *dev, uint64_t taddr,
                                     uint8_t c, uint32_t size)
 {
-    return (dma_memory_set(dev->dma, taddr, c, size) != 0) ?
+    return (dma_memory_set(&dev->dma, taddr, c, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
-#define vio_stb(_dev, _addr, _val) (stb_dma((_dev)->dma, (_addr), (_val)))
-#define vio_sth(_dev, _addr, _val) (stw_be_dma((_dev)->dma, (_addr), (_val)))
-#define vio_stl(_dev, _addr, _val) (stl_be_dma((_dev)->dma, (_addr), (_val)))
-#define vio_stq(_dev, _addr, _val) (stq_be_dma((_dev)->dma, (_addr), (_val)))
-#define vio_ldq(_dev, _addr) (ldq_be_dma((_dev)->dma, (_addr)))
+#define vio_stb(_dev, _addr, _val) (stb_dma(&(_dev)->dma, (_addr), (_val)))
+#define vio_sth(_dev, _addr, _val) (stw_be_dma(&(_dev)->dma, (_addr), (_val)))
+#define vio_stl(_dev, _addr, _val) (stl_be_dma(&(_dev)->dma, (_addr), (_val)))
+#define vio_stq(_dev, _addr, _val) (stq_be_dma(&(_dev)->dma, (_addr), (_val)))
+#define vio_ldq(_dev, _addr) (ldq_be_dma(&(_dev)->dma, (_addr)))
 
 int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq);
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 20/21] dma: eliminate DMAContext
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (18 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 19/21] spapr_vio: take care of creating our own AddressSpace/DMAContext Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 21/21] memory: give name to every AddressSpace Paolo Bonzini
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel

The DMAContext is a simple pointer to an AddressSpace that is now always
already available.  Make everyone hold the address space directly,
and clean up the DMA API to use the AddressSpace directly.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 dma-helpers.c              | 24 +++++----------
 exec.c                     |  3 --
 hw/dma/pl330.c             |  8 ++---
 hw/ide/ahci.c              | 18 +++++------
 hw/ide/ahci.h              |  4 +--
 hw/ide/ich.c               |  2 +-
 hw/ide/macio.c             |  4 +--
 hw/pci/pci.c               |  4 ---
 hw/ppc/spapr_vio.c         |  1 -
 hw/scsi/megasas.c          |  4 +--
 hw/scsi/virtio-scsi.c      |  2 +-
 hw/scsi/vmw_pvscsi.c       |  2 +-
 hw/sd/sdhci.c              | 22 +++++++-------
 hw/usb/hcd-ehci-pci.c      |  4 +--
 hw/usb/hcd-ehci-sysbus.c   |  2 +-
 hw/usb/hcd-ehci.c          | 12 ++++----
 hw/usb/hcd-ehci.h          |  2 +-
 hw/usb/hcd-ohci.c          | 30 +++++++++----------
 hw/usb/libhw.c             |  4 +--
 include/hw/pci/pci.h       | 17 +++++------
 include/hw/ppc/spapr_vio.h | 19 ++++++------
 include/sysemu/dma.h       | 75 ++++++++++++++++++++--------------------------
 22 files changed, 116 insertions(+), 147 deletions(-)

diff --git a/dma-helpers.c b/dma-helpers.c
index c53705a..5afba47 100644
--- a/dma-helpers.c
+++ b/dma-helpers.c
@@ -14,11 +14,9 @@
 
 /* #define DEBUG_IOMMU */
 
-int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len)
+int dma_memory_set(AddressSpace *as, dma_addr_t addr, uint8_t c, dma_addr_t len)
 {
-    AddressSpace *as = dma->as;
-
-    dma_barrier(dma, DMA_DIRECTION_FROM_DEVICE);
+    dma_barrier(as, DMA_DIRECTION_FROM_DEVICE);
 
 #define FILLBUF_SIZE 512
     uint8_t fillbuf[FILLBUF_SIZE];
@@ -36,13 +34,13 @@ int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len)
     return error;
 }
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma)
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, AddressSpace *as)
 {
     qsg->sg = g_malloc(alloc_hint * sizeof(ScatterGatherEntry));
     qsg->nsg = 0;
     qsg->nalloc = alloc_hint;
     qsg->size = 0;
-    qsg->dma = dma;
+    qsg->as = as;
 }
 
 void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len)
@@ -102,7 +100,7 @@ static void dma_bdrv_unmap(DMAAIOCB *dbs)
     int i;
 
     for (i = 0; i < dbs->iov.niov; ++i) {
-        dma_memory_unmap(dbs->sg->dma, dbs->iov.iov[i].iov_base,
+        dma_memory_unmap(dbs->sg->as, dbs->iov.iov[i].iov_base,
                          dbs->iov.iov[i].iov_len, dbs->dir,
                          dbs->iov.iov[i].iov_len);
     }
@@ -150,7 +148,7 @@ static void dma_bdrv_cb(void *opaque, int ret)
     while (dbs->sg_cur_index < dbs->sg->nsg) {
         cur_addr = dbs->sg->sg[dbs->sg_cur_index].base + dbs->sg_cur_byte;
         cur_len = dbs->sg->sg[dbs->sg_cur_index].len - dbs->sg_cur_byte;
-        mem = dma_memory_map(dbs->sg->dma, cur_addr, &cur_len, dbs->dir);
+        mem = dma_memory_map(dbs->sg->as, cur_addr, &cur_len, dbs->dir);
         if (!mem)
             break;
         qemu_iovec_add(&dbs->iov, mem, cur_len);
@@ -247,7 +245,7 @@ static uint64_t dma_buf_rw(uint8_t *ptr, int32_t len, QEMUSGList *sg,
     while (len > 0) {
         ScatterGatherEntry entry = sg->sg[sg_cur_index++];
         int32_t xfer = MIN(len, entry.len);
-        dma_memory_rw(sg->dma, entry.base, ptr, xfer, dir);
+        dma_memory_rw(sg->as, entry.base, ptr, xfer, dir);
         ptr += xfer;
         len -= xfer;
         resid -= xfer;
@@ -271,11 +269,3 @@ void dma_acct_start(BlockDriverState *bs, BlockAcctCookie *cookie,
 {
     bdrv_acct_start(bs, cookie, sg->size, type);
 }
-
-void dma_context_init(DMAContext *dma, AddressSpace *as)
-{
-#ifdef DEBUG_IOMMU
-    fprintf(stderr, "dma_context_init(%p -> %p)\n", dma, as);
-#endif
-    dma->as = as;
-}
diff --git a/exec.c b/exec.c
index e7c70de..e58f842 100644
--- a/exec.c
+++ b/exec.c
@@ -63,7 +63,6 @@ static MemoryRegion *system_io;
 
 AddressSpace address_space_io;
 AddressSpace address_space_memory;
-DMAContext dma_context_memory;
 
 MemoryRegion io_mem_rom, io_mem_notdirty;
 static MemoryRegion io_mem_unassigned;
@@ -1839,8 +1838,6 @@ static void memory_map_init(void)
     memory_listener_register(&core_memory_listener, &address_space_memory);
     memory_listener_register(&io_memory_listener, &address_space_io);
     memory_listener_register(&tcg_memory_listener, &address_space_memory);
-
-    dma_context_init(&dma_context_memory, &address_space_memory);
 }
 
 MemoryRegion *get_system_memory(void)
diff --git a/hw/dma/pl330.c b/hw/dma/pl330.c
index 60f5299..044c087 100644
--- a/hw/dma/pl330.c
+++ b/hw/dma/pl330.c
@@ -1074,7 +1074,7 @@ static inline const PL330InsnDesc *pl330_fetch_insn(PL330Chan *ch)
     uint8_t opcode;
     int i;
 
-    dma_memory_read(&dma_context_memory, ch->pc, &opcode, 1);
+    dma_memory_read(&address_space_memory, ch->pc, &opcode, 1);
     for (i = 0; insn_desc[i].size; i++) {
         if ((opcode & insn_desc[i].opmask) == insn_desc[i].opcode) {
             return &insn_desc[i];
@@ -1088,7 +1088,7 @@ static inline void pl330_exec_insn(PL330Chan *ch, const PL330InsnDesc *insn)
     uint8_t buf[PL330_INSN_MAXSIZE];
 
     assert(insn->size <= PL330_INSN_MAXSIZE);
-    dma_memory_read(&dma_context_memory, ch->pc, buf, insn->size);
+    dma_memory_read(&address_space_memory, ch->pc, buf, insn->size);
     insn->exec(ch, buf[0], &buf[1], insn->size - 1);
 }
 
@@ -1153,7 +1153,7 @@ static int pl330_exec_cycle(PL330Chan *channel)
     if (q != NULL && q->len <= pl330_fifo_num_free(&s->fifo)) {
         int len = q->len - (q->addr & (q->len - 1));
 
-        dma_memory_read(&dma_context_memory, q->addr, buf, len);
+        dma_memory_read(&address_space_memory, q->addr, buf, len);
         if (PL330_ERR_DEBUG > 1) {
             DB_PRINT("PL330 read from memory @%08x (size = %08x):\n",
                       q->addr, len);
@@ -1185,7 +1185,7 @@ static int pl330_exec_cycle(PL330Chan *channel)
             fifo_res = pl330_fifo_get(&s->fifo, buf, len, q->tag);
         }
         if (fifo_res == PL330_FIFO_OK || q->z) {
-            dma_memory_write(&dma_context_memory, q->addr, buf, len);
+            dma_memory_write(&address_space_memory, q->addr, buf, len);
             if (PL330_ERR_DEBUG > 1) {
                 DB_PRINT("PL330 read from memory @%08x (size = %08x):\n",
                          q->addr, len);
diff --git a/hw/ide/ahci.c b/hw/ide/ahci.c
index eab6096..1adfa0b 100644
--- a/hw/ide/ahci.c
+++ b/hw/ide/ahci.c
@@ -597,7 +597,7 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis)
     if (!cmd_fis) {
         /* map cmd_fis */
         uint64_t tbl_addr = le64_to_cpu(ad->cur_cmd->tbl_addr);
-        cmd_fis = dma_memory_map(ad->hba->dma, tbl_addr, &cmd_len,
+        cmd_fis = dma_memory_map(ad->hba->as, tbl_addr, &cmd_len,
                                  DMA_DIRECTION_TO_DEVICE);
         cmd_mapped = 1;
     }
@@ -630,7 +630,7 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis)
     ahci_trigger_irq(ad->hba, ad, PORT_IRQ_D2H_REG_FIS);
 
     if (cmd_mapped) {
-        dma_memory_unmap(ad->hba->dma, cmd_fis, cmd_len,
+        dma_memory_unmap(ad->hba->as, cmd_fis, cmd_len,
                          DMA_DIRECTION_TO_DEVICE, cmd_len);
     }
 }
@@ -657,7 +657,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
     }
 
     /* map PRDT */
-    if (!(prdt = dma_memory_map(ad->hba->dma, prdt_addr, &prdt_len,
+    if (!(prdt = dma_memory_map(ad->hba->as, prdt_addr, &prdt_len,
                                 DMA_DIRECTION_TO_DEVICE))){
         DPRINTF(ad->port_no, "map failed\n");
         return -1;
@@ -691,7 +691,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
             goto out;
         }
 
-        qemu_sglist_init(sglist, (sglist_alloc_hint - off_idx), ad->hba->dma);
+        qemu_sglist_init(sglist, (sglist_alloc_hint - off_idx), ad->hba->as);
         qemu_sglist_add(sglist, le64_to_cpu(tbl[off_idx].addr + off_pos),
                         le32_to_cpu(tbl[off_idx].flags_size) + 1 - off_pos);
 
@@ -703,7 +703,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
     }
 
 out:
-    dma_memory_unmap(ad->hba->dma, prdt, prdt_len,
+    dma_memory_unmap(ad->hba->as, prdt, prdt_len,
                      DMA_DIRECTION_TO_DEVICE, prdt_len);
     return r;
 }
@@ -836,7 +836,7 @@ static int handle_cmd(AHCIState *s, int port, int slot)
     tbl_addr = le64_to_cpu(cmd->tbl_addr);
 
     cmd_len = 0x80;
-    cmd_fis = dma_memory_map(s->dma, tbl_addr, &cmd_len,
+    cmd_fis = dma_memory_map(s->as, tbl_addr, &cmd_len,
                              DMA_DIRECTION_FROM_DEVICE);
 
     if (!cmd_fis) {
@@ -963,7 +963,7 @@ static int handle_cmd(AHCIState *s, int port, int slot)
     }
 
 out:
-    dma_memory_unmap(s->dma, cmd_fis, cmd_len, DMA_DIRECTION_FROM_DEVICE,
+    dma_memory_unmap(s->as, cmd_fis, cmd_len, DMA_DIRECTION_FROM_DEVICE,
                      cmd_len);
 
     if (s->dev[port].port.ifs[0].status & (BUSY_STAT|DRQ_STAT)) {
@@ -1145,12 +1145,12 @@ static const IDEDMAOps ahci_dma_ops = {
     .reset = ahci_dma_reset,
 };
 
-void ahci_init(AHCIState *s, DeviceState *qdev, DMAContext *dma, int ports)
+void ahci_init(AHCIState *s, DeviceState *qdev, AddressSpace *as, int ports)
 {
     qemu_irq *irqs;
     int i;
 
-    s->dma = dma;
+    s->as = as;
     s->ports = ports;
     s->dev = g_malloc0(sizeof(AHCIDevice) * ports);
     ahci_reg_init(s);
diff --git a/hw/ide/ahci.h b/hw/ide/ahci.h
index 85f37fe..341a571 100644
--- a/hw/ide/ahci.h
+++ b/hw/ide/ahci.h
@@ -297,7 +297,7 @@ typedef struct AHCIState {
     uint32_t idp_index;     /* Current IDP index */
     int32_t ports;
     qemu_irq irq;
-    DMAContext *dma;
+    AddressSpace *as;
 } AHCIState;
 
 typedef struct AHCIPCIState {
@@ -338,7 +338,7 @@ typedef struct NCQFrame {
     uint8_t reserved10;
 } QEMU_PACKED NCQFrame;
 
-void ahci_init(AHCIState *s, DeviceState *qdev, DMAContext *dma, int ports);
+void ahci_init(AHCIState *s, DeviceState *qdev, AddressSpace *as, int ports);
 void ahci_uninit(AHCIState *s);
 
 void ahci_reset(AHCIState *s);
diff --git a/hw/ide/ich.c b/hw/ide/ich.c
index ed1f1a2..6c0c0c2 100644
--- a/hw/ide/ich.c
+++ b/hw/ide/ich.c
@@ -104,7 +104,7 @@ static int pci_ich9_ahci_init(PCIDevice *dev)
     uint8_t *sata_cap;
     d = DO_UPCAST(struct AHCIPCIState, card, dev);
 
-    ahci_init(&d->ahci, &dev->qdev, pci_dma_context(dev), 6);
+    ahci_init(&d->ahci, &dev->qdev, pci_get_address_space(dev), 6);
 
     pci_config_set_prog_interface(d->card.config, AHCI_PROGMODE_MAJOR_REV_1);
 
diff --git a/hw/ide/macio.c b/hw/ide/macio.c
index e1e4f41..a1952b0 100644
--- a/hw/ide/macio.c
+++ b/hw/ide/macio.c
@@ -71,7 +71,7 @@ static void pmac_ide_atapi_transfer_cb(void *opaque, int ret)
     s->io_buffer_size = io->len;
 
     qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1,
-                     &dma_context_memory);
+                     &address_space_memory);
     qemu_sglist_add(&s->sg, io->addr, io->len);
     io->addr += io->len;
     io->len = 0;
@@ -128,7 +128,7 @@ static void pmac_ide_transfer_cb(void *opaque, int ret)
     s->io_buffer_size = io->len;
 
     qemu_sglist_init(&s->sg, io->len / MACIO_PAGE_SIZE + 1,
-                     &dma_context_memory);
+                     &address_space_memory);
     qemu_sglist_add(&s->sg, io->addr, io->len);
     io->addr += io->len;
     io->len = 0;
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 9e6bd77..2b1bc6d 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -816,8 +816,6 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
                              dma_as->root, 0, memory_region_size(dma_as->root));
     memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
     address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
-    pci_dev->dma = g_new(DMAContext, 1);
-    dma_context_init(pci_dev->dma, &pci_dev->bus_master_as);
 
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
@@ -874,8 +872,6 @@ static void do_pci_unregister_device(PCIDevice *pci_dev)
 
     address_space_destroy(&pci_dev->bus_master_as);
     memory_region_destroy(&pci_dev->bus_master_enable_region);
-    g_free(pci_dev->dma);
-    pci_dev->dma = NULL;
 }
 
 static void pci_unregister_io_regions(PCIDevice *pci_dev)
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 8d77a36..5e72f1b 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -459,7 +459,6 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
         uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
         dev->tcet = spapr_tce_new_table(liobn, pc->rtce_window_size);
         address_space_init(&dev->as, spapr_tce_get_iommu(dev->tcet));
-        dma_context_init(&dev->dma, &dev->as);
     }
 
     return pc->init(dev);
diff --git a/hw/scsi/megasas.c b/hw/scsi/megasas.c
index fe6550c..7ee7d97 100644
--- a/hw/scsi/megasas.c
+++ b/hw/scsi/megasas.c
@@ -232,7 +232,7 @@ static int megasas_map_sgl(MegasasState *s, MegasasCmd *cmd, union mfi_sgl *sgl)
                                          MEGASAS_MAX_SGE);
         return iov_count;
     }
-    qemu_sglist_init(&cmd->qsg, iov_count, pci_dma_context(&s->dev));
+    qemu_sglist_init(&cmd->qsg, iov_count, pci_get_address_space(&s->dev));
     for (i = 0; i < iov_count; i++) {
         dma_addr_t iov_pa, iov_size_p;
 
@@ -628,7 +628,7 @@ static int megasas_map_dcmd(MegasasState *s, MegasasCmd *cmd)
     }
     iov_pa = megasas_sgl_get_addr(cmd, &cmd->frame->dcmd.sgl);
     iov_size = megasas_sgl_get_len(cmd, &cmd->frame->dcmd.sgl);
-    qemu_sglist_init(&cmd->qsg, 1, pci_dma_context(&s->dev));
+    qemu_sglist_init(&cmd->qsg, 1, pci_get_address_space(&s->dev));
     qemu_sglist_add(&cmd->qsg, iov_pa, iov_size);
     cmd->iov_size = iov_size;
     return cmd->iov_size;
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 08dd3f3..b8a0abf 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -80,7 +80,7 @@ static void virtio_scsi_bad_req(void)
 static void qemu_sgl_init_external(QEMUSGList *qsgl, struct iovec *sg,
                                    hwaddr *addr, int num)
 {
-    qemu_sglist_init(qsgl, num, &dma_context_memory);
+    qemu_sglist_init(qsgl, num, &address_space_memory);
     while (num--) {
         qemu_sglist_add(qsgl, *(addr++), (sg++)->iov_len);
     }
diff --git a/hw/scsi/vmw_pvscsi.c b/hw/scsi/vmw_pvscsi.c
index 48d12f4..eb2270f 100644
--- a/hw/scsi/vmw_pvscsi.c
+++ b/hw/scsi/vmw_pvscsi.c
@@ -617,7 +617,7 @@ pvscsi_build_sglist(PVSCSIState *s, PVSCSIRequest *r)
 {
     PCIDevice *d = PCI_DEVICE(s);
 
-    qemu_sglist_init(&r->sgl, 1, pci_dma_context(d));
+    pci_dma_sglist_init(&r->sgl, d, 1);
     if (r->req.flags & PVSCSI_FLAG_CMD_WITH_SG_LIST) {
         pvscsi_convert_sglist(r);
     } else {
diff --git a/hw/sd/sdhci.c b/hw/sd/sdhci.c
index 91dc9b0..5d9247a 100644
--- a/hw/sd/sdhci.c
+++ b/hw/sd/sdhci.c
@@ -496,7 +496,7 @@ static void sdhci_sdma_transfer_multi_blocks(SDHCIState *s)
                     s->blkcnt--;
                 }
             }
-            dma_memory_write(&dma_context_memory, s->sdmasysad,
+            dma_memory_write(&address_space_memory, s->sdmasysad,
                              &s->fifo_buffer[begin], s->data_count - begin);
             s->sdmasysad += s->data_count - begin;
             if (s->data_count == block_size) {
@@ -518,7 +518,7 @@ static void sdhci_sdma_transfer_multi_blocks(SDHCIState *s)
                 s->data_count = block_size;
                 boundary_count -= block_size - begin;
             }
-            dma_memory_read(&dma_context_memory, s->sdmasysad,
+            dma_memory_read(&address_space_memory, s->sdmasysad,
                             &s->fifo_buffer[begin], s->data_count);
             s->sdmasysad += s->data_count - begin;
             if (s->data_count == block_size) {
@@ -557,10 +557,10 @@ static void sdhci_sdma_transfer_single_block(SDHCIState *s)
         for (n = 0; n < datacnt; n++) {
             s->fifo_buffer[n] = sd_read_data(s->card);
         }
-        dma_memory_write(&dma_context_memory, s->sdmasysad, s->fifo_buffer,
+        dma_memory_write(&address_space_memory, s->sdmasysad, s->fifo_buffer,
                          datacnt);
     } else {
-        dma_memory_read(&dma_context_memory, s->sdmasysad, s->fifo_buffer,
+        dma_memory_read(&address_space_memory, s->sdmasysad, s->fifo_buffer,
                         datacnt);
         for (n = 0; n < datacnt; n++) {
             sd_write_data(s->card, s->fifo_buffer[n]);
@@ -588,7 +588,7 @@ static void get_adma_description(SDHCIState *s, ADMADescr *dscr)
     hwaddr entry_addr = (hwaddr)s->admasysaddr;
     switch (SDHC_DMA_TYPE(s->hostctl)) {
     case SDHC_CTRL_ADMA2_32:
-        dma_memory_read(&dma_context_memory, entry_addr, (uint8_t *)&adma2,
+        dma_memory_read(&address_space_memory, entry_addr, (uint8_t *)&adma2,
                         sizeof(adma2));
         adma2 = le64_to_cpu(adma2);
         /* The spec does not specify endianness of descriptor table.
@@ -600,7 +600,7 @@ static void get_adma_description(SDHCIState *s, ADMADescr *dscr)
         dscr->incr = 8;
         break;
     case SDHC_CTRL_ADMA1_32:
-        dma_memory_read(&dma_context_memory, entry_addr, (uint8_t *)&adma1,
+        dma_memory_read(&address_space_memory, entry_addr, (uint8_t *)&adma1,
                         sizeof(adma1));
         adma1 = le32_to_cpu(adma1);
         dscr->addr = (hwaddr)(adma1 & 0xFFFFF000);
@@ -613,12 +613,12 @@ static void get_adma_description(SDHCIState *s, ADMADescr *dscr)
         }
         break;
     case SDHC_CTRL_ADMA2_64:
-        dma_memory_read(&dma_context_memory, entry_addr,
+        dma_memory_read(&address_space_memory, entry_addr,
                         (uint8_t *)(&dscr->attr), 1);
-        dma_memory_read(&dma_context_memory, entry_addr + 2,
+        dma_memory_read(&address_space_memory, entry_addr + 2,
                         (uint8_t *)(&dscr->length), 2);
         dscr->length = le16_to_cpu(dscr->length);
-        dma_memory_read(&dma_context_memory, entry_addr + 4,
+        dma_memory_read(&address_space_memory, entry_addr + 4,
                         (uint8_t *)(&dscr->addr), 8);
         dscr->attr = le64_to_cpu(dscr->attr);
         dscr->attr &= 0xfffffff8;
@@ -678,7 +678,7 @@ static void sdhci_do_adma(SDHCIState *s)
                         s->data_count = block_size;
                         length -= block_size - begin;
                     }
-                    dma_memory_write(&dma_context_memory, dscr.addr,
+                    dma_memory_write(&address_space_memory, dscr.addr,
                                      &s->fifo_buffer[begin],
                                      s->data_count - begin);
                     dscr.addr += s->data_count - begin;
@@ -702,7 +702,7 @@ static void sdhci_do_adma(SDHCIState *s)
                         s->data_count = block_size;
                         length -= block_size - begin;
                     }
-                    dma_memory_read(&dma_context_memory, dscr.addr,
+                    dma_memory_read(&address_space_memory, dscr.addr,
                                     &s->fifo_buffer[begin], s->data_count);
                     dscr.addr += s->data_count - begin;
                     if (s->data_count == block_size) {
diff --git a/hw/usb/hcd-ehci-pci.c b/hw/usb/hcd-ehci-pci.c
index 0eb7826..f1b5f5d 100644
--- a/hw/usb/hcd-ehci-pci.c
+++ b/hw/usb/hcd-ehci-pci.c
@@ -63,7 +63,7 @@ static int usb_ehci_pci_initfn(PCIDevice *dev)
     s->caps[0x09] = 0x68;        /* EECP */
 
     s->irq = dev->irq[3];
-    s->dma = pci_dma_context(dev);
+    s->as = pci_get_address_space(dev);
 
     s->capsbase = 0x00;
     s->opregbase = 0x20;
@@ -86,7 +86,7 @@ static void usb_ehci_pci_write_config(PCIDevice *dev, uint32_t addr,
         return;
     }
     busmaster = pci_get_word(dev->config + PCI_COMMAND) & PCI_COMMAND_MASTER;
-    i->ehci.dma = busmaster ? pci_dma_context(dev) : NULL;
+    i->ehci.as = busmaster ? pci_get_address_space(dev) : &address_space_memory;
 }
 
 static Property ehci_pci_properties[] = {
diff --git a/hw/usb/hcd-ehci-sysbus.c b/hw/usb/hcd-ehci-sysbus.c
index b68a66a..f9e4fd3 100644
--- a/hw/usb/hcd-ehci-sysbus.c
+++ b/hw/usb/hcd-ehci-sysbus.c
@@ -40,7 +40,7 @@ static int usb_ehci_sysbus_initfn(SysBusDevice *dev)
 
     s->capsbase = sec->capsbase;
     s->opregbase = sec->opregbase;
-    s->dma = &dma_context_memory;
+    s->as = &address_space_memory;
 
     usb_ehci_initfn(s, DEVICE(dev));
     sysbus_init_irq(dev, &s->irq);
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index 0d3799d..1ad2159 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -446,7 +446,7 @@ static inline int get_dwords(EHCIState *ehci, uint32_t addr,
 {
     int i;
 
-    if (!ehci->dma) {
+    if (!ehci->as) {
         ehci_raise_irq(ehci, USBSTS_HSE);
         ehci->usbcmd &= ~USBCMD_RUNSTOP;
         trace_usb_ehci_dma_error();
@@ -454,7 +454,7 @@ static inline int get_dwords(EHCIState *ehci, uint32_t addr,
     }
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-        dma_memory_read(ehci->dma, addr, buf, sizeof(*buf));
+        dma_memory_read(ehci->as, addr, buf, sizeof(*buf));
         *buf = le32_to_cpu(*buf);
     }
 
@@ -467,7 +467,7 @@ static inline int put_dwords(EHCIState *ehci, uint32_t addr,
 {
     int i;
 
-    if (!ehci->dma) {
+    if (!ehci->as) {
         ehci_raise_irq(ehci, USBSTS_HSE);
         ehci->usbcmd &= ~USBCMD_RUNSTOP;
         trace_usb_ehci_dma_error();
@@ -476,7 +476,7 @@ static inline int put_dwords(EHCIState *ehci, uint32_t addr,
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
         uint32_t tmp = cpu_to_le32(*buf);
-        dma_memory_write(ehci->dma, addr, &tmp, sizeof(tmp));
+        dma_memory_write(ehci->as, addr, &tmp, sizeof(tmp));
     }
 
     return num;
@@ -1245,7 +1245,7 @@ static int ehci_init_transfer(EHCIPacket *p)
     cpage  = get_field(p->qtd.token, QTD_TOKEN_CPAGE);
     bytes  = get_field(p->qtd.token, QTD_TOKEN_TBYTES);
     offset = p->qtd.bufptr[0] & ~QTD_BUFPTR_MASK;
-    qemu_sglist_init(&p->sgl, 5, p->queue->ehci->dma);
+    qemu_sglist_init(&p->sgl, 5, p->queue->ehci->as);
 
     while (bytes > 0) {
         if (cpage > 4) {
@@ -1484,7 +1484,7 @@ static int ehci_process_itd(EHCIState *ehci,
                 return -1;
             }
 
-            qemu_sglist_init(&ehci->isgl, 2, ehci->dma);
+            qemu_sglist_init(&ehci->isgl, 2, ehci->as);
             if (off + len > 4096) {
                 /* transfer crosses page border */
                 uint32_t len2 = off + len - 4096;
diff --git a/hw/usb/hcd-ehci.h b/hw/usb/hcd-ehci.h
index e95bb7e..2fcb92f 100644
--- a/hw/usb/hcd-ehci.h
+++ b/hw/usb/hcd-ehci.h
@@ -261,7 +261,7 @@ struct EHCIState {
     USBBus bus;
     qemu_irq irq;
     MemoryRegion mem;
-    DMAContext *dma;
+    AddressSpace *as;
     MemoryRegion mem_caps;
     MemoryRegion mem_opreg;
     MemoryRegion mem_ports;
diff --git a/hw/usb/hcd-ohci.c b/hw/usb/hcd-ohci.c
index 51241cd..5513924 100644
--- a/hw/usb/hcd-ohci.c
+++ b/hw/usb/hcd-ohci.c
@@ -62,7 +62,7 @@ typedef struct {
     USBBus bus;
     qemu_irq irq;
     MemoryRegion mem;
-    DMAContext *dma;
+    AddressSpace *as;
     int num_ports;
     const char *name;
 
@@ -508,7 +508,7 @@ static inline int get_dwords(OHCIState *ohci,
     addr += ohci->localmem_base;
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-        dma_memory_read(ohci->dma, addr, buf, sizeof(*buf));
+        dma_memory_read(ohci->as, addr, buf, sizeof(*buf));
         *buf = le32_to_cpu(*buf);
     }
 
@@ -525,7 +525,7 @@ static inline int put_dwords(OHCIState *ohci,
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
         uint32_t tmp = cpu_to_le32(*buf);
-        dma_memory_write(ohci->dma, addr, &tmp, sizeof(tmp));
+        dma_memory_write(ohci->as, addr, &tmp, sizeof(tmp));
     }
 
     return 1;
@@ -540,7 +540,7 @@ static inline int get_words(OHCIState *ohci,
     addr += ohci->localmem_base;
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
-        dma_memory_read(ohci->dma, addr, buf, sizeof(*buf));
+        dma_memory_read(ohci->as, addr, buf, sizeof(*buf));
         *buf = le16_to_cpu(*buf);
     }
 
@@ -557,7 +557,7 @@ static inline int put_words(OHCIState *ohci,
 
     for (i = 0; i < num; i++, buf++, addr += sizeof(*buf)) {
         uint16_t tmp = cpu_to_le16(*buf);
-        dma_memory_write(ohci->dma, addr, &tmp, sizeof(tmp));
+        dma_memory_write(ohci->as, addr, &tmp, sizeof(tmp));
     }
 
     return 1;
@@ -585,7 +585,7 @@ static inline int ohci_read_iso_td(OHCIState *ohci,
 static inline int ohci_read_hcca(OHCIState *ohci,
                                  dma_addr_t addr, struct ohci_hcca *hcca)
 {
-    dma_memory_read(ohci->dma, addr + ohci->localmem_base, hcca, sizeof(*hcca));
+    dma_memory_read(ohci->as, addr + ohci->localmem_base, hcca, sizeof(*hcca));
     return 1;
 }
 
@@ -617,7 +617,7 @@ static inline int ohci_put_iso_td(OHCIState *ohci,
 static inline int ohci_put_hcca(OHCIState *ohci,
                                 dma_addr_t addr, struct ohci_hcca *hcca)
 {
-    dma_memory_write(ohci->dma,
+    dma_memory_write(ohci->as,
                      addr + ohci->localmem_base + HCCA_WRITEBACK_OFFSET,
                      (char *)hcca + HCCA_WRITEBACK_OFFSET,
                      HCCA_WRITEBACK_SIZE);
@@ -634,12 +634,12 @@ static void ohci_copy_td(OHCIState *ohci, struct ohci_td *td,
     n = 0x1000 - (ptr & 0xfff);
     if (n > len)
         n = len;
-    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, n, dir);
+    dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, n, dir);
     if (n == len)
         return;
     ptr = td->be & ~0xfffu;
     buf += n;
-    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, len - n, dir);
+    dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, len - n, dir);
 }
 
 /* Read/Write the contents of an ISO TD from/to main memory.  */
@@ -653,12 +653,12 @@ static void ohci_copy_iso_td(OHCIState *ohci,
     n = 0x1000 - (ptr & 0xfff);
     if (n > len)
         n = len;
-    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, n, dir);
+    dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, n, dir);
     if (n == len)
         return;
     ptr = end_addr & ~0xfffu;
     buf += n;
-    dma_memory_rw(ohci->dma, ptr + ohci->localmem_base, buf, len - n, dir);
+    dma_memory_rw(ohci->as, ptr + ohci->localmem_base, buf, len - n, dir);
 }
 
 static void ohci_process_lists(OHCIState *ohci, int completion);
@@ -1788,11 +1788,11 @@ static USBBusOps ohci_bus_ops = {
 static int usb_ohci_init(OHCIState *ohci, DeviceState *dev,
                          int num_ports, dma_addr_t localmem_base,
                          char *masterbus, uint32_t firstport,
-                         DMAContext *dma)
+                         AddressSpace *as)
 {
     int i;
 
-    ohci->dma = dma;
+    ohci->as = as;
 
     if (usb_frame_time == 0) {
 #ifdef OHCI_TIME_WARP
@@ -1859,7 +1859,7 @@ static int usb_ohci_initfn_pci(struct PCIDevice *dev)
 
     if (usb_ohci_init(&ohci->state, &dev->qdev, ohci->num_ports, 0,
                       ohci->masterbus, ohci->firstport,
-                      pci_dma_context(dev)) != 0) {
+                      pci_get_address_space(dev)) != 0) {
         return -1;
     }
     ohci->state.irq = ohci->pci_dev.irq[0];
@@ -1882,7 +1882,7 @@ static int ohci_init_pxa(SysBusDevice *dev)
 
     /* Cannot fail as we pass NULL for masterbus */
     usb_ohci_init(&s->ohci, &dev->qdev, s->num_ports, s->dma_offset, NULL, 0,
-                  &dma_context_memory);
+                  &address_space_memory);
     sysbus_init_irq(dev, &s->ohci.irq);
     sysbus_init_mmio(dev, &s->ohci.mem);
 
diff --git a/hw/usb/libhw.c b/hw/usb/libhw.c
index d2d4b51..8df11c4 100644
--- a/hw/usb/libhw.c
+++ b/hw/usb/libhw.c
@@ -37,7 +37,7 @@ int usb_packet_map(USBPacket *p, QEMUSGList *sgl)
 
         while (len) {
             dma_addr_t xlen = len;
-            mem = dma_memory_map(sgl->dma, base, &xlen, dir);
+            mem = dma_memory_map(sgl->as, base, &xlen, dir);
             if (!mem) {
                 goto err;
             }
@@ -63,7 +63,7 @@ void usb_packet_unmap(USBPacket *p, QEMUSGList *sgl)
     int i;
 
     for (i = 0; i < p->iov.niov; i++) {
-        dma_memory_unmap(sgl->dma, p->iov.iov[i].iov_base,
+        dma_memory_unmap(sgl->as, p->iov.iov[i].iov_base,
                          p->iov.iov[i].iov_len, dir,
                          p->iov.iov[i].iov_len);
     }
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index 3a85bce..6ef1f97 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -242,7 +242,6 @@ struct PCIDevice {
     PCIIORegion io_regions[PCI_NUM_REGIONS];
     AddressSpace bus_master_as;
     MemoryRegion bus_master_enable_region;
-    DMAContext *dma;
 
     /* do not access the following fields */
     PCIConfigReadFunc *config_read;
@@ -639,15 +638,15 @@ static inline uint32_t pci_config_size(const PCIDevice *d)
 }
 
 /* DMA access functions */
-static inline DMAContext *pci_dma_context(PCIDevice *dev)
+static inline AddressSpace *pci_get_address_space(PCIDevice *dev)
 {
-    return dev->dma;
+    return &dev->bus_master_as;
 }
 
 static inline int pci_dma_rw(PCIDevice *dev, dma_addr_t addr,
                              void *buf, dma_addr_t len, DMADirection dir)
 {
-    dma_memory_rw(pci_dma_context(dev), addr, buf, len, dir);
+    dma_memory_rw(pci_get_address_space(dev), addr, buf, len, dir);
     return 0;
 }
 
@@ -667,12 +666,12 @@ static inline int pci_dma_write(PCIDevice *dev, dma_addr_t addr,
     static inline uint##_bits##_t ld##_l##_pci_dma(PCIDevice *dev,      \
                                                    dma_addr_t addr)     \
     {                                                                   \
-        return ld##_l##_dma(pci_dma_context(dev), addr);                \
+        return ld##_l##_dma(pci_get_address_space(dev), addr);          \
     }                                                                   \
     static inline void st##_s##_pci_dma(PCIDevice *dev,                 \
                                         dma_addr_t addr, uint##_bits##_t val) \
     {                                                                   \
-        st##_s##_dma(pci_dma_context(dev), addr, val);                  \
+        st##_s##_dma(pci_get_address_space(dev), addr, val);            \
     }
 
 PCI_DMA_DEFINE_LDST(ub, b, 8);
@@ -690,20 +689,20 @@ static inline void *pci_dma_map(PCIDevice *dev, dma_addr_t addr,
 {
     void *buf;
 
-    buf = dma_memory_map(pci_dma_context(dev), addr, plen, dir);
+    buf = dma_memory_map(pci_get_address_space(dev), addr, plen, dir);
     return buf;
 }
 
 static inline void pci_dma_unmap(PCIDevice *dev, void *buffer, dma_addr_t len,
                                  DMADirection dir, dma_addr_t access_len)
 {
-    dma_memory_unmap(pci_dma_context(dev), buffer, len, dir, access_len);
+    dma_memory_unmap(pci_get_address_space(dev), buffer, len, dir, access_len);
 }
 
 static inline void pci_dma_sglist_init(QEMUSGList *qsg, PCIDevice *dev,
                                        int alloc_hint)
 {
-    qemu_sglist_init(qsg, alloc_hint, pci_dma_context(dev));
+    qemu_sglist_init(qsg, alloc_hint, pci_get_address_space(dev));
 }
 
 extern const VMStateDescription vmstate_pci_device;
diff --git a/include/hw/ppc/spapr_vio.h b/include/hw/ppc/spapr_vio.h
index 2de58f1..3609327 100644
--- a/include/hw/ppc/spapr_vio.h
+++ b/include/hw/ppc/spapr_vio.h
@@ -64,7 +64,6 @@ struct VIOsPAPRDevice {
     target_ulong signal_state;
     VIOsPAPR_CRQ crq;
     AddressSpace as;
-    DMAContext dma;
     sPAPRTCETable *tcet;
 };
 
@@ -93,35 +92,35 @@ static inline qemu_irq spapr_vio_qirq(VIOsPAPRDevice *dev)
 static inline bool spapr_vio_dma_valid(VIOsPAPRDevice *dev, uint64_t taddr,
                                        uint32_t size, DMADirection dir)
 {
-    return dma_memory_valid(&dev->dma, taddr, size, dir);
+    return dma_memory_valid(&dev->as, taddr, size, dir);
 }
 
 static inline int spapr_vio_dma_read(VIOsPAPRDevice *dev, uint64_t taddr,
                                      void *buf, uint32_t size)
 {
-    return (dma_memory_read(&dev->dma, taddr, buf, size) != 0) ?
+    return (dma_memory_read(&dev->as, taddr, buf, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
 static inline int spapr_vio_dma_write(VIOsPAPRDevice *dev, uint64_t taddr,
                                       const void *buf, uint32_t size)
 {
-    return (dma_memory_write(&dev->dma, taddr, buf, size) != 0) ?
+    return (dma_memory_write(&dev->as, taddr, buf, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
 static inline int spapr_vio_dma_set(VIOsPAPRDevice *dev, uint64_t taddr,
                                     uint8_t c, uint32_t size)
 {
-    return (dma_memory_set(&dev->dma, taddr, c, size) != 0) ?
+    return (dma_memory_set(&dev->as, taddr, c, size) != 0) ?
         H_DEST_PARM : H_SUCCESS;
 }
 
-#define vio_stb(_dev, _addr, _val) (stb_dma(&(_dev)->dma, (_addr), (_val)))
-#define vio_sth(_dev, _addr, _val) (stw_be_dma(&(_dev)->dma, (_addr), (_val)))
-#define vio_stl(_dev, _addr, _val) (stl_be_dma(&(_dev)->dma, (_addr), (_val)))
-#define vio_stq(_dev, _addr, _val) (stq_be_dma(&(_dev)->dma, (_addr), (_val)))
-#define vio_ldq(_dev, _addr) (ldq_be_dma(&(_dev)->dma, (_addr)))
+#define vio_stb(_dev, _addr, _val) (stb_dma(&(_dev)->as, (_addr), (_val)))
+#define vio_sth(_dev, _addr, _val) (stw_be_dma(&(_dev)->as, (_addr), (_val)))
+#define vio_stl(_dev, _addr, _val) (stl_be_dma(&(_dev)->as, (_addr), (_val)))
+#define vio_stq(_dev, _addr, _val) (stq_be_dma(&(_dev)->as, (_addr), (_val)))
+#define vio_ldq(_dev, _addr) (ldq_be_dma(&(_dev)->as, (_addr)))
 
 int spapr_vio_send_crq(VIOsPAPRDevice *dev, uint8_t *crq);
 
diff --git a/include/sysemu/dma.h b/include/sysemu/dma.h
index 3317dcf..031d1f5 100644
--- a/include/sysemu/dma.h
+++ b/include/sysemu/dma.h
@@ -12,11 +12,11 @@
 
 #include <stdio.h>
 #include "exec/memory.h"
+#include "exec/address-spaces.h"
 #include "hw/hw.h"
 #include "block/block.h"
 #include "sysemu/kvm.h"
 
-typedef struct DMAContext DMAContext;
 typedef struct ScatterGatherEntry ScatterGatherEntry;
 
 typedef enum {
@@ -29,7 +29,7 @@ struct QEMUSGList {
     int nsg;
     int nalloc;
     size_t size;
-    DMAContext *dma;
+    AddressSpace *as;
 };
 
 #ifndef CONFIG_USER_ONLY
@@ -46,16 +46,7 @@ typedef uint64_t dma_addr_t;
 #define DMA_ADDR_BITS 64
 #define DMA_ADDR_FMT "%" PRIx64
 
-struct DMAContext {
-    AddressSpace *as;
-};
-
-/* A global DMA context corresponding to the address_space_memory
- * AddressSpace, for sysbus devices which do DMA.
- */
-extern DMAContext dma_context_memory;
-
-static inline void dma_barrier(DMAContext *dma, DMADirection dir)
+static inline void dma_barrier(AddressSpace *as, DMADirection dir)
 {
     /*
      * This is called before DMA read and write operations
@@ -83,105 +74,105 @@ static inline void dma_barrier(DMAContext *dma, DMADirection dir)
 /* Checks that the given range of addresses is valid for DMA.  This is
  * useful for certain cases, but usually you should just use
  * dma_memory_{read,write}() and check for errors */
-static inline bool dma_memory_valid(DMAContext *dma,
+static inline bool dma_memory_valid(AddressSpace *as,
                                     dma_addr_t addr, dma_addr_t len,
                                     DMADirection dir)
 {
-    return address_space_access_valid(dma->as, addr, len,
+    return address_space_access_valid(as, addr, len,
                                       dir == DMA_DIRECTION_FROM_DEVICE);
 }
 
-static inline int dma_memory_rw_relaxed(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_rw_relaxed(AddressSpace *as, dma_addr_t addr,
                                         void *buf, dma_addr_t len,
                                         DMADirection dir)
 {
-    return address_space_rw(dma->as, addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
+    return address_space_rw(as, addr, buf, len, dir == DMA_DIRECTION_FROM_DEVICE);
 }
 
-static inline int dma_memory_read_relaxed(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_read_relaxed(AddressSpace *as, dma_addr_t addr,
                                           void *buf, dma_addr_t len)
 {
-    return dma_memory_rw_relaxed(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
+    return dma_memory_rw_relaxed(as, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
 }
 
-static inline int dma_memory_write_relaxed(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_write_relaxed(AddressSpace *as, dma_addr_t addr,
                                            const void *buf, dma_addr_t len)
 {
-    return dma_memory_rw_relaxed(dma, addr, (void *)buf, len,
+    return dma_memory_rw_relaxed(as, addr, (void *)buf, len,
                                  DMA_DIRECTION_FROM_DEVICE);
 }
 
-static inline int dma_memory_rw(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_rw(AddressSpace *as, dma_addr_t addr,
                                 void *buf, dma_addr_t len,
                                 DMADirection dir)
 {
-    dma_barrier(dma, dir);
+    dma_barrier(as, dir);
 
-    return dma_memory_rw_relaxed(dma, addr, buf, len, dir);
+    return dma_memory_rw_relaxed(as, addr, buf, len, dir);
 }
 
-static inline int dma_memory_read(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_read(AddressSpace *as, dma_addr_t addr,
                                   void *buf, dma_addr_t len)
 {
-    return dma_memory_rw(dma, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
+    return dma_memory_rw(as, addr, buf, len, DMA_DIRECTION_TO_DEVICE);
 }
 
-static inline int dma_memory_write(DMAContext *dma, dma_addr_t addr,
+static inline int dma_memory_write(AddressSpace *as, dma_addr_t addr,
                                    const void *buf, dma_addr_t len)
 {
-    return dma_memory_rw(dma, addr, (void *)buf, len,
+    return dma_memory_rw(as, addr, (void *)buf, len,
                          DMA_DIRECTION_FROM_DEVICE);
 }
 
-int dma_memory_set(DMAContext *dma, dma_addr_t addr, uint8_t c, dma_addr_t len);
+int dma_memory_set(AddressSpace *as, dma_addr_t addr, uint8_t c, dma_addr_t len);
 
-static inline void *dma_memory_map(DMAContext *dma,
+static inline void *dma_memory_map(AddressSpace *as,
                                    dma_addr_t addr, dma_addr_t *len,
                                    DMADirection dir)
 {
     hwaddr xlen = *len;
     void *p;
 
-    p = address_space_map(dma->as, addr, &xlen, dir == DMA_DIRECTION_FROM_DEVICE);
+    p = address_space_map(as, addr, &xlen, dir == DMA_DIRECTION_FROM_DEVICE);
     *len = xlen;
     return p;
 }
 
-static inline void dma_memory_unmap(DMAContext *dma,
+static inline void dma_memory_unmap(AddressSpace *as,
                                     void *buffer, dma_addr_t len,
                                     DMADirection dir, dma_addr_t access_len)
 {
-    address_space_unmap(dma->as, buffer, (hwaddr)len,
+    address_space_unmap(as, buffer, (hwaddr)len,
                         dir == DMA_DIRECTION_FROM_DEVICE, access_len);
 }
 
 #define DEFINE_LDST_DMA(_lname, _sname, _bits, _end) \
-    static inline uint##_bits##_t ld##_lname##_##_end##_dma(DMAContext *dma, \
+    static inline uint##_bits##_t ld##_lname##_##_end##_dma(AddressSpace *as, \
                                                             dma_addr_t addr) \
     {                                                                   \
         uint##_bits##_t val;                                            \
-        dma_memory_read(dma, addr, &val, (_bits) / 8);                  \
+        dma_memory_read(as, addr, &val, (_bits) / 8);                   \
         return _end##_bits##_to_cpu(val);                               \
     }                                                                   \
-    static inline void st##_sname##_##_end##_dma(DMAContext *dma,       \
+    static inline void st##_sname##_##_end##_dma(AddressSpace *as,      \
                                                  dma_addr_t addr,       \
                                                  uint##_bits##_t val)   \
     {                                                                   \
         val = cpu_to_##_end##_bits(val);                                \
-        dma_memory_write(dma, addr, &val, (_bits) / 8);                 \
+        dma_memory_write(as, addr, &val, (_bits) / 8);                  \
     }
 
-static inline uint8_t ldub_dma(DMAContext *dma, dma_addr_t addr)
+static inline uint8_t ldub_dma(AddressSpace *as, dma_addr_t addr)
 {
     uint8_t val;
 
-    dma_memory_read(dma, addr, &val, 1);
+    dma_memory_read(as, addr, &val, 1);
     return val;
 }
 
-static inline void stb_dma(DMAContext *dma, dma_addr_t addr, uint8_t val)
+static inline void stb_dma(AddressSpace *as, dma_addr_t addr, uint8_t val)
 {
-    dma_memory_write(dma, addr, &val, 1);
+    dma_memory_write(as, addr, &val, 1);
 }
 
 DEFINE_LDST_DMA(uw, w, 16, le);
@@ -193,14 +184,12 @@ DEFINE_LDST_DMA(q, q, 64, be);
 
 #undef DEFINE_LDST_DMA
 
-void dma_context_init(DMAContext *dma, AddressSpace *as);
-
 struct ScatterGatherEntry {
     dma_addr_t base;
     dma_addr_t len;
 };
 
-void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, DMAContext *dma);
+void qemu_sglist_init(QEMUSGList *qsg, int alloc_hint, AddressSpace *as);
 void qemu_sglist_add(QEMUSGList *qsg, dma_addr_t base, dma_addr_t len);
 void qemu_sglist_destroy(QEMUSGList *qsg);
 #endif
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [Qemu-devel] [PATCH 21/21] memory: give name to every AddressSpace
  2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
                   ` (19 preceding siblings ...)
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 20/21] dma: eliminate DMAContext Paolo Bonzini
@ 2013-05-30 21:17 ` Paolo Bonzini
  20 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-30 21:17 UTC (permalink / raw)
  To: qemu-devel; +Cc: Alexey Kardashevskiy

From: Alexey Kardashevskiy <aik@ozlabs.ru>

The "info mtree" command in QEMU console prints only "memory" and "I/O"
address spaces while there are actually a lot more other AddressSpace
structs created by PCI and VIO devices. Those devices do not normally
have names and therefore not present in "info mtree" output.

The patch fixes this.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c                | 6 ++----
 hw/pci/pci.c          | 3 ++-
 hw/ppc/spapr_pci.c    | 4 +++-
 hw/ppc/spapr_vio.c    | 2 +-
 include/exec/memory.h | 6 ++++--
 memory.c              | 8 +++-----
 6 files changed, 15 insertions(+), 14 deletions(-)

diff --git a/exec.c b/exec.c
index e58f842..8909478 100644
--- a/exec.c
+++ b/exec.c
@@ -1827,13 +1827,11 @@ static void memory_map_init(void)
 {
     system_memory = g_malloc(sizeof(*system_memory));
     memory_region_init(system_memory, "system", INT64_MAX);
-    address_space_init(&address_space_memory, system_memory);
-    address_space_memory.name = "memory";
+    address_space_init(&address_space_memory, system_memory, "memory");
 
     system_io = g_malloc(sizeof(*system_io));
     memory_region_init(system_io, "io", 65536);
-    address_space_init(&address_space_io, system_io);
-    address_space_io.name = "I/O";
+    address_space_init(&address_space_io, system_io, "I/O");
 
     memory_listener_register(&core_memory_listener, &address_space_memory);
     memory_listener_register(&io_memory_listener, &address_space_io);
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 2b1bc6d..fa30110 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -815,7 +815,8 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus,
     memory_region_init_alias(&pci_dev->bus_master_enable_region, "bus master",
                              dma_as->root, 0, memory_region_size(dma_as->root));
     memory_region_set_enabled(&pci_dev->bus_master_enable_region, false);
-    address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region);
+    address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_enable_region,
+                       name);
 
     pci_dev->devfn = devfn;
     pstrcpy(pci_dev->name, sizeof(pci_dev->name), name);
diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 459398c..04e8362 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -650,7 +650,9 @@ static int spapr_phb_init(SysBusDevice *s)
         fprintf(stderr, "Unable to create TCE table for %s\n", sphb->dtbusname);
         return -1;
     }
-    address_space_init(&sphb->iommu_as, spapr_tce_get_iommu(sphb->tcet));
+    address_space_init(&sphb->iommu_as, spapr_tce_get_iommu(sphb->tcet),
+                       sphb->dtbusname);
+
     pci_setup_iommu(bus, spapr_pci_dma_iommu, sphb);
 
     QLIST_INSERT_HEAD(&spapr->phbs, sphb, list);
diff --git a/hw/ppc/spapr_vio.c b/hw/ppc/spapr_vio.c
index 5e72f1b..3cfa326 100644
--- a/hw/ppc/spapr_vio.c
+++ b/hw/ppc/spapr_vio.c
@@ -458,7 +458,7 @@ static int spapr_vio_busdev_init(DeviceState *qdev)
     if (pc->rtce_window_size) {
         uint32_t liobn = SPAPR_VIO_BASE_LIOBN | dev->reg;
         dev->tcet = spapr_tce_new_table(liobn, pc->rtce_window_size);
-        address_space_init(&dev->as, spapr_tce_get_iommu(dev->tcet));
+        address_space_init(&dev->as, spapr_tce_get_iommu(dev->tcet), qdev->id);
     }
 
     return pc->init(dev);
diff --git a/include/exec/memory.h b/include/exec/memory.h
index c81acfd..a73adbe 100644
--- a/include/exec/memory.h
+++ b/include/exec/memory.h
@@ -192,7 +192,7 @@ struct MemoryRegionPortio {
  */
 struct AddressSpace {
     /* All fields are private. */
-    const char *name;
+    char *name;
     MemoryRegion *root;
     struct FlatView *current_map;
     int ioeventfd_nb;
@@ -892,8 +892,10 @@ void mtree_info(fprintf_function mon_printf, void *f);
  *
  * @as: an uninitialized #AddressSpace
  * @root: a #MemoryRegion that routes addesses for the address space
+ * @name: an address space name.  The name is only used for debugging
+ *        output.
  */
-void address_space_init(AddressSpace *as, MemoryRegion *root);
+void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name);
 
 
 /**
diff --git a/memory.c b/memory.c
index 802867d..363f8f2 100644
--- a/memory.c
+++ b/memory.c
@@ -1657,7 +1657,7 @@ void memory_listener_unregister(MemoryListener *listener)
     QTAILQ_REMOVE(&memory_listeners, listener, link);
 }
 
-void address_space_init(AddressSpace *as, MemoryRegion *root)
+void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name)
 {
     memory_region_transaction_begin();
     as->root = root;
@@ -1666,7 +1666,7 @@ void address_space_init(AddressSpace *as, MemoryRegion *root)
     as->ioeventfd_nb = 0;
     as->ioeventfds = NULL;
     QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link);
-    as->name = NULL;
+    as->name = g_strdup(name ? name : "anonymous");
     address_space_init_dispatch(as);
     memory_region_update_pending |= root->enabled;
     memory_region_transaction_commit();
@@ -1681,6 +1681,7 @@ void address_space_destroy(AddressSpace *as)
     QTAILQ_REMOVE(&address_spaces, as, address_spaces_link);
     address_space_destroy_dispatch(as);
     flatview_destroy(as->current_map);
+    g_free(as->name);
     g_free(as->current_map);
     g_free(as->ioeventfds);
 }
@@ -1807,9 +1808,6 @@ void mtree_info(fprintf_function mon_printf, void *f)
     QTAILQ_INIT(&ml_head);
 
     QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
-        if (!as->name) {
-            continue;
-        }
         mon_printf(f, "%s\n", as->name);
         mtree_print_mr(mon_printf, f, as->root, 0, 0, &ml_head);
     }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
@ 2013-05-31  6:56   ` Alexey Kardashevskiy
  2013-05-31  7:12     ` Paolo Bonzini
  2013-05-31 22:18   ` Richard Henderson
  2013-06-06  8:36   ` Alexey Kardashevskiy
  2 siblings, 1 reply; 46+ messages in thread
From: Alexey Kardashevskiy @ 2013-05-31  6:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/31/2013 07:16 AM, Paolo Bonzini wrote:
> So far, the size of all regions passed to listeners could fit in 64 bits,
> because artificial regions (containers and aliases) are eliminated by
> the memory core, leaving only device regions which have reasonable sizes
> 
> An IOMMU however cannot be eliminated by the memory core, and may have
> an artificial size, hence we may need 65 bits to represent its size.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

[...]

> diff --git a/kvm-all.c b/kvm-all.c
> index 8222729..86c9af3 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -316,7 +316,7 @@ static void kvm_log_start(MemoryListener *listener,
>      int r;
>  
>      r = kvm_dirty_pages_log_change(section->offset_within_address_space,
> -                                   section->size, true);
> +                                   int128_get64(section->size), true);
>      if (r < 0) {
>          abort();
>      }
> @@ -328,7 +328,7 @@ static void kvm_log_stop(MemoryListener *listener,
>      int r;
>  
>      r = kvm_dirty_pages_log_change(section->offset_within_address_space,
> -                                   section->size, false);
> +                                   int128_get64(section->size), false);
>      if (r < 0) {
>          abort();
>      }
> @@ -366,7 +366,8 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>      unsigned int i, j;
>      unsigned long page_number, c;
>      hwaddr addr, addr1;
> -    unsigned int len = ((section->size / getpagesize()) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
> +    unsigned int pages = int128_get64(section->size) / getpagesize();
> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>      unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>  
>      /*
> @@ -409,7 +410,7 @@ static int kvm_physical_sync_dirty_bitmap(MemoryRegionSection *section)
>      KVMSlot *mem;
>      int ret = 0;
>      hwaddr start_addr = section->offset_within_address_space;
> -    hwaddr end_addr = start_addr + section->size;
> +    hwaddr end_addr = start_addr + int128_get64(section->size);
>  
>      d.dirty_bitmap = NULL;
>      while (start_addr < end_addr) {
> @@ -619,7 +620,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>      MemoryRegion *mr = section->mr;
>      bool log_dirty = memory_region_is_logging(mr);
>      hwaddr start_addr = section->offset_within_address_space;
> -    ram_addr_t size = section->size;
> +    ram_addr_t size = int128_get64(section->size);
>      void *ram = NULL;
>      unsigned delta;
>  

Tried to replay part2 and part3 on qemu.org/master (I assume part1 is
already there), part2 played well, part3 failed. What did I do wrong? Thanks.


alexey@ka1:~/pcipassthru/qemu-impreza$ git am ~/bonzini-iommu3.mbox
Applying: memory: Introduce address_space_lookup_region
Applying: memory: move private types to exec.c
Applying: exec: Allow unaligned address_space_rw
Applying: exec: Resolve subpages in one step except for IOTLB fills
Applying: exec: Implement subpage_read/write via address_space_rw
Applying: exec: return MemoryRegion from address_space_translate
Applying: Revert "memory: limit sections in the radix tree to the actual
address space size"
Applying: Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62"
Applying: exec: reorganize mem_add to match Int128 version
Applying: memory: make section size a 128-bit integer
error: patch failed: kvm-all.c:619
error: kvm-all.c: patch does not apply
Patch failed at 0010 memory: make section size a 128-bit integer
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".



-- 
Alexey

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-31  6:56   ` Alexey Kardashevskiy
@ 2013-05-31  7:12     ` Paolo Bonzini
  0 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-05-31  7:12 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: qemu-devel

Il 31/05/2013 08:56, Alexey Kardashevskiy ha scritto:
> On 05/31/2013 07:16 AM, Paolo Bonzini wrote:
>> So far, the size of all regions passed to listeners could fit in 64 bits,
>> because artificial regions (containers and aliases) are eliminated by
>> the memory core, leaving only device regions which have reasonable sizes
>>
>> An IOMMU however cannot be eliminated by the memory core, and may have
>> an artificial size, hence we may need 65 bits to represent its size.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> 
> [...]
> 
>> diff --git a/kvm-all.c b/kvm-all.c
>> index 8222729..86c9af3 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -316,7 +316,7 @@ static void kvm_log_start(MemoryListener *listener,
>>      int r;
>>  
>>      r = kvm_dirty_pages_log_change(section->offset_within_address_space,
>> -                                   section->size, true);
>> +                                   int128_get64(section->size), true);
>>      if (r < 0) {
>>          abort();
>>      }
>> @@ -328,7 +328,7 @@ static void kvm_log_stop(MemoryListener *listener,
>>      int r;
>>  
>>      r = kvm_dirty_pages_log_change(section->offset_within_address_space,
>> -                                   section->size, false);
>> +                                   int128_get64(section->size), false);
>>      if (r < 0) {
>>          abort();
>>      }
>> @@ -366,7 +366,8 @@ static int kvm_get_dirty_pages_log_range(MemoryRegionSection *section,
>>      unsigned int i, j;
>>      unsigned long page_number, c;
>>      hwaddr addr, addr1;
>> -    unsigned int len = ((section->size / getpagesize()) + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>> +    unsigned int pages = int128_get64(section->size) / getpagesize();
>> +    unsigned int len = (pages + HOST_LONG_BITS - 1) / HOST_LONG_BITS;
>>      unsigned long hpratio = getpagesize() / TARGET_PAGE_SIZE;
>>  
>>      /*
>> @@ -409,7 +410,7 @@ static int kvm_physical_sync_dirty_bitmap(MemoryRegionSection *section)
>>      KVMSlot *mem;
>>      int ret = 0;
>>      hwaddr start_addr = section->offset_within_address_space;
>> -    hwaddr end_addr = start_addr + section->size;
>> +    hwaddr end_addr = start_addr + int128_get64(section->size);
>>  
>>      d.dirty_bitmap = NULL;
>>      while (start_addr < end_addr) {
>> @@ -619,7 +620,7 @@ static void kvm_set_phys_mem(MemoryRegionSection *section, bool add)
>>      MemoryRegion *mr = section->mr;
>>      bool log_dirty = memory_region_is_logging(mr);
>>      hwaddr start_addr = section->offset_within_address_space;
>> -    ram_addr_t size = section->size;
>> +    ram_addr_t size = int128_get64(section->size);
>>      void *ram = NULL;
>>      unsigned delta;
>>  
> 
> Tried to replay part2 and part3 on qemu.org/master (I assume part1 is
> already there), part2 played well, part3 failed. What did I do wrong? Thanks.

Needs a rebase, it seems.  I'll push to the iommu branch later today and
repost this patch.

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region Paolo Bonzini
@ 2013-05-31 21:56   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 21:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Kiszka, qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This introduces a wrapper for phys_page_find (before we complicate
> address_space_translate with IOMMU translation).
> 
> The function will also include subpage handling.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c Paolo Bonzini
@ 2013-05-31 21:57   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 21:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                         | 16 ++++++++++++++++
>  include/exec/memory-internal.h | 15 ---------------
>  2 files changed, 16 insertions(+), 15 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw Paolo Bonzini
@ 2013-05-31 21:57   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 21:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Kiszka, qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This will be needed for some corner cases with para-virtual I/O ports.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills Paolo Bonzini
@ 2013-05-31 21:58   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 21:58 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Kiszka, qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>  static MemoryRegionSection *address_space_lookup_region(AddressSpace *as,
> -                                                        hwaddr addr)
> +                                                        hwaddr addr,
> +                                                        bool resolve_subpage)
>  {
> -    return phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
> +    MemoryRegionSection *section;
> +    subpage_t *subpage;
> +
> +    section = phys_page_find(as->dispatch, addr >> TARGET_PAGE_BITS);
> +    if (resolve_subpage && section->mr->subpage) {
> +        subpage = container_of(section->mr, subpage_t, iomem);
> +        section = &phys_sections[subpage->sub_section[SUBPAGE_IDX(addr)]];
> +    }
> +    return section;
>  }

Was there somewhere else that no longer needs to check phys_sections?
Or does that get eliminated in the course of subsequent patches?


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw Paolo Bonzini
@ 2013-05-31 21:59   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 21:59 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Jan Kiszka, qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> This will allow to add support for unaligned memory regions: the subpage
> container region can activate unaligned support unconditionally because
> the read/write handler will now ensure that accesses are split as
> required by calling address_space_rw. We can furthermore drop the
> special handling of RAM subpages, address_space_rw takes care of this
> already.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c | 125 +++++++++++++++++++++++++----------------------------------------
>  1 file changed, 47 insertions(+), 78 deletions(-)

I take it this is the subsequent patch that I queried from 4/.
In which case both can have

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate Paolo Bonzini
@ 2013-05-31 22:00   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:00 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> Only address_space_translate_for_iotlb needs to return the section.
> Every caller of address_space_translate now uses only section->mr,
> return it directly.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                | 150 +++++++++++++++++++++++++-------------------------
>  include/exec/memory.h |   8 +--
>  2 files changed, 79 insertions(+), 79 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size"
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size" Paolo Bonzini
@ 2013-05-31 22:04   ` Richard Henderson
  2013-06-01  6:29     ` Paolo Bonzini
  0 siblings, 1 reply; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:04 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> This reverts commit 86a8623692b1b559a419a92eb8b6897c221bca74.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                | 13 +------------
>  include/exec/memory.h |  3 ---
>  2 files changed, 1 insertion(+), 15 deletions(-)

I would think that both the 7/ and 8/ reversion patches would
have to be after 10/ in order to be bi-sectable?


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
  2013-05-31  6:56   ` Alexey Kardashevskiy
@ 2013-05-31 22:18   ` Richard Henderson
  2013-06-02 14:03     ` Paolo Bonzini
  2013-06-02 14:18     ` Peter Maydell
  2013-06-06  8:36   ` Alexey Kardashevskiy
  2 siblings, 2 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> +static inline Int128 int128_rshift(Int128 a, int n)
> +{
> +    return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), (a.hi >> n) };
> +}

Produces wrong results for n == 0, since (a.hi << 64) is undefined.


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version Paolo Bonzini
@ 2013-05-31 22:42   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:42 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
> When adding support for 2^64-byte sections, we will have to change
> the structure of mem_add to avoid failures in int128_get64.
> Reorganize the code now before introducing Int128.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c | 39 ++++++++++++++++-----------------------
>  1 file changed, 16 insertions(+), 23 deletions(-)

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 11/21] memory: iommu support
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 11/21] memory: iommu support Paolo Bonzini
@ 2013-05-31 22:54   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:54 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Avi Kivity

On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>  struct MemoryRegion {
>      /* All fields are private - violators will be prosecuted */
>      const MemoryRegionOps *ops;
> +    const MemoryRegionIOMMUOps *iommu_ops;
>      void *opaque;
>      MemoryRegion *parent;
...
> +void memory_region_init_iommu(MemoryRegion *mr,
> +                              MemoryRegionIOMMUOps *ops,
> +                              const char *name,
> +                              uint64_t size)
> +{

Surely the incoming ops pointer should be const too?  Otherwise,

Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers Paolo Bonzini
@ 2013-05-31 22:56   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, David Gibson

On 05/30/2013 02:17 PM, Paolo Bonzini wrote:
> From: David Gibson <david@gibson.dropbear.id.au>
> 
> This patch adds a NotifierList to MemoryRegions which represent IOMMUs
> allowing other parts of the code to register interest in mappings or
> unmappings from the IOMMU.  All IOMMU implementations will need to call
> memory_region_notify_iommu() to inform those waiting on the notifier list,
> whenever an IOMMU mapping is made or removed.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  include/exec/memory.h | 32 ++++++++++++++++++++++++++++++++
>  memory.c              | 18 ++++++++++++++++++
>  2 files changed, 50 insertions(+)


Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used
  2013-05-30 21:17 ` [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used Paolo Bonzini
@ 2013-05-31 22:57   ` Richard Henderson
  0 siblings, 0 replies; 46+ messages in thread
From: Richard Henderson @ 2013-05-31 22:57 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Avi Kivity

On 05/30/2013 02:17 PM, Paolo Bonzini wrote:
> From: Avi Kivity <avi.kivity@gmail.com>
> 
> vfio doesn't support guest iommus yet, indicate it to the user
> by gently depositing a core on their disk.
> 
> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
> Signed-off-by: Avi Kivity <avi.kivity@gmail.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  hw/misc/vfio.c | 2 ++
>  1 file changed, 2 insertions(+)


Reviewed-by: Richard Henderson <rth@twiddle.net>


r~

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size"
  2013-05-31 22:04   ` Richard Henderson
@ 2013-06-01  6:29     ` Paolo Bonzini
  0 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-01  6:29 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

Il 01/06/2013 00:04, Richard Henderson ha scritto:
> On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>> This reverts commit 86a8623692b1b559a419a92eb8b6897c221bca74.
>>
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>  exec.c                | 13 +------------
>>  include/exec/memory.h |  3 ---
>>  2 files changed, 1 insertion(+), 15 deletions(-)
> 
> I would think that both the 7/ and 8/ reversion patches would
> have to be after 10/ in order to be bi-sectable?

Not necessary, it only matters after IOMMU regions are introduced.
Normal regions are never 2^64 bytes in size.

Putting the reversion afterwards would conflict.

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-31 22:18   ` Richard Henderson
@ 2013-06-02 14:03     ` Paolo Bonzini
  2013-06-02 14:18     ` Peter Maydell
  1 sibling, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-02 14:03 UTC (permalink / raw)
  To: Richard Henderson; +Cc: qemu-devel

Il 01/06/2013 00:18, Richard Henderson ha scritto:
> On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>> +static inline Int128 int128_rshift(Int128 a, int n)
>> +{
>> +    return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), (a.hi >> n) };
>> +}
> 
> Produces wrong results for n == 0, since (a.hi << 64) is undefined.

Good catch, I'm adding an


    if (!n) {
        return a;
    }

before.

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-31 22:18   ` Richard Henderson
  2013-06-02 14:03     ` Paolo Bonzini
@ 2013-06-02 14:18     ` Peter Maydell
  2013-06-02 14:36       ` Paolo Bonzini
  1 sibling, 1 reply; 46+ messages in thread
From: Peter Maydell @ 2013-06-02 14:18 UTC (permalink / raw)
  To: Richard Henderson; +Cc: Paolo Bonzini, qemu-devel

On 31 May 2013 23:18, Richard Henderson <rth@twiddle.net> wrote:
> On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>> +static inline Int128 int128_rshift(Int128 a, int n)
>> +{
>> +    return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), (a.hi >> n) };
>> +}
>
> Produces wrong results for n == 0, since (a.hi << 64) is undefined.

It produces wrong results for shifts by more than 64,
for that matter.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-02 14:18     ` Peter Maydell
@ 2013-06-02 14:36       ` Paolo Bonzini
  2013-06-02 14:50         ` Peter Maydell
  0 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-02 14:36 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, Richard Henderson

Il 02/06/2013 16:18, Peter Maydell ha scritto:
> On 31 May 2013 23:18, Richard Henderson <rth@twiddle.net> wrote:
>> On 05/30/2013 02:16 PM, Paolo Bonzini wrote:
>>> +static inline Int128 int128_rshift(Int128 a, int n)
>>> +{
>>> +    return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), (a.hi >> n) };
>>> +}
>>
>> Produces wrong results for n == 0, since (a.hi << 64) is undefined.
> 
> It produces wrong results for shifts by more than 64,
> for that matter.

This should work:

    int64_t h;
    if (!n) {
        return a;
    }
    h = a.hi >> n;
    if (n >= 64) {
        return (Int128) { h, h >> 63 };
    } else {
       return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), h };
    }

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-02 14:36       ` Paolo Bonzini
@ 2013-06-02 14:50         ` Peter Maydell
  2013-06-02 19:52           ` Paolo Bonzini
  0 siblings, 1 reply; 46+ messages in thread
From: Peter Maydell @ 2013-06-02 14:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Richard Henderson

On 2 June 2013 15:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This should work:
>
>     int64_t h;
>     if (!n) {
>         return a;
>     }
>     h = a.hi >> n;

This is undefined for n >= 64.

>     if (n >= 64) {
>         return (Int128) { h, h >> 63 };
>     } else {
>        return (Int128) { (a.lo >> n) | (a.hi << (64 - n)), h };
>     }

I would suggest looking at fpu/softfloat-macros.h:shift128Right()
except that that has at least one clearly dubious thing in it
(a check for "count < 64" in an else case that can't be reached
if count < 64)...

thanks
-- PMM

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-02 14:50         ` Peter Maydell
@ 2013-06-02 19:52           ` Paolo Bonzini
  0 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-02 19:52 UTC (permalink / raw)
  To: Peter Maydell; +Cc: qemu-devel, Richard Henderson

Il 02/06/2013 16:50, Peter Maydell ha scritto:
> On 2 June 2013 15:36, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> This should work:
>>
>>     int64_t h;
>>     if (!n) {
>>         return a;
>>     }
>>     h = a.hi >> n;
> 
> This is undefined for n >= 64.

Yes, it has to be a.hi >> (n & 63).

> I would suggest looking at fpu/softfloat-macros.h:shift128Right()
> except that that has at least one clearly dubious thing in it
> (a check for "count < 64" in an else case that can't be reached
> if count < 64)...

It's a bit different in that I want an arithmetic right shift.

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
  2013-05-31  6:56   ` Alexey Kardashevskiy
  2013-05-31 22:18   ` Richard Henderson
@ 2013-06-06  8:36   ` Alexey Kardashevskiy
  2013-06-07  1:09     ` Paolo Bonzini
  2 siblings, 1 reply; 46+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-06  8:36 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alex Williamson, qemu-devel, David Gibson

On 05/31/2013 07:16 AM, Paolo Bonzini wrote:
> So far, the size of all regions passed to listeners could fit in 64 bits,
> because artificial regions (containers and aliases) are eliminated by
> the memory core, leaving only device regions which have reasonable sizes
> 
> An IOMMU however cannot be eliminated by the memory core, and may have
> an artificial size, hence we may need 65 bits to represent its size.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  exec.c                        | 37 +++++++++++++++++++++----------------
>  hw/core/loader.c              |  2 +-
>  hw/display/exynos4210_fimd.c  |  4 ++--
>  hw/display/framebuffer.c      |  3 ++-
>  hw/misc/vfio.c                |  4 ++--
>  hw/virtio/dataplane/hostmem.c |  2 +-
>  hw/virtio/vhost.c             |  4 ++--
>  hw/virtio/virtio-balloon.c    |  2 +-
>  include/exec/memory.h         |  5 ++++-
>  include/qemu/int128.h         | 10 ++++++++++
>  kvm-all.c                     | 23 ++++++++++++++---------
>  memory.c                      | 14 +++++++-------
>  xen-all.c                     |  6 +++---
>  13 files changed, 70 insertions(+), 46 deletions(-)
> 
> diff --git a/exec.c b/exec.c
> index cf3ea6c..b86f0cc 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -801,7 +801,7 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
>      MemoryRegionSection *existing = phys_page_find(d, base >> TARGET_PAGE_BITS);
>      MemoryRegionSection subsection = {
>          .offset_within_address_space = base,
> -        .size = TARGET_PAGE_SIZE,
> +        .size = int128_make64(TARGET_PAGE_SIZE),
>      };
>      hwaddr start, end;
>  
> @@ -816,16 +816,18 @@ static void register_subpage(AddressSpaceDispatch *d, MemoryRegionSection *secti
>          subpage = container_of(existing->mr, subpage_t, iomem);
>      }
>      start = section->offset_within_address_space & ~TARGET_PAGE_MASK;
> -    end = start + section->size - 1;
> +    end = start + int128_get64(section->size) - 1;
>      subpage_register(subpage, start, end, phys_section_add(section));
>  }
>  
>  
> -static void register_multipage(AddressSpaceDispatch *d, MemoryRegionSection *section)
> +static void register_multipage(AddressSpaceDispatch *d,
> +                               MemoryRegionSection *section)
>  {
>      hwaddr start_addr = section->offset_within_address_space;
>      uint16_t section_index = phys_section_add(section);
> -    uint64_t num_pages = section->size >> TARGET_PAGE_BITS;
> +    uint64_t num_pages = int128_get64(int128_rshift(section->size,
> +                                                    TARGET_PAGE_BITS));
>  
>      assert(num_pages);
>      phys_page_set(d, start_addr >> TARGET_PAGE_BITS, num_pages, section_index);
> @@ -835,28 +837,29 @@ static void mem_add(MemoryListener *listener, MemoryRegionSection *section)
>  {
>      AddressSpaceDispatch *d = container_of(listener, AddressSpaceDispatch, listener);
>      MemoryRegionSection now = *section, remain = *section;
> +    Int128 page_size = int128_make64(TARGET_PAGE_SIZE);
>  
>      if (now.offset_within_address_space & ~TARGET_PAGE_MASK) {
>          uint64_t left = TARGET_PAGE_ALIGN(now.offset_within_address_space)
>                         - now.offset_within_address_space;
>  
> -        now.size = MIN(left, now.size);
> +        now.size = int128_min(int128_make64(left), now.size);
>          register_subpage(d, &now);
>      } else {
> -        now.size = 0;
> +        now.size = int128_zero();
>      }
> -    while (remain.size != now.size) {
> -        remain.size -= now.size;
> -        remain.offset_within_address_space += now.size;
> -        remain.offset_within_region += now.size;
> +    while (int128_ne(remain.size, now.size)) {
> +        remain.size = int128_sub(remain.size, now.size);
> +        remain.offset_within_address_space += int128_get64(now.size);
> +        remain.offset_within_region += int128_get64(now.size);
>          now = remain;
> -        if (remain.size < TARGET_PAGE_SIZE) {
> +        if (int128_lt(remain.size, page_size)) {
>              register_subpage(d, &now);
>          } else if (remain.offset_within_region & ~TARGET_PAGE_MASK) {
> -            now.size = TARGET_PAGE_SIZE;
> +            now.size = page_size;
>              register_subpage(d, &now);
>          } else {
> -            now.size &= -TARGET_PAGE_SIZE;
> +            now.size = int128_and(now.size, int128_neg(page_size));
>              register_multipage(d, &now);
>          }
>      }
> @@ -1666,7 +1669,7 @@ static uint16_t dummy_section(MemoryRegion *mr)
>          .mr = mr,
>          .offset_within_address_space = 0,
>          .offset_within_region = 0,
> -        .size = UINT64_MAX,
> +        .size = int128_2_64(),
>      };
>  
>      return phys_section_add(&section);
> @@ -1735,14 +1738,16 @@ static void io_region_add(MemoryListener *listener,
>      mrio->mr = section->mr;
>      mrio->offset = section->offset_within_region;
>      iorange_init(&mrio->iorange, &memory_region_iorange_ops,
> -                 section->offset_within_address_space, section->size);
> +                 section->offset_within_address_space,
> +                 int128_get64(section->size));
>      ioport_register(&mrio->iorange);
>  }
>  
>  static void io_region_del(MemoryListener *listener,
>                            MemoryRegionSection *section)
>  {
> -    isa_unassign_ioport(section->offset_within_address_space, section->size);
> +    isa_unassign_ioport(section->offset_within_address_space,
> +                        int128_get64(section->size));
>  }
>  
>  static MemoryListener core_memory_listener = {
> diff --git a/hw/core/loader.c b/hw/core/loader.c
> index 7507914..3a60cbe 100644
> --- a/hw/core/loader.c
> +++ b/hw/core/loader.c
> @@ -726,7 +726,7 @@ int rom_load_all(void)
>          addr  = rom->addr;
>          addr += rom->romsize;
>          section = memory_region_find(get_system_memory(), rom->addr, 1);
> -        rom->isrom = section.size && memory_region_is_rom(section.mr);
> +        rom->isrom = int128_nz(section.size) && memory_region_is_rom(section.mr);
>      }
>      qemu_register_reset(rom_reset, NULL);
>      roms_loaded = 1;
> diff --git a/hw/display/exynos4210_fimd.c b/hw/display/exynos4210_fimd.c
> index 6cb5016..0da00a9 100644
> --- a/hw/display/exynos4210_fimd.c
> +++ b/hw/display/exynos4210_fimd.c
> @@ -1133,7 +1133,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
>      DPRINT_TRACE("Window %u framebuffer changed: address=0x%08x, len=0x%x\n",
>              win, fb_start_addr, w->fb_len);
>  
> -    if (w->mem_section.size != w->fb_len ||
> +    if (int128_get64(w->mem_section.size) != w->fb_len ||
>              !memory_region_is_ram(w->mem_section.mr)) {
>          DPRINT_ERROR("Failed to find window %u framebuffer region\n", win);
>          goto error_return;
> @@ -1155,7 +1155,7 @@ static void fimd_update_memory_section(Exynos4210fimdState *s, unsigned win)
>  
>  error_return:
>      w->mem_section.mr = NULL;
> -    w->mem_section.size = 0;
> +    w->mem_section.size = int128_zero();
>      w->host_fb_addr = NULL;
>      w->fb_len = 0;
>  }
> diff --git a/hw/display/framebuffer.c b/hw/display/framebuffer.c
> index 6be31db..49c9e59 100644
> --- a/hw/display/framebuffer.c
> +++ b/hw/display/framebuffer.c
> @@ -54,7 +54,8 @@ void framebuffer_update_display(
>      src_len = src_width * rows;
>  
>      mem_section = memory_region_find(address_space, base, src_len);
> -    if (mem_section.size != src_len || !memory_region_is_ram(mem_section.mr)) {
> +    if (int128_get64(mem_section.size) != src_len ||
> +            !memory_region_is_ram(mem_section.mr)) {
>          return;
>      }
>      mem = mem_section.mr;
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index 693a9ff..c89676b 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -1953,7 +1953,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>      }
>  
>      iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> -    end = (section->offset_within_address_space + section->size) &
> +    end = (section->offset_within_address_space + int128_get64(section->size)) &
>            TARGET_PAGE_MASK;



Another problem with this patch. Here is some more context (***):

===
     iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
-    end = (section->offset_within_address_space + section->size) &
+    end = (section->offset_within_address_space +
int128_get64(section->size)) &
           TARGET_PAGE_MASK;

     if (iova >= end) {
         return;
     }

     vaddr = memory_region_get_ram_ptr(section->mr) +
             section->offset_within_region +
             (iova - section->offset_within_address_space);

     DPRINTF("region_add %"HWADDR_PRIx" - %"HWADDR_PRIx" [%p]\n",
             iova, end - 1, vaddr);

     ret = vfio_dma_map(container, iova, end - iova, vaddr, section->readonly);

===

What happens:

1. "spapr: use memory core for iommu support"  patch calls
memory_region_init_iommu() with size=UINT64_MAX.

2. "memory: use 128-bit integers for sizes and intermediates" patch fixes
such values to UINT64_MAX+1:

void memory_region_init(MemoryRegion *mr,
                         const char *name,
                         uint64_t size)
 {
     mr->ops = NULL;
     mr->parent = NULL;
-    mr->size = size;
+    mr->size = int128_make64(size);
+    if (size == UINT64_MAX) {
+        mr->size = int128_2_64();
+    }

3. (***) patch calls int128_get64() which fails in assert.


At the moment I fixed it by calling  memory_region_init_iommu(...
UINT64_MAX >> 1) + 1) and it makes me happy (or it can be INT64_MAX+1) but
I am not sure it is canonically right :)

What would be the right fix?


-- 
Alexey

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-06  8:36   ` Alexey Kardashevskiy
@ 2013-06-07  1:09     ` Paolo Bonzini
  2013-06-07  1:23       ` Alexey Kardashevskiy
  0 siblings, 1 reply; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-07  1:09 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alex Williamson, qemu-devel, David Gibson

Il 06/06/2013 04:36, Alexey Kardashevskiy ha scritto:
>> > diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>> > index 693a9ff..c89676b 100644
>> > --- a/hw/misc/vfio.c
>> > +++ b/hw/misc/vfio.c
>> > @@ -1953,7 +1953,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>> >      }
>> >  
>> >      iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>> > -    end = (section->offset_within_address_space + section->size) &
>> > +    end = (section->offset_within_address_space + int128_get64(section->size)) &
>> >            TARGET_PAGE_MASK;
> 
> 
> Another problem with this patch. Here is some more context (***):

By the time you get here, this should have already crashed at this
code that patch 13 adds:

diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
index c89676b..52fb036 100644
--- a/hw/misc/vfio.c
+++ b/hw/misc/vfio.c
@@ -1939,6 +1939,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
     void *vaddr;
     int ret;
 
+    assert(!memory_region_is_iommu(section->mr));
+

so it seems like a bug in your VFIO patches.

Paolo

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-07  1:09     ` Paolo Bonzini
@ 2013-06-07  1:23       ` Alexey Kardashevskiy
  2013-06-07  2:39         ` Paolo Bonzini
  0 siblings, 1 reply; 46+ messages in thread
From: Alexey Kardashevskiy @ 2013-06-07  1:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Alex Williamson, qemu-devel, David Gibson

On 06/07/2013 11:09 AM, Paolo Bonzini wrote:
> Il 06/06/2013 04:36, Alexey Kardashevskiy ha scritto:
>>>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>>>> index 693a9ff..c89676b 100644
>>>> --- a/hw/misc/vfio.c
>>>> +++ b/hw/misc/vfio.c
>>>> @@ -1953,7 +1953,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>>>      }
>>>>  
>>>>      iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>>>> -    end = (section->offset_within_address_space + section->size) &
>>>> +    end = (section->offset_within_address_space + int128_get64(section->size)) &
>>>>            TARGET_PAGE_MASK;
>>
>>
>> Another problem with this patch. Here is some more context (***):
> 
> By the time you get here, this should have already crashed at this
> code that patch 13 adds:
> 
> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
> index c89676b..52fb036 100644
> --- a/hw/misc/vfio.c
> +++ b/hw/misc/vfio.c
> @@ -1939,6 +1939,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
>      void *vaddr;
>      int ret;
>  
> +    assert(!memory_region_is_iommu(section->mr));
> +
> 
> so it seems like a bug in your VFIO patches.


No, I have David's patches which fix all of this, I'm planning to post them
soon.

My question is rather what is the whole point of calling
memory_region_init_iommu with size==UINT64_MAX?

mem_add() tries to do register_subpage() when size is not aligned
(UINT64_MAX is not) and fails.

So if we want to init memory region with the size as big as possible on
64bit systems, we either have to replace all 64bit sizes with 64bit end
addresses (and then use 0-ffffffffffffffff) or support int128 sizes
everywhere (even if it is just hi=1, lo=0) or stop initializing memory
regions with sizes way bigger than we really need in next 5 years.

Am I missing a bigger picture?


-- 
Alexey

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer
  2013-06-07  1:23       ` Alexey Kardashevskiy
@ 2013-06-07  2:39         ` Paolo Bonzini
  0 siblings, 0 replies; 46+ messages in thread
From: Paolo Bonzini @ 2013-06-07  2:39 UTC (permalink / raw)
  To: Alexey Kardashevskiy; +Cc: Alex Williamson, qemu-devel, David Gibson

Il 06/06/2013 21:23, Alexey Kardashevskiy ha scritto:
> On 06/07/2013 11:09 AM, Paolo Bonzini wrote:
>> Il 06/06/2013 04:36, Alexey Kardashevskiy ha scritto:
>>>>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>>>>> index 693a9ff..c89676b 100644
>>>>> --- a/hw/misc/vfio.c
>>>>> +++ b/hw/misc/vfio.c
>>>>> @@ -1953,7 +1953,7 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>>>>      }
>>>>>  
>>>>>      iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
>>>>> -    end = (section->offset_within_address_space + section->size) &
>>>>> +    end = (section->offset_within_address_space + int128_get64(section->size)) &
>>>>>            TARGET_PAGE_MASK;
>>>
>>>
>>> Another problem with this patch. Here is some more context (***):
>>
>> By the time you get here, this should have already crashed at this
>> code that patch 13 adds:
>>
>> diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
>> index c89676b..52fb036 100644
>> --- a/hw/misc/vfio.c
>> +++ b/hw/misc/vfio.c
>> @@ -1939,6 +1939,8 @@ static void vfio_listener_region_add(MemoryListener *listener,
>>      void *vaddr;
>>      int ret;
>>  
>> +    assert(!memory_region_is_iommu(section->mr));
>> +
>>
>> so it seems like a bug in your VFIO patches.
> 
> 
> No, I have David's patches which fix all of this, I'm planning to post them
> soon.

Then you have to change this code to use int128 throughout.

> My question is rather what is the whole point of calling
> memory_region_init_iommu with size==UINT64_MAX?
> 
> mem_add() tries to do register_subpage() when size is not aligned
> (UINT64_MAX is not) and fails.

UINT64_MAX represent a 2^64-byte region, not 2^64-1.  mem_add does not
see an unaligned size.

> So if we want to init memory region with the size as big as possible on
> 64bit systems, we either have to replace all 64bit sizes with 64bit end
> addresses (and then use 0-ffffffffffffffff) or support int128 sizes
> everywhere (even if it is just hi=1, lo=0) or stop initializing memory
> regions with sizes way bigger than we really need in next 5 years.

The previous version of the patches limited the address space to 62
bits, but s390 guys preferred to have 2^64.

Paolo

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2013-06-07  2:39 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-30 21:16 [Qemu-devel] [PATCH 00/21] Memory/IOMMU patches, part 3: IOMMU implementation Paolo Bonzini
2013-05-30 21:16 ` [Qemu-devel] [PATCH 01/21] memory: Introduce address_space_lookup_region Paolo Bonzini
2013-05-31 21:56   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 02/21] memory: move private types to exec.c Paolo Bonzini
2013-05-31 21:57   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 03/21] exec: Allow unaligned address_space_rw Paolo Bonzini
2013-05-31 21:57   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 04/21] exec: Resolve subpages in one step except for IOTLB fills Paolo Bonzini
2013-05-31 21:58   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 05/21] exec: Implement subpage_read/write via address_space_rw Paolo Bonzini
2013-05-31 21:59   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 06/21] exec: return MemoryRegion from address_space_translate Paolo Bonzini
2013-05-31 22:00   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 07/21] Revert "memory: limit sections in the radix tree to the actual address space size" Paolo Bonzini
2013-05-31 22:04   ` Richard Henderson
2013-06-01  6:29     ` Paolo Bonzini
2013-05-30 21:16 ` [Qemu-devel] [PATCH 08/21] Revert "s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62" Paolo Bonzini
2013-05-30 21:16 ` [Qemu-devel] [PATCH 09/21] exec: reorganize mem_add to match Int128 version Paolo Bonzini
2013-05-31 22:42   ` Richard Henderson
2013-05-30 21:16 ` [Qemu-devel] [PATCH 10/21] memory: make section size a 128-bit integer Paolo Bonzini
2013-05-31  6:56   ` Alexey Kardashevskiy
2013-05-31  7:12     ` Paolo Bonzini
2013-05-31 22:18   ` Richard Henderson
2013-06-02 14:03     ` Paolo Bonzini
2013-06-02 14:18     ` Peter Maydell
2013-06-02 14:36       ` Paolo Bonzini
2013-06-02 14:50         ` Peter Maydell
2013-06-02 19:52           ` Paolo Bonzini
2013-06-06  8:36   ` Alexey Kardashevskiy
2013-06-07  1:09     ` Paolo Bonzini
2013-06-07  1:23       ` Alexey Kardashevskiy
2013-06-07  2:39         ` Paolo Bonzini
2013-05-30 21:16 ` [Qemu-devel] [PATCH 11/21] memory: iommu support Paolo Bonzini
2013-05-31 22:54   ` Richard Henderson
2013-05-30 21:17 ` [Qemu-devel] [PATCH 12/21] memory: Add iommu map/unmap notifiers Paolo Bonzini
2013-05-31 22:56   ` Richard Henderson
2013-05-30 21:17 ` [Qemu-devel] [PATCH 13/21] vfio: abort if an emulated iommu is used Paolo Bonzini
2013-05-31 22:57   ` Richard Henderson
2013-05-30 21:17 ` [Qemu-devel] [PATCH 14/21] spapr: convert TCE API to use an opaque type Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 15/21] spapr: make IOMMU translation go through IOMMUTLBEntry Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 16/21] spapr: use memory core for iommu support Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 17/21] dma: eliminate old-style IOMMU support Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 18/21] pci: use memory core for iommu support Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 19/21] spapr_vio: take care of creating our own AddressSpace/DMAContext Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 20/21] dma: eliminate DMAContext Paolo Bonzini
2013-05-30 21:17 ` [Qemu-devel] [PATCH 21/21] memory: give name to every AddressSpace Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).