[Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one
  2008-12-18 17:01 [Qemu-devel] [PATCH 0/6] bypass tcg memory functions -v2 Glauber Costa
@ 2008-12-18 17:01 ` Glauber Costa
  0 siblings, 0 replies; 18+ messages in thread
From: Glauber Costa @ 2008-12-18 17:01 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ian.Jackson, avi, kvm, stefano.stabellini

It's like a shark eating a bunch of small fishes:
in some situations (vga linear frame buffer mapping,
for example), we need to register a new slot in place
of older, smaller ones. This patch handles this case

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 kvm-all.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 11034df..3c12c37 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -582,6 +582,16 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
                 kvm_set_phys_mem(mem_start, mem_size, mem_offset);
 
             return;
+        } else if (start_addr <= mem->start_addr &&
+                   (start_addr + size) >= (mem->start_addr +
+                                           mem->memory_size)) {
+            KVMSlot slot;
+            /* unregister whole slot */
+            memcpy(&slot, mem, sizeof(slot));
+            mem->memory_size = 0;
+            kvm_set_user_memory_region(s, mem);
+
+            kvm_set_phys_mem(start_addr, size, phys_offset);
         } else {
             printf("Registering overlapping slot\n");
             abort();
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
@ 2009-01-20 18:50 Glauber Costa
  2009-01-20 18:51 ` [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one Glauber Costa
  2009-01-21  5:43 ` [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Paul Brook
  0 siblings, 2 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:50 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

This series is not very different from the last one that did it.
Just bringing it back from my vacations. Idea is to replace tcg
memory functions with kvm's, selectable at runtime. Right now
we use a conditional selection. In the future, we might use
function pointers to allow for easy coupling of any hypervisor.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one
  2009-01-20 18:50 [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Glauber Costa
@ 2009-01-20 18:51 ` Glauber Costa
  2009-01-20 18:51   ` [Qemu-devel] [PATCH 2/6] re-register whole area upon lfb unmap Glauber Costa
  2009-01-21  5:43 ` [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Paul Brook
  1 sibling, 1 reply; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

It's like a shark eating a bunch of small fishes:
in some situations (vga linear frame buffer mapping,
for example), we need to register a new slot in place
of older, smaller ones. This patch handles this case

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 kvm-all.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 9fb295c..53aca0a 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -582,6 +582,16 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
                 kvm_set_phys_mem(mem_start, mem_size, mem_offset);
 
             return;
+        } else if (start_addr <= mem->start_addr &&
+                   (start_addr + size) >= (mem->start_addr +
+                                           mem->memory_size)) {
+            KVMSlot slot;
+            /* unregister whole slot */
+            memcpy(&slot, mem, sizeof(slot));
+            mem->memory_size = 0;
+            kvm_set_user_memory_region(s, mem);
+
+            kvm_set_phys_mem(start_addr, size, phys_offset);
         } else {
             printf("Registering overlapping slot\n");
             abort();
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 2/6] re-register whole area upon lfb unmap.
  2009-01-20 18:51 ` [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one Glauber Costa
@ 2009-01-20 18:51   ` Glauber Costa
  2009-01-20 18:51     ` [Qemu-devel] [PATCH 3/6] isolate io handling routine Glauber Costa
  0 siblings, 1 reply; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

set phys_offset correctly for the whole vga area when unmapping linear vram
(for vga optimization). We first register the old pieces as unassigned
memory, to make things easier for kvm (and possibly other slot based
implementations in the future). Replacing the region directly would
make the slot management significantly more complex.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 hw/cirrus_vga.c |    7 ++-----
 1 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index ef939ae..75faafc 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -2646,11 +2646,8 @@ static void map_linear_vram(CirrusVGAState *s)
         s->lfb_vram_mapped = 1;
         vga_dirty_log_start((VGAState *)s);
     }
-    else {
-        cpu_register_physical_memory(isa_mem_base + 0xa0000, 0x8000, s->vga_io_memory);
-        cpu_register_physical_memory(isa_mem_base + 0xa8000, 0x8000, s->vga_io_memory);
-    }
-
+    else
+        cpu_register_physical_memory(isa_mem_base + 0xa0000, 0x20000, s->vga_io_memory);
 }
 
 static void unmap_linear_vram(CirrusVGAState *s)
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 3/6] isolate io handling routine
  2009-01-20 18:51   ` [Qemu-devel] [PATCH 2/6] re-register whole area upon lfb unmap Glauber Costa
@ 2009-01-20 18:51     ` Glauber Costa
  2009-01-20 18:51       ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Glauber Costa
  0 siblings, 1 reply; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

introduce cpu_physical_memory_do_io, which handles
the mmio part of cpu_physical_memory_rw. KVM can use
it to do mmio, since mmio is essentially the same for
both KVM and tcg.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 cpu-all.h |    2 +
 exec.c    |   88 ++++++++++++++++++++++++++++++++++---------------------------
 2 files changed, 51 insertions(+), 39 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index ee0a6e3..0c4432a 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -911,6 +911,8 @@ int cpu_register_io_memory(int io_index,
 CPUWriteMemoryFunc **cpu_get_io_memory_write(int io_index);
 CPUReadMemoryFunc **cpu_get_io_memory_read(int io_index);
 
+int cpu_physical_memory_do_io(target_phys_addr_t addr, uint8_t *buf, int l,
+                            int is_write, unsigned long pd);
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write);
 static inline void cpu_physical_memory_read(target_phys_addr_t addr,
diff --git a/exec.c b/exec.c
index faa6333..f72fb96 100644
--- a/exec.c
+++ b/exec.c
@@ -2911,12 +2911,57 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
 }
 
 #else
+int cpu_physical_memory_do_io(target_phys_addr_t addr, uint8_t *buf, int l, int is_write, unsigned long pd)
+{
+    int io_index;
+    uint32_t val;
+
+    io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
+    if (is_write) {
+        /* XXX: could force cpu_single_env to NULL to avoid
+           potential bugs */
+        if (l >= 4 && ((addr & 3) == 0)) {
+            /* 32 bit write access */
+            val = ldl_p(buf);
+            io_mem_write[io_index][2](io_mem_opaque[io_index], addr, val);
+            l = 4;
+        } else if (l >= 2 && ((addr & 1) == 0)) {
+            /* 16 bit write access */
+            val = lduw_p(buf);
+            io_mem_write[io_index][1](io_mem_opaque[io_index], addr, val);
+            l = 2;
+        } else {
+            /* 8 bit write access */
+            val = ldub_p(buf);
+            io_mem_write[io_index][0](io_mem_opaque[io_index], addr, val);
+            l = 1;
+        }
+    } else {
+        if (l >= 4 && ((addr & 3) == 0)) {
+            /* 32 bit read access */
+            val = io_mem_read[io_index][2](io_mem_opaque[io_index], addr);
+            stl_p(buf, val);
+            l = 4;
+        } else if (l >= 2 && ((addr & 1) == 0)) {
+            /* 16 bit read access */
+            val = io_mem_read[io_index][1](io_mem_opaque[io_index], addr);
+            stw_p(buf, val);
+            l = 2;
+        } else {
+            /* 8 bit read access */
+            val = io_mem_read[io_index][0](io_mem_opaque[io_index], addr);
+            stb_p(buf, val);
+            l = 1;
+        }
+    }
+    return l;
+}
+
 void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                             int len, int is_write)
 {
-    int l, io_index;
+    int l;
     uint8_t *ptr;
-    uint32_t val;
     target_phys_addr_t page;
     unsigned long pd;
     PhysPageDesc *p;
@@ -2935,27 +2980,9 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
 
         if (is_write) {
             if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) {
-                io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
                 if (p)
                     addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset;
-                /* XXX: could force cpu_single_env to NULL to avoid
-                   potential bugs */
-                if (l >= 4 && ((addr & 3) == 0)) {
-                    /* 32 bit write access */
-                    val = ldl_p(buf);
-                    io_mem_write[io_index][2](io_mem_opaque[io_index], addr, val);
-                    l = 4;
-                } else if (l >= 2 && ((addr & 1) == 0)) {
-                    /* 16 bit write access */
-                    val = lduw_p(buf);
-                    io_mem_write[io_index][1](io_mem_opaque[io_index], addr, val);
-                    l = 2;
-                } else {
-                    /* 8 bit write access */
-                    val = ldub_p(buf);
-                    io_mem_write[io_index][0](io_mem_opaque[io_index], addr, val);
-                    l = 1;
-                }
+                l = cpu_physical_memory_do_io(addr, buf, len, is_write, pd);
             } else {
                 unsigned long addr1;
                 addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK);
@@ -2973,26 +3000,9 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
         } else {
             if ((pd & ~TARGET_PAGE_MASK) > IO_MEM_ROM &&
                 !(pd & IO_MEM_ROMD)) {
-                /* I/O case */
-                io_index = (pd >> IO_MEM_SHIFT) & (IO_MEM_NB_ENTRIES - 1);
                 if (p)
                     addr = (addr & ~TARGET_PAGE_MASK) + p->region_offset;
-                if (l >= 4 && ((addr & 3) == 0)) {
-                    /* 32 bit read access */
-                    val = io_mem_read[io_index][2](io_mem_opaque[io_index], addr);
-                    stl_p(buf, val);
-                    l = 4;
-                } else if (l >= 2 && ((addr & 1) == 0)) {
-                    /* 16 bit read access */
-                    val = io_mem_read[io_index][1](io_mem_opaque[io_index], addr);
-                    stw_p(buf, val);
-                    l = 2;
-                } else {
-                    /* 8 bit read access */
-                    val = io_mem_read[io_index][0](io_mem_opaque[io_index], addr);
-                    stb_p(buf, val);
-                    l = 1;
-                }
+                l = cpu_physical_memory_do_io(addr, buf, len, is_write, pd);
             } else {
                 /* RAM case */
                 ptr = phys_ram_base + (pd & TARGET_PAGE_MASK) +
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw
  2009-01-20 18:51     ` [Qemu-devel] [PATCH 3/6] isolate io handling routine Glauber Costa
@ 2009-01-20 18:51       ` Glauber Costa
  2009-01-20 18:51         ` [Qemu-devel] [PATCH 5/6] hook cpu_register_physical_mem Glauber Costa
  2009-01-20 20:24         ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Avi Kivity
  0 siblings, 2 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

This patch introduces a kvm version of cpu_physical_memory_rw.
The main motivation is to bypass tcg version, which contains
tcg-specific code, as well as data structures not used by kvm,
such as l1_phys_map.

In this patch, I'm using a runtime selection of which function
to call, but the mid-term goal is to use function pointers in
a way very close to which QEMUAccel used to be.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 exec.c    |   13 +++++++++++--
 kvm-all.c |   44 ++++++++++++++++++++++++++++++++++++++++----
 kvm.h     |    2 ++
 3 files changed, 53 insertions(+), 6 deletions(-)

diff --git a/exec.c b/exec.c
index f72fb96..20ac972 100644
--- a/exec.c
+++ b/exec.c
@@ -2957,8 +2957,8 @@ int cpu_physical_memory_do_io(target_phys_addr_t addr, uint8_t *buf, int l, int
     return l;
 }
 
-void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
-                            int len, int is_write)
+static void default_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
+                                   int len, int is_write)
 {
     int l;
     uint8_t *ptr;
@@ -3016,6 +3016,15 @@ void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
     }
 }
 
+void cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
+                            int len, int is_write)
+{
+    if (kvm_enabled())
+        kvm_cpu_physical_memory_rw(addr, buf, len, is_write);
+    else
+        default_physical_memory_rw(addr, buf, len, is_write);
+}
+
 /* used for ROM loading : can write in RAM and ROM */
 void cpu_physical_memory_write_rom(target_phys_addr_t addr,
                                    const uint8_t *buf, int len)
diff --git a/kvm-all.c b/kvm-all.c
index 53aca0a..9cbb2f9 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -540,7 +540,7 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
 
     mem = kvm_lookup_slot(s, start_addr);
     if (mem) {
-        if ((flags == IO_MEM_UNASSIGNED) || (flags >= TLB_MMIO)) {
+        if ((flags == IO_MEM_UNASSIGNED))  {
             mem->memory_size = 0;
             mem->start_addr = start_addr;
             mem->phys_offset = 0;
@@ -559,6 +559,8 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
                 mem->phys_offset)
                 return;
 
+            if ((phys_offset & ~TARGET_PAGE_MASK) != 0)
+                return;
             /* unregister whole slot */
             memcpy(&slot, mem, sizeof(slot));
             mem->memory_size = 0;
@@ -598,16 +600,21 @@ void kvm_set_phys_mem(target_phys_addr_t start_addr,
         }
     }
     /* KVM does not need to know about this memory */
-    if (flags >= IO_MEM_UNASSIGNED)
+    if (flags == IO_MEM_UNASSIGNED)
         return;
 
-    mem = kvm_alloc_slot(s);
+    if (!mem)
+        mem = kvm_alloc_slot(s);
     mem->memory_size = size;
     mem->start_addr = start_addr;
     mem->phys_offset = phys_offset;
     mem->flags = 0;
 
-    kvm_set_user_memory_region(s, mem);
+    if (flags < TLB_MMIO)
+        kvm_set_user_memory_region(s, mem);
+    else{
+        mem->phys_offset = flags;
+    }
     /* FIXME deal with errors */
 }
 
@@ -673,3 +680,32 @@ int kvm_has_sync_mmu(void)
 
     return 0;
 }
+
+void kvm_cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
+                                int len, int is_write)
+{
+    KVMSlot *mem;
+    KVMState *s = kvm_state;
+    int l;
+
+    mem = kvm_lookup_slot(s, addr);
+    if (!mem)
+        return;
+
+    if ((mem->phys_offset & ~TARGET_PAGE_MASK) >= TLB_MMIO) {
+        l = 0;
+        while (len > l)
+            l += cpu_physical_memory_do_io(addr + l, buf + l, len - l, is_write, mem->phys_offset);
+    } else {
+        uint8_t *uaddr = phys_ram_base + mem->phys_offset + (addr - mem->start_addr);
+        l = len;
+        if (addr + len > mem->start_addr + mem->memory_size)
+            l = mem->memory_size - addr;
+        if (!is_write)
+            memcpy(buf, uaddr, l);
+        else
+            memcpy(uaddr, buf, l);
+        if (l != len)
+            kvm_cpu_physical_memory_rw(addr + l, buf + l, len - l, is_write);
+    }
+}
diff --git a/kvm.h b/kvm.h
index efce145..e3e9ca0 100644
--- a/kvm.h
+++ b/kvm.h
@@ -49,6 +49,8 @@ int kvm_has_sync_mmu(void);
 int kvm_coalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
+void kvm_cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
+                                int len, int is_write);
 /* internal API */
 
 struct KVMState;
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 5/6] hook cpu_register_physical_mem
  2009-01-20 18:51       ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Glauber Costa
@ 2009-01-20 18:51         ` Glauber Costa
  2009-01-20 18:51           ` [Qemu-devel] [PATCH 6/6] cache slot lookup Glauber Costa
  2009-01-20 20:24         ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Avi Kivity
  1 sibling, 1 reply; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

Since now we have our own memory read/write function, we don't
depend on all of tcg data structures anymore. So, instead of filling
them up, bypass it altogether by using kvm_set_phys mem alone.

To do that, we now have to provide our own way to get page
information given the address. (kvm_get_physical_page_desc)

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 exec.c    |   44 +++++++++++++++++++++++++++++---------------
 kvm-all.c |   13 +++++++++++++
 kvm.h     |    2 ++
 3 files changed, 44 insertions(+), 15 deletions(-)

diff --git a/exec.c b/exec.c
index 20ac972..f009e41 100644
--- a/exec.c
+++ b/exec.c
@@ -2260,14 +2260,8 @@ static void *subpage_init (target_phys_addr_t base, ram_addr_t *phys,
         }                                                               \
     } while (0)
 
-/* register physical memory. 'size' must be a multiple of the target
-   page size. If (phys_offset & ~TARGET_PAGE_MASK) != 0, then it is an
-   io memory page.  The address used when calling the IO function is
-   the offset from the start of the region, plus region_offset.  Both
-   start_region and regon_offset are rounded down to a page boundary
-   before calculating this offset.  This should not be a problem unless
-   the low bits of start_addr and region_offset differ.  */
-void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
+
+static void default_register_physical_memory_offset(target_phys_addr_t start_addr,
                                          ram_addr_t size,
                                          ram_addr_t phys_offset,
                                          ram_addr_t region_offset)
@@ -2285,8 +2279,6 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
         kqemu_set_phys_mem(start_addr, size, phys_offset);
     }
 #endif
-    if (kvm_enabled())
-        kvm_set_phys_mem(start_addr, size, phys_offset);
 
     region_offset &= TARGET_PAGE_MASK;
     size = (size + TARGET_PAGE_SIZE - 1) & TARGET_PAGE_MASK;
@@ -2353,15 +2345,37 @@ void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
     }
 }
 
+/* register physical memory. 'size' must be a multiple of the target
+   page size. If (phys_offset & ~TARGET_PAGE_MASK) != 0, then it is an
+   io memory page.  The address used when calling the IO function is
+   the offset from the start of the region, plus region_offset.  Both
+   start_region and regon_offset are rounded down to a page boundary
+   before calculating this offset.  This should not be a problem unless
+   the low bits of start_addr and region_offset differ.  */
+void cpu_register_physical_memory_offset(target_phys_addr_t start_addr,
+                                         ram_addr_t size,
+                                         ram_addr_t phys_offset,
+                                         ram_addr_t region_offset)
+{
+    if (kvm_enabled())
+        kvm_set_phys_mem(start_addr, size, phys_offset);
+    else
+        default_register_physical_memory_offset(start_addr, size, phys_offset, region_offset);
+}
+
 /* XXX: temporary until new memory mapping API */
 ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr)
 {
-    PhysPageDesc *p;
+    if (kvm_enabled()) {
+        return kvm_get_physical_page_desc(addr);
+    } else {
+        PhysPageDesc *p;
 
-    p = phys_page_find(addr >> TARGET_PAGE_BITS);
-    if (!p)
-        return IO_MEM_UNASSIGNED;
-    return p->phys_offset;
+        p = phys_page_find(addr >> TARGET_PAGE_BITS);
+        if (!p)
+            return IO_MEM_UNASSIGNED;
+        return p->phys_offset;
+    }
 }
 
 void qemu_register_coalesced_mmio(target_phys_addr_t addr, ram_addr_t size)
diff --git a/kvm-all.c b/kvm-all.c
index 9cbb2f9..85f2922 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -709,3 +709,16 @@ void kvm_cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
             kvm_cpu_physical_memory_rw(addr + l, buf + l, len - l, is_write);
     }
 }
+
+ram_addr_t kvm_get_physical_page_desc(target_phys_addr_t addr)
+{
+
+    KVMSlot *mem;
+    KVMState *s = kvm_state;
+    mem = kvm_lookup_slot(s, addr);
+
+    if (!mem)
+        return IO_MEM_UNASSIGNED;
+    else
+        return (addr - mem->start_addr) + mem->phys_offset;
+}
diff --git a/kvm.h b/kvm.h
index e3e9ca0..776cfcf 100644
--- a/kvm.h
+++ b/kvm.h
@@ -51,6 +51,8 @@ int kvm_uncoalesce_mmio_region(target_phys_addr_t start, ram_addr_t size);
 
 void kvm_cpu_physical_memory_rw(target_phys_addr_t addr, uint8_t *buf,
                                 int len, int is_write);
+
+ram_addr_t kvm_get_physical_page_desc(target_phys_addr_t addr);
 /* internal API */
 
 struct KVMState;
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Qemu-devel] [PATCH 6/6] cache slot lookup
  2009-01-20 18:51         ` [Qemu-devel] [PATCH 5/6] hook cpu_register_physical_mem Glauber Costa
@ 2009-01-20 18:51           ` Glauber Costa
  0 siblings, 0 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 18:51 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

record slot used in last lookup. For the common mmio case,
we'll usually access the same memory slot repeatedly.

Signed-off-by: Glauber Costa <glommer@redhat.com>
---
 kvm-all.c |   11 ++++++++++-
 1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/kvm-all.c b/kvm-all.c
index 85f2922..6a5a98c 100644
--- a/kvm-all.c
+++ b/kvm-all.c
@@ -75,16 +75,25 @@ static KVMSlot *kvm_alloc_slot(KVMState *s)
     return NULL;
 }
 
+static KVMSlot *last_slot = NULL;
+
 static KVMSlot *kvm_lookup_slot(KVMState *s, target_phys_addr_t start_addr)
 {
     int i;
 
+
+    if (last_slot && (start_addr >= last_slot->start_addr &&
+            start_addr < (last_slot->start_addr + last_slot->memory_size)))
+        return last_slot;
+
     for (i = 0; i < ARRAY_SIZE(s->slots); i++) {
         KVMSlot *mem = &s->slots[i];
 
         if (start_addr >= mem->start_addr &&
-            start_addr < (mem->start_addr + mem->memory_size))
+            start_addr < (mem->start_addr + mem->memory_size)) {
+            last_slot = mem;
             return mem;
+        }
     }
 
     return NULL;
-- 
1.5.6.5

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw
  2009-01-20 18:51       ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Glauber Costa
  2009-01-20 18:51         ` [Qemu-devel] [PATCH 5/6] hook cpu_register_physical_mem Glauber Costa
@ 2009-01-20 20:24         ` Avi Kivity
  2009-01-20 20:33           ` Glauber Costa
  1 sibling, 1 reply; 18+ messages in thread
From: Avi Kivity @ 2009-01-20 20:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

Glauber Costa wrote:
> This patch introduces a kvm version of cpu_physical_memory_rw.
> The main motivation is to bypass tcg version, which contains
> tcg-specific code, as well as data structures not used by kvm,
> such as l1_phys_map.
>
> In this patch, I'm using a runtime selection of which function
> to call, but the mid-term goal is to use function pointers in
> a way very close to which QEMUAccel used to be.
>   

You will need to trivially adjust this for cpu_physical_memory_map().

You're still allocating the page descriptors, right?

My feeling is that qemu could long term drop phys_ram_base and move to a 
slot based structure; most lookups will hit on the first access (the 
largest slot).  The main motivation is to enable memory hotplug.  Of 
course this shouldn't interfere with this patchset going in.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw
  2009-01-20 20:24         ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Avi Kivity
@ 2009-01-20 20:33           ` Glauber Costa
  0 siblings, 0 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-20 20:33 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori

On Tue, Jan 20, 2009 at 6:24 PM, Avi Kivity <avi@redhat.com> wrote:
> Glauber Costa wrote:
>>
>> This patch introduces a kvm version of cpu_physical_memory_rw.
>> The main motivation is to bypass tcg version, which contains
>> tcg-specific code, as well as data structures not used by kvm,
>> such as l1_phys_map.
>>
>> In this patch, I'm using a runtime selection of which function
>> to call, but the mid-term goal is to use function pointers in
>> a way very close to which QEMUAccel used to be.
>>
>
> You will need to trivially adjust this for cpu_physical_memory_map().
Was it merged yet?

>
> You're still allocating the page descriptors, right?
By page descriptors you mean the l1_phys_map? If so, yes, because of
the way cpu_exec works. We can
live without them right now. With other patches I have queued up, we
get closer to this goal.

>
> My feeling is that qemu could long term drop phys_ram_base and move to a
> slot based structure; most lookups will hit on the first access (the largest
> slot).  The main motivation is to enable memory hotplug.  Of course this
> shouldn't interfere with this patchset going in.
>

-- 
Glauber  Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-20 18:50 [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Glauber Costa
  2009-01-20 18:51 ` [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one Glauber Costa
@ 2009-01-21  5:43 ` Paul Brook
  2009-01-21 11:46   ` Glauber Costa
  2009-01-21 19:26   ` Anthony Liguori
  1 sibling, 2 replies; 18+ messages in thread
From: Paul Brook @ 2009-01-21  5:43 UTC (permalink / raw)
  To: qemu-devel; +Cc: Glauber Costa, aliguori

On Tuesday 20 January 2009, Glauber Costa wrote:
> This series is not very different from the last one that did it.
> Just bringing it back from my vacations. Idea is to replace tcg
> memory functions with kvm's, selectable at runtime. Right now
> we use a conditional selection. In the future, we might use
> function pointers to allow for easy coupling of any hypervisor.

I dislike that you're introducing two different ways or doing the same thing.  
Duplicating all the memory region tracking code seems like a bad way to solve 
this problem.

Paul

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-21  5:43 ` [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Paul Brook
@ 2009-01-21 11:46   ` Glauber Costa
  2009-01-21 17:16     ` Ian Jackson
  2009-01-21 19:26   ` Anthony Liguori
  1 sibling, 1 reply; 18+ messages in thread
From: Glauber Costa @ 2009-01-21 11:46 UTC (permalink / raw)
  To: Paul Brook; +Cc: aliguori, qemu-devel

On Wed, Jan 21, 2009 at 05:43:06AM +0000, Paul Brook wrote:
> On Tuesday 20 January 2009, Glauber Costa wrote:
> > This series is not very different from the last one that did it.
> > Just bringing it back from my vacations. Idea is to replace tcg
> > memory functions with kvm's, selectable at runtime. Right now
> > we use a conditional selection. In the future, we might use
> > function pointers to allow for easy coupling of any hypervisor.
> 
> I dislike that you're introducing two different ways or doing the same thing.  
> Duplicating all the memory region tracking code seems like a bad way to solve 
> this problem.
That's because I think I'm introducing two different ways of doing two different
things that just happens to fall under the same name of "memory tracking".

tcg has a lot of stuff that hypervisors won't need, such as keeping track of a tlb,
invalidating memory for the code generator, etc. It is all heavily coupled, and
attempts to make it less like it, just led to a bigger number of hooks, most of them
which were empty hooks. (you might remember the old QEMUAccel approach we took).

Before this patchset, memory registration for KVM is:
   * kvm_set_phys_mem
   * tcg stuff.

After this patchset, it is:
   * kvm_set_phys_mem
   * end of story.

So right now, tcg memory handling is pure overhead for us. Keeping things
separated gives opportunity for better specific optimization when possible,
not to mention the fact that the possibility that we break tcg while inoccently
hacking KVM drops significantly in this area.

In all my laziness I agree that we should not be duplicating things. Hence why,
for example, I tried to commonize the I/O functions: which are the same. (and I
see no benefit in changing the way KVM keeps track of I/O regions in the near
future)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-21 11:46   ` Glauber Costa
@ 2009-01-21 17:16     ` Ian Jackson
  2009-01-21 18:40       ` Glauber Costa
  2009-01-22  0:11       ` Paul Brook
  0 siblings, 2 replies; 18+ messages in thread
From: Ian Jackson @ 2009-01-21 17:16 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori, Paul Brook

Glauber Costa writes ("Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009"):
> On Wed, Jan 21, 2009 at 05:43:06AM +0000, Paul Brook wrote:
> > I dislike that you're introducing two different ways or doing the
> > same thing.  Duplicating all the memory region tracking code seems
> > like a bad way to solve this problem.
>
> That's because I think I'm introducing two different ways of doing
> two different things that just happens to fall under the same name
> of "memory tracking".

For what it's worth I can definitely see where Glauber is coming from.
The Xen tree also has a replacement implementations of things like
cpu_register_io_memory, for very similar reasons.

> In all my laziness I agree that we should not be duplicating
> things. Hence why, for example, I tried to commonize the I/O
> functions: which are the same. (and I see no benefit in changing the
> way KVM keeps track of I/O regions in the near future)

I think the KVM and Xen approaches are probably similar enough that we
can share this code.  I'll have to look at it in detail at some point,
which I don't have time to do right now or necessarily even soon.

But in the meantime I support in principle the approach that KVM are
taking here.

Ian.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-21 17:16     ` Ian Jackson
@ 2009-01-21 18:40       ` Glauber Costa
  2009-01-22  0:11       ` Paul Brook
  1 sibling, 0 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-21 18:40 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori, Paul Brook

>
> For what it's worth I can definitely see where Glauber is coming from.
I'm coming from Brazil. How did you figure it out? ;-)

-- 
Glauber  Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-21  5:43 ` [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Paul Brook
  2009-01-21 11:46   ` Glauber Costa
@ 2009-01-21 19:26   ` Anthony Liguori
  1 sibling, 0 replies; 18+ messages in thread
From: Anthony Liguori @ 2009-01-21 19:26 UTC (permalink / raw)
  To: Paul Brook; +Cc: Glauber Costa, qemu-devel

Paul Brook wrote:
> On Tuesday 20 January 2009, Glauber Costa wrote:
>   
>> This series is not very different from the last one that did it.
>> Just bringing it back from my vacations. Idea is to replace tcg
>> memory functions with kvm's, selectable at runtime. Right now
>> we use a conditional selection. In the future, we might use
>> function pointers to allow for easy coupling of any hypervisor.
>>     
>
> I dislike that you're introducing two different ways or doing the same thing.  
> Duplicating all the memory region tracking code seems like a bad way to solve 
> this problem.
>   

To me, this patch set is really about defining the memory API for QEMU's 
CPU emulation/virtualization core.  It can be made a lot prettier by 
introducing function pointers and such but doing the separation and 
having two implementations is the first step in that direction.

I believe this is the right level of abstraction.  Trying to make shared 
code with a lower level interface is much more complicated because TCG 
has some need for deep hooks in the memory code.  If TCG ever wants to 
use memory slots instead of a tree, I'd rather see common code 
abstracted into a set of libs than making the slot code the main code 
with hooks for TCG.

A big part of the reason for these patches is to make it possible to 
build QEMU without TCG in an architecturally clean way.

Regards,

Anthony Liguori

> Paul
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-21 17:16     ` Ian Jackson
  2009-01-21 18:40       ` Glauber Costa
@ 2009-01-22  0:11       ` Paul Brook
  2009-01-22 14:02         ` Glauber Costa
  2009-01-26 20:42         ` Anthony Liguori
  1 sibling, 2 replies; 18+ messages in thread
From: Paul Brook @ 2009-01-22  0:11 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori, Ian Jackson

On Wednesday 21 January 2009, Ian Jackson wrote:
> Glauber Costa writes ("Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory 
functions -v1.0-2009"):
> > On Wed, Jan 21, 2009 at 05:43:06AM +0000, Paul Brook wrote:
> > > I dislike that you're introducing two different ways or doing the
> > > same thing.  Duplicating all the memory region tracking code seems
> > > like a bad way to solve this problem.
> >
> > That's because I think I'm introducing two different ways of doing
> > two different things that just happens to fall under the same name
> > of "memory tracking".
>
> For what it's worth I can definitely see where Glauber is coming from.
> The Xen tree also has a replacement implementations of things like
> cpu_register_io_memory, for very similar reasons.
>
> > In all my laziness I agree that we should not be duplicating
> > things. Hence why, for example, I tried to commonize the I/O
> > functions: which are the same. (and I see no benefit in changing the
> > way KVM keeps track of I/O regions in the near future)
>
> I think the KVM and Xen approaches are probably similar enough that we
> can share this code.  I'll have to look at it in detail at some point,
> which I don't have time to do right now or necessarily even soon.

I don't see a reason why these need to be different. They're all doing 
basically the same thing. The low level implementation details ara a bit 
different, but in principle kvm, xen and tcg all need to to exactly the same 
thing: Figure out what a particular physical address is mapped to.

As discussed previously, l1_phys_map is not a good solution, and needs to go 
away. Anypatch that involves independent code paths for kvm and tcg 
because "the tcg code doesn't work for kvm" sounds a lot like lazyness. The 
real solution is to fix the current implementation rather than adding a new 
one.

Paul

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-22  0:11       ` Paul Brook
@ 2009-01-22 14:02         ` Glauber Costa
  2009-01-26 20:42         ` Anthony Liguori
  1 sibling, 0 replies; 18+ messages in thread
From: Glauber Costa @ 2009-01-22 14:02 UTC (permalink / raw)
  To: qemu-devel; +Cc: aliguori, Ian Jackson

On Wed, Jan 21, 2009 at 10:11 PM, Paul Brook <paul@codesourcery.com> wrote:
> I don't see a reason why these need to be different. They're all doing
> basically the same thing. The low level implementation details ara a bit
> different, but in principle kvm, xen and tcg all need to to exactly the same
> thing: Figure out what a particular physical address is mapped to.
I'll try to work something out today. I agree that if we do can have
something similar, let's have it.

However, as anthony poses it, we'll differ _somewhere_, and the whole
question is where to hook.
In a level lower than that, we can probably share more code, true, but
we'll end up with an API
that means a lot less, and much more difficult to get right.

The idea of sharing I have in mind right now is close to what I did
wrt to io handling.


-- 
Glauber  Costa.
"Free as in Freedom"
http://glommer.net

"The less confident you are, the more serious you have to act."

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009
  2009-01-22  0:11       ` Paul Brook
  2009-01-22 14:02         ` Glauber Costa
@ 2009-01-26 20:42         ` Anthony Liguori
  1 sibling, 0 replies; 18+ messages in thread
From: Anthony Liguori @ 2009-01-26 20:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Ian Jackson, Paul Brook

Paul Brook wrote:
> On Wednesday 21 January 2009, Ian Jackson wrote:
>   
>>> In all my laziness I agree that we should not be duplicating
>>> things. Hence why, for example, I tried to commonize the I/O
>>> functions: which are the same. (and I see no benefit in changing the
>>> way KVM keeps track of I/O regions in the near future)
>>>       
>> I think the KVM and Xen approaches are probably similar enough that we
>> can share this code.  I'll have to look at it in detail at some point,
>> which I don't have time to do right now or necessarily even soon.
>>     
>
> I don't see a reason why these need to be different. They're all doing 
> basically the same thing. The low level implementation details ara a bit 
> different, but in principle kvm, xen and tcg all need to to exactly the same 
> thing: Figure out what a particular physical address is mapped to.
>   

That's a bit of a simplification.  TCG needs a fair bit more than what 
KVM and Xen needs.  TCG needs to keep track of any write operations to 
memory that could possibly be translated to flush the translation 
caches.  TCG also maintains a software TLB and therefore is going to 
have a different set of sensitivities to the performance characteristics 
of RAM lookup since the performance overhead occurs on every software 
TLB miss.

Conversely, KVM needs to only keep track of MMIO regions since the 
kernel tracks RAM regions.  There is no software TLB so the cost of a 
TLB miss is neglible.  There is no concern about icache flushing because 
the hardware will take care of this.

That said, it may be possible to develop a unified implementation.  
However, it has been expressed previously (by Fabrice, among others) 
that there is a desire to decouple the CPU execution code (be it TCG or 
KVM) from the rest of QEMU so that you could even plug-in new CPU 
emulators rather easily.  That is what this patch series is doing 
primarily, establishing that interface.

> As discussed previously, l1_phys_map is not a good solution, and needs to go 
> away. Anypatch that involves independent code paths for kvm and tcg 
> because "the tcg code doesn't work for kvm" sounds a lot like lazyness. The 
> real solution is to fix the current implementation rather than adding a new 
> one.
>   

Laziness is distinctly different from incremental development.  We know 
this code will do well for KVM, so let's use it for KVM and iron out any 
problems.  After someone has done the work of doing performance 
measurement on the TCG code and adjusting the slot code so that it can 
be hooked appropriate for TCG, we can convert TCG to using it, if 
appropriate.

At the end of the day, we're talking about 200-300 lines of code.  I'm 
not hugely concerned that we're duplicating code within QEMU.

Regards,

Anthony Liguori

> Paul
>
>
>   

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-01-26 20:43 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-01-20 18:50 [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Glauber Costa
2009-01-20 18:51 ` [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one Glauber Costa
2009-01-20 18:51   ` [Qemu-devel] [PATCH 2/6] re-register whole area upon lfb unmap Glauber Costa
2009-01-20 18:51     ` [Qemu-devel] [PATCH 3/6] isolate io handling routine Glauber Costa
2009-01-20 18:51       ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Glauber Costa
2009-01-20 18:51         ` [Qemu-devel] [PATCH 5/6] hook cpu_register_physical_mem Glauber Costa
2009-01-20 18:51           ` [Qemu-devel] [PATCH 6/6] cache slot lookup Glauber Costa
2009-01-20 20:24         ` [Qemu-devel] [PATCH 4/6] replace cpu_physical_memory_rw Avi Kivity
2009-01-20 20:33           ` Glauber Costa
2009-01-21  5:43 ` [Qemu-devel] [PATCH 0/6] Bypass tcg memory functions -v1.0-2009 Paul Brook
2009-01-21 11:46   ` Glauber Costa
2009-01-21 17:16     ` Ian Jackson
2009-01-21 18:40       ` Glauber Costa
2009-01-22  0:11       ` Paul Brook
2009-01-22 14:02         ` Glauber Costa
2009-01-26 20:42         ` Anthony Liguori
2009-01-21 19:26   ` Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2008-12-18 17:01 [Qemu-devel] [PATCH 0/6] bypass tcg memory functions -v2 Glauber Costa
2008-12-18 17:01 ` [Qemu-devel] [PATCH 1/6] remove smaller slots if registering a bigger one Glauber Costa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).