qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
@ 2011-05-03 16:49 Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 1/4] exec: extract cpu_physical_memory_map_internal Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-03 16:49 UTC (permalink / raw)
  To: qemu-devel

Paravirtualized devices (and also some real devices) can assume they
are going to access RAM.  For this reason, provide a fast-path
function with the following properties:

1) it will never allocate a bounce buffer

2) it can be used for read-modify-write operations

3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short" blocks

Patches 3 and 4 use this function for virtio devices and the milkymist
GPU.  The latter is only compile-tested.

Another function checks if it is possible to split a contiguous physical
address range into multiple subranges, all of which use the fast path.
I will introduce later a use for this function.

Paolo Bonzini (4):
  exec: extract cpu_physical_memory_map_internal
  exec: introduce cpu_physical_memory_map_fast and
    cpu_physical_memory_map_check
  virtio: use cpu_physical_memory_map_fast
  milkymist: use cpu_physical_memory_map_fast

 cpu-common.h        |    4 ++
 exec.c              |  108 +++++++++++++++++++++++++++++++++++++-------------
 hw/milkymist-tmu2.c |   39 ++++++++++--------
 hw/vhost.c          |   10 ++--
 hw/virtio.c         |    2 +-
 5 files changed, 111 insertions(+), 52 deletions(-)

-- 
1.7.4.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 1/4] exec: extract cpu_physical_memory_map_internal
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
@ 2011-05-03 16:49 ` Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 2/4] exec: introduce cpu_physical_memory_map_fast and cpu_physical_memory_map_check Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-03 16:49 UTC (permalink / raw)
  To: qemu-devel

This function performs all the work on the fast path, and returns
enough information for the slow path to pick up the work.  This
will be used later by other functions that only do the fast path.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 exec.c |   77 ++++++++++++++++++++++++++++++++++++++++-----------------------
 1 files changed, 49 insertions(+), 28 deletions(-)

diff --git a/exec.c b/exec.c
index a718d74..9b2c9e4 100644
--- a/exec.c
+++ b/exec.c
@@ -3898,16 +3898,9 @@ static void cpu_notify_map_clients(void)
     }
 }
 
-/* Map a physical memory region into a host virtual address.
- * May map a subset of the requested range, given by and returned in *plen.
- * May return NULL if resources needed to perform the mapping are exhausted.
- * Use only for reads OR writes - not for read-modify-write operations.
- * Use cpu_register_map_client() to know when retrying the map operation is
- * likely to succeed.
- */
-void *cpu_physical_memory_map(target_phys_addr_t addr,
-                              target_phys_addr_t *plen,
-                              int is_write)
+static void *cpu_physical_memory_map_internal(target_phys_addr_t addr,
+                                              target_phys_addr_t *plen,
+                                              uintptr_t *pd)
 {
     target_phys_addr_t len = *plen;
     target_phys_addr_t done = 0;
@@ -3915,7 +3908,6 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
     uint8_t *ret = NULL;
     uint8_t *ptr;
     target_phys_addr_t page;
-    unsigned long pd;
     PhysPageDesc *p;
     unsigned long addr1;
 
@@ -3926,26 +3918,16 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
             l = len;
         p = phys_page_find(page >> TARGET_PAGE_BITS);
         if (!p) {
-            pd = IO_MEM_UNASSIGNED;
-        } else {
-            pd = p->phys_offset;
+            *pd = IO_MEM_UNASSIGNED;
+            break;
         }
 
-        if ((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) {
-            if (done || bounce.buffer) {
-                break;
-            }
-            bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
-            bounce.addr = addr;
-            bounce.len = l;
-            if (!is_write) {
-                cpu_physical_memory_read(addr, bounce.buffer, l);
-            }
-            ptr = bounce.buffer;
-        } else {
-            addr1 = (pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK);
-            ptr = qemu_get_ram_ptr(addr1);
+        *pd = p->phys_offset;
+        if ((*pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM) {
+            break;
         }
+        addr1 = (*pd & TARGET_PAGE_MASK) + (addr & ~TARGET_PAGE_MASK);
+        ptr = qemu_get_ram_ptr(addr1);
         if (!done) {
             ret = ptr;
         } else if (ret + done != ptr) {
@@ -3960,6 +3942,45 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
     return ret;
 }
 
+/* Map a physical memory region into a host virtual address.
+ * May map a subset of the requested range, given by and returned in *plen.
+ * May return NULL if resources needed to perform the mapping are exhausted.
+ * Use only for reads OR writes - not for read-modify-write operations.
+ * Use cpu_register_map_client() to know when retrying the map operation is
+ * likely to succeed.
+ */
+void *cpu_physical_memory_map(target_phys_addr_t addr,
+                              target_phys_addr_t *plen,
+                              int is_write)
+{
+    target_phys_addr_t page;
+    uintptr_t pd = IO_MEM_UNASSIGNED;
+    void *ret;
+    ret = cpu_physical_memory_map_internal(addr, plen, &pd);
+    if (ret) {
+        return ret;
+    }
+
+    assert((pd & ~TARGET_PAGE_MASK) != IO_MEM_RAM);
+    if (pd == IO_MEM_UNASSIGNED) {
+        return NULL;
+    }
+    if (bounce.buffer) {
+        return NULL;
+    }
+
+    /* Read at most a page into the temporary buffer.  */
+    page = addr & TARGET_PAGE_MASK;
+    bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, TARGET_PAGE_SIZE);
+    bounce.addr = addr;
+    bounce.len = MIN(page + TARGET_PAGE_SIZE - addr, *plen);
+    if (!is_write) {
+        cpu_physical_memory_read(addr, bounce.buffer, bounce.len);
+    }
+    *plen = bounce.len;
+    return bounce.buffer;
+}
+
 /* Unmaps a memory region previously mapped by cpu_physical_memory_map().
  * Will also mark the memory as dirty if is_write == 1.  access_len gives
  * the amount of memory that was actually read or written by the caller.
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 2/4] exec: introduce cpu_physical_memory_map_fast and cpu_physical_memory_map_check
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 1/4] exec: extract cpu_physical_memory_map_internal Paolo Bonzini
@ 2011-05-03 16:49 ` Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 3/4] virtio: use cpu_physical_memory_map_fast Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-03 16:49 UTC (permalink / raw)
  To: qemu-devel

Paravirtualized devices (and also some real devices) can assume they
are going to access RAM.  For this reason, provide a fast-path
function with the following properties:

1) it will never allocate a bounce buffer

2) it can be used for read-modify-write operations

3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short" blocks

To use cpu_physical_memory_map_fast for RMW, just pass 1 to is_write
when unmapping.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 cpu-common.h |    4 ++++
 exec.c       |   31 +++++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/cpu-common.h b/cpu-common.h
index 96c02ae..d6e116d 100644
--- a/cpu-common.h
+++ b/cpu-common.h
@@ -80,6 +80,10 @@ static inline void cpu_physical_memory_write(target_phys_addr_t addr,
 void *cpu_physical_memory_map(target_phys_addr_t addr,
                               target_phys_addr_t *plen,
                               int is_write);
+void *cpu_physical_memory_map_fast(target_phys_addr_t addr,
+                                   target_phys_addr_t *plen);
+bool cpu_physical_memory_map_check(target_phys_addr_t addr,
+                                   target_phys_addr_t len);
 void cpu_physical_memory_unmap(void *buffer, target_phys_addr_t len,
                                int is_write, target_phys_addr_t access_len);
 void *cpu_register_map_client(void *opaque, void (*callback)(void *opaque));
diff --git a/exec.c b/exec.c
index 9b2c9e4..2b88c29 100644
--- a/exec.c
+++ b/exec.c
@@ -3944,6 +3944,19 @@ static void *cpu_physical_memory_map_internal(target_phys_addr_t addr,
 
 /* Map a physical memory region into a host virtual address.
  * May map a subset of the requested range, given by and returned in *plen.
+ * May return NULL if extra resources are needed to perform the mapping
+ * (i.e. cpu_physical_memory_map is needed).
+ * It may be used for read-modify-write operations.
+ */
+void *cpu_physical_memory_map_fast(target_phys_addr_t addr,
+                                   target_phys_addr_t *plen)
+{
+    uintptr_t pd;
+    return cpu_physical_memory_map_internal(addr, plen, &pd);
+}
+
+/* Map a physical memory region into a host virtual address.
+ * May map a subset of the requested range, given by and returned in *plen.
  * May return NULL if resources needed to perform the mapping are exhausted.
  * Use only for reads OR writes - not for read-modify-write operations.
  * Use cpu_register_map_client() to know when retrying the map operation is
@@ -3981,6 +3994,24 @@ void *cpu_physical_memory_map(target_phys_addr_t addr,
     return bounce.buffer;
 }
 
+/* Returns true if the entire area between ADDR and ADDR+LEN (inclusive
+ * and exclusive, respectively) can be mapped by cpu_physical_memory_map_fast
+ * (possibly in multiple steps).
+ */
+bool cpu_physical_memory_map_check(target_phys_addr_t addr,
+                                   target_phys_addr_t len)
+{
+    while (len > 0) {
+        target_phys_addr_t l = len;
+        if (!cpu_physical_memory_map_fast(addr, &l)) {
+            return false;
+        }
+        len -= l;
+        addr += l;
+    }
+    return true;
+}
+
 /* Unmaps a memory region previously mapped by cpu_physical_memory_map().
  * Will also mark the memory as dirty if is_write == 1.  access_len gives
  * the amount of memory that was actually read or written by the caller.
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 3/4] virtio: use cpu_physical_memory_map_fast
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 1/4] exec: extract cpu_physical_memory_map_internal Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 2/4] exec: introduce cpu_physical_memory_map_fast and cpu_physical_memory_map_check Paolo Bonzini
@ 2011-05-03 16:49 ` Paolo Bonzini
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 4/4] milkymist: " Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-03 16:49 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 hw/vhost.c  |   10 +++++-----
 hw/virtio.c |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/vhost.c b/hw/vhost.c
index 14b571d..763ee4c 100644
--- a/hw/vhost.c
+++ b/hw/vhost.c
@@ -283,7 +283,7 @@ static int vhost_verify_ring_mappings(struct vhost_dev *dev,
             continue;
         }
         l = vq->ring_size;
-        p = cpu_physical_memory_map(vq->ring_phys, &l, 1);
+        p = cpu_physical_memory_map_fast(vq->ring_phys, &l);
         if (!p || l != vq->ring_size) {
             fprintf(stderr, "Unable to map ring buffer for ring %d\n", i);
             return -ENOMEM;
@@ -476,21 +476,21 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
 
     s = l = virtio_queue_get_desc_size(vdev, idx);
     a = virtio_queue_get_desc_addr(vdev, idx);
-    vq->desc = cpu_physical_memory_map(a, &l, 0);
+    vq->desc = cpu_physical_memory_map_fast(a, &l);
     if (!vq->desc || l != s) {
         r = -ENOMEM;
         goto fail_alloc_desc;
     }
     s = l = virtio_queue_get_avail_size(vdev, idx);
     a = virtio_queue_get_avail_addr(vdev, idx);
-    vq->avail = cpu_physical_memory_map(a, &l, 0);
+    vq->avail = cpu_physical_memory_map_fast(a, &l);
     if (!vq->avail || l != s) {
         r = -ENOMEM;
         goto fail_alloc_avail;
     }
     vq->used_size = s = l = virtio_queue_get_used_size(vdev, idx);
     vq->used_phys = a = virtio_queue_get_used_addr(vdev, idx);
-    vq->used = cpu_physical_memory_map(a, &l, 1);
+    vq->used = cpu_physical_memory_map_fast(a, &l);
     if (!vq->used || l != s) {
         r = -ENOMEM;
         goto fail_alloc_used;
@@ -498,7 +498,7 @@ static int vhost_virtqueue_init(struct vhost_dev *dev,
 
     vq->ring_size = s = l = virtio_queue_get_ring_size(vdev, idx);
     vq->ring_phys = a = virtio_queue_get_ring_addr(vdev, idx);
-    vq->ring = cpu_physical_memory_map(a, &l, 1);
+    vq->ring = cpu_physical_memory_map_fast(a, &l);
     if (!vq->ring || l != s) {
         r = -ENOMEM;
         goto fail_alloc_ring;
diff --git a/hw/virtio.c b/hw/virtio.c
index 6e8814c..429646a 100644
--- a/hw/virtio.c
+++ b/hw/virtio.c
@@ -372,7 +372,7 @@ void virtqueue_map_sg(struct iovec *sg, target_phys_addr_t *addr,
 
     for (i = 0; i < num_sg; i++) {
         len = sg[i].iov_len;
-        sg[i].iov_base = cpu_physical_memory_map(addr[i], &len, is_write);
+        sg[i].iov_base = cpu_physical_memory_map_fast(addr[i], &len);
         if (sg[i].iov_base == NULL || len != sg[i].iov_len) {
             error_report("virtio: trying to map MMIO memory");
             exit(1);
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Qemu-devel] [PATCH 4/4] milkymist: use cpu_physical_memory_map_fast
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
                   ` (2 preceding siblings ...)
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 3/4] virtio: use cpu_physical_memory_map_fast Paolo Bonzini
@ 2011-05-03 16:49 ` Paolo Bonzini
  2011-05-04 21:56   ` Michael Walle
  2011-05-12 14:51 ` [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
  2011-05-12 15:32 ` Avi Kivity
  5 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-03 16:49 UTC (permalink / raw)
  To: qemu-devel; +Cc: Michael Walle

This patch adds proper unmapping of the memory when the addresses
cross multiple memory blocks, and it also uses a single map_fast
operation for the RMW access to the destination frame buffer.

Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        Compile-tested only.

 hw/milkymist-tmu2.c |   39 +++++++++++++++++++++------------------
 1 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/hw/milkymist-tmu2.c b/hw/milkymist-tmu2.c
index 9cebe31..575fa41 100644
--- a/hw/milkymist-tmu2.c
+++ b/hw/milkymist-tmu2.c
@@ -181,9 +181,9 @@ static void tmu2_start(MilkymistTMU2State *s)
     GLXPbuffer pbuffer;
     GLuint texture;
     void *fb;
-    target_phys_addr_t fb_len;
+    target_phys_addr_t fb_len, fb_len_full;
     void *mesh;
-    target_phys_addr_t mesh_len;
+    target_phys_addr_t mesh_len, mesh_len_full;
     float m;
 
     trace_milkymist_tmu2_start();
@@ -205,8 +205,12 @@ static void tmu2_start(MilkymistTMU2State *s)
     /* Read the QEMU source framebuffer into an OpenGL texture */
     glGenTextures(1, &texture);
     glBindTexture(GL_TEXTURE_2D, texture);
-    fb_len = 2*s->regs[R_TEXHRES]*s->regs[R_TEXVRES];
-    fb = cpu_physical_memory_map(s->regs[R_TEXFBUF], &fb_len, 0);
+    fb_len_full = fb_len = 2*s->regs[R_TEXHRES]*s->regs[R_TEXVRES];
+    fb = cpu_physical_memory_map_fast(s->regs[R_TEXFBUF], &fb_len);
+    if (fb_len < fb_len_full) {
+        cpu_physical_memory_unmap(fb, fb_len, 0, 0);
+        fb = NULL;
+    }
     if (fb == NULL) {
         glDeleteTextures(1, &texture);
         glXMakeContextCurrent(s->dpy, None, None, NULL);
@@ -249,8 +253,12 @@ static void tmu2_start(MilkymistTMU2State *s)
     glColor4f(m, m, m, (float)(s->regs[R_ALPHA] + 1) / 64.0f);
 
     /* Read the QEMU dest. framebuffer into the OpenGL framebuffer */
-    fb_len = 2 * s->regs[R_DSTHRES] * s->regs[R_DSTVRES];
-    fb = cpu_physical_memory_map(s->regs[R_DSTFBUF], &fb_len, 0);
+    fb_len_full = fb_len = 2 * s->regs[R_DSTHRES] * s->regs[R_DSTVRES];
+    fb = cpu_physical_memory_map_fast(s->regs[R_DSTFBUF], &fb_len);
+    if (fb_len < fb_len_full) {
+        cpu_physical_memory_unmap(fb, fb_len, 0, 0);
+        fb = NULL;
+    }
     if (fb == NULL) {
         glDeleteTextures(1, &texture);
         glXMakeContextCurrent(s->dpy, None, None, NULL);
@@ -260,7 +268,6 @@ static void tmu2_start(MilkymistTMU2State *s)
 
     glDrawPixels(s->regs[R_DSTHRES], s->regs[R_DSTVRES], GL_RGB,
             GL_UNSIGNED_SHORT_5_6_5, fb);
-    cpu_physical_memory_unmap(fb, fb_len, 0, fb_len);
     glViewport(0, 0, s->regs[R_DSTHRES], s->regs[R_DSTVRES]);
     glMatrixMode(GL_PROJECTION);
     glLoadIdentity();
@@ -268,9 +275,14 @@ static void tmu2_start(MilkymistTMU2State *s)
     glMatrixMode(GL_MODELVIEW);
 
     /* Map the texture */
-    mesh_len = MESH_MAXSIZE*MESH_MAXSIZE*sizeof(struct vertex);
-    mesh = cpu_physical_memory_map(s->regs[R_VERTICESADDR], &mesh_len, 0);
+    mesh_len_full = mesh_len = MESH_MAXSIZE*MESH_MAXSIZE*sizeof(struct vertex);
+    mesh = cpu_physical_memory_map_fast(s->regs[R_VERTICESADDR], &mesh_len);
+    if (mesh_len < mesh_len_full) {
+        cpu_physical_memory_unmap(mesh, mesh_len, 0, 0);
+        mesh = NULL;
+    }
     if (mesh == NULL) {
+        cpu_physical_memory_unmap(fb, fb_len, 0, fb_len);
         glDeleteTextures(1, &texture);
         glXMakeContextCurrent(s->dpy, None, None, NULL);
         glXDestroyPbuffer(s->dpy, pbuffer);
@@ -285,15 +297,6 @@ static void tmu2_start(MilkymistTMU2State *s)
     cpu_physical_memory_unmap(mesh, mesh_len, 0, mesh_len);
 
     /* Write back the OpenGL framebuffer to the QEMU framebuffer */
-    fb_len = 2 * s->regs[R_DSTHRES] * s->regs[R_DSTVRES];
-    fb = cpu_physical_memory_map(s->regs[R_DSTFBUF], &fb_len, 1);
-    if (fb == NULL) {
-        glDeleteTextures(1, &texture);
-        glXMakeContextCurrent(s->dpy, None, None, NULL);
-        glXDestroyPbuffer(s->dpy, pbuffer);
-        return;
-    }
-
     glReadPixels(0, 0, s->regs[R_DSTHRES], s->regs[R_DSTVRES], GL_RGB,
             GL_UNSIGNED_SHORT_5_6_5, fb);
     cpu_physical_memory_unmap(fb, fb_len, 1, fb_len);
-- 
1.7.4.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 4/4] milkymist: use cpu_physical_memory_map_fast
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 4/4] milkymist: " Paolo Bonzini
@ 2011-05-04 21:56   ` Michael Walle
  0 siblings, 0 replies; 15+ messages in thread
From: Michael Walle @ 2011-05-04 21:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Am Dienstag 03 Mai 2011, 18:49:34 schrieb Paolo Bonzini:
> This patch adds proper unmapping of the memory when the addresses
> cross multiple memory blocks, and it also uses a single map_fast
> operation for the RMW access to the destination frame buffer.
> 
> Cc: Michael Walle <michael@walle.cc>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael Walle <michael@walle.cc>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
                   ` (3 preceding siblings ...)
  2011-05-03 16:49 ` [Qemu-devel] [PATCH 4/4] milkymist: " Paolo Bonzini
@ 2011-05-12 14:51 ` Paolo Bonzini
  2011-05-31  9:16   ` Paolo Bonzini
  2011-05-12 15:32 ` Avi Kivity
  5 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-12 14:51 UTC (permalink / raw)
  Cc: qemu-devel

On 05/03/2011 06:49 PM, Paolo Bonzini wrote:
> Paravirtualized devices (and also some real devices) can assume they
> are going to access RAM.  For this reason, provide a fast-path
> function with the following properties:
>
> 1) it will never allocate a bounce buffer
>
> 2) it can be used for read-modify-write operations
>
> 3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short" blocks
>
> Patches 3 and 4 use this function for virtio devices and the milkymist
> GPU.  The latter is only compile-tested.
>
> Another function checks if it is possible to split a contiguous physical
> address range into multiple subranges, all of which use the fast path.
> I will introduce later a use for this function.
>
> Paolo Bonzini (4):
>    exec: extract cpu_physical_memory_map_internal
>    exec: introduce cpu_physical_memory_map_fast and
>      cpu_physical_memory_map_check
>    virtio: use cpu_physical_memory_map_fast
>    milkymist: use cpu_physical_memory_map_fast
>
>   cpu-common.h        |    4 ++
>   exec.c              |  108 +++++++++++++++++++++++++++++++++++++-------------
>   hw/milkymist-tmu2.c |   39 ++++++++++--------
>   hw/vhost.c          |   10 ++--
>   hw/virtio.c         |    2 +-
>   5 files changed, 111 insertions(+), 52 deletions(-)
>

Ping?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
                   ` (4 preceding siblings ...)
  2011-05-12 14:51 ` [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
@ 2011-05-12 15:32 ` Avi Kivity
  2011-05-13  6:33   ` Paolo Bonzini
  5 siblings, 1 reply; 15+ messages in thread
From: Avi Kivity @ 2011-05-12 15:32 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 05/03/2011 07:49 PM, Paolo Bonzini wrote:
> Paravirtualized devices (and also some real devices) can assume they
> are going to access RAM.  For this reason, provide a fast-path
> function with the following properties:
>
> 1) it will never allocate a bounce buffer
>
> 2) it can be used for read-modify-write operations
>
> 3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short" blocks
>
> Patches 3 and 4 use this function for virtio devices and the milkymist
> GPU.  The latter is only compile-tested.
>
> Another function checks if it is possible to split a contiguous physical
> address range into multiple subranges, all of which use the fast path.
> I will introduce later a use for this function.
>

Out of curiosity, what performance benefit do you see?

For relatively constant mappings (like the ring) we can cache the 
mapping in structure and invalidate it when the memory map changes 
(using, say, rcu).  That doesn't work for the actual buffers, or for 
indirect mappings.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-05-12 15:32 ` Avi Kivity
@ 2011-05-13  6:33   ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-13  6:33 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel

On 05/12/2011 05:32 PM, Avi Kivity wrote:
> Out of curiosity, what performance benefit do you see?

Zero. :)  Also because the only real change is in patch 4/4 (milkymist) 
which I only compile-tested.  In all other instances, using 
cpu_physical_memory_map_fast is just to make it clear that we don't want 
bounce buffers.

However, this is just preparatory work for vmw_pvscsi, which will use 
the functions to build its iovecs.  qemu_get_ram_ptr would not really be 
a satisfactory API for that.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-05-12 14:51 ` [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
@ 2011-05-31  9:16   ` Paolo Bonzini
  2011-06-06 12:27     ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-05-31  9:16 UTC (permalink / raw)
  To: qemu-devel

On 05/12/2011 04:51 PM, Paolo Bonzini wrote:
> On 05/03/2011 06:49 PM, Paolo Bonzini wrote:
>> Paravirtualized devices (and also some real devices) can assume they
>> are going to access RAM. For this reason, provide a fast-path
>> function with the following properties:
>>
>> 1) it will never allocate a bounce buffer
>>
>> 2) it can be used for read-modify-write operations
>>
>> 3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short"
>> blocks
>>
>> Patches 3 and 4 use this function for virtio devices and the milkymist
>> GPU. The latter is only compile-tested.
>>
>> Another function checks if it is possible to split a contiguous physical
>> address range into multiple subranges, all of which use the fast path.
>> I will introduce later a use for this function.
>>
>> Paolo Bonzini (4):
>> exec: extract cpu_physical_memory_map_internal
>> exec: introduce cpu_physical_memory_map_fast and
>> cpu_physical_memory_map_check
>> virtio: use cpu_physical_memory_map_fast
>> milkymist: use cpu_physical_memory_map_fast
>>
>> cpu-common.h | 4 ++
>> exec.c | 108 +++++++++++++++++++++++++++++++++++++-------------
>> hw/milkymist-tmu2.c | 39 ++++++++++--------
>> hw/vhost.c | 10 ++--
>> hw/virtio.c | 2 +-
>> 5 files changed, 111 insertions(+), 52 deletions(-)
>>
>
> Ping?

Ping^2?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-05-31  9:16   ` Paolo Bonzini
@ 2011-06-06 12:27     ` Paolo Bonzini
  2011-06-06 12:56       ` Anthony Liguori
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 12:27 UTC (permalink / raw)
  To: qemu-devel

On 05/31/2011 11:16 AM, Paolo Bonzini wrote:
> On 05/12/2011 04:51 PM, Paolo Bonzini wrote:
>> On 05/03/2011 06:49 PM, Paolo Bonzini wrote:
>>> Paravirtualized devices (and also some real devices) can assume they
>>> are going to access RAM. For this reason, provide a fast-path
>>> function with the following properties:
>>>
>>> 1) it will never allocate a bounce buffer
>>>
>>> 2) it can be used for read-modify-write operations
>>>
>>> 3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short"
>>> blocks
>>>
>>> Patches 3 and 4 use this function for virtio devices and the milkymist
>>> GPU. The latter is only compile-tested.
>>>
>>> Another function checks if it is possible to split a contiguous physical
>>> address range into multiple subranges, all of which use the fast path.
>>> I will introduce later a use for this function.
>>>
>>> Paolo Bonzini (4):
>>> exec: extract cpu_physical_memory_map_internal
>>> exec: introduce cpu_physical_memory_map_fast and
>>> cpu_physical_memory_map_check
>>> virtio: use cpu_physical_memory_map_fast
>>> milkymist: use cpu_physical_memory_map_fast
>>>
>>> cpu-common.h | 4 ++
>>> exec.c | 108 +++++++++++++++++++++++++++++++++++++-------------
>>> hw/milkymist-tmu2.c | 39 ++++++++++--------
>>> hw/vhost.c | 10 ++--
>>> hw/virtio.c | 2 +-
>>> 5 files changed, 111 insertions(+), 52 deletions(-)
>>>
>>
>> Ping?
>
> Ping^2?

Ping^3?

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-06-06 12:27     ` Paolo Bonzini
@ 2011-06-06 12:56       ` Anthony Liguori
  2011-06-06 13:09         ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Anthony Liguori @ 2011-06-06 12:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 06/06/2011 07:27 AM, Paolo Bonzini wrote:
> On 05/31/2011 11:16 AM, Paolo Bonzini wrote:
>> On 05/12/2011 04:51 PM, Paolo Bonzini wrote:
>>> On 05/03/2011 06:49 PM, Paolo Bonzini wrote:
>>>> Paravirtualized devices (and also some real devices) can assume they
>>>> are going to access RAM. For this reason, provide a fast-path
>>>> function with the following properties:
>>>>
>>>> 1) it will never allocate a bounce buffer
>>>>
>>>> 2) it can be used for read-modify-write operations
>>>>
>>>> 3) unlike qemu_get_ram_ptr, it is safe because it recognizes "short"
>>>> blocks
>>>>
>>>> Patches 3 and 4 use this function for virtio devices and the milkymist
>>>> GPU. The latter is only compile-tested.
>>>>
>>>> Another function checks if it is possible to split a contiguous
>>>> physical
>>>> address range into multiple subranges, all of which use the fast path.
>>>> I will introduce later a use for this function.
>>>>
>>>> Paolo Bonzini (4):
>>>> exec: extract cpu_physical_memory_map_internal
>>>> exec: introduce cpu_physical_memory_map_fast and
>>>> cpu_physical_memory_map_check
>>>> virtio: use cpu_physical_memory_map_fast
>>>> milkymist: use cpu_physical_memory_map_fast
>>>>
>>>> cpu-common.h | 4 ++
>>>> exec.c | 108 +++++++++++++++++++++++++++++++++++++-------------
>>>> hw/milkymist-tmu2.c | 39 ++++++++++--------
>>>> hw/vhost.c | 10 ++--
>>>> hw/virtio.c | 2 +-
>>>> 5 files changed, 111 insertions(+), 52 deletions(-)
>>>>
>>>
>>> Ping?
>>
>> Ping^2?
>
> Ping^3?

Oh, the patch series basically died for me when I saw:

Avi> What performance benefit does this bring?

Paolo> Zero

Especially given Avi's efforts to introduce a new RAM API, I don't want 
yet another special case to handle.

You're just trying to avoid having to handle map failures, right?  So 
it's not really cpu_physical_memory_map_fast, it's really 
cpu_physical_memory_map_simple?

I'd prefer that a device just treat failures as fatal vs. using a new API.

Regards,

Anthony Liguori

> Paolo
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-06-06 12:56       ` Anthony Liguori
@ 2011-06-06 13:09         ` Paolo Bonzini
  2011-06-06 15:44           ` Anthony Liguori
  0 siblings, 1 reply; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 13:09 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

On 06/06/2011 02:56 PM, Anthony Liguori wrote:
>
> Oh, the patch series basically died for me when I saw:
>
> Avi> What performance benefit does this bring?
>
> Paolo> Zero

:)

> Especially given Avi's efforts to introduce a new RAM API, I don't want
> yet another special case to handle.

This is not a special case, the existing functions are all mapped onto 
the new cpu_physical_memory_map_internal.  I don't think this is in any 
way related to Avi's RAM API which is (mostly) for MMIO.

> You're just trying to avoid having to handle map failures, right?

Not just that.  If you had a memory block at say 1 GB - 2 GB, and 
another at 2 GB - 3 GB, a DMA operation that crosses the two could be 
implemented with cpu_physical_memory_map_fast; you would simply build a 
two-element iovec in two steps, something the current API does not allow.

The patch does not change virtio to do the split, but it is possible to 
do that.  The reason I'm not doing the virtio change, is that I know mst 
has pending changes to virtio and I'd rather avoid the conflicts for 
now.  However, for vmw_pvscsi I'm going to handle it using the new 
functions.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-06-06 13:09         ` Paolo Bonzini
@ 2011-06-06 15:44           ` Anthony Liguori
  2011-06-06 15:55             ` Paolo Bonzini
  0 siblings, 1 reply; 15+ messages in thread
From: Anthony Liguori @ 2011-06-06 15:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

On 06/06/2011 08:09 AM, Paolo Bonzini wrote:
> On 06/06/2011 02:56 PM, Anthony Liguori wrote:
>>
>> Oh, the patch series basically died for me when I saw:
>>
>> Avi> What performance benefit does this bring?
>>
>> Paolo> Zero
>
> :)
>
>> Especially given Avi's efforts to introduce a new RAM API, I don't want
>> yet another special case to handle.
>
> This is not a special case, the existing functions are all mapped onto
> the new cpu_physical_memory_map_internal. I don't think this is in any
> way related to Avi's RAM API which is (mostly) for MMIO.
>
>> You're just trying to avoid having to handle map failures, right?
>
> Not just that. If you had a memory block at say 1 GB - 2 GB, and another
> at 2 GB - 3 GB, a DMA operation that crosses the two could be
> implemented with cpu_physical_memory_map_fast; you would simply build a
> two-element iovec in two steps, something the current API does not allow.

You cannot assume RAM blocks are contiguous.  This has nothing to do 
with PV or not PV but how the RAM API works today.

>
> The patch does not change virtio to do the split, but it is possible to
> do that. The reason I'm not doing the virtio change, is that I know mst
> has pending changes to virtio and I'd rather avoid the conflicts for
> now. However, for vmw_pvscsi I'm going to handle it using the new
> functions.

Virtio can handle all of this today because it uses 
cpu_physical_memory_rw for ring access and then calls map for SG 
elements.  SG elements are usually 4k so it's never really an issue to 
get a partial mapping.  We could be more robust about it but in 
practice, it's not a problem.

Regards,

Anthony Liguori

>
> Paolo
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast
  2011-06-06 15:44           ` Anthony Liguori
@ 2011-06-06 15:55             ` Paolo Bonzini
  0 siblings, 0 replies; 15+ messages in thread
From: Paolo Bonzini @ 2011-06-06 15:55 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel

On 06/06/2011 05:44 PM, Anthony Liguori wrote:
>>
>> Not just that. If you had a memory block at say 1 GB - 2 GB, and another
>> at 2 GB - 3 GB, a DMA operation that crosses the two could be
>> implemented with cpu_physical_memory_map_fast; you would simply build a
>> two-element iovec in two steps, something the current API does not allow.
>
> You cannot assume RAM blocks are contiguous.  This has nothing to do
> with PV or not PV but how the RAM API works today.

That's exactly why I said a *two-element* iovec.

> Virtio can handle all of this today because it uses
> cpu_physical_memory_rw for ring access and then calls map for SG
> elements.  SG elements are usually 4k so it's never really an issue to
> get a partial mapping.  We could be more robust about it but in
> practice, it's not a problem.

I know in practice it's not a problem, but I dislike not having an API 
that can deal with it even in theory.  For vmw_pvscsi it's like 5 lines 
of code to allow it.

Paolo

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-06-06 15:55 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-03 16:49 [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
2011-05-03 16:49 ` [Qemu-devel] [PATCH 1/4] exec: extract cpu_physical_memory_map_internal Paolo Bonzini
2011-05-03 16:49 ` [Qemu-devel] [PATCH 2/4] exec: introduce cpu_physical_memory_map_fast and cpu_physical_memory_map_check Paolo Bonzini
2011-05-03 16:49 ` [Qemu-devel] [PATCH 3/4] virtio: use cpu_physical_memory_map_fast Paolo Bonzini
2011-05-03 16:49 ` [Qemu-devel] [PATCH 4/4] milkymist: " Paolo Bonzini
2011-05-04 21:56   ` Michael Walle
2011-05-12 14:51 ` [Qemu-devel] [PATCH 0/4] introduce cpu_physical_memory_map_fast Paolo Bonzini
2011-05-31  9:16   ` Paolo Bonzini
2011-06-06 12:27     ` Paolo Bonzini
2011-06-06 12:56       ` Anthony Liguori
2011-06-06 13:09         ` Paolo Bonzini
2011-06-06 15:44           ` Anthony Liguori
2011-06-06 15:55             ` Paolo Bonzini
2011-05-12 15:32 ` Avi Kivity
2011-05-13  6:33   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).