xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] qemu-qdisk: Replace grant map by grant copy.
@ 2016-05-27 12:53 Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 1/3] libs, gnttab, libxc: Interface for grant copy operation Paulina Szubarczyk
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Paulina Szubarczyk @ 2016-05-27 12:53 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, George.Dunlap, Paulina Szubarczyk,
	ian.jackson, P.Gawkowski, anthony.perard

Hi, 

It is a proposition for implementation of the replacement of the grant map 
operation with grant copy. 

I would appreciate an opinion about the approach if is proper or maybe 
I assumed something wrongly, and if you see any possibility of improvement
or the things that need to be change.
If the approach is any good, I need to still rethink batch mode, notification 
and implementation for mini-os.

In the libs, gnttab, linbxc there is added interface and invocation of 
an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..) system call on the gnttdev device. 
Described in details in the following messages. It is not implemented for 
mini-os.

The grant map operation is replaced on the behalf of grant copy in 
qemu-xen-dir/hw/block/xen_disk. The implementation is described in the patch.

For the functional test I attached the device with a qdisk backend to the guest.
I successfully mounted it and stored files there. During creation of 
a file system on the device BLKIF_OP_DISCARD operation seems to fail(ret value 
different then zero) but it also fails for the original version due to error 
return from qemu.

I made fio tests before[0] and after[1] the changes with different iodepth and 
size of the block. The test which I run can be accessed on my github[2] but 
mainly after the warm up I run for 60 seconds:
    fio --time_based \
		--clocksource=clock_gettime \
		--rw=randread \
		--random_distribution=pareto:0.9 \
		--size=10g \
	    --direct='1' \
	    --ioengine=libaio \
		--filename=$DEV \
		--iodepth=$IODEPTH \
		--bs=$BS \
		--name=$NAME \
		--runtime=$RUNTIME >> $FILENAME
The test were repeated at least three times. 

Although before the changes results looks coherent for me, after there are
considerable peaks for iodepth = {4,8}.

[0] https://docs.google.com/spreadsheets/d/1n0lraodhF5jlNO-YWNTgl57Aoe3K5S7Qke8YQQkGCDQ/edit?usp=sharing
[1] https://docs.google.com/spreadsheets/d/1E6AMiB8ceJpExL6jWpH9u2yy6DZxzhmDUyFf-eUuJ0c/edit?usp=sharing
    - domU sheets
[2] https://github.com/paulina-szubarczyk/xen-benchmark
    - multitest_with_iodepth.sh


Thanks and regards, 
Paulina

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH 1/3] libs, gnttab, libxc: Interface for grant copy operation
  2016-05-27 12:53 [PATCH 0/3] qemu-qdisk: Replace grant map by grant copy Paulina Szubarczyk
@ 2016-05-27 12:53 ` Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 2/3] qemu-xen-dir/hw/block/xen_disk: Replace grant map by grant copy Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 3/3] qemu-xen-dir/hw/block: Cache local buffers used in " Paulina Szubarczyk
  2 siblings, 0 replies; 4+ messages in thread
From: Paulina Szubarczyk @ 2016-05-27 12:53 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, George.Dunlap, Paulina Szubarczyk,
	ian.jackson, P.Gawkowski, anthony.perard

Implentation of interface to grant copy operation called through
libxc. An ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..) system call is
invoked for linux. In the mini-os the operation is yet not
implemented.

* In the file "tools/include/xen-sys/Linux/gntdev.h" added
  - 'struct ioctl_gntdev_grant_copy_segment'
    The structure is analogous to 'struct gntdev_grant_copy_segment'
    defined in linux code include/uapi/xen/gntdev.h. Typdefs are
    replaced by they original types:
      typedef uint16_t domid_t;
      typedef uint32_t grant_ref_t;
    That leads to defining domids array with type uint16_t in libs,
    differently then in other functions concerning grant table
    operations in that library.

` - macro #define IOCTL_GNTDEV_GRANT_COPY

  - 'struct ioctl_gntdev_grant_copy'
    taken from linux code as higher. Structure aggregating
    'struct gntdev_grant_copy_segment'

* In the file libs/gnttab/linux.c
  - function int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
                              uint32_t count,
                              uint16_t *domids, uint32_t *refs, void
                              **bufs, uint32_t *offset, uint32_t *len,
                              int type, uint32_t notify_offset,
                              evtchn_port_t notify_port)

    It is a function used to perform grant copy opertion. It allocats
    'ioctl_gntdev_grant_copy' and 'ioctl_gntdev_grant_copy_segment'.
    Segments are filled from the passed values.

    When @type is different then zero the source to copy from are guest
    domain grant pages addressed by @refs and the destination is local
    memory accessed from @bufs, the operation flag is then set to
    'GNTCOPY_source_gref', contrarily for @type equal zero.

    @offset is the offset on the page
    @len is the amount of data to copy,
    @offset[i] + @len[i] should not exceed XEN_PAGE_SIZE
        - the condition is checked in gntdev device.

    Notification is yet not implemented.
---
 tools/include/xen-sys/Linux/gntdev.h  | 21 ++++++++++
 tools/libs/gnttab/gnttab_core.c       | 12 ++++++
 tools/libs/gnttab/include/xengnttab.h | 18 +++++++++
 tools/libs/gnttab/libxengnttab.map    |  2 +
 tools/libs/gnttab/linux.c             | 72 +++++++++++++++++++++++++++++++++++
 tools/libs/gnttab/minios.c            |  8 ++++
 tools/libs/gnttab/private.h           |  6 +++
 tools/libxc/include/xenctrl_compat.h  |  8 ++++
 tools/libxc/xc_gnttab_compat.c        | 12 ++++++
 9 files changed, 159 insertions(+)

diff --git a/tools/include/xen-sys/Linux/gntdev.h b/tools/include/xen-sys/Linux/gntdev.h
index caf6fb4..0ca07c9 100644
--- a/tools/include/xen-sys/Linux/gntdev.h
+++ b/tools/include/xen-sys/Linux/gntdev.h
@@ -147,4 +147,25 @@ struct ioctl_gntdev_unmap_notify {
 /* Send an interrupt on the indicated event channel */
 #define UNMAP_NOTIFY_SEND_EVENT 0x2
 
+struct ioctl_gntdev_grant_copy_segment {
+    union {
+        void *virt;
+        struct {
+            uint32_t ref;
+            uint16_t offset;
+            uint16_t domid;
+        } foreign;
+    } source, dest;
+    uint16_t len;
+    uint16_t flags;
+    int16_t status;
+};
+
+#define IOCTL_GNTDEV_GRANT_COPY \
+_IOC(_IOC_NONE, 'G', 8, sizeof(struct ioctl_gntdev_grant_copy))
+struct ioctl_gntdev_grant_copy {
+    unsigned int count;
+    struct ioctl_gntdev_grant_copy_segment *segments;
+};
+
 #endif /* __LINUX_PUBLIC_GNTDEV_H__ */
diff --git a/tools/libs/gnttab/gnttab_core.c b/tools/libs/gnttab/gnttab_core.c
index 5d0474d..1e014f8 100644
--- a/tools/libs/gnttab/gnttab_core.c
+++ b/tools/libs/gnttab/gnttab_core.c
@@ -113,6 +113,18 @@ int xengnttab_unmap(xengnttab_handle *xgt, void *start_address, uint32_t count)
     return osdep_gnttab_unmap(xgt, start_address, count);
 }
 
+int xengnttab_copy_grant(xengnttab_handle *xgt,
+                         uint32_t count,
+                         uint16_t *domids,
+                         uint32_t *refs,
+                         void **bufs,
+                         uint32_t *offset, 
+                         uint32_t *len,
+                         int type)
+{
+    return osdep_gnttab_grant_copy(xgt, count, domids, refs, bufs, offset, len, 
+                                   type, -1, -1);
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/include/xengnttab.h b/tools/libs/gnttab/include/xengnttab.h
index 0431dcf..923e022 100644
--- a/tools/libs/gnttab/include/xengnttab.h
+++ b/tools/libs/gnttab/include/xengnttab.h
@@ -258,6 +258,24 @@ int xengnttab_unmap(xengnttab_handle *xgt, void *start_address, uint32_t count);
 int xengnttab_set_max_grants(xengnttab_handle *xgt,
                              uint32_t nr_grants);
 
+/**
+ * Copy memory from or to the domains defined in domids array.
+ * When @type is different then zero data is copied from grant pages addressed 
+ * by @refs to @bufs, and contrarily for @type equal zero. 
+ *
+ * @offset is the offset on the page 
+ * @len is the amount of data to copy 
+ * @offset[i] + @len[i] should not exceed XEN_PAGE_SIZE
+ */
+int xengnttab_copy_grant(xengnttab_handle *xgt,
+                         uint32_t count,
+                         uint16_t *domids,
+                         uint32_t *refs,
+                         void **bufs,
+                         uint32_t *offset, 
+                         uint32_t *len, 
+                         int type);
+
 /*
  * Grant Sharing Interface (allocating and granting pages to others)
  */
diff --git a/tools/libs/gnttab/libxengnttab.map b/tools/libs/gnttab/libxengnttab.map
index dc737ac..6a94102 100644
--- a/tools/libs/gnttab/libxengnttab.map
+++ b/tools/libs/gnttab/libxengnttab.map
@@ -12,6 +12,8 @@ VERS_1.0 {
 
 		xengnttab_unmap;
 
+		xengnttab_copy_grant;
+		
 		xengntshr_open;
 		xengntshr_close;
 
diff --git a/tools/libs/gnttab/linux.c b/tools/libs/gnttab/linux.c
index 7b0fba4..2b21a9f 100644
--- a/tools/libs/gnttab/linux.c
+++ b/tools/libs/gnttab/linux.c
@@ -235,6 +235,78 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
     return 0;
 }
 
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                              uint32_t count,
+                              uint16_t *domids, uint32_t *refs, void **bufs,
+                              uint32_t *offset, uint32_t *len, int type, 
+                              uint32_t notify_offset, evtchn_port_t notify_port)
+{
+    int fd = xgt->fd;
+    struct ioctl_gntdev_grant_copy *copy = NULL;
+    struct ioctl_gntdev_grant_copy_segment *seg = NULL;
+    int i, r = 0;
+
+    copy = malloc(sizeof(struct ioctl_gntdev_grant_copy));
+    if(!copy) {
+        r = -1; goto out;
+    }
+
+    seg = calloc(count, sizeof(struct ioctl_gntdev_grant_copy_segment));
+    if(!seg) {
+        r = -1; goto out;
+    }
+
+    copy->segments = seg;
+    copy->count = count;
+
+    for (i = 0; i < count; i++)
+    {
+        seg[i].len = len[i];
+        seg[i].status = 0;
+
+        if(type) 
+        {
+            seg[i].flags = GNTCOPY_source_gref;
+
+            seg[i].source.foreign.domid = domids[i];
+            seg[i].source.foreign.ref = refs[i];
+            seg[i].source.foreign.offset = offset[i];
+            seg[i].dest.virt = bufs[i];
+        } 
+        else
+        {
+            seg[i].flags = GNTCOPY_dest_gref;
+
+            seg[i].dest.foreign.domid = domids[i];
+            seg[i].dest.foreign.ref = refs[i];
+            seg[i].dest.foreign.offset = offset[i];
+            seg[i].source.virt = bufs[i];
+        }
+    }
+    
+    if (ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, copy)) {
+        GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
+        r = -1; goto out;
+    }
+
+    for (i = 0; i < count; i++) {
+        if(seg[i].status != GNTST_okay) {
+            GTERROR(xgt->logger, "GRANT COPY failed for segment %d, "
+                    "with status %d\n", i, seg[i].status);
+        }
+    }
+
+    r = 0;
+out:
+    if(seg) 
+        free(seg);
+
+    if(copy)
+        free(copy);
+
+    return r;    
+}
+
 int osdep_gntshr_open(xengntshr_handle *xgs)
 {
     int fd = open(DEVXEN "gntalloc", O_RDWR);
diff --git a/tools/libs/gnttab/minios.c b/tools/libs/gnttab/minios.c
index 7e04174..8c90227 100644
--- a/tools/libs/gnttab/minios.c
+++ b/tools/libs/gnttab/minios.c
@@ -106,6 +106,14 @@ int osdep_gnttab_set_max_grants(xengnttab_handle *xgt, uint32_t count)
     return ret;
 }
 
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                            uint32_t count,
+                            uint16_t *domids, uint32_t *refs, void **bufs,
+                            uint32_t *mem, uint32_t *len, int type, 
+                            uint32_t notify_offset, evtchn_port_t notify_port)
+{
+    return -1;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/gnttab/private.h b/tools/libs/gnttab/private.h
index d286c86..098de8b 100644
--- a/tools/libs/gnttab/private.h
+++ b/tools/libs/gnttab/private.h
@@ -23,6 +23,12 @@ void *osdep_gnttab_grant_map(xengnttab_handle *xgt,
 int osdep_gnttab_unmap(xengnttab_handle *xgt,
                        void *start_address,
                        uint32_t count);
+int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
+                            uint32_t count,
+                            uint16_t *domids, uint32_t *refs, void **bufs,
+                            uint32_t *offset, uint32_t *len, int type, 
+                            uint32_t notify_offset, evtchn_port_t notify_port);
+
 int osdep_gntshr_open(xengntshr_handle *xgs);
 int osdep_gntshr_close(xengntshr_handle *xgs);
 
diff --git a/tools/libxc/include/xenctrl_compat.h b/tools/libxc/include/xenctrl_compat.h
index 93ccadb..871d48d 100644
--- a/tools/libxc/include/xenctrl_compat.h
+++ b/tools/libxc/include/xenctrl_compat.h
@@ -104,6 +104,14 @@ int xc_gnttab_munmap(xc_gnttab *xcg,
                      uint32_t count);
 int xc_gnttab_set_max_grants(xc_gnttab *xcg,
                              uint32_t count);
+int xc_gnttab_copy_grant(xc_gnttab *xcg,
+                         uint32_t count,
+                         uint16_t *domids,
+                         uint32_t *refs,
+                         void **bufs,
+                         uint32_t *mem, 
+                         uint32_t *len,
+                         int type);
 
 typedef struct xengntdev_handle xc_gntshr;
 
diff --git a/tools/libxc/xc_gnttab_compat.c b/tools/libxc/xc_gnttab_compat.c
index 6f036d8..888cfc3 100644
--- a/tools/libxc/xc_gnttab_compat.c
+++ b/tools/libxc/xc_gnttab_compat.c
@@ -69,6 +69,18 @@ int xc_gnttab_set_max_grants(xc_gnttab *xcg,
     return xengnttab_set_max_grants(xcg, count);
 }
 
+int xc_gnttab_copy_grant(xc_gnttab *xcg,
+                         uint32_t count,
+                         uint16_t *domids,
+                         uint32_t *refs,
+                         void **bufs,
+                         uint32_t *mem, 
+                         uint32_t *len,
+                         int type)
+{
+    return xengnttab_copy_grant(xcg, count, domids, refs, bufs, mem, len, type);
+}
+
 xc_gntshr *xc_gntshr_open(xentoollog_logger *logger,
                           unsigned open_flags)
 {
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 2/3] qemu-xen-dir/hw/block/xen_disk: Replace grant map by grant copy.
  2016-05-27 12:53 [PATCH 0/3] qemu-qdisk: Replace grant map by grant copy Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 1/3] libs, gnttab, libxc: Interface for grant copy operation Paulina Szubarczyk
@ 2016-05-27 12:53 ` Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 3/3] qemu-xen-dir/hw/block: Cache local buffers used in " Paulina Szubarczyk
  2 siblings, 0 replies; 4+ messages in thread
From: Paulina Szubarczyk @ 2016-05-27 12:53 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, George.Dunlap, Paulina Szubarczyk,
	ian.jackson, P.Gawkowski, anthony.perard

Grant copy operation is divided into two phases different for
'read' and 'write' operation.

For a 'read' operation the flow is as follow:
1. allocate local buffers for all the segments contained in
   a request.
2. fill the request io vectors with the buffers' addresses
3. invoke read operation by qemu device
4. in the completition call grant copy
5. free the buffers

Function 'ioreq_read_init' implements 1. and 2. step. It is called
instead of 'ioreq_map' in 'ioreq_runio_qemu_aio'. Then the function
'ioreq_runio_qemu_aio' continues withouth changes performing step 3.
Steps 4. and 5. are called in the callback function
'qemu_aio_complete'. The ioreq_read' function is implemented for
step 4 which calls the new function 'xc_gnttab_copy_grant' presented
in the other part of the patch.

For a 'write' operation steps 4. happens before step 2.. First data
are copied from calling guest domains and then qemu operates on them.
For that step 'ioreq_write' function is added.

The function for grant map operation are removed.
---
 hw/block/xen_disk.c | 419 +++++++++++++++++++++-------------------------------
 1 file changed, 170 insertions(+), 249 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 37e14d1..3e5eefd 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -79,13 +79,12 @@ struct ioreq {
     int                 postsync;
     uint8_t             mapped;
 
-    /* grant mapping */
-    uint32_t            domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    /* grant copy */
+    uint16_t            domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
     uint32_t            refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
     int                 prot;
     void                *page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
     void                *pages;
-    int                 num_unmap;
 
     /* aio status */
     int                 aio_inflight;
@@ -123,13 +122,8 @@ struct XenBlkDev {
     int                 requests_inflight;
     int                 requests_finished;
 
-    /* Persistent grants extension */
+    /* */
     gboolean            feature_discard;
-    gboolean            feature_persistent;
-    GTree               *persistent_gnts;
-    GSList              *persistent_regions;
-    unsigned int        persistent_gnt_count;
-    unsigned int        max_grants;
 
     /* qemu block driver */
     DriveInfo           *dinfo;
@@ -164,46 +158,6 @@ static void ioreq_reset(struct ioreq *ioreq)
     qemu_iovec_reset(&ioreq->v);
 }
 
-static gint int_cmp(gconstpointer a, gconstpointer b, gpointer user_data)
-{
-    uint ua = GPOINTER_TO_UINT(a);
-    uint ub = GPOINTER_TO_UINT(b);
-    return (ua > ub) - (ua < ub);
-}
-
-static void destroy_grant(gpointer pgnt)
-{
-    PersistentGrant *grant = pgnt;
-    XenGnttab gnt = grant->blkdev->xendev.gnttabdev;
-
-    if (xc_gnttab_munmap(gnt, grant->page, 1) != 0) {
-        xen_be_printf(&grant->blkdev->xendev, 0,
-                      "xc_gnttab_munmap failed: %s\n",
-                      strerror(errno));
-    }
-    grant->blkdev->persistent_gnt_count--;
-    xen_be_printf(&grant->blkdev->xendev, 3,
-                  "unmapped grant %p\n", grant->page);
-    g_free(grant);
-}
-
-static void remove_persistent_region(gpointer data, gpointer dev)
-{
-    PersistentRegion *region = data;
-    struct XenBlkDev *blkdev = dev;
-    XenGnttab gnt = blkdev->xendev.gnttabdev;
-
-    if (xc_gnttab_munmap(gnt, region->addr, region->num) != 0) {
-        xen_be_printf(&blkdev->xendev, 0,
-                      "xc_gnttab_munmap region %p failed: %s\n",
-                      region->addr, strerror(errno));
-    }
-    xen_be_printf(&blkdev->xendev, 3,
-                  "unmapped grant region %p with %d pages\n",
-                  region->addr, region->num);
-    g_free(region);
-}
-
 static struct ioreq *ioreq_start(struct XenBlkDev *blkdev)
 {
     struct ioreq *ioreq = NULL;
@@ -314,7 +268,9 @@ static int ioreq_parse(struct ioreq *ioreq)
         ioreq->refs[i]   = ioreq->req.seg[i].gref;
 
         mem = ioreq->req.seg[i].first_sect * blkdev->file_blk;
-        len = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) * blkdev->file_blk;
+        len = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) 
+              * blkdev->file_blk;
+
         qemu_iovec_add(&ioreq->v, (void*)mem, len);
     }
     if (ioreq->start + ioreq->v.size > blkdev->file_size) {
@@ -328,175 +284,151 @@ err:
     return -1;
 }
 
-static void ioreq_unmap(struct ioreq *ioreq)
+static void* get_buffer(void) {
+    void *buf;
+
+    buf = mmap(NULL, 1 << XC_PAGE_SHIFT, PROT_READ | PROT_WRITE, 
+               MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+
+    if (unlikely(buf == MAP_FAILED))
+        return NULL;
+
+    return buf;
+}
+
+static int free_buffer(void* buf) {
+    return munmap(buf, 1 << XC_PAGE_SHIFT);
+}
+
+static int free_buffers(void** page, int count) 
+{
+    int i, r = 0;
+
+    for (i = 0; i < count; i++) { 
+        
+        if(free_buffer(page[i])) 
+            r = 1;
+        
+        page[i] = NULL;
+    }
+
+    return r;
+}
+
+static int ioreq_write(struct ioreq *ioreq) 
 {
     XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
-    int i;
+    uint16_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    uint32_t offset[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    uint32_t len[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    void *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    int i, count = 0, rc, r;
+    /* Copy the data for write operation from guest grant pages addressed by 
+     * domids, refs, offset, len to local buffers.
+     * 
+     * Bufferes are then mapped to the pending request for further 
+     * completition.
+     */
 
-    if (ioreq->num_unmap == 0 || ioreq->mapped == 0) {
-        return;
+    if (ioreq->v.niov == 0) {
+        r = 0; goto out;
     }
-    if (batch_maps) {
-        if (!ioreq->pages) {
-            return;
+
+    count = ioreq->v.niov;
+    for (i = 0; i < count; i++) {
+        domids[i] = ioreq->domids[i];
+        refs[i]   = ioreq->refs[i];
+        offset[i] = ioreq->req.seg[i].first_sect * ioreq->blkdev->file_blk;
+        len[i] = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) 
+                  * ioreq->blkdev->file_blk;
+        pages[i]  = get_buffer();
+
+        if(!pages[i]) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to alloc page, errno %d \n", errno);
+            r = 1; goto out;
         }
-        if (xc_gnttab_munmap(gnt, ioreq->pages, ioreq->num_unmap) != 0) {
-            xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n",
-                          strerror(errno));
+    }
+    rc = xc_gnttab_copy_grant(gnt, count, domids, refs, pages, offset, len, 1);
+
+    if(rc) {
+        xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                      "failed to copy data for write %d \n", rc);
+
+        if(free_buffers(ioreq->page, ioreq->v.niov)) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to free page, errno %d \n", errno);
         }
-        ioreq->blkdev->cnt_map -= ioreq->num_unmap;
-        ioreq->pages = NULL;
-    } else {
-        for (i = 0; i < ioreq->num_unmap; i++) {
-            if (!ioreq->page[i]) {
-                continue;
-            }
-            if (xc_gnttab_munmap(gnt, ioreq->page[i], 1) != 0) {
-                xen_be_printf(&ioreq->blkdev->xendev, 0, "xc_gnttab_munmap failed: %s\n",
-                              strerror(errno));
-            }
-            ioreq->blkdev->cnt_map--;
-            ioreq->page[i] = NULL;
+        r = 1; goto out;
+    }
+
+    for (i = 0; i < ioreq->v.niov; i++) {
+        ioreq->page[i] = pages[i];
+        ioreq->v.iov[i].iov_base += (uintptr_t)pages[i];
+    }
+
+    r = 0;
+out:
+    return r;
+}
+
+static int ioreq_read_init(struct ioreq *ioreq) 
+{
+    int i;
+
+    if (ioreq->v.niov == 0) {
+        return 0;
+    }
+
+    for (i = 0; i < ioreq->v.niov; i++) {
+        ioreq->page[i] = get_buffer();
+        if(!ioreq->page[i]) {
+            return -1;
         }
+        ioreq->v.iov[i].iov_base += (uintptr_t)ioreq->page[i];
     }
-    ioreq->mapped = 0;
+
+    return 0;
 }
 
-static int ioreq_map(struct ioreq *ioreq)
+
+static int ioreq_read(struct ioreq *ioreq) 
 {
     XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
-    uint32_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    uint16_t domids[BLKIF_MAX_SEGMENTS_PER_REQUEST];
     uint32_t refs[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-    void *page[BLKIF_MAX_SEGMENTS_PER_REQUEST];
-    int i, j, new_maps = 0;
-    PersistentGrant *grant;
-    PersistentRegion *region;
-    /* domids and refs variables will contain the information necessary
-     * to map the grants that are needed to fulfill this request.
-     *
-     * After mapping the needed grants, the page array will contain the
-     * memory address of each granted page in the order specified in ioreq
-     * (disregarding if it's a persistent grant or not).
+    uint32_t offset[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    uint32_t len[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    void *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+    int i, count = 0, rc;
+
+    /* Copy the data from local buffers to guest grant pages addressed by 
+     * domids, refs, offset on the completition of read operation.
      */
 
-    if (ioreq->v.niov == 0 || ioreq->mapped == 1) {
+    if (ioreq->v.niov == 0) {
         return 0;
     }
-    if (ioreq->blkdev->feature_persistent) {
-        for (i = 0; i < ioreq->v.niov; i++) {
-            grant = g_tree_lookup(ioreq->blkdev->persistent_gnts,
-                                    GUINT_TO_POINTER(ioreq->refs[i]));
-
-            if (grant != NULL) {
-                page[i] = grant->page;
-                xen_be_printf(&ioreq->blkdev->xendev, 3,
-                              "using persistent-grant %" PRIu32 "\n",
-                              ioreq->refs[i]);
-            } else {
-                    /* Add the grant to the list of grants that
-                     * should be mapped
-                     */
-                    domids[new_maps] = ioreq->domids[i];
-                    refs[new_maps] = ioreq->refs[i];
-                    page[i] = NULL;
-                    new_maps++;
-            }
-        }
-        /* Set the protection to RW, since grants may be reused later
-         * with a different protection than the one needed for this request
-         */
-        ioreq->prot = PROT_WRITE | PROT_READ;
-    } else {
-        /* All grants in the request should be mapped */
-        memcpy(refs, ioreq->refs, sizeof(refs));
-        memcpy(domids, ioreq->domids, sizeof(domids));
-        memset(page, 0, sizeof(page));
-        new_maps = ioreq->v.niov;
-    }
 
-    if (batch_maps && new_maps) {
-        ioreq->pages = xc_gnttab_map_grant_refs
-            (gnt, new_maps, domids, refs, ioreq->prot);
-        if (ioreq->pages == NULL) {
-            xen_be_printf(&ioreq->blkdev->xendev, 0,
-                          "can't map %d grant refs (%s, %d maps)\n",
-                          new_maps, strerror(errno), ioreq->blkdev->cnt_map);
-            return -1;
-        }
-        for (i = 0, j = 0; i < ioreq->v.niov; i++) {
-            if (page[i] == NULL) {
-                page[i] = ioreq->pages + (j++) * XC_PAGE_SIZE;
-            }
-        }
-        ioreq->blkdev->cnt_map += new_maps;
-    } else if (new_maps)  {
-        for (i = 0; i < new_maps; i++) {
-            ioreq->page[i] = xc_gnttab_map_grant_ref
-                (gnt, domids[i], refs[i], ioreq->prot);
-            if (ioreq->page[i] == NULL) {
-                xen_be_printf(&ioreq->blkdev->xendev, 0,
-                              "can't map grant ref %d (%s, %d maps)\n",
-                              refs[i], strerror(errno), ioreq->blkdev->cnt_map);
-                ioreq->mapped = 1;
-                ioreq_unmap(ioreq);
-                return -1;
-            }
-            ioreq->blkdev->cnt_map++;
-        }
-        for (i = 0, j = 0; i < ioreq->v.niov; i++) {
-            if (page[i] == NULL) {
-                page[i] = ioreq->page[j++];
-            }
-        }
+    count = ioreq->v.niov;
+    for (i = 0; i < count; i++) {
+        domids[i] = ioreq->domids[i];
+        refs[i]   = ioreq->refs[i];
+        offset[i] = ioreq->req.seg[i].first_sect * ioreq->blkdev->file_blk;
+        len[i] = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) 
+                 * ioreq->blkdev->file_blk;
+        pages[i]   = ioreq->v.iov[i].iov_base;
     }
-    if (ioreq->blkdev->feature_persistent && new_maps != 0 &&
-        (!batch_maps || (ioreq->blkdev->persistent_gnt_count + new_maps <=
-        ioreq->blkdev->max_grants))) {
-        /*
-         * If we are using persistent grants and batch mappings only
-         * add the new maps to the list of persistent grants if the whole
-         * area can be persistently mapped.
-         */
-        if (batch_maps) {
-            region = g_malloc0(sizeof(*region));
-            region->addr = ioreq->pages;
-            region->num = new_maps;
-            ioreq->blkdev->persistent_regions = g_slist_append(
-                                            ioreq->blkdev->persistent_regions,
-                                            region);
-        }
-        while ((ioreq->blkdev->persistent_gnt_count < ioreq->blkdev->max_grants)
-              && new_maps) {
-            /* Go through the list of newly mapped grants and add as many
-             * as possible to the list of persistently mapped grants.
-             *
-             * Since we start at the end of ioreq->page(s), we only need
-             * to decrease new_maps to prevent this granted pages from
-             * being unmapped in ioreq_unmap.
-             */
-            grant = g_malloc0(sizeof(*grant));
-            new_maps--;
-            if (batch_maps) {
-                grant->page = ioreq->pages + (new_maps) * XC_PAGE_SIZE;
-            } else {
-                grant->page = ioreq->page[new_maps];
-            }
-            grant->blkdev = ioreq->blkdev;
-            xen_be_printf(&ioreq->blkdev->xendev, 3,
-                          "adding grant %" PRIu32 " page: %p\n",
-                          refs[new_maps], grant->page);
-            g_tree_insert(ioreq->blkdev->persistent_gnts,
-                          GUINT_TO_POINTER(refs[new_maps]),
-                          grant);
-            ioreq->blkdev->persistent_gnt_count++;
-        }
-        assert(!batch_maps || new_maps == 0);
-    }
-    for (i = 0; i < ioreq->v.niov; i++) {
-        ioreq->v.iov[i].iov_base += (uintptr_t)page[i];
+
+    rc = xc_gnttab_copy_grant(gnt, count, domids, refs, pages, offset, len, 0);
+    
+    if(rc) {
+        xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                      "failed to copy data to guest %d \n", rc);
+        return -1;
     }
-    ioreq->mapped = 1;
-    ioreq->num_unmap = new_maps;
+
     return 0;
 }
 
@@ -529,7 +461,23 @@ static void qemu_aio_complete(void *opaque, int ret)
     }
 
     ioreq->status = ioreq->aio_errors ? BLKIF_RSP_ERROR : BLKIF_RSP_OKAY;
-    ioreq_unmap(ioreq);
+    
+    switch(ioreq->req.operation) {
+    case BLKIF_OP_READ: 
+        if(ioreq_read(ioreq)) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to copy read data to guest\n");
+        }
+    case BLKIF_OP_WRITE:
+        if(free_buffers(ioreq->page, ioreq->v.niov)) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to free page, errno %d \n", errno);
+        }
+        break;
+    default:
+        break;
+    }
+
     ioreq_finish(ioreq);
     switch (ioreq->req.operation) {
     case BLKIF_OP_WRITE:
@@ -551,8 +499,25 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
 {
     struct XenBlkDev *blkdev = ioreq->blkdev;
 
-    if (ioreq->req.nr_segments && ioreq_map(ioreq) == -1) {
-        goto err_no_map;
+    switch (ioreq->req.operation) {
+    case BLKIF_OP_READ:
+        if (ioreq_read_init(ioreq)) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to initialize buffers for"
+                          "copy data to guest %d \n", errno);
+            goto out;
+        }
+            break;
+    case BLKIF_OP_WRITE:
+    case BLKIF_OP_FLUSH_DISKCACHE:
+        if (ioreq_write(ioreq)) {
+            xen_be_printf(&ioreq->blkdev->xendev, 0, 
+                          "failed to write data from guest\n");
+            goto out;
+        }
+        break;
+    default:
+        break;
     }
 
     ioreq->aio_inflight++;
@@ -594,16 +559,14 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq)
     }
     default:
         /* unknown operation (shouldn't happen -- parse catches this) */
-        goto err;
+        goto out;
     }
 
     qemu_aio_complete(ioreq, 0);
 
     return 0;
 
-err:
-    ioreq_unmap(ioreq);
-err_no_map:
+out:
     ioreq_finish(ioreq);
     ioreq->status = BLKIF_RSP_ERROR;
     return -1;
@@ -764,11 +727,6 @@ static void blk_alloc(struct XenDevice *xendev)
     if (xen_mode != XEN_EMULATE) {
         batch_maps = 1;
     }
-    if (xc_gnttab_set_max_grants(xendev->gnttabdev,
-            MAX_GRANTS(max_requests, BLKIF_MAX_SEGMENTS_PER_REQUEST)) < 0) {
-        xen_be_printf(xendev, 0, "xc_gnttab_set_max_grants failed: %s\n",
-                      strerror(errno));
-    }
 }
 
 static void blk_parse_discard(struct XenBlkDev *blkdev)
@@ -880,7 +838,7 @@ out_error:
 static int blk_connect(struct XenDevice *xendev)
 {
     struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
-    int pers, index, qflags;
+    int index, qflags;
     bool readonly = true;
 
     /* read-only ? */
@@ -958,11 +916,6 @@ static int blk_connect(struct XenDevice *xendev)
                              &blkdev->xendev.remote_port) == -1) {
         return -1;
     }
-    if (xenstore_read_fe_int(&blkdev->xendev, "feature-persistent", &pers)) {
-        blkdev->feature_persistent = FALSE;
-    } else {
-        blkdev->feature_persistent = !!pers;
-    }
 
     blkdev->protocol = BLKIF_PROTOCOL_NATIVE;
     if (blkdev->xendev.protocol) {
@@ -1006,18 +959,6 @@ static int blk_connect(struct XenDevice *xendev)
     }
     }
 
-    if (blkdev->feature_persistent) {
-        /* Init persistent grants */
-        blkdev->max_grants = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST;
-        blkdev->persistent_gnts = g_tree_new_full((GCompareDataFunc)int_cmp,
-                                             NULL, NULL,
-                                             batch_maps ?
-                                             (GDestroyNotify)g_free :
-                                             (GDestroyNotify)destroy_grant);
-        blkdev->persistent_regions = NULL;
-        blkdev->persistent_gnt_count = 0;
-    }
-
     xen_be_bind_evtchn(&blkdev->xendev);
 
     xen_be_printf(&blkdev->xendev, 1, "ok: proto %s, ring-ref %d, "
@@ -1043,26 +984,6 @@ static void blk_disconnect(struct XenDevice *xendev)
         blkdev->cnt_map--;
         blkdev->sring = NULL;
     }
-
-    /*
-     * Unmap persistent grants before switching to the closed state
-     * so the frontend can free them.
-     *
-     * In the !batch_maps case g_tree_destroy will take care of unmapping
-     * the grant, but in the batch_maps case we need to iterate over every
-     * region in persistent_regions and unmap it.
-     */
-    if (blkdev->feature_persistent) {
-        g_tree_destroy(blkdev->persistent_gnts);
-        assert(batch_maps || blkdev->persistent_gnt_count == 0);
-        if (batch_maps) {
-            blkdev->persistent_gnt_count = 0;
-            g_slist_foreach(blkdev->persistent_regions,
-                            (GFunc)remove_persistent_region, blkdev);
-            g_slist_free(blkdev->persistent_regions);
-        }
-        blkdev->feature_persistent = false;
-    }
 }
 
 static int blk_free(struct XenDevice *xendev)
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH 3/3] qemu-xen-dir/hw/block: Cache local buffers used in grant copy
  2016-05-27 12:53 [PATCH 0/3] qemu-qdisk: Replace grant map by grant copy Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 1/3] libs, gnttab, libxc: Interface for grant copy operation Paulina Szubarczyk
  2016-05-27 12:53 ` [PATCH 2/3] qemu-xen-dir/hw/block/xen_disk: Replace grant map by grant copy Paulina Szubarczyk
@ 2016-05-27 12:53 ` Paulina Szubarczyk
  2 siblings, 0 replies; 4+ messages in thread
From: Paulina Szubarczyk @ 2016-05-27 12:53 UTC (permalink / raw)
  To: xen-devel, roger.pau
  Cc: sstabellini, wei.liu2, George.Dunlap, Paulina Szubarczyk,
	ian.jackson, P.Gawkowski, anthony.perard

If there are still pending requests the buffers are not free() but
cached in an array of a size max_request*BLKIF_MAX_SEGMENTS_PER_REQUEST
---
 hw/block/xen_disk.c | 59 ++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 47 insertions(+), 12 deletions(-)

diff --git a/hw/block/xen_disk.c b/hw/block/xen_disk.c
index 3e5eefd..ab1863b 100644
--- a/hw/block/xen_disk.c
+++ b/hw/block/xen_disk.c
@@ -125,6 +125,10 @@ struct XenBlkDev {
     /* */
     gboolean            feature_discard;
 
+    /* request buffer cache */
+    void                **buf_cache;
+    int                 buf_cache_free;
+
     /* qemu block driver */
     DriveInfo           *dinfo;
     BlockBackend        *blk;
@@ -284,11 +288,16 @@ err:
     return -1;
 }
 
-static void* get_buffer(void) {
+static void* get_buffer(struct XenBlkDev *blkdev) {
     void *buf;
 
-    buf = mmap(NULL, 1 << XC_PAGE_SHIFT, PROT_READ | PROT_WRITE, 
+    if(blkdev->buf_cache_free <= 0) {
+        buf = mmap(NULL, 1 << XC_PAGE_SHIFT, PROT_READ | PROT_WRITE, 
                MAP_SHARED | MAP_ANONYMOUS, -1, 0);
+    } else {
+        blkdev->buf_cache_free--;
+        buf = blkdev->buf_cache[blkdev->buf_cache_free];
+    }
 
     if (unlikely(buf == MAP_FAILED))
         return NULL;
@@ -300,21 +309,40 @@ static int free_buffer(void* buf) {
     return munmap(buf, 1 << XC_PAGE_SHIFT);
 }
 
-static int free_buffers(void** page, int count) 
+static int free_buffers(void** page, int count, struct XenBlkDev *blkdev) 
 {
-    int i, r = 0;
+    int i, put_buf_cache = 0, r = 0;
+
+    if (blkdev->more_work && blkdev->requests_inflight < max_requests) {
+        put_buf_cache = max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST
+                        - blkdev->buf_cache_free;
+    }
 
     for (i = 0; i < count; i++) { 
-        
-        if(free_buffer(page[i])) 
-            r = 1;
-        
+        if(put_buf_cache > 0) {
+            blkdev->buf_cache[blkdev->buf_cache_free++] = page[i];
+            put_buf_cache--;
+        } else { 
+            if(free_buffer(page[i])) 
+                r = 1;
+        }
+
         page[i] = NULL;
     }
 
     return r;
 }
 
+static void free_buf_cache(struct XenBlkDev *blkdev) {
+    int i;
+    for(i = 0; i < blkdev->buf_cache_free; i++) {
+        free_buffer(blkdev->buf_cache[i]);
+    }
+
+    blkdev->buf_cache_free = 0;
+    free(blkdev->buf_cache);
+}
+
 static int ioreq_write(struct ioreq *ioreq) 
 {
     XenGnttab gnt = ioreq->blkdev->xendev.gnttabdev;
@@ -342,7 +370,7 @@ static int ioreq_write(struct ioreq *ioreq)
         offset[i] = ioreq->req.seg[i].first_sect * ioreq->blkdev->file_blk;
         len[i] = (ioreq->req.seg[i].last_sect - ioreq->req.seg[i].first_sect + 1) 
                   * ioreq->blkdev->file_blk;
-        pages[i]  = get_buffer();
+        pages[i]  = get_buffer(ioreq->blkdev);
 
         if(!pages[i]) {
             xen_be_printf(&ioreq->blkdev->xendev, 0, 
@@ -356,7 +384,7 @@ static int ioreq_write(struct ioreq *ioreq)
         xen_be_printf(&ioreq->blkdev->xendev, 0, 
                       "failed to copy data for write %d \n", rc);
 
-        if(free_buffers(ioreq->page, ioreq->v.niov)) {
+        if(free_buffers(ioreq->page, ioreq->v.niov, ioreq->blkdev)) {
             xen_be_printf(&ioreq->blkdev->xendev, 0, 
                           "failed to free page, errno %d \n", errno);
         }
@@ -382,7 +410,7 @@ static int ioreq_read_init(struct ioreq *ioreq)
     }
 
     for (i = 0; i < ioreq->v.niov; i++) {
-        ioreq->page[i] = get_buffer();
+        ioreq->page[i] = get_buffer(ioreq->blkdev);
         if(!ioreq->page[i]) {
             return -1;
         }
@@ -469,7 +497,7 @@ static void qemu_aio_complete(void *opaque, int ret)
                           "failed to copy read data to guest\n");
         }
     case BLKIF_OP_WRITE:
-        if(free_buffers(ioreq->page, ioreq->v.niov)) {
+        if(free_buffers(ioreq->page, ioreq->v.niov, ioreq->blkdev)) {
             xen_be_printf(&ioreq->blkdev->xendev, 0, 
                           "failed to free page, errno %d \n", errno);
         }
@@ -936,6 +964,11 @@ static int blk_connect(struct XenDevice *xendev)
     }
     blkdev->cnt_map++;
 
+    /* create buffer cache for grant copy operations*/
+    blkdev->buf_cache_free = 0;
+    blkdev->buf_cache = calloc(max_requests * BLKIF_MAX_SEGMENTS_PER_REQUEST, 
+                               sizeof(void *));
+
     switch (blkdev->protocol) {
     case BLKIF_PROTOCOL_NATIVE:
     {
@@ -972,6 +1005,8 @@ static void blk_disconnect(struct XenDevice *xendev)
 {
     struct XenBlkDev *blkdev = container_of(xendev, struct XenBlkDev, xendev);
 
+    free_buf_cache(blkdev);
+
     if (blkdev->blk) {
         blk_detach_dev(blkdev->blk, blkdev);
         blk_unref(blkdev->blk);
-- 
1.9.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-05-27 12:55 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-27 12:53 [PATCH 0/3] qemu-qdisk: Replace grant map by grant copy Paulina Szubarczyk
2016-05-27 12:53 ` [PATCH 1/3] libs, gnttab, libxc: Interface for grant copy operation Paulina Szubarczyk
2016-05-27 12:53 ` [PATCH 2/3] qemu-xen-dir/hw/block/xen_disk: Replace grant map by grant copy Paulina Szubarczyk
2016-05-27 12:53 ` [PATCH 3/3] qemu-xen-dir/hw/block: Cache local buffers used in " Paulina Szubarczyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).