[Qemu-devel] [PATCH v3 0/3] Performance improvements for xen

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block
@ 2018-12-12 11:16 Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 1/3] xen-block: improve batching behaviour Paul Durrant
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Paul Durrant @ 2018-12-12 11:16 UTC (permalink / raw)
  To: qemu-devel, qemu-block, xen-devel
  Cc: Paul Durrant, Anthony Perard, Kevin Wolf, Max Reitz,
	Stefan Hajnoczi, Stefano Stabellini

This series is a re-base of Tim's v2 series [1] on top of my series [2].

[1] https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg00243.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02271.html

Tim Smith (3):
  xen-block: improve batching behaviour
  xen-block: improve response latency
  xen-block: avoid repeated memory allocation

 hw/block/dataplane/xen-block.c | 105 ++++++++++++++++++++++++-----------------
 1 file changed, 62 insertions(+), 43 deletions(-)
---
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
-- 
2.1.4

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v3 1/3] xen-block: improve batching behaviour
  2018-12-12 11:16 [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Paul Durrant
@ 2018-12-12 11:16 ` Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 2/3] xen-block: improve response latency Paul Durrant
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Paul Durrant @ 2018-12-12 11:16 UTC (permalink / raw)
  To: qemu-devel, qemu-block, xen-devel
  Cc: Tim Smith, Paul Durrant, Stefan Hajnoczi, Stefano Stabellini,
	Anthony Perard, Kevin Wolf, Max Reitz

From: Tim Smith <tim.smith@citrix.com>

When I/O consists of many small requests, performance is improved by
batching them together in a single io_submit() call. When there are
relatively few requests, the extra overhead is not worth it. This
introduces a check to start batching I/O requests via blk_io_plug()/
blk_io_unplug() in an amount proportional to the number which were
already in flight at the time we started reading the ring.

Signed-off-by: Tim Smith <tim.smith@citrix.com>

Re-based and commit comment adjusted.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
---
 hw/block/dataplane/xen-block.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index 80df7da..db17ab5 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -528,10 +528,18 @@ static int xen_block_get_request(XenBlockDataPlane *dataplane,
     return 0;
 }
 
+/*
+ * Threshold of in-flight requests above which we will start using
+ * blk_io_plug()/blk_io_unplug() to batch requests.
+ */
+#define IO_PLUG_THRESHOLD 1
+
 static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
 {
     RING_IDX rc, rp;
     XenBlockRequest *request;
+    int inflight_atstart = dataplane->requests_inflight;
+    int batched = 0;
 
     dataplane->more_work = 0;
 
@@ -540,6 +548,18 @@ static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
     xen_block_send_response_all(dataplane);
+    /*
+     * If there was more than IO_PLUG_THRESHOLD requests in flight
+     * when we got here, this is an indication that there the bottleneck
+     * is below us, so it's worth beginning to batch up I/O requests
+     * rather than submitting them immediately. The maximum number
+     * of requests we're willing to batch is the number already in
+     * flight, so it can grow up to max_requests when the bottleneck
+     * is below us.
+     */
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_plug(dataplane->blk);
+    }
     while (rc != rp) {
         /* pull request from ring */
         if (RING_REQUEST_CONS_OVERFLOW(&dataplane->rings.common, rc)) {
@@ -585,7 +605,22 @@ static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
             continue;
         }
 
+        if (inflight_atstart > IO_PLUG_THRESHOLD &&
+            batched >= inflight_atstart) {
+            blk_io_unplug(dataplane->blk);
+        }
         xen_block_do_aio(request);
+        if (inflight_atstart > IO_PLUG_THRESHOLD) {
+            if (batched >= inflight_atstart) {
+                blk_io_plug(dataplane->blk);
+                batched = 0;
+            } else {
+                batched++;
+            }
+        }
+    }
+    if (inflight_atstart > IO_PLUG_THRESHOLD) {
+        blk_io_unplug(dataplane->blk);
     }
 
     if (dataplane->more_work &&
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v3 2/3] xen-block: improve response latency
  2018-12-12 11:16 [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 1/3] xen-block: improve batching behaviour Paul Durrant
@ 2018-12-12 11:16 ` Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 3/3] xen-block: avoid repeated memory allocation Paul Durrant
  2018-12-13 12:24 ` [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Anthony PERARD
  3 siblings, 0 replies; 5+ messages in thread
From: Paul Durrant @ 2018-12-12 11:16 UTC (permalink / raw)
  To: qemu-devel, qemu-block, xen-devel
  Cc: Tim Smith, Paul Durrant, Stefan Hajnoczi, Stefano Stabellini,
	Anthony Perard, Kevin Wolf, Max Reitz

From: Tim Smith <tim.smith@citrix.com>

If the I/O ring is full, the guest cannot send any more requests
until some responses are sent. Only sending all available responses
just before checking for new work does not leave much time for the
guest to supply new work, so this will cause stalls if the ring gets
full. Also, not completing reads as soon as possible adds latency
to the guest.

To alleviate that, complete IO requests as soon as they come back.
xen_block_send_response() already returns a value indicating whether
a notify should be sent, which is all the batching we need.

Signed-off-by: Tim Smith <tim.smith@citrix.com>

Re-based and commit comment adjusted.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
---
 hw/block/dataplane/xen-block.c | 56 ++++++++++++++----------------------------
 1 file changed, 18 insertions(+), 38 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index db17ab5..b4ff2e3 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -55,11 +55,9 @@ struct XenBlockDataPlane {
     blkif_back_rings_t rings;
     int more_work;
     QLIST_HEAD(inflight_head, XenBlockRequest) inflight;
-    QLIST_HEAD(finished_head, XenBlockRequest) finished;
     QLIST_HEAD(freelist_head, XenBlockRequest) freelist;
     int requests_total;
     int requests_inflight;
-    int requests_finished;
     unsigned int max_requests;
     BlockBackend *blk;
     QEMUBH *bh;
@@ -116,12 +114,10 @@ static void xen_block_finish_request(XenBlockRequest *request)
     XenBlockDataPlane *dataplane = request->dataplane;
 
     QLIST_REMOVE(request, list);
-    QLIST_INSERT_HEAD(&dataplane->finished, request, list);
     dataplane->requests_inflight--;
-    dataplane->requests_finished++;
 }
 
-static void xen_block_release_request(XenBlockRequest *request, bool finish)
+static void xen_block_release_request(XenBlockRequest *request)
 {
     XenBlockDataPlane *dataplane = request->dataplane;
 
@@ -129,11 +125,7 @@ static void xen_block_release_request(XenBlockRequest *request, bool finish)
     reset_request(request);
     request->dataplane = dataplane;
     QLIST_INSERT_HEAD(&dataplane->freelist, request, list);
-    if (finish) {
-        dataplane->requests_finished--;
-    } else {
-        dataplane->requests_inflight--;
-    }
+    dataplane->requests_inflight--;
 }
 
 /*
@@ -248,6 +240,7 @@ static int xen_block_copy_request(XenBlockRequest *request)
 }
 
 static int xen_block_do_aio(XenBlockRequest *request);
+static int xen_block_send_response(XenBlockRequest *request);
 
 static void xen_block_complete_aio(void *opaque, int ret)
 {
@@ -312,6 +305,18 @@ static void xen_block_complete_aio(void *opaque, int ret)
     default:
         break;
     }
+    if (xen_block_send_response(request)) {
+        Error *local_err = NULL;
+
+        xen_device_notify_event_channel(dataplane->xendev,
+                                        dataplane->event_channel,
+                                        &local_err);
+        if (local_err) {
+            error_report_err(local_err);
+        }
+    }
+    xen_block_release_request(request);
+
     qemu_bh_schedule(dataplane->bh);
 
 done:
@@ -419,7 +424,7 @@ err:
     return -1;
 }
 
-static int xen_block_send_response_one(XenBlockRequest *request)
+static int xen_block_send_response(XenBlockRequest *request)
 {
     XenBlockDataPlane *dataplane = request->dataplane;
     int send_notify = 0;
@@ -474,29 +479,6 @@ static int xen_block_send_response_one(XenBlockRequest *request)
     return send_notify;
 }
 
-/* walk finished list, send outstanding responses, free requests */
-static void xen_block_send_response_all(XenBlockDataPlane *dataplane)
-{
-    XenBlockRequest *request;
-    int send_notify = 0;
-
-    while (!QLIST_EMPTY(&dataplane->finished)) {
-        request = QLIST_FIRST(&dataplane->finished);
-        send_notify += xen_block_send_response_one(request);
-        xen_block_release_request(request, true);
-    }
-    if (send_notify) {
-        Error *local_err = NULL;
-
-        xen_device_notify_event_channel(dataplane->xendev,
-                                        dataplane->event_channel,
-                                        &local_err);
-        if (local_err) {
-            error_report_err(local_err);
-        }
-    }
-}
-
 static int xen_block_get_request(XenBlockDataPlane *dataplane,
                                  XenBlockRequest *request, RING_IDX rc)
 {
@@ -547,7 +529,6 @@ static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
     rp = dataplane->rings.common.sring->req_prod;
     xen_rmb(); /* Ensure we see queued requests up to 'rp'. */
 
-    xen_block_send_response_all(dataplane);
     /*
      * If there was more than IO_PLUG_THRESHOLD requests in flight
      * when we got here, this is an indication that there the bottleneck
@@ -591,7 +572,7 @@ static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
                 break;
             };
 
-            if (xen_block_send_response_one(request)) {
+            if (xen_block_send_response(request)) {
                 Error *local_err = NULL;
 
                 xen_device_notify_event_channel(dataplane->xendev,
@@ -601,7 +582,7 @@ static void xen_block_handle_requests(XenBlockDataPlane *dataplane)
                     error_report_err(local_err);
                 }
             }
-            xen_block_release_request(request, false);
+            xen_block_release_request(request);
             continue;
         }
 
@@ -657,7 +638,6 @@ XenBlockDataPlane *xen_block_dataplane_create(XenDevice *xendev,
     dataplane->file_size = blk_getlength(dataplane->blk);
 
     QLIST_INIT(&dataplane->inflight);
-    QLIST_INIT(&dataplane->finished);
     QLIST_INIT(&dataplane->freelist);
 
     if (iothread) {
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PATCH v3 3/3] xen-block: avoid repeated memory allocation
  2018-12-12 11:16 [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 1/3] xen-block: improve batching behaviour Paul Durrant
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 2/3] xen-block: improve response latency Paul Durrant
@ 2018-12-12 11:16 ` Paul Durrant
  2018-12-13 12:24 ` [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Anthony PERARD
  3 siblings, 0 replies; 5+ messages in thread
From: Paul Durrant @ 2018-12-12 11:16 UTC (permalink / raw)
  To: qemu-devel, qemu-block, xen-devel
  Cc: Tim Smith, Paul Durrant, Stefan Hajnoczi, Stefano Stabellini,
	Anthony Perard, Kevin Wolf, Max Reitz

From: Tim Smith <tim.smith@citrix.com>

The xen-block dataplane currently allocates memory to hold the data for
each request as that request is used, and frees it afterwards. Because
it requires page-aligned blocks, this interacts poorly with non-page-
aligned allocations and balloons the heap.

Instead, allocate the maximum possible buffer size required for the
protocol, which is BLKIF_MAX_SEGMENTS_PER_REQUEST (currently 11) pages
when the request structure is created, and keep that buffer until it is
destroyed. Since the requests are re-used via a free list, this should
actually improve memory usage.

Signed-off-by: Tim Smith <tim.smith@citrix.com>

Re-based and commit comment adjusted.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
---
 hw/block/dataplane/xen-block.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/hw/block/dataplane/xen-block.c b/hw/block/dataplane/xen-block.c
index b4ff2e3..21804d7 100644
--- a/hw/block/dataplane/xen-block.c
+++ b/hw/block/dataplane/xen-block.c
@@ -70,7 +70,6 @@ static void reset_request(XenBlockRequest *request)
     memset(&request->req, 0, sizeof(request->req));
     request->status = 0;
     request->start = 0;
-    request->buf = NULL;
     request->size = 0;
     request->presync = 0;
 
@@ -95,6 +94,14 @@ static XenBlockRequest *xen_block_start_request(XenBlockDataPlane *dataplane)
         /* allocate new struct */
         request = g_malloc0(sizeof(*request));
         request->dataplane = dataplane;
+        /*
+         * We cannot need more pages per requests than this, and since we
+         * re-use requests, allocate the memory once here. It will be freed
+         * xen_block_dataplane_destroy() when the request list is freed.
+         */
+        request->buf = qemu_memalign(XC_PAGE_SIZE,
+                                     BLKIF_MAX_SEGMENTS_PER_REQUEST *
+                                     XC_PAGE_SIZE);
         dataplane->requests_total++;
         qemu_iovec_init(&request->v, 1);
     } else {
@@ -272,14 +279,12 @@ static void xen_block_complete_aio(void *opaque, int ret)
         if (ret == 0) {
             xen_block_copy_request(request);
         }
-        qemu_vfree(request->buf);
         break;
     case BLKIF_OP_WRITE:
     case BLKIF_OP_FLUSH_DISKCACHE:
         if (!request->req.nr_segments) {
             break;
         }
-        qemu_vfree(request->buf);
         break;
     default:
         break;
@@ -360,12 +365,10 @@ static int xen_block_do_aio(XenBlockRequest *request)
 {
     XenBlockDataPlane *dataplane = request->dataplane;
 
-    request->buf = qemu_memalign(XC_PAGE_SIZE, request->size);
     if (request->req.nr_segments &&
         (request->req.operation == BLKIF_OP_WRITE ||
          request->req.operation == BLKIF_OP_FLUSH_DISKCACHE) &&
         xen_block_copy_request(request)) {
-        qemu_vfree(request->buf);
         goto err;
     }
 
@@ -665,6 +668,7 @@ void xen_block_dataplane_destroy(XenBlockDataPlane *dataplane)
         request = QLIST_FIRST(&dataplane->freelist);
         QLIST_REMOVE(request, list);
         qemu_iovec_destroy(&request->v);
+        qemu_vfree(request->buf);
         g_free(request);
     }
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block
  2018-12-12 11:16 [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Paul Durrant
                   ` (2 preceding siblings ...)
  2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 3/3] xen-block: avoid repeated memory allocation Paul Durrant
@ 2018-12-13 12:24 ` Anthony PERARD
  3 siblings, 0 replies; 5+ messages in thread
From: Anthony PERARD @ 2018-12-13 12:24 UTC (permalink / raw)
  To: Paul Durrant
  Cc: qemu-devel, qemu-block, xen-devel, Kevin Wolf, Max Reitz,
	Stefan Hajnoczi, Stefano Stabellini

On Wed, Dec 12, 2018 at 11:16:23AM +0000, Paul Durrant wrote:
> This series is a re-base of Tim's v2 series [1] on top of my series [2].
> 
> [1] https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg00243.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02271.html

For the series:
Acked-by: Anthony PERARD <anthony.perard@citrix.com>

And I've pushed that here:
https://xenbits.xen.org/gitweb/?p=people/aperard/qemu-dm.git;a=shortlog;h=refs/heads/xen-next

-- 
Anthony PERARD

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-12-13 12:24 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-12 11:16 [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Paul Durrant
2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 1/3] xen-block: improve batching behaviour Paul Durrant
2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 2/3] xen-block: improve response latency Paul Durrant
2018-12-12 11:16 ` [Qemu-devel] [PATCH v3 3/3] xen-block: avoid repeated memory allocation Paul Durrant
2018-12-13 12:24 ` [Qemu-devel] [PATCH v3 0/3] Performance improvements for xen_disk^Wxen-block Anthony PERARD

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).