qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages
@ 2013-03-22 14:47 Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 1/7] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
                   ` (7 more replies)
  0 siblings, 8 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

In migration all data is copied to a static buffer in QEMUFile,
this hurts our network bandwidth and CPU usage especially with large guests.
We switched to iovec for storing different buffers to send (even a byte field is
considered as a buffer) and use sendmsg to send the iovec.
Adjacent iovecs are coalesced to create a bigger buffer instead of many small
buffers.
  
Guest memory pages are not copied by calling a new function
qemu_put_buffer_async.
The page header data and device state data are still copied into the static
buffer. This data consists of a lot of bytes and integer fields and the static
buffer is used to store it during batching.

git repository: git://github.com/oritwas/qemu.git sendv_v2

Changes from v4:
return ssize_t for writev_buffer ops.
Fix other Eric's comments.
Squash patch 8 (coalesce adjacent iovecs) into patch 4.

Changes from v3:
Use a variable for iov_size
Change f->bytes_xfer +=1 to f->bytes_xfer++
Remove unneeded "More optimized qemu_put_be64/32/16" patch
Move code to the right patch
Rename qemu_put_buffer_no_copy to qemu_put_buffer_async
Use function for updating the iovec that detect adjacent iovecs and coalesce
them.

Change from v2:
Always send data for the iovec even if writev_buffer is not implemented.
Coalesce adjacent iovecs to create one big buffer from small adjacent buffer.

Changes from v1:
Use iov_send for socket.
Make writev_buffer optional and if it is not implemented use put_buffer

Future work: Make number of iovec changeable

Orit Wasserman (7):
  Add QemuFileWritevBuffer QemuFileOps
  Add socket_writev_buffer function
  Update bytes_xfer in qemu_put_byte
  Store the data to send also in iovec
  Use writev ops if available
  Add qemu_put_buffer_async
  Use qemu_put_buffer_async for guest memory pages

 arch_init.c                   |   2 +-
 include/migration/qemu-file.h |  12 +++++
 savevm.c                      | 103 +++++++++++++++++++++++++++++++++++-------
 3 files changed, 100 insertions(+), 17 deletions(-)

-- 
1.7.11.7

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 1/7] Add QemuFileWritevBuffer QemuFileOps
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
@ 2013-03-22 14:47 ` Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 2/7] Add socket_writev_buffer function Orit Wasserman
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

This will allow us to write an iovec

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 include/migration/qemu-file.h | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index df81261..8b8070f 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -51,11 +51,18 @@ typedef int (QEMUFileCloseFunc)(void *opaque);
  */
 typedef int (QEMUFileGetFD)(void *opaque);
 
+/*
+ * This function writes an iovec to file.
+ */
+typedef ssize_t (QEMUFileWritevBufferFunc)(void *opaque, struct iovec *iov,
+                                           int iovcnt);
+
 typedef struct QEMUFileOps {
     QEMUFilePutBufferFunc *put_buffer;
     QEMUFileGetBufferFunc *get_buffer;
     QEMUFileCloseFunc *close;
     QEMUFileGetFD *get_fd;
+    QEMUFileWritevBufferFunc *writev_buffer;
 } QEMUFileOps;
 
 QEMUFile *qemu_fopen_ops(void *opaque, const QEMUFileOps *ops);
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 2/7] Add socket_writev_buffer function
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 1/7] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
@ 2013-03-22 14:47 ` Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 3/7] Update bytes_xfer in qemu_put_byte Orit Wasserman
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/savevm.c b/savevm.c
index 35c8d1e..21140c4 100644
--- a/savevm.c
+++ b/savevm.c
@@ -39,6 +39,7 @@
 #include "qmp-commands.h"
 #include "trace.h"
 #include "qemu/bitops.h"
+#include "qemu/iov.h"
 
 #define SELF_ANNOUNCE_ROUNDS 5
 
@@ -171,6 +172,19 @@ static void coroutine_fn yield_until_fd_readable(int fd)
     qemu_coroutine_yield();
 }
 
+static ssize_t socket_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len;
+    ssize_t size = iov_size(iov, iovcnt);
+
+    len = iov_send(s->fd, iov, iovcnt, 0, size);
+    if (len < size) {
+        len = -socket_error();
+    }
+    return len;
+}
+
 static int socket_get_fd(void *opaque)
 {
     QEMUFileSocket *s = opaque;
@@ -387,6 +401,7 @@ static const QEMUFileOps socket_read_ops = {
 static const QEMUFileOps socket_write_ops = {
     .get_fd =     socket_get_fd,
     .put_buffer = socket_put_buffer,
+    .writev_buffer = socket_writev_buffer,
     .close =      socket_close
 };
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 3/7] Update bytes_xfer in qemu_put_byte
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 1/7] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 2/7] Add socket_writev_buffer function Orit Wasserman
@ 2013-03-22 14:47 ` Orit Wasserman
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 4/7] Store the data to send also in iovec Orit Wasserman
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:47 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 savevm.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/savevm.c b/savevm.c
index 21140c4..2ba0d85 100644
--- a/savevm.c
+++ b/savevm.c
@@ -648,6 +648,8 @@ void qemu_put_byte(QEMUFile *f, int v)
 
     f->buf[f->buf_index++] = v;
     f->is_write = 1;
+    f->bytes_xfer++;
+
     if (f->buf_index >= IO_BUF_SIZE) {
         qemu_fflush(f);
     }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 4/7] Store the data to send also in iovec
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (2 preceding siblings ...)
  2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 3/7] Update bytes_xfer in qemu_put_byte Orit Wasserman
@ 2013-03-22 14:48 ` Orit Wasserman
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 5/7] Use writev ops if available Orit Wasserman
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

All data is still copied into the static buffer.
Adjacent iovecs are coalesced so we send one big buffer
instead of many small buffers.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 savevm.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/savevm.c b/savevm.c
index 2ba0d85..8dad9ef 100644
--- a/savevm.c
+++ b/savevm.c
@@ -114,6 +114,7 @@ void qemu_announce_self(void)
 /* savevm/loadvm support */
 
 #define IO_BUF_SIZE 32768
+#define MAX_IOV_SIZE MIN(IOV_MAX, 64)
 
 struct QEMUFile {
     const QEMUFileOps *ops;
@@ -129,6 +130,9 @@ struct QEMUFile {
     int buf_size; /* 0 when writing */
     uint8_t buf[IO_BUF_SIZE];
 
+    struct iovec iov[MAX_IOV_SIZE];
+    unsigned int iovcnt;
+
     int last_error;
 };
 
@@ -528,6 +532,7 @@ static void qemu_fflush(QEMUFile *f)
             f->pos += f->buf_index;
         }
         f->buf_index = 0;
+        f->iovcnt = 0;
     }
     if (ret < 0) {
         qemu_file_set_error(f, ret);
@@ -601,6 +606,18 @@ int qemu_fclose(QEMUFile *f)
     return ret;
 }
 
+static void add_to_iovec(QEMUFile *f, const uint8_t *buf, int size)
+{
+    /* check for adjacent buffer and coalesce them */
+    if (f->iovcnt > 0 && buf == f->iov[f->iovcnt - 1].iov_base +
+        f->iov[f->iovcnt - 1].iov_len) {
+        f->iov[f->iovcnt - 1].iov_len += size;
+    } else {
+        f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
+        f->iov[f->iovcnt++].iov_len = size;
+    }
+}
+
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
 {
     int l;
@@ -620,12 +637,13 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         if (l > size)
             l = size;
         memcpy(f->buf + f->buf_index, buf, l);
+        add_to_iovec(f, f->buf + f->buf_index, l);
         f->is_write = 1;
         f->buf_index += l;
         f->bytes_xfer += l;
         buf += l;
         size -= l;
-        if (f->buf_index >= IO_BUF_SIZE) {
+        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
             qemu_fflush(f);
             if (qemu_file_get_error(f)) {
                 break;
@@ -650,7 +668,9 @@ void qemu_put_byte(QEMUFile *f, int v)
     f->is_write = 1;
     f->bytes_xfer++;
 
-    if (f->buf_index >= IO_BUF_SIZE) {
+    add_to_iovec(f, f->buf + (f->buf_index - 1), 1);
+
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
         qemu_fflush(f);
     }
 }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 5/7] Use writev ops if available
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (3 preceding siblings ...)
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 4/7] Store the data to send also in iovec Orit Wasserman
@ 2013-03-22 14:48 ` Orit Wasserman
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 6/7] Add qemu_put_buffer_async Orit Wasserman
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

Update qemu_fflush and stdio_close to use writev ops if they are available
Use the buffers stored in the iovec.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
---
 savevm.c | 31 +++++++++++++++++++++++--------
 1 file changed, 23 insertions(+), 8 deletions(-)

diff --git a/savevm.c b/savevm.c
index 8dad9ef..294434b 100644
--- a/savevm.c
+++ b/savevm.c
@@ -293,7 +293,7 @@ static int stdio_fclose(void *opaque)
     QEMUFileStdio *s = opaque;
     int ret = 0;
 
-    if (s->file->ops->put_buffer) {
+    if (s->file->ops->put_buffer || s->file->ops->writev_buffer) {
         int fd = fileno(s->stdio_file);
         struct stat st;
 
@@ -516,20 +516,35 @@ static void qemu_file_set_error(QEMUFile *f, int ret)
     }
 }
 
-/** Flushes QEMUFile buffer
+/**
+ * Flushes QEMUFile buffer
  *
+ * If there is writev_buffer QEMUFileOps it uses it otherwise uses
+ * put_buffer ops.
  */
 static void qemu_fflush(QEMUFile *f)
 {
-    int ret = 0;
+    ssize_t ret = 0;
+    int i = 0;
 
-    if (!f->ops->put_buffer) {
+    if (!f->ops->writev_buffer && !f->ops->put_buffer) {
         return;
     }
-    if (f->is_write && f->buf_index > 0) {
-        ret = f->ops->put_buffer(f->opaque, f->buf, f->pos, f->buf_index);
-        if (ret >= 0) {
-            f->pos += f->buf_index;
+
+    if (f->is_write && f->iovcnt > 0) {
+        if (f->ops->writev_buffer) {
+            ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
+            if (ret >= 0) {
+                f->pos += ret;
+            }
+        } else {
+            for (i = 0; i < f->iovcnt && ret >= 0; i++) {
+                ret = f->ops->put_buffer(f->opaque, f->iov[i].iov_base, f->pos,
+                                         f->iov[i].iov_len);
+                if (ret >= 0) {
+                    f->pos += ret;
+                }
+            }
         }
         f->buf_index = 0;
         f->iovcnt = 0;
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 6/7] Add qemu_put_buffer_async
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (4 preceding siblings ...)
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 5/7] Use writev ops if available Orit Wasserman
@ 2013-03-22 14:48 ` Orit Wasserman
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages Orit Wasserman
  2013-03-27 21:27 ` [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Eric Blake
  7 siblings, 0 replies; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

This allows us to add a buffer to the iovec to send without copying it
into the static buffer, the buffer will be sent later when qemu_fflush is called.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 include/migration/qemu-file.h |  5 +++++
 savevm.c                      | 34 ++++++++++++++++++++++++++--------
 2 files changed, 31 insertions(+), 8 deletions(-)

diff --git a/include/migration/qemu-file.h b/include/migration/qemu-file.h
index 8b8070f..623c434 100644
--- a/include/migration/qemu-file.h
+++ b/include/migration/qemu-file.h
@@ -75,6 +75,11 @@ int qemu_fclose(QEMUFile *f);
 int64_t qemu_ftell(QEMUFile *f);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);
+/*
+ * put_buffer without copying the buffer.
+ * The buffer should be available till it is sent asynchronously.
+ */
+void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size);
 
 static inline void qemu_put_ubyte(QEMUFile *f, unsigned int v)
 {
diff --git a/savevm.c b/savevm.c
index 294434b..63ab79b 100644
--- a/savevm.c
+++ b/savevm.c
@@ -633,6 +633,28 @@ static void add_to_iovec(QEMUFile *f, const uint8_t *buf, int size)
     }
 }
 
+void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size)
+{
+    if (f->last_error) {
+        return;
+    }
+
+    if (f->is_write == 0 && f->buf_index > 0) {
+        fprintf(stderr,
+                "Attempted to write to buffer while read buffer is not empty\n");
+        abort();
+    }
+
+    add_to_iovec(f, buf, size);
+
+    f->is_write = 1;
+    f->bytes_xfer += size;
+
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
+        qemu_fflush(f);
+    }
+}
+
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
 {
     int l;
@@ -652,18 +674,14 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         if (l > size)
             l = size;
         memcpy(f->buf + f->buf_index, buf, l);
-        add_to_iovec(f, f->buf + f->buf_index, l);
         f->is_write = 1;
         f->buf_index += l;
-        f->bytes_xfer += l;
+        qemu_put_buffer_async(f, f->buf + (f->buf_index - l), l);
+        if (qemu_file_get_error(f)) {
+            break;
+        }
         buf += l;
         size -= l;
-        if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-            qemu_fflush(f);
-            if (qemu_file_get_error(f)) {
-                break;
-            }
-        }
     }
 }
 
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (5 preceding siblings ...)
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 6/7] Add qemu_put_buffer_async Orit Wasserman
@ 2013-03-22 14:48 ` Orit Wasserman
  2013-04-05 13:44   ` Kevin Wolf
  2013-03-27 21:27 ` [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Eric Blake
  7 siblings, 1 reply; 14+ messages in thread
From: Orit Wasserman @ 2013-03-22 14:48 UTC (permalink / raw)
  To: qemu-devel; +Cc: mst, quintela, Orit Wasserman, pbonzini, chegu_vinod

This will remove an unneeded copy of guest memory pages.
For the page header and device state we still copy the data to the
static buffer the other option is to allocate the memory on demand
which is more expensive.

Signed-off-by: Orit Wasserman <owasserm@redhat.com>
---
 arch_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index 98e2bc6..aa56a20 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -481,7 +481,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
             /* XBZRLE overflow or normal page */
             if (bytes_sent == -1) {
                 bytes_sent = save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_PAGE);
-                qemu_put_buffer(f, p, TARGET_PAGE_SIZE);
+                qemu_put_buffer_async(f, p, TARGET_PAGE_SIZE);
                 bytes_sent += TARGET_PAGE_SIZE;
                 acct_info.norm_pages++;
             }
-- 
1.7.11.7

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages
  2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
                   ` (6 preceding siblings ...)
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages Orit Wasserman
@ 2013-03-27 21:27 ` Eric Blake
  7 siblings, 0 replies; 14+ messages in thread
From: Eric Blake @ 2013-03-27 21:27 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: pbonzini, mst, chegu_vinod, qemu-devel, quintela

[-- Attachment #1: Type: text/plain, Size: 1123 bytes --]

On 03/22/2013 08:47 AM, Orit Wasserman wrote:
> In migration all data is copied to a static buffer in QEMUFile,
> this hurts our network bandwidth and CPU usage especially with large guests.
> We switched to iovec for storing different buffers to send (even a byte field is
> considered as a buffer) and use sendmsg to send the iovec.
> Adjacent iovecs are coalesced to create a bigger buffer instead of many small
> buffers.
>   
> Guest memory pages are not copied by calling a new function
> qemu_put_buffer_async.
> The page header data and device state data are still copied into the static
> buffer. This data consists of a lot of bytes and integer fields and the static
> buffer is used to store it during batching.
> 
> git repository: git://github.com/oritwas/qemu.git sendv_v2
> 
> Changes from v4:
> return ssize_t for writev_buffer ops.
> Fix other Eric's comments.
> Squash patch 8 (coalesce adjacent iovecs) into patch 4.

Series:
Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 621 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages Orit Wasserman
@ 2013-04-05 13:44   ` Kevin Wolf
  2013-04-05 15:23     ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Wolf @ 2013-04-05 13:44 UTC (permalink / raw)
  To: Orit Wasserman; +Cc: pbonzini, quintela, chegu_vinod, qemu-devel, mst

Am 22.03.2013 um 15:48 hat Orit Wasserman geschrieben:
> This will remove an unneeded copy of guest memory pages.
> For the page header and device state we still copy the data to the
> static buffer the other option is to allocate the memory on demand
> which is more expensive.
> 
> Signed-off-by: Orit Wasserman <owasserm@redhat.com>

This seems to have killed savevm performance. I noticed that
qemu-iotests case 007 took forever on my test box (882 seconds instead
of something like 10 seconds). It can be reproduced by this script:

export MALLOC_PERTURB_=11
qemu-img create -f qcow2 -o compat=1.1 test.qcow2 1M
time qemu-system-x86_64 -nographic -hda $TEST_IMG -serial none -monitor stdio <<EOF
savevm test
quit
EOF

This used to take about 0.6s for me, after this patch it's around 10s.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-04-05 13:44   ` Kevin Wolf
@ 2013-04-05 15:23     ` Paolo Bonzini
  2013-04-05 15:39       ` Kevin Wolf
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-04-05 15:23 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Orit Wasserman, quintela, chegu_vinod, qemu-devel, mst

[-- Attachment #1: Type: text/plain, Size: 721 bytes --]

Il 05/04/2013 15:44, Kevin Wolf ha scritto:
> This seems to have killed savevm performance. I noticed that
> qemu-iotests case 007 took forever on my test box (882 seconds instead
> of something like 10 seconds). It can be reproduced by this script:
> 
> export MALLOC_PERTURB_=11
> qemu-img create -f qcow2 -o compat=1.1 test.qcow2 1M
> time qemu-system-x86_64 -nographic -hda $TEST_IMG -serial none -monitor stdio <<EOF
> savevm test
> quit
> EOF
> 
> This used to take about 0.6s for me, after this patch it's around 10s.

The solution could be to make bdrv_load_vmstate take an iov/iovcnt pair.

Alternatively, you can try the attached patch.  I haven't yet tested it
though, and won't be able to do so today.

Paolo

[-- Attachment #2: savevm-performance.patch --]
[-- Type: text/x-patch, Size: 3446 bytes --]

diff --git a/savevm.c b/savevm.c
index b1d8988..af99d64 100644
--- a/savevm.c
+++ b/savevm.c
@@ -525,27 +525,24 @@ static void qemu_file_set_error(QEMUFile *f, int ret)
 static void qemu_fflush(QEMUFile *f)
 {
     ssize_t ret = 0;
-    int i = 0;
 
     if (!f->ops->writev_buffer && !f->ops->put_buffer) {
         return;
     }
 
-    if (f->is_write && f->iovcnt > 0) {
+    if (f->is_write) {
         if (f->ops->writev_buffer) {
-            ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
-            if (ret >= 0) {
-                f->pos += ret;
+            if (f->iovcnt > 0) {
+                ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
             }
         } else {
-            for (i = 0; i < f->iovcnt && ret >= 0; i++) {
-                ret = f->ops->put_buffer(f->opaque, f->iov[i].iov_base, f->pos,
-                                         f->iov[i].iov_len);
-                if (ret >= 0) {
-                    f->pos += ret;
-                }
+            if (f->buf_index > 0) {
+                ret = f->ops->put_buffer(f->opaque, f->buf, f->pos, f->buf_index);
             }
         }
+        if (ret >= 0) {
+            f->pos += ret;
+        }
         f->buf_index = 0;
         f->iovcnt = 0;
     }
@@ -631,6 +628,11 @@ static void add_to_iovec(QEMUFile *f, const uint8_t *buf, int size)
         f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
         f->iov[f->iovcnt++].iov_len = size;
     }
+
+    f->is_write = 1;
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
+        qemu_fflush(f);
+    }
 }
 
 void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size)
@@ -645,13 +647,11 @@ void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size)
         abort();
     }
 
-    add_to_iovec(f, buf, size);
-
-    f->is_write = 1;
-    f->bytes_xfer += size;
-
-    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-        qemu_fflush(f);
+    if (f->ops->writev_buffer) {
+        f->bytes_xfer += size;
+        add_to_iovec(f, buf, size);
+    } else {
+        qemu_put_buffer(f, buf, size);
     }
 }
 
@@ -674,9 +674,17 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         if (l > size)
             l = size;
         memcpy(f->buf + f->buf_index, buf, l);
-        f->is_write = 1;
-        f->buf_index += l;
-        qemu_put_buffer_async(f, f->buf + (f->buf_index - l), l);
+        f->bytes_xfer += size;
+        if (f->ops->writev_buffer) {
+            add_to_iovec(f, f->buf + f->buf_index, l);
+            f->buf_index += l;
+        } else {
+            f->is_write = 1;
+            f->buf_index += l;
+            if (f->buf_index == IO_BUF_SIZE) {
+                qemu_fflush(f);
+            }
+        }
         if (qemu_file_get_error(f)) {
             break;
         }
@@ -697,14 +705,16 @@ void qemu_put_byte(QEMUFile *f, int v)
         abort();
     }
 
-    f->buf[f->buf_index++] = v;
-    f->is_write = 1;
+    f->buf[f->buf_index] = v;
     f->bytes_xfer++;
-
-    add_to_iovec(f, f->buf + (f->buf_index - 1), 1);
-
-    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-        qemu_fflush(f);
+    if (f->ops->writev_buffer) {
+        add_to_iovec(f, f->buf + f->buf_index, 1);
+        f->buf_index++;
+    } else {
+        f->buf_index++;
+        if (f->buf_index == IO_BUF_SIZE) {
+            qemu_fflush(f);
+        }
     }
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-04-05 15:23     ` Paolo Bonzini
@ 2013-04-05 15:39       ` Kevin Wolf
  2013-04-05 15:42         ` Paolo Bonzini
  0 siblings, 1 reply; 14+ messages in thread
From: Kevin Wolf @ 2013-04-05 15:39 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Orit Wasserman, quintela, chegu_vinod, qemu-devel, mst

Am 05.04.2013 um 17:23 hat Paolo Bonzini geschrieben:
> Il 05/04/2013 15:44, Kevin Wolf ha scritto:
> > This seems to have killed savevm performance. I noticed that
> > qemu-iotests case 007 took forever on my test box (882 seconds instead
> > of something like 10 seconds). It can be reproduced by this script:
> > 
> > export MALLOC_PERTURB_=11
> > qemu-img create -f qcow2 -o compat=1.1 test.qcow2 1M
> > time qemu-system-x86_64 -nographic -hda $TEST_IMG -serial none -monitor stdio <<EOF
> > savevm test
> > quit
> > EOF
> > 
> > This used to take about 0.6s for me, after this patch it's around 10s.
> 
> The solution could be to make bdrv_load_vmstate take an iov/iovcnt pair.

Ah, so you're saying that instead of linearising the buffer it breaks up
the requests in tiny pieces?

Implementing vectored bdrv_load/save_vmstate should be easy in theory.

> Alternatively, you can try the attached patch.  I haven't yet tested it
> though, and won't be able to do so today.

Attempted to write to buffer while read buffer is not empty

Program received signal SIGABRT, Aborted.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-04-05 15:39       ` Kevin Wolf
@ 2013-04-05 15:42         ` Paolo Bonzini
  2013-04-05 15:56           ` Kevin Wolf
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2013-04-05 15:42 UTC (permalink / raw)
  To: Kevin Wolf; +Cc: Orit Wasserman, quintela, chegu_vinod, qemu-devel, mst

[-- Attachment #1: Type: text/plain, Size: 699 bytes --]

Il 05/04/2013 17:39, Kevin Wolf ha scritto:
>> > The solution could be to make bdrv_load_vmstate take an iov/iovcnt pair.
> Ah, so you're saying that instead of linearising the buffer it breaks up
> the requests in tiny pieces?

Only for RAM (header/page/header/page...), because the page comes
straight from the guest memory.  Device state is still buffered and fast.

> Implementing vectored bdrv_load/save_vmstate should be easy in theory.
> 
>> > Alternatively, you can try the attached patch.  I haven't yet tested it
>> > though, and won't be able to do so today.
> Attempted to write to buffer while read buffer is not empty
> 
> Program received signal SIGABRT, Aborted.

Second try.

Paolo

[-- Attachment #2: savevm-performance.patch --]
[-- Type: text/x-patch, Size: 3472 bytes --]

diff --git a/savevm.c b/savevm.c
index b1d8988..5871642 100644
--- a/savevm.c
+++ b/savevm.c
@@ -525,27 +525,24 @@ static void qemu_file_set_error(QEMUFile *f, int ret)
 static void qemu_fflush(QEMUFile *f)
 {
     ssize_t ret = 0;
-    int i = 0;
 
     if (!f->ops->writev_buffer && !f->ops->put_buffer) {
         return;
     }
 
-    if (f->is_write && f->iovcnt > 0) {
+    if (f->is_write) {
         if (f->ops->writev_buffer) {
-            ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
-            if (ret >= 0) {
-                f->pos += ret;
+            if (f->iovcnt > 0) {
+                ret = f->ops->writev_buffer(f->opaque, f->iov, f->iovcnt);
             }
         } else {
-            for (i = 0; i < f->iovcnt && ret >= 0; i++) {
-                ret = f->ops->put_buffer(f->opaque, f->iov[i].iov_base, f->pos,
-                                         f->iov[i].iov_len);
-                if (ret >= 0) {
-                    f->pos += ret;
-                }
+            if (f->buf_index > 0) {
+                ret = f->ops->put_buffer(f->opaque, f->buf, f->pos, f->buf_index);
             }
         }
+        if (ret >= 0) {
+            f->pos += ret;
+        }
         f->buf_index = 0;
         f->iovcnt = 0;
     }
@@ -631,6 +628,11 @@ static void add_to_iovec(QEMUFile *f, const uint8_t *buf, int size)
         f->iov[f->iovcnt].iov_base = (uint8_t *)buf;
         f->iov[f->iovcnt++].iov_len = size;
     }
+
+    f->is_write = 1;
+    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
+        qemu_fflush(f);
+    }
 }
 
 void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size)
@@ -645,13 +647,11 @@ void qemu_put_buffer_async(QEMUFile *f, const uint8_t *buf, int size)
         abort();
     }
 
-    add_to_iovec(f, buf, size);
-
-    f->is_write = 1;
-    f->bytes_xfer += size;
-
-    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-        qemu_fflush(f);
+    if (f->ops->writev_buffer) {
+        f->bytes_xfer += size;
+        add_to_iovec(f, buf, size);
+    } else {
+        qemu_put_buffer(f, buf, size);
     }
 }
 
@@ -674,9 +674,17 @@ void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
         if (l > size)
             l = size;
         memcpy(f->buf + f->buf_index, buf, l);
-        f->is_write = 1;
-        f->buf_index += l;
-        qemu_put_buffer_async(f, f->buf + (f->buf_index - l), l);
+        f->bytes_xfer += size;
+        if (f->ops->writev_buffer) {
+            add_to_iovec(f, f->buf + f->buf_index, l);
+            f->buf_index += l;
+        } else {
+            f->is_write = 1;
+            f->buf_index += l;
+            if (f->buf_index == IO_BUF_SIZE) {
+                qemu_fflush(f);
+            }
+        }
         if (qemu_file_get_error(f)) {
             break;
         }
@@ -697,14 +705,17 @@ void qemu_put_byte(QEMUFile *f, int v)
         abort();
     }
 
-    f->buf[f->buf_index++] = v;
-    f->is_write = 1;
+    f->buf[f->buf_index] = v;
     f->bytes_xfer++;
-
-    add_to_iovec(f, f->buf + (f->buf_index - 1), 1);
-
-    if (f->buf_index >= IO_BUF_SIZE || f->iovcnt >= MAX_IOV_SIZE) {
-        qemu_fflush(f);
+    if (f->ops->writev_buffer) {
+        add_to_iovec(f, f->buf + f->buf_index, 1);
+        f->buf_index++;
+    } else {
+        f->is_write = 1;
+        f->buf_index++;
+        if (f->buf_index == IO_BUF_SIZE) {
+            qemu_fflush(f);
+        }
     }
 }
 

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages
  2013-04-05 15:42         ` Paolo Bonzini
@ 2013-04-05 15:56           ` Kevin Wolf
  0 siblings, 0 replies; 14+ messages in thread
From: Kevin Wolf @ 2013-04-05 15:56 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Orit Wasserman, quintela, chegu_vinod, qemu-devel, mst

Am 05.04.2013 um 17:42 hat Paolo Bonzini geschrieben:
> Il 05/04/2013 17:39, Kevin Wolf ha scritto:
> >> > The solution could be to make bdrv_load_vmstate take an iov/iovcnt pair.
> > Ah, so you're saying that instead of linearising the buffer it breaks up
> > the requests in tiny pieces?
> 
> Only for RAM (header/page/header/page...), because the page comes
> straight from the guest memory.  Device state is still buffered and fast.

And quite small in comparison. ;-)

> > Implementing vectored bdrv_load/save_vmstate should be easy in theory.
> > 
> >> > Alternatively, you can try the attached patch.  I haven't yet tested it
> >> > though, and won't be able to do so today.
> > Attempted to write to buffer while read buffer is not empty
> > 
> > Program received signal SIGABRT, Aborted.
> 
> Second try.

Okay, this doesn't seem to crash any more. Now I'm at 2.5s instead of
10s, which is obviously much better, but still worse than the initial
0.6s. The rest of the performance regression seems to come from a
different patch, though, so I guess I should do another bisect.

Kevin

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2013-04-05 15:56 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-22 14:47 [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Orit Wasserman
2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 1/7] Add QemuFileWritevBuffer QemuFileOps Orit Wasserman
2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 2/7] Add socket_writev_buffer function Orit Wasserman
2013-03-22 14:47 ` [Qemu-devel] [PATCH v5 3/7] Update bytes_xfer in qemu_put_byte Orit Wasserman
2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 4/7] Store the data to send also in iovec Orit Wasserman
2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 5/7] Use writev ops if available Orit Wasserman
2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 6/7] Add qemu_put_buffer_async Orit Wasserman
2013-03-22 14:48 ` [Qemu-devel] [PATCH v5 7/7] Use qemu_put_buffer_async for guest memory pages Orit Wasserman
2013-04-05 13:44   ` Kevin Wolf
2013-04-05 15:23     ` Paolo Bonzini
2013-04-05 15:39       ` Kevin Wolf
2013-04-05 15:42         ` Paolo Bonzini
2013-04-05 15:56           ` Kevin Wolf
2013-03-27 21:27 ` [Qemu-devel] [PATCH v5 0/7] Migration: Remove copying of guest ram pages Eric Blake

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).