qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/6] migration: followups for writev patches
@ 2013-03-27 16:36 Paolo Bonzini
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

This series fixes a few small problems in Orit's writev patches:

1) socket_put_buffer is not needed anymore and can be dropped (patch 1);

2) sendmsg could do a partial write even for a blocking socket, in
case it is interrupted by a signal. This should not happen for
migration, but it is relatively easy to fix in iov_send_recv (patches
2 to 5);

3) recent libvirt will always use fd migration, and thus will not
benefit from the writev speedups.  fd migration also uses FILE*, which
incurs an extra copy.  Patch 6 fixes both issues.

Thanks,

Paolo

Paolo Bonzini (6):
  qemu-file: drop socket_put_buffer
  iov: reorganize iov_send_recv, part 1
  iov: reorganize iov_send_recv, part 2
  iov: reorganize iov_send_recv, part 3
  iov: handle partial writes from sendmsg and recvmsg
  qemu-file: do not use stdio for qemu_fdopen

 savevm.c   | 113 ++++++++++++++++++++++++++++++++++++++++++++++++-------------
 util/iov.c | 104 ++++++++++++++++++++++++++++++++------------------------
 2 files changed, 148 insertions(+), 69 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:20   ` Juan Quintela
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

It is enough to implement one of socket_writev_buffer and
socket_put_buffer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 savevm.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/savevm.c b/savevm.c
index 406caa9..0415830 100644
--- a/savevm.c
+++ b/savevm.c
@@ -219,18 +219,6 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return len;
 }
 
-static int socket_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
-{
-    QEMUFileSocket *s = opaque;
-    ssize_t len;
-
-    len = qemu_send_full(s->fd, buf, size, 0);
-    if (len < size) {
-        len = -socket_error();
-    }
-    return len;
-}
-
 static int socket_close(void *opaque)
 {
     QEMUFileSocket *s = opaque;
@@ -404,7 +392,6 @@ static const QEMUFileOps socket_read_ops = {
 
 static const QEMUFileOps socket_write_ops = {
     .get_fd =     socket_get_fd,
-    .put_buffer = socket_put_buffer,
     .writev_buffer = socket_writev_buffer,
     .close =      socket_close
 };
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:18   ` Juan Quintela
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Once the initial part of the iov is dropped, it is not used anymore.
Modify iov/iovcnt directly instead of adjusting them with the "si"
variable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index 9dae318..adb9a70 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -159,16 +159,22 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     for (si = 0; si < iov_cnt && offset >= iov[si].iov_len; ++si) {
         offset -= iov[si].iov_len;
     }
+
+    /* si == iov_cnt would only be valid if bytes == 0, which
+     * we already ruled out above.  */
+    assert(si < iov_cnt);
+    iov += si;
+    iov_cnt -= si;
+
     if (offset) {
-        assert(si < iov_cnt);
         /* second, skip `offset' bytes from the (now) first element,
          * undo it on exit */
-        iov[si].iov_base += offset;
-        iov[si].iov_len -= offset;
+        iov[0].iov_base += offset;
+        iov[0].iov_len -= offset;
     }
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
-    for (ei = si; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
+    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
         bytes -= iov[ei].iov_len;
     }
     if (bytes) {
@@ -183,12 +189,12 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
         ++ei;
     }
 
-    ret = do_send_recv(sockfd, iov + si, ei - si, do_send);
+    ret = do_send_recv(sockfd, iov, ei, do_send);
 
     /* Undo the changes above */
     if (offset) {
-        iov[si].iov_base -= offset;
-        iov[si].iov_len += offset;
+        iov[0].iov_base -= offset;
+        iov[0].iov_len += offset;
     }
     if (bytes) {
         iov[ei-1].iov_len += bytes;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:23   ` Juan Quintela
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Do not touch the "bytes" argument anymore.  Instead, remember the
original length of the last iovec if we touch it, and restore it
afterwards.

This requires undoing the changes in opposite order.  The previous
algorithm didn't care.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index adb9a70..110d18e 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -145,7 +145,9 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
                       bool do_send)
 {
     ssize_t ret;
+    size_t orig_len, tail;
     unsigned si, ei;            /* start and end indexes */
+
     if (bytes == 0) {
         /* Catch the do-nothing case early, as otherwise we will pass an
          * empty iovec to sendmsg/recvmsg(), and not all implementations
@@ -174,31 +176,29 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     }
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
-    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
-        bytes -= iov[ei].iov_len;
+    tail = bytes;
+    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= tail; ++ei) {
+        tail -= iov[ei].iov_len;
     }
-    if (bytes) {
-        /* second, fixup the last element, and remember
-         * the length we've cut from the end of it in `bytes' */
-        size_t tail;
+    if (tail) {
+        /* second, fixup the last element, and remember the original
+         * length */
         assert(ei < iov_cnt);
-        assert(iov[ei].iov_len > bytes);
-        tail = iov[ei].iov_len - bytes;
-        iov[ei].iov_len = bytes;
-        bytes = tail;  /* bytes is now equal to the tail size */
-        ++ei;
+        assert(iov[ei].iov_len > tail);
+        orig_len = iov[ei].iov_len;
+        iov[ei++].iov_len = tail;
     }
 
     ret = do_send_recv(sockfd, iov, ei, do_send);
 
     /* Undo the changes above */
+    if (tail) {
+        iov[ei-1].iov_len = orig_len;
+    }
     if (offset) {
         iov[0].iov_base -= offset;
         iov[0].iov_len += offset;
     }
-    if (bytes) {
-        iov[ei-1].iov_len += bytes;
-    }
 
     return ret;
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:24   ` Juan Quintela
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

"si" and "ei" are merged in a single variable.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index 110d18e..f14ff0b 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -146,7 +146,7 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 {
     ssize_t ret;
     size_t orig_len, tail;
-    unsigned si, ei;            /* start and end indexes */
+    unsigned niov;
 
     if (bytes == 0) {
         /* Catch the do-nothing case early, as otherwise we will pass an
@@ -158,15 +158,15 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 
     /* Find the start position, skipping `offset' bytes:
      * first, skip all full-sized vector elements, */
-    for (si = 0; si < iov_cnt && offset >= iov[si].iov_len; ++si) {
-        offset -= iov[si].iov_len;
+    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+        offset -= iov[niov].iov_len;
     }
 
-    /* si == iov_cnt would only be valid if bytes == 0, which
+    /* niov == iov_cnt would only be valid if bytes == 0, which
      * we already ruled out above.  */
-    assert(si < iov_cnt);
-    iov += si;
-    iov_cnt -= si;
+    assert(niov < iov_cnt);
+    iov += niov;
+    iov_cnt -= niov;
 
     if (offset) {
         /* second, skip `offset' bytes from the (now) first element,
@@ -177,23 +177,23 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
     tail = bytes;
-    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= tail; ++ei) {
-        tail -= iov[ei].iov_len;
+    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+        tail -= iov[niov].iov_len;
     }
     if (tail) {
         /* second, fixup the last element, and remember the original
          * length */
-        assert(ei < iov_cnt);
-        assert(iov[ei].iov_len > tail);
-        orig_len = iov[ei].iov_len;
-        iov[ei++].iov_len = tail;
+        assert(niov < iov_cnt);
+        assert(iov[niov].iov_len > tail);
+        orig_len = iov[niov].iov_len;
+        iov[niov++].iov_len = tail;
     }
 
-    ret = do_send_recv(sockfd, iov, ei, do_send);
+    ret = do_send_recv(sockfd, iov, niov, do_send);
 
     /* Undo the changes above */
     if (tail) {
-        iov[ei-1].iov_len = orig_len;
+        iov[niov-1].iov_len = orig_len;
     }
     if (offset) {
         iov[0].iov_base -= offset;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:27   ` Juan Quintela
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
  2013-04-03  8:08 ` [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Orit Wasserman
  6 siblings, 1 reply; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Partial writes can still happen in sendmsg and recvmsg, if a
signal is received in the middle of a write.  To handle this,
retry the operation with a new offset/bytes pair.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        I suggest using "git show -b" to review this one...

 util/iov.c | 102 +++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 47 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index f14ff0b..0ecfc4c 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -144,63 +144,71 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
                       size_t offset, size_t bytes,
                       bool do_send)
 {
+    ssize_t total = 0;
     ssize_t ret;
     size_t orig_len, tail;
     unsigned niov;
 
-    if (bytes == 0) {
-        /* Catch the do-nothing case early, as otherwise we will pass an
-         * empty iovec to sendmsg/recvmsg(), and not all implementations
-         * accept this.
-         */
-        return 0;
-    }
-
-    /* Find the start position, skipping `offset' bytes:
-     * first, skip all full-sized vector elements, */
-    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
-        offset -= iov[niov].iov_len;
-    }
+    while (bytes > 0) {
+        /* Find the start position, skipping `offset' bytes:
+         * first, skip all full-sized vector elements, */
+        for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+            offset -= iov[niov].iov_len;
+        }
 
-    /* niov == iov_cnt would only be valid if bytes == 0, which
-     * we already ruled out above.  */
-    assert(niov < iov_cnt);
-    iov += niov;
-    iov_cnt -= niov;
-
-    if (offset) {
-        /* second, skip `offset' bytes from the (now) first element,
-         * undo it on exit */
-        iov[0].iov_base += offset;
-        iov[0].iov_len -= offset;
-    }
-    /* Find the end position skipping `bytes' bytes: */
-    /* first, skip all full-sized elements */
-    tail = bytes;
-    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
-        tail -= iov[niov].iov_len;
-    }
-    if (tail) {
-        /* second, fixup the last element, and remember the original
-         * length */
+        /* niov == iov_cnt would only be valid if bytes == 0, which
+         * we already ruled out in the loop condition.  */
         assert(niov < iov_cnt);
-        assert(iov[niov].iov_len > tail);
-        orig_len = iov[niov].iov_len;
-        iov[niov++].iov_len = tail;
-    }
+        iov += niov;
+        iov_cnt -= niov;
+
+        if (offset) {
+            /* second, skip `offset' bytes from the (now) first element,
+             * undo it on exit */
+            iov[0].iov_base += offset;
+            iov[0].iov_len -= offset;
+        }
+        /* Find the end position skipping `bytes' bytes: */
+        /* first, skip all full-sized elements */
+        tail = bytes;
+        for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+            tail -= iov[niov].iov_len;
+        }
+        if (tail) {
+            /* second, fixup the last element, and remember the original
+             * length */
+            assert(niov < iov_cnt);
+            assert(iov[niov].iov_len > tail);
+            orig_len = iov[niov].iov_len;
+            iov[niov++].iov_len = tail;
+        }
 
-    ret = do_send_recv(sockfd, iov, niov, do_send);
+        ret = do_send_recv(sockfd, iov, niov, do_send);
 
-    /* Undo the changes above */
-    if (tail) {
-        iov[niov-1].iov_len = orig_len;
-    }
-    if (offset) {
-        iov[0].iov_base -= offset;
-        iov[0].iov_len += offset;
+        /* Undo the changes above before checking for errors */
+        if (tail) {
+            iov[niov-1].iov_len = orig_len;
+        }
+        if (offset) {
+            iov[0].iov_base -= offset;
+            iov[0].iov_len += offset;
+        }
+
+        if (ret < 0) {
+            assert(errno != EINTR);
+            if (errno == EAGAIN && total > 0) {
+                return total;
+            }
+            return -1;
+        }
+
+        /* Prepare for the next iteration */
+        offset += ret;
+        total += ret;
+        bytes -= ret;
     }
 
-    return ret;
+    return total;
 }
 
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:30   ` Juan Quintela
  2013-04-16 21:09   ` Anthony Liguori
  2013-04-03  8:08 ` [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Orit Wasserman
  6 siblings, 2 replies; 17+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

This uses system calls directly for Unix file descriptors, so that the
efficient writev_buffer can be used.  Pay attention to the possibility
of partial writes in writev.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 savevm.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 89 insertions(+), 11 deletions(-)

diff --git a/savevm.c b/savevm.c
index 0415830..8eb5aab 100644
--- a/savevm.c
+++ b/savevm.c
@@ -356,9 +356,93 @@ static const QEMUFileOps stdio_file_write_ops = {
     .close =      stdio_fclose
 };
 
+static ssize_t unix_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len, offset;
+    ssize_t size = iov_size(iov, iovcnt);
+    ssize_t total = 0;
+
+    assert(iovcnt > 0);
+    offset = 0;
+    while (size > 0) {
+        /* Find the next start position; skip all full-sized vector elements  */
+        while (offset >= iov[0].iov_len) {
+            offset -= iov[0].iov_len;
+            iov++, iovcnt--;
+        }
+
+        /* skip `offset' bytes from the (now) first element, undo it on exit */
+        assert(iovcnt > 0);
+        iov[0].iov_base += offset;
+        iov[0].iov_len -= offset;
+
+        do {
+            len = writev(s->fd, iov, iovcnt);
+        } while (len == -1 && errno == EINTR);
+        if (len == -1) {
+            return -errno;
+        }
+
+        /* Undo the changes above */
+        iov[0].iov_base -= offset;
+        iov[0].iov_len += offset;
+
+        /* Prepare for the next iteration */
+        offset += len;
+        total += len;
+        size -= len;
+    }
+
+    return total;
+}
+
+static int unix_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len;
+
+    for (;;) {
+        len = read(s->fd, buf, size);
+        if (len != -1) {
+            break;
+        }
+        if (errno == EAGAIN) {
+            yield_until_fd_readable(s->fd);
+        } else if (errno != EINTR) {
+            break;
+        }
+    }
+
+    if (len == -1) {
+        len = -errno;
+    }
+    return len;
+}
+
+static int unix_close(void *opaque)
+{
+    QEMUFileSocket *s = opaque;
+    close(s->fd);
+    g_free(s);
+    return 0;
+}
+
+static const QEMUFileOps unix_read_ops = {
+    .get_fd =     socket_get_fd,
+    .get_buffer = unix_get_buffer,
+    .close =      unix_close
+};
+
+static const QEMUFileOps unix_write_ops = {
+    .get_fd =     socket_get_fd,
+    .writev_buffer = unix_writev_buffer,
+    .close =      unix_close
+};
+
 QEMUFile *qemu_fdopen(int fd, const char *mode)
 {
-    QEMUFileStdio *s;
+    QEMUFileSocket *s;
 
     if (mode == NULL ||
 	(mode[0] != 'r' && mode[0] != 'w') ||
@@ -367,21 +451,15 @@ QEMUFile *qemu_fdopen(int fd, const char *mode)
         return NULL;
     }
 
-    s = g_malloc0(sizeof(QEMUFileStdio));
-    s->stdio_file = fdopen(fd, mode);
-    if (!s->stdio_file)
-        goto fail;
+    s = g_malloc0(sizeof(QEMUFileSocket));
+    s->fd = fd;
 
     if(mode[0] == 'r') {
-        s->file = qemu_fopen_ops(s, &stdio_file_read_ops);
+        s->file = qemu_fopen_ops(s, &unix_read_ops);
     } else {
-        s->file = qemu_fopen_ops(s, &stdio_file_write_ops);
+        s->file = qemu_fopen_ops(s, &unix_write_ops);
     }
     return s->file;
-
-fail:
-    g_free(s);
-    return NULL;
 }
 
 static const QEMUFileOps socket_read_ops = {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] migration: followups for writev patches
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (5 preceding siblings ...)
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
@ 2013-04-03  8:08 ` Orit Wasserman
  2013-04-16 15:41   ` Anthony Liguori
  6 siblings, 1 reply; 17+ messages in thread
From: Orit Wasserman @ 2013-04-03  8:08 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, quintela

On 03/27/2013 06:36 PM, Paolo Bonzini wrote:
> This series fixes a few small problems in Orit's writev patches:
> 
> 1) socket_put_buffer is not needed anymore and can be dropped (patch 1);
> 
> 2) sendmsg could do a partial write even for a blocking socket, in
> case it is interrupted by a signal. This should not happen for
> migration, but it is relatively easy to fix in iov_send_recv (patches
> 2 to 5);
> 
> 3) recent libvirt will always use fd migration, and thus will not
> benefit from the writev speedups.  fd migration also uses FILE*, which
> incurs an extra copy.  Patch 6 fixes both issues.
> 
> Thanks,
> 
> Paolo
> 
> Paolo Bonzini (6):
>   qemu-file: drop socket_put_buffer
>   iov: reorganize iov_send_recv, part 1
>   iov: reorganize iov_send_recv, part 2
>   iov: reorganize iov_send_recv, part 3
>   iov: handle partial writes from sendmsg and recvmsg
>   qemu-file: do not use stdio for qemu_fdopen
> 
>  savevm.c   | 113 ++++++++++++++++++++++++++++++++++++++++++++++++-------------
>  util/iov.c | 104 ++++++++++++++++++++++++++++++++------------------------
>  2 files changed, 148 insertions(+), 69 deletions(-)
> 
Series
Reviewed-by: Orit Wasserman <owasserm@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
@ 2013-04-09 13:18   ` Juan Quintela
  0 siblings, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:18 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Once the initial part of the iov is dropped, it is not used anymore.
> Modify iov/iovcnt directly instead of adjusting them with the "si"
> variable.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>


I would have preffered that iov was used as const,  but as the current
code already modifies it,  nothing wrong here.

(coping the iov shouldn't have been such a big problem and we would
remove the need to fix the offsets at the end).

But as said,  code is better than what we had.

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
@ 2013-04-09 13:20   ` Juan Quintela
  0 siblings, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:20 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> It is enough to implement one of socket_writev_buffer and
> socket_put_buffer.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
@ 2013-04-09 13:23   ` Juan Quintela
  0 siblings, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Do not touch the "bytes" argument anymore.  Instead, remember the
> original length of the last iovec if we touch it, and restore it
> afterwards.
>
> This requires undoing the changes in opposite order.  The previous
> algorithm didn't care.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
@ 2013-04-09 13:24   ` Juan Quintela
  0 siblings, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:24 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> "si" and "ei" are merged in a single variable.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
@ 2013-04-09 13:27   ` Juan Quintela
  0 siblings, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Partial writes can still happen in sendmsg and recvmsg, if a
> signal is received in the middle of a write.  To handle this,
> retry the operation with a new offset/bytes pair.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
@ 2013-04-09 13:30   ` Juan Quintela
  2013-04-16 21:09   ` Anthony Liguori
  1 sibling, 0 replies; 17+ messages in thread
From: Juan Quintela @ 2013-04-09 13:30 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> This uses system calls directly for Unix file descriptors, so that the
> efficient writev_buffer can be used.  Pay attention to the possibility
> of partial writes in writev.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

> +static ssize_t unix_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len, offset;
> +    ssize_t size = iov_size(iov, iovcnt);
> +    ssize_t total = 0;
> +
> +    assert(iovcnt > 0);
> +    offset = 0;
> +    while (size > 0) {
> +        /* Find the next start position; skip all full-sized vector elements  */
> +        while (offset >= iov[0].iov_len) {
> +            offset -= iov[0].iov_len;
> +            iov++, iovcnt--;
> +        }
> +
> +        /* skip `offset' bytes from the (now) first element, undo it on exit */
> +        assert(iovcnt > 0);
> +        iov[0].iov_base += offset;
> +        iov[0].iov_len -= offset;
> +
> +        do {
> +            len = writev(s->fd, iov, iovcnt);
> +        } while (len == -1 && errno == EINTR);
> +        if (len == -1) {
> +            return -errno;
> +        }
> +
> +        /* Undo the changes above */
> +        iov[0].iov_base -= offset;
> +        iov[0].iov_len += offset;
> +
> +        /* Prepare for the next iteration */
> +        offset += len;
> +        total += len;
> +        size -= len;
> +    }
> +
> +    return total;
> +}

This code is very similar to the one in the iov_send_recv(),  but I
can't think on a trivial way to share it :p

Later,  Juan.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 0/6] migration: followups for writev patches
  2013-04-03  8:08 ` [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Orit Wasserman
@ 2013-04-16 15:41   ` Anthony Liguori
  0 siblings, 0 replies; 17+ messages in thread
From: Anthony Liguori @ 2013-04-16 15:41 UTC (permalink / raw)
  To: Orit Wasserman, Paolo Bonzini; +Cc: qemu-devel, quintela

Orit Wasserman <owasserm@redhat.com> writes:

> On 03/27/2013 06:36 PM, Paolo Bonzini wrote:
>> This series fixes a few small problems in Orit's writev patches:
>> 
>> 1) socket_put_buffer is not needed anymore and can be dropped (patch 1);
>> 
>> 2) sendmsg could do a partial write even for a blocking socket, in
>> case it is interrupted by a signal. This should not happen for
>> migration, but it is relatively easy to fix in iov_send_recv (patches
>> 2 to 5);
>> 
>> 3) recent libvirt will always use fd migration, and thus will not
>> benefit from the writev speedups.  fd migration also uses FILE*, which
>> incurs an extra copy.  Patch 6 fixes both issues.
>> 
>> Thanks,
>> 
>> Paolo
>> 
>> Paolo Bonzini (6):
>>   qemu-file: drop socket_put_buffer
>>   iov: reorganize iov_send_recv, part 1
>>   iov: reorganize iov_send_recv, part 2
>>   iov: reorganize iov_send_recv, part 3
>>   iov: handle partial writes from sendmsg and recvmsg
>>   qemu-file: do not use stdio for qemu_fdopen
>> 
>>  savevm.c   | 113 ++++++++++++++++++++++++++++++++++++++++++++++++-------------
>>  util/iov.c | 104 ++++++++++++++++++++++++++++++++------------------------
>>  2 files changed, 148 insertions(+), 69 deletions(-)
>> 
> Series
> Reviewed-by: Orit Wasserman <owasserm@redhat.com>

Per Paolo, I will process this through my queue.

Regards,

Anthony Liguori

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
  2013-04-09 13:30   ` Juan Quintela
@ 2013-04-16 21:09   ` Anthony Liguori
  1 sibling, 0 replies; 17+ messages in thread
From: Anthony Liguori @ 2013-04-16 21:09 UTC (permalink / raw)
  To: Paolo Bonzini, qemu-devel; +Cc: owasserm, quintela

Paolo Bonzini <pbonzini@redhat.com> writes:

> This uses system calls directly for Unix file descriptors, so that the
> efficient writev_buffer can be used.  Pay attention to the possibility
> of partial writes in writev.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Needs rebasing:

  CC    i386-softmmu/savevm.o
/home/aliguori/git/qemu/savevm.c:439:5: error: initialization from incompatible pointer type [-Werror]
/home/aliguori/git/qemu/savevm.c:439:5: error: (near initialization for ‘unix_write_ops.writev_buffer’) [-Werror]
cc1: all warnings being treated as errors
make[1]: *** [savevm.o] Error 1
make: *** [subdir-i386-softmmu] Error 2
d13ed415c0ab9f7cf29226022f8c24d1b8dc794d is the first bad commit
commit d13ed415c0ab9f7cf29226022f8c24d1b8dc794d
Author: Paolo Bonzini <pbonzini@redhat.com>
Date:   Wed Mar 27 17:36:32 2013 +0100

    qemu-file: do not use stdio for qemu_fdopen

Regards,

Anthony Liguori

> ---
>  savevm.c | 100 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 89 insertions(+), 11 deletions(-)
>
> diff --git a/savevm.c b/savevm.c
> index 0415830..8eb5aab 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -356,9 +356,93 @@ static const QEMUFileOps stdio_file_write_ops = {
>      .close =      stdio_fclose
>  };
>  
> +static ssize_t unix_writev_buffer(void *opaque, struct iovec *iov, int iovcnt)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len, offset;
> +    ssize_t size = iov_size(iov, iovcnt);
> +    ssize_t total = 0;
> +
> +    assert(iovcnt > 0);
> +    offset = 0;
> +    while (size > 0) {
> +        /* Find the next start position; skip all full-sized vector elements  */
> +        while (offset >= iov[0].iov_len) {
> +            offset -= iov[0].iov_len;
> +            iov++, iovcnt--;
> +        }
> +
> +        /* skip `offset' bytes from the (now) first element, undo it on exit */
> +        assert(iovcnt > 0);
> +        iov[0].iov_base += offset;
> +        iov[0].iov_len -= offset;
> +
> +        do {
> +            len = writev(s->fd, iov, iovcnt);
> +        } while (len == -1 && errno == EINTR);
> +        if (len == -1) {
> +            return -errno;
> +        }
> +
> +        /* Undo the changes above */
> +        iov[0].iov_base -= offset;
> +        iov[0].iov_len += offset;
> +
> +        /* Prepare for the next iteration */
> +        offset += len;
> +        total += len;
> +        size -= len;
> +    }
> +
> +    return total;
> +}
> +
> +static int unix_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len;
> +
> +    for (;;) {
> +        len = read(s->fd, buf, size);
> +        if (len != -1) {
> +            break;
> +        }
> +        if (errno == EAGAIN) {
> +            yield_until_fd_readable(s->fd);
> +        } else if (errno != EINTR) {
> +            break;
> +        }
> +    }
> +
> +    if (len == -1) {
> +        len = -errno;
> +    }
> +    return len;
> +}
> +
> +static int unix_close(void *opaque)
> +{
> +    QEMUFileSocket *s = opaque;
> +    close(s->fd);
> +    g_free(s);
> +    return 0;
> +}
> +
> +static const QEMUFileOps unix_read_ops = {
> +    .get_fd =     socket_get_fd,
> +    .get_buffer = unix_get_buffer,
> +    .close =      unix_close
> +};
> +
> +static const QEMUFileOps unix_write_ops = {
> +    .get_fd =     socket_get_fd,
> +    .writev_buffer = unix_writev_buffer,
> +    .close =      unix_close
> +};
> +
>  QEMUFile *qemu_fdopen(int fd, const char *mode)
>  {
> -    QEMUFileStdio *s;
> +    QEMUFileSocket *s;
>  
>      if (mode == NULL ||
>  	(mode[0] != 'r' && mode[0] != 'w') ||
> @@ -367,21 +451,15 @@ QEMUFile *qemu_fdopen(int fd, const char *mode)
>          return NULL;
>      }
>  
> -    s = g_malloc0(sizeof(QEMUFileStdio));
> -    s->stdio_file = fdopen(fd, mode);
> -    if (!s->stdio_file)
> -        goto fail;
> +    s = g_malloc0(sizeof(QEMUFileSocket));
> +    s->fd = fd;
>  
>      if(mode[0] == 'r') {
> -        s->file = qemu_fopen_ops(s, &stdio_file_read_ops);
> +        s->file = qemu_fopen_ops(s, &unix_read_ops);
>      } else {
> -        s->file = qemu_fopen_ops(s, &stdio_file_write_ops);
> +        s->file = qemu_fopen_ops(s, &unix_write_ops);
>      }
>      return s->file;
> -
> -fail:
> -    g_free(s);
> -    return NULL;
>  }
>  
>  static const QEMUFileOps socket_read_ops = {
> -- 
> 1.8.1.4

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) " Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  0 siblings, 0 replies; 17+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

"si" and "ei" are merged in a single variable.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index 110d18e..f14ff0b 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -146,7 +146,7 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 {
     ssize_t ret;
     size_t orig_len, tail;
-    unsigned si, ei;            /* start and end indexes */
+    unsigned niov;
 
     if (bytes == 0) {
         /* Catch the do-nothing case early, as otherwise we will pass an
@@ -158,15 +158,15 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 
     /* Find the start position, skipping `offset' bytes:
      * first, skip all full-sized vector elements, */
-    for (si = 0; si < iov_cnt && offset >= iov[si].iov_len; ++si) {
-        offset -= iov[si].iov_len;
+    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+        offset -= iov[niov].iov_len;
     }
 
-    /* si == iov_cnt would only be valid if bytes == 0, which
+    /* niov == iov_cnt would only be valid if bytes == 0, which
      * we already ruled out above.  */
-    assert(si < iov_cnt);
-    iov += si;
-    iov_cnt -= si;
+    assert(niov < iov_cnt);
+    iov += niov;
+    iov_cnt -= niov;
 
     if (offset) {
         /* second, skip `offset' bytes from the (now) first element,
@@ -177,23 +177,23 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
     tail = bytes;
-    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= tail; ++ei) {
-        tail -= iov[ei].iov_len;
+    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+        tail -= iov[niov].iov_len;
     }
     if (tail) {
         /* second, fixup the last element, and remember the original
          * length */
-        assert(ei < iov_cnt);
-        assert(iov[ei].iov_len > tail);
-        orig_len = iov[ei].iov_len;
-        iov[ei++].iov_len = tail;
+        assert(niov < iov_cnt);
+        assert(iov[niov].iov_len > tail);
+        orig_len = iov[niov].iov_len;
+        iov[niov++].iov_len = tail;
     }
 
-    ret = do_send_recv(sockfd, iov, ei, do_send);
+    ret = do_send_recv(sockfd, iov, niov, do_send);
 
     /* Undo the changes above */
     if (tail) {
-        iov[ei-1].iov_len = orig_len;
+        iov[niov-1].iov_len = orig_len;
     }
     if (offset) {
         iov[0].iov_base -= offset;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2013-04-17  9:47 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
2013-03-27 16:36 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
2013-04-09 13:20   ` Juan Quintela
2013-03-27 16:36 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
2013-04-09 13:18   ` Juan Quintela
2013-03-27 16:36 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
2013-04-09 13:23   ` Juan Quintela
2013-03-27 16:36 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
2013-04-09 13:24   ` Juan Quintela
2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
2013-04-09 13:27   ` Juan Quintela
2013-03-27 16:36 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
2013-04-09 13:30   ` Juan Quintela
2013-04-16 21:09   ` Anthony Liguori
2013-04-03  8:08 ` [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Orit Wasserman
2013-04-16 15:41   ` Anthony Liguori
  -- strict thread matches above, loose matches on Subject: below --
2013-04-17  9:46 [Qemu-devel] [PULL (rebased) " Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).