qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg
  2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
@ 2013-03-27 16:36 ` Paolo Bonzini
  2013-04-09 13:27   ` Juan Quintela
  0 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-03-27 16:36 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Partial writes can still happen in sendmsg and recvmsg, if a
signal is received in the middle of a write.  To handle this,
retry the operation with a new offset/bytes pair.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
        I suggest using "git show -b" to review this one...

 util/iov.c | 102 +++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 47 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index f14ff0b..0ecfc4c 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -144,63 +144,71 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
                       size_t offset, size_t bytes,
                       bool do_send)
 {
+    ssize_t total = 0;
     ssize_t ret;
     size_t orig_len, tail;
     unsigned niov;
 
-    if (bytes == 0) {
-        /* Catch the do-nothing case early, as otherwise we will pass an
-         * empty iovec to sendmsg/recvmsg(), and not all implementations
-         * accept this.
-         */
-        return 0;
-    }
-
-    /* Find the start position, skipping `offset' bytes:
-     * first, skip all full-sized vector elements, */
-    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
-        offset -= iov[niov].iov_len;
-    }
+    while (bytes > 0) {
+        /* Find the start position, skipping `offset' bytes:
+         * first, skip all full-sized vector elements, */
+        for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+            offset -= iov[niov].iov_len;
+        }
 
-    /* niov == iov_cnt would only be valid if bytes == 0, which
-     * we already ruled out above.  */
-    assert(niov < iov_cnt);
-    iov += niov;
-    iov_cnt -= niov;
-
-    if (offset) {
-        /* second, skip `offset' bytes from the (now) first element,
-         * undo it on exit */
-        iov[0].iov_base += offset;
-        iov[0].iov_len -= offset;
-    }
-    /* Find the end position skipping `bytes' bytes: */
-    /* first, skip all full-sized elements */
-    tail = bytes;
-    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
-        tail -= iov[niov].iov_len;
-    }
-    if (tail) {
-        /* second, fixup the last element, and remember the original
-         * length */
+        /* niov == iov_cnt would only be valid if bytes == 0, which
+         * we already ruled out in the loop condition.  */
         assert(niov < iov_cnt);
-        assert(iov[niov].iov_len > tail);
-        orig_len = iov[niov].iov_len;
-        iov[niov++].iov_len = tail;
-    }
+        iov += niov;
+        iov_cnt -= niov;
+
+        if (offset) {
+            /* second, skip `offset' bytes from the (now) first element,
+             * undo it on exit */
+            iov[0].iov_base += offset;
+            iov[0].iov_len -= offset;
+        }
+        /* Find the end position skipping `bytes' bytes: */
+        /* first, skip all full-sized elements */
+        tail = bytes;
+        for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+            tail -= iov[niov].iov_len;
+        }
+        if (tail) {
+            /* second, fixup the last element, and remember the original
+             * length */
+            assert(niov < iov_cnt);
+            assert(iov[niov].iov_len > tail);
+            orig_len = iov[niov].iov_len;
+            iov[niov++].iov_len = tail;
+        }
 
-    ret = do_send_recv(sockfd, iov, niov, do_send);
+        ret = do_send_recv(sockfd, iov, niov, do_send);
 
-    /* Undo the changes above */
-    if (tail) {
-        iov[niov-1].iov_len = orig_len;
-    }
-    if (offset) {
-        iov[0].iov_base -= offset;
-        iov[0].iov_len += offset;
+        /* Undo the changes above before checking for errors */
+        if (tail) {
+            iov[niov-1].iov_len = orig_len;
+        }
+        if (offset) {
+            iov[0].iov_base -= offset;
+            iov[0].iov_len += offset;
+        }
+
+        if (ret < 0) {
+            assert(errno != EINTR);
+            if (errno == EAGAIN && total > 0) {
+                return total;
+            }
+            return -1;
+        }
+
+        /* Prepare for the next iteration */
+        offset += ret;
+        total += ret;
+        bytes -= ret;
     }
 
-    return ret;
+    return total;
 }
 
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg
  2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
@ 2013-04-09 13:27   ` Juan Quintela
  0 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2013-04-09 13:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: owasserm, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Partial writes can still happen in sendmsg and recvmsg, if a
> signal is received in the middle of a write.  To handle this,
> retry the operation with a new offset/bytes pair.
>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches
@ 2013-04-17  9:46 Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
                   ` (5 more replies)
  0 siblings, 6 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Anthony,

as agreed off-list I'm sending you a pull request for the rebased patches,
adding some new migration code but mostly touching iov.c.

The following changes since commit e0a83fc2c1582dc8d4453849852ebe6c258b7c3a:

  qom: do nothing on unparent of object without parent (2013-04-16 16:10:21 -0500)

are available in the git repository at:

  git://github.com/bonzini/qemu.git migration-writev

for you to fetch changes up to e9d8fbf53a33983c81d67d18e1baa914eb16cdea:

  qemu-file: do not use stdio for qemu_fdopen (2013-04-17 11:44:22 +0200)

----------------------------------------------------------------
Paolo Bonzini (6):
      qemu-file: drop socket_put_buffer
      iov: reorganize iov_send_recv, part 1
      iov: reorganize iov_send_recv, part 2
      iov: reorganize iov_send_recv, part 3
      iov: handle partial writes from sendmsg and recvmsg
      qemu-file: do not use stdio for qemu_fdopen

 savevm.c   | 114 ++++++++++++++++++++++++++++++++++++++++++++++++-------------
 util/iov.c | 104 +++++++++++++++++++++++++++++++------------------------
 2 files changed, 149 insertions(+), 69 deletions(-)
-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

It is enough to implement one of socket_writev_buffer and
socket_put_buffer.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 savevm.c | 13 -------------
 1 file changed, 13 deletions(-)

diff --git a/savevm.c b/savevm.c
index 53515cb..ffabbff 100644
--- a/savevm.c
+++ b/savevm.c
@@ -219,18 +219,6 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return len;
 }
 
-static int socket_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, int size)
-{
-    QEMUFileSocket *s = opaque;
-    ssize_t len;
-
-    len = qemu_send_full(s->fd, buf, size, 0);
-    if (len < size) {
-        len = -socket_error();
-    }
-    return len;
-}
-
 static int socket_close(void *opaque)
 {
     QEMUFileSocket *s = opaque;
@@ -404,7 +392,6 @@ static const QEMUFileOps socket_read_ops = {
 
 static const QEMUFileOps socket_write_ops = {
     .get_fd =     socket_get_fd,
-    .put_buffer = socket_put_buffer,
     .writev_buffer = socket_writev_buffer,
     .close =      socket_close
 };
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Once the initial part of the iov is dropped, it is not used anymore.
Modify iov/iovcnt directly instead of adjusting them with the "si"
variable.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index 9dae318..adb9a70 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -159,16 +159,22 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     for (si = 0; si < iov_cnt && offset >= iov[si].iov_len; ++si) {
         offset -= iov[si].iov_len;
     }
+
+    /* si == iov_cnt would only be valid if bytes == 0, which
+     * we already ruled out above.  */
+    assert(si < iov_cnt);
+    iov += si;
+    iov_cnt -= si;
+
     if (offset) {
-        assert(si < iov_cnt);
         /* second, skip `offset' bytes from the (now) first element,
          * undo it on exit */
-        iov[si].iov_base += offset;
-        iov[si].iov_len -= offset;
+        iov[0].iov_base += offset;
+        iov[0].iov_len -= offset;
     }
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
-    for (ei = si; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
+    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
         bytes -= iov[ei].iov_len;
     }
     if (bytes) {
@@ -183,12 +189,12 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
         ++ei;
     }
 
-    ret = do_send_recv(sockfd, iov + si, ei - si, do_send);
+    ret = do_send_recv(sockfd, iov, ei, do_send);
 
     /* Undo the changes above */
     if (offset) {
-        iov[si].iov_base -= offset;
-        iov[si].iov_len += offset;
+        iov[0].iov_base -= offset;
+        iov[0].iov_len += offset;
     }
     if (bytes) {
         iov[ei-1].iov_len += bytes;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Do not touch the "bytes" argument anymore.  Instead, remember the
original length of the last iovec if we touch it, and restore it
afterwards.

This requires undoing the changes in opposite order.  The previous
algorithm didn't care.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index adb9a70..110d18e 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -145,7 +145,9 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
                       bool do_send)
 {
     ssize_t ret;
+    size_t orig_len, tail;
     unsigned si, ei;            /* start and end indexes */
+
     if (bytes == 0) {
         /* Catch the do-nothing case early, as otherwise we will pass an
          * empty iovec to sendmsg/recvmsg(), and not all implementations
@@ -174,31 +176,29 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     }
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
-    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= bytes; ++ei) {
-        bytes -= iov[ei].iov_len;
+    tail = bytes;
+    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= tail; ++ei) {
+        tail -= iov[ei].iov_len;
     }
-    if (bytes) {
-        /* second, fixup the last element, and remember
-         * the length we've cut from the end of it in `bytes' */
-        size_t tail;
+    if (tail) {
+        /* second, fixup the last element, and remember the original
+         * length */
         assert(ei < iov_cnt);
-        assert(iov[ei].iov_len > bytes);
-        tail = iov[ei].iov_len - bytes;
-        iov[ei].iov_len = bytes;
-        bytes = tail;  /* bytes is now equal to the tail size */
-        ++ei;
+        assert(iov[ei].iov_len > tail);
+        orig_len = iov[ei].iov_len;
+        iov[ei++].iov_len = tail;
     }
 
     ret = do_send_recv(sockfd, iov, ei, do_send);
 
     /* Undo the changes above */
+    if (tail) {
+        iov[ei-1].iov_len = orig_len;
+    }
     if (offset) {
         iov[0].iov_base -= offset;
         iov[0].iov_len += offset;
     }
-    if (bytes) {
-        iov[ei-1].iov_len += bytes;
-    }
 
     return ret;
 }
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (2 preceding siblings ...)
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

"si" and "ei" are merged in a single variable.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 30 +++++++++++++++---------------
 1 file changed, 15 insertions(+), 15 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index 110d18e..f14ff0b 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -146,7 +146,7 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 {
     ssize_t ret;
     size_t orig_len, tail;
-    unsigned si, ei;            /* start and end indexes */
+    unsigned niov;
 
     if (bytes == 0) {
         /* Catch the do-nothing case early, as otherwise we will pass an
@@ -158,15 +158,15 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
 
     /* Find the start position, skipping `offset' bytes:
      * first, skip all full-sized vector elements, */
-    for (si = 0; si < iov_cnt && offset >= iov[si].iov_len; ++si) {
-        offset -= iov[si].iov_len;
+    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+        offset -= iov[niov].iov_len;
     }
 
-    /* si == iov_cnt would only be valid if bytes == 0, which
+    /* niov == iov_cnt would only be valid if bytes == 0, which
      * we already ruled out above.  */
-    assert(si < iov_cnt);
-    iov += si;
-    iov_cnt -= si;
+    assert(niov < iov_cnt);
+    iov += niov;
+    iov_cnt -= niov;
 
     if (offset) {
         /* second, skip `offset' bytes from the (now) first element,
@@ -177,23 +177,23 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
     /* Find the end position skipping `bytes' bytes: */
     /* first, skip all full-sized elements */
     tail = bytes;
-    for (ei = 0; ei < iov_cnt && iov[ei].iov_len <= tail; ++ei) {
-        tail -= iov[ei].iov_len;
+    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+        tail -= iov[niov].iov_len;
     }
     if (tail) {
         /* second, fixup the last element, and remember the original
          * length */
-        assert(ei < iov_cnt);
-        assert(iov[ei].iov_len > tail);
-        orig_len = iov[ei].iov_len;
-        iov[ei++].iov_len = tail;
+        assert(niov < iov_cnt);
+        assert(iov[niov].iov_len > tail);
+        orig_len = iov[niov].iov_len;
+        iov[niov++].iov_len = tail;
     }
 
-    ret = do_send_recv(sockfd, iov, ei, do_send);
+    ret = do_send_recv(sockfd, iov, niov, do_send);
 
     /* Undo the changes above */
     if (tail) {
-        iov[ei-1].iov_len = orig_len;
+        iov[niov-1].iov_len = orig_len;
     }
     if (offset) {
         iov[0].iov_base -= offset;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (3 preceding siblings ...)
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
  5 siblings, 0 replies; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

Partial writes can still happen in sendmsg and recvmsg, if a
signal is received in the middle of a write.  To handle this,
retry the operation with a new offset/bytes pair.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 util/iov.c | 102 +++++++++++++++++++++++++++++++++----------------------------
 1 file changed, 55 insertions(+), 47 deletions(-)

diff --git a/util/iov.c b/util/iov.c
index f14ff0b..d32226d 100644
--- a/util/iov.c
+++ b/util/iov.c
@@ -144,63 +144,71 @@ ssize_t iov_send_recv(int sockfd, struct iovec *iov, unsigned iov_cnt,
                       size_t offset, size_t bytes,
                       bool do_send)
 {
+    ssize_t total = 0;
     ssize_t ret;
     size_t orig_len, tail;
     unsigned niov;
 
-    if (bytes == 0) {
-        /* Catch the do-nothing case early, as otherwise we will pass an
-         * empty iovec to sendmsg/recvmsg(), and not all implementations
-         * accept this.
-         */
-        return 0;
-    }
-
-    /* Find the start position, skipping `offset' bytes:
-     * first, skip all full-sized vector elements, */
-    for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
-        offset -= iov[niov].iov_len;
-    }
+    while (bytes > 0) {
+        /* Find the start position, skipping `offset' bytes:
+         * first, skip all full-sized vector elements, */
+        for (niov = 0; niov < iov_cnt && offset >= iov[niov].iov_len; ++niov) {
+            offset -= iov[niov].iov_len;
+        }
 
-    /* niov == iov_cnt would only be valid if bytes == 0, which
-     * we already ruled out above.  */
-    assert(niov < iov_cnt);
-    iov += niov;
-    iov_cnt -= niov;
-
-    if (offset) {
-        /* second, skip `offset' bytes from the (now) first element,
-         * undo it on exit */
-        iov[0].iov_base += offset;
-        iov[0].iov_len -= offset;
-    }
-    /* Find the end position skipping `bytes' bytes: */
-    /* first, skip all full-sized elements */
-    tail = bytes;
-    for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
-        tail -= iov[niov].iov_len;
-    }
-    if (tail) {
-        /* second, fixup the last element, and remember the original
-         * length */
+        /* niov == iov_cnt would only be valid if bytes == 0, which
+         * we already ruled out in the loop condition.  */
         assert(niov < iov_cnt);
-        assert(iov[niov].iov_len > tail);
-        orig_len = iov[niov].iov_len;
-        iov[niov++].iov_len = tail;
-    }
+        iov += niov;
+        iov_cnt -= niov;
+
+        if (offset) {
+            /* second, skip `offset' bytes from the (now) first element,
+             * undo it on exit */
+            iov[0].iov_base += offset;
+            iov[0].iov_len -= offset;
+        }
+        /* Find the end position skipping `bytes' bytes: */
+        /* first, skip all full-sized elements */
+        tail = bytes;
+        for (niov = 0; niov < iov_cnt && iov[niov].iov_len <= tail; ++niov) {
+            tail -= iov[niov].iov_len;
+        }
+        if (tail) {
+            /* second, fixup the last element, and remember the original
+             * length */
+            assert(niov < iov_cnt);
+            assert(iov[niov].iov_len > tail);
+            orig_len = iov[niov].iov_len;
+            iov[niov++].iov_len = tail;
+        }
 
-    ret = do_send_recv(sockfd, iov, niov, do_send);
+        ret = do_send_recv(sockfd, iov, niov, do_send);
 
-    /* Undo the changes above */
-    if (tail) {
-        iov[niov-1].iov_len = orig_len;
-    }
-    if (offset) {
-        iov[0].iov_base -= offset;
-        iov[0].iov_len += offset;
+        /* Undo the changes above before checking for errors */
+        if (tail) {
+            iov[niov-1].iov_len = orig_len;
+        }
+        if (offset) {
+            iov[0].iov_base -= offset;
+            iov[0].iov_len += offset;
+        }
+
+        if (ret < 0) {
+            assert(errno != EINTR);
+            if (errno == EAGAIN && total > 0) {
+                return total;
+            }
+            return -1;
+        }
+
+        /* Prepare for the next iteration */
+        offset += ret;
+        total += ret;
+        bytes -= ret;
     }
 
-    return ret;
+    return total;
 }
 
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen
  2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
                   ` (4 preceding siblings ...)
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
@ 2013-04-17  9:46 ` Paolo Bonzini
  2013-04-20 18:53   ` Blue Swirl
  5 siblings, 1 reply; 10+ messages in thread
From: Paolo Bonzini @ 2013-04-17  9:46 UTC (permalink / raw)
  To: qemu-devel; +Cc: owasserm, quintela

This uses system calls directly for Unix file descriptors, so that the
efficient writev_buffer can be used.  Pay attention to the possibility
of partial writes in writev.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Orit Wassermann <owasserm@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 savevm.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 90 insertions(+), 11 deletions(-)

diff --git a/savevm.c b/savevm.c
index ffabbff..31dcce9 100644
--- a/savevm.c
+++ b/savevm.c
@@ -356,9 +356,94 @@ static const QEMUFileOps stdio_file_write_ops = {
     .close =      stdio_fclose
 };
 
+static ssize_t unix_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
+                                  int64_t pos)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len, offset;
+    ssize_t size = iov_size(iov, iovcnt);
+    ssize_t total = 0;
+
+    assert(iovcnt > 0);
+    offset = 0;
+    while (size > 0) {
+        /* Find the next start position; skip all full-sized vector elements  */
+        while (offset >= iov[0].iov_len) {
+            offset -= iov[0].iov_len;
+            iov++, iovcnt--;
+        }
+
+        /* skip `offset' bytes from the (now) first element, undo it on exit */
+        assert(iovcnt > 0);
+        iov[0].iov_base += offset;
+        iov[0].iov_len -= offset;
+
+        do {
+            len = writev(s->fd, iov, iovcnt);
+        } while (len == -1 && errno == EINTR);
+        if (len == -1) {
+            return -errno;
+        }
+
+        /* Undo the changes above */
+        iov[0].iov_base -= offset;
+        iov[0].iov_len += offset;
+
+        /* Prepare for the next iteration */
+        offset += len;
+        total += len;
+        size -= len;
+    }
+
+    return total;
+}
+
+static int unix_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
+{
+    QEMUFileSocket *s = opaque;
+    ssize_t len;
+
+    for (;;) {
+        len = read(s->fd, buf, size);
+        if (len != -1) {
+            break;
+        }
+        if (errno == EAGAIN) {
+            yield_until_fd_readable(s->fd);
+        } else if (errno != EINTR) {
+            break;
+        }
+    }
+
+    if (len == -1) {
+        len = -errno;
+    }
+    return len;
+}
+
+static int unix_close(void *opaque)
+{
+    QEMUFileSocket *s = opaque;
+    close(s->fd);
+    g_free(s);
+    return 0;
+}
+
+static const QEMUFileOps unix_read_ops = {
+    .get_fd =     socket_get_fd,
+    .get_buffer = unix_get_buffer,
+    .close =      unix_close
+};
+
+static const QEMUFileOps unix_write_ops = {
+    .get_fd =     socket_get_fd,
+    .writev_buffer = unix_writev_buffer,
+    .close =      unix_close
+};
+
 QEMUFile *qemu_fdopen(int fd, const char *mode)
 {
-    QEMUFileStdio *s;
+    QEMUFileSocket *s;
 
     if (mode == NULL ||
 	(mode[0] != 'r' && mode[0] != 'w') ||
@@ -367,21 +452,15 @@ QEMUFile *qemu_fdopen(int fd, const char *mode)
         return NULL;
     }
 
-    s = g_malloc0(sizeof(QEMUFileStdio));
-    s->stdio_file = fdopen(fd, mode);
-    if (!s->stdio_file)
-        goto fail;
+    s = g_malloc0(sizeof(QEMUFileSocket));
+    s->fd = fd;
 
     if(mode[0] == 'r') {
-        s->file = qemu_fopen_ops(s, &stdio_file_read_ops);
+        s->file = qemu_fopen_ops(s, &unix_read_ops);
     } else {
-        s->file = qemu_fopen_ops(s, &stdio_file_write_ops);
+        s->file = qemu_fopen_ops(s, &unix_write_ops);
     }
     return s->file;
-
-fail:
-    g_free(s);
-    return NULL;
 }
 
 static const QEMUFileOps socket_read_ops = {
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen
  2013-04-17  9:46 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
@ 2013-04-20 18:53   ` Blue Swirl
  0 siblings, 0 replies; 10+ messages in thread
From: Blue Swirl @ 2013-04-20 18:53 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Orit Wasserman, Stefan Weil, qemu-devel, quintela@redhat.com

On Wed, Apr 17, 2013 at 9:46 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> This uses system calls directly for Unix file descriptors, so that the
> efficient writev_buffer can be used.  Pay attention to the possibility
> of partial writes in writev.
>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Orit Wassermann <owasserm@redhat.com>
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  savevm.c | 101 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------
>  1 file changed, 90 insertions(+), 11 deletions(-)
>
> diff --git a/savevm.c b/savevm.c
> index ffabbff..31dcce9 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -356,9 +356,94 @@ static const QEMUFileOps stdio_file_write_ops = {
>      .close =      stdio_fclose
>  };
>
> +static ssize_t unix_writev_buffer(void *opaque, struct iovec *iov, int iovcnt,
> +                                  int64_t pos)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len, offset;
> +    ssize_t size = iov_size(iov, iovcnt);
> +    ssize_t total = 0;
> +
> +    assert(iovcnt > 0);
> +    offset = 0;
> +    while (size > 0) {
> +        /* Find the next start position; skip all full-sized vector elements  */
> +        while (offset >= iov[0].iov_len) {
> +            offset -= iov[0].iov_len;
> +            iov++, iovcnt--;
> +        }
> +
> +        /* skip `offset' bytes from the (now) first element, undo it on exit */
> +        assert(iovcnt > 0);
> +        iov[0].iov_base += offset;
> +        iov[0].iov_len -= offset;
> +
> +        do {
> +            len = writev(s->fd, iov, iovcnt);

This breaks mingw32 build:
  LINK  i386-softmmu/qemu-system-i386.exe
savevm.o: In function `unix_writev_buffer':
/src/qemu/savevm.c:382: undefined reference to `_writev'
collect2: error: ld returned 1 exit status

> +        } while (len == -1 && errno == EINTR);
> +        if (len == -1) {
> +            return -errno;
> +        }
> +
> +        /* Undo the changes above */
> +        iov[0].iov_base -= offset;
> +        iov[0].iov_len += offset;
> +
> +        /* Prepare for the next iteration */
> +        offset += len;
> +        total += len;
> +        size -= len;
> +    }
> +
> +    return total;
> +}
> +
> +static int unix_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
> +{
> +    QEMUFileSocket *s = opaque;
> +    ssize_t len;
> +
> +    for (;;) {
> +        len = read(s->fd, buf, size);
> +        if (len != -1) {
> +            break;
> +        }
> +        if (errno == EAGAIN) {
> +            yield_until_fd_readable(s->fd);
> +        } else if (errno != EINTR) {
> +            break;
> +        }
> +    }
> +
> +    if (len == -1) {
> +        len = -errno;
> +    }
> +    return len;
> +}
> +
> +static int unix_close(void *opaque)
> +{
> +    QEMUFileSocket *s = opaque;
> +    close(s->fd);
> +    g_free(s);
> +    return 0;
> +}
> +
> +static const QEMUFileOps unix_read_ops = {
> +    .get_fd =     socket_get_fd,
> +    .get_buffer = unix_get_buffer,
> +    .close =      unix_close
> +};
> +
> +static const QEMUFileOps unix_write_ops = {
> +    .get_fd =     socket_get_fd,
> +    .writev_buffer = unix_writev_buffer,
> +    .close =      unix_close
> +};
> +
>  QEMUFile *qemu_fdopen(int fd, const char *mode)
>  {
> -    QEMUFileStdio *s;
> +    QEMUFileSocket *s;
>
>      if (mode == NULL ||
>         (mode[0] != 'r' && mode[0] != 'w') ||
> @@ -367,21 +452,15 @@ QEMUFile *qemu_fdopen(int fd, const char *mode)
>          return NULL;
>      }
>
> -    s = g_malloc0(sizeof(QEMUFileStdio));
> -    s->stdio_file = fdopen(fd, mode);
> -    if (!s->stdio_file)
> -        goto fail;
> +    s = g_malloc0(sizeof(QEMUFileSocket));
> +    s->fd = fd;
>
>      if(mode[0] == 'r') {
> -        s->file = qemu_fopen_ops(s, &stdio_file_read_ops);
> +        s->file = qemu_fopen_ops(s, &unix_read_ops);
>      } else {
> -        s->file = qemu_fopen_ops(s, &stdio_file_write_ops);
> +        s->file = qemu_fopen_ops(s, &unix_write_ops);
>      }
>      return s->file;
> -
> -fail:
> -    g_free(s);
> -    return NULL;
>  }
>
>  static const QEMUFileOps socket_read_ops = {
> --
> 1.8.1.4
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2013-04-20 18:54 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-04-17  9:46 [Qemu-devel] [PULL (rebased) 0/6] migration: followups for writev patches Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 1/6] qemu-file: drop socket_put_buffer Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 2/6] iov: reorganize iov_send_recv, part 1 Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 3/6] iov: reorganize iov_send_recv, part 2 Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 4/6] iov: reorganize iov_send_recv, part 3 Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
2013-04-17  9:46 ` [Qemu-devel] [PATCH 6/6] qemu-file: do not use stdio for qemu_fdopen Paolo Bonzini
2013-04-20 18:53   ` Blue Swirl
  -- strict thread matches above, loose matches on Subject: below --
2013-03-27 16:36 [Qemu-devel] [PATCH 0/6] migration: followups for writev patches Paolo Bonzini
2013-03-27 16:36 ` [Qemu-devel] [PATCH 5/6] iov: handle partial writes from sendmsg and recvmsg Paolo Bonzini
2013-04-09 13:27   ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).