qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08
@ 2018-01-08 15:31 Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 1/3] nbd/server: Implement sparse reads atop structured reply Eric Blake
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Eric Blake @ 2018-01-08 15:31 UTC (permalink / raw)
  To: qemu-devel

The following changes since commit 799044b6a3a0fc63e1e020e4d9266786a2dc7a0b:

  Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2018-01-08 13:44:01 +0000)

are available in the Git repository at:

  git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2018-01-08

for you to fetch changes up to c4365735a7d38f4355c6f77e6670d3972315f7c2:

  block/nbd: fix segmentation fault when .desc is not null-terminated (2018-01-08 09:12:23 -0600)

Careful readers may note that I'm abondoning my earlier pull
request from 20 Dec:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg04517.html
This pull request includes the first two patches from that one,
but the second half of that request has a v2 posted by Vladimir,
which I'm letting bake a bit longer rather than including today.

----------------------------------------------------------------
nbd patches for 2018-01-08

- Eric Blake: 0/2 Optimize sparse reads over NBD
- Murilo Opsfelder Araujo: block/nbd: fix segmentation fault when .desc is not null-terminated

----------------------------------------------------------------
Eric Blake (2):
      nbd/server: Implement sparse reads atop structured reply
      nbd/server: Optimize final chunk of sparse read

Murilo Opsfelder Araujo (1):
      block/nbd: fix segmentation fault when .desc is not null-terminated

 block/nbd.c      |  1 +
 nbd/server.c     | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 nbd/trace-events |  1 +
 3 files changed, 78 insertions(+), 3 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 1/3] nbd/server: Implement sparse reads atop structured reply
  2018-01-08 15:31 [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Eric Blake
@ 2018-01-08 15:31 ` Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 2/3] nbd/server: Optimize final chunk of sparse read Eric Blake
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2018-01-08 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, open list:Network Block Dev...

The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire.  Time to implement
this in the server; our client can already read such data.

We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here).  Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read).  Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 nbd/server.c     | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 nbd/trace-events |  1 +
 2 files changed, 76 insertions(+), 3 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index 92c0fdd03b..be7310cb41 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1303,6 +1303,7 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,
                                                     uint64_t offset,
                                                     void *data,
                                                     size_t size,
+                                                    bool final,
                                                     Error **errp)
 {
     NBDStructuredReadData chunk;
@@ -1313,13 +1314,73 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,

     assert(size);
     trace_nbd_co_send_structured_read(handle, offset, data, size);
-    set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA,
-                 handle, sizeof(chunk) - sizeof(chunk.h) + size);
+    set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
+                 NBD_REPLY_TYPE_OFFSET_DATA, handle,
+                 sizeof(chunk) - sizeof(chunk.h) + size);
     stq_be_p(&chunk.offset, offset);

     return nbd_co_send_iov(client, iov, 2, errp);
 }

+static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
+                                                uint64_t handle,
+                                                uint64_t offset,
+                                                uint8_t *data,
+                                                size_t size,
+                                                Error **errp)
+{
+    int ret = 0;
+    NBDExport *exp = client->exp;
+    size_t progress = 0;
+
+    while (progress < size) {
+        int64_t pnum;
+        int status = bdrv_block_status_above(blk_bs(exp->blk), NULL,
+                                             offset + progress,
+                                             size - progress, &pnum, NULL,
+                                             NULL);
+
+        if (status < 0) {
+            error_setg_errno(errp, -status, "unable to check for holes");
+            return status;
+        }
+        assert(pnum && pnum <= size - progress);
+        if (status & BDRV_BLOCK_ZERO) {
+            NBDStructuredReadHole chunk;
+            struct iovec iov[] = {
+                {.iov_base = &chunk, .iov_len = sizeof(chunk)},
+            };
+
+            trace_nbd_co_send_structured_read_hole(handle, offset + progress,
+                                                   pnum);
+            set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_HOLE,
+                         handle, sizeof(chunk) - sizeof(chunk.h));
+            stq_be_p(&chunk.offset, offset + progress);
+            stl_be_p(&chunk.length, pnum);
+            ret = nbd_co_send_iov(client, iov, 1, errp);
+        } else {
+            ret = blk_pread(exp->blk, offset + progress + exp->dev_offset,
+                            data + progress, pnum);
+            if (ret < 0) {
+                error_setg_errno(errp, -ret, "reading from file failed");
+                break;
+            }
+            ret = nbd_co_send_structured_read(client, handle, offset + progress,
+                                              data + progress, pnum, false,
+                                              errp);
+        }
+
+        if (ret < 0) {
+            break;
+        }
+        progress += pnum;
+    }
+    if (!ret) {
+        ret = nbd_co_send_structured_done(client, handle, errp);
+    }
+    return ret;
+}
+
 static int coroutine_fn nbd_co_send_structured_error(NBDClient *client,
                                                      uint64_t handle,
                                                      uint32_t error,
@@ -1481,6 +1542,16 @@ static coroutine_fn void nbd_trip(void *opaque)
             }
         }

+        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF)) {
+            ret = nbd_co_send_sparse_read(req->client, request.handle,
+                                          request.from, req->data, request.len,
+                                          &local_err);
+            if (ret < 0) {
+                goto reply;
+            }
+            goto done;
+        }
+
         ret = blk_pread(exp->blk, request.from + exp->dev_offset,
                         req->data, request.len);
         if (ret < 0) {
@@ -1561,7 +1632,8 @@ reply:
         } else if (reply_data_len) {
             ret = nbd_co_send_structured_read(req->client, request.handle,
                                               request.from, req->data,
-                                              reply_data_len, &local_err);
+                                              reply_data_len, true,
+                                              &local_err);
         } else {
             ret = nbd_co_send_structured_done(req->client, request.handle,
                                               &local_err);
diff --git a/nbd/trace-events b/nbd/trace-events
index 92568edce5..2b8268ce8c 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -57,6 +57,7 @@ nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients fr
 nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d"
 nbd_co_send_structured_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64
 nbd_co_send_structured_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu"
+nbd_co_send_structured_read_hole(uint64_t handle, uint64_t offset, size_t size) "Send structured read hole reply: handle = %" PRIu64 ", offset = %" PRIu64 ", len = %zu"
 nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
 nbd_co_receive_request_payload_received(uint64_t handle, uint32_t len) "Payload received: handle = %" PRIu64 ", len = %" PRIu32
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 2/3] nbd/server: Optimize final chunk of sparse read
  2018-01-08 15:31 [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 1/3] nbd/server: Implement sparse reads atop structured reply Eric Blake
@ 2018-01-08 15:31 ` Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 3/3] block/nbd: fix segmentation fault when .desc is not null-terminated Eric Blake
  2018-01-09 16:28 ` [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2018-01-08 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, open list:Network Block Dev...

If we are careful to handle 0-length read requests correctly,
we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
needing a separate chunk.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
---
 nbd/server.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index be7310cb41..e443b3cf5c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1339,12 +1339,14 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                                              offset + progress,
                                              size - progress, &pnum, NULL,
                                              NULL);
+        bool final;

         if (status < 0) {
             error_setg_errno(errp, -status, "unable to check for holes");
             return status;
         }
         assert(pnum && pnum <= size - progress);
+        final = progress + pnum == size;
         if (status & BDRV_BLOCK_ZERO) {
             NBDStructuredReadHole chunk;
             struct iovec iov[] = {
@@ -1353,7 +1355,8 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,

             trace_nbd_co_send_structured_read_hole(handle, offset + progress,
                                                    pnum);
-            set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_HOLE,
+            set_be_chunk(&chunk.h, final ? NBD_REPLY_FLAG_DONE : 0,
+                         NBD_REPLY_TYPE_OFFSET_HOLE,
                          handle, sizeof(chunk) - sizeof(chunk.h));
             stq_be_p(&chunk.offset, offset + progress);
             stl_be_p(&chunk.length, pnum);
@@ -1366,7 +1369,7 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
                 break;
             }
             ret = nbd_co_send_structured_read(client, handle, offset + progress,
-                                              data + progress, pnum, false,
+                                              data + progress, pnum, final,
                                               errp);
         }

@@ -1375,9 +1378,6 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
         }
         progress += pnum;
     }
-    if (!ret) {
-        ret = nbd_co_send_structured_done(client, handle, errp);
-    }
     return ret;
 }

@@ -1542,7 +1542,8 @@ static coroutine_fn void nbd_trip(void *opaque)
             }
         }

-        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF)) {
+        if (client->structured_reply && !(request.flags & NBD_CMD_FLAG_DF) &&
+            request.len) {
             ret = nbd_co_send_sparse_read(req->client, request.handle,
                                           request.from, req->data, request.len,
                                           &local_err);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [Qemu-devel] [PULL 3/3] block/nbd: fix segmentation fault when .desc is not null-terminated
  2018-01-08 15:31 [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 1/3] nbd/server: Implement sparse reads atop structured reply Eric Blake
  2018-01-08 15:31 ` [Qemu-devel] [PULL 2/3] nbd/server: Optimize final chunk of sparse read Eric Blake
@ 2018-01-08 15:31 ` Eric Blake
  2018-01-09 16:28 ` [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Eric Blake @ 2018-01-08 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Murilo Opsfelder Araujo, qemu-stable, Paolo Bonzini, Kevin Wolf,
	Max Reitz, open list:Network Block Dev...

From: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>

The find_desc_by_name() from util/qemu-option.c relies on the .name not being
NULL to call strcmp(). This check becomes unsafe when the list is not
NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can
result in segmentation fault when strcmp() tries to access an invalid memory:

    #0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6
    #1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166
    #2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026
    #3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406
    #4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135
    #5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395

>From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an
invalid memory:

    >>> p desc[5]
    $8 = {
      name = 0x1037f098 "R27A",
      type = 1561964883,
      help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>,
      def_value_str = 0x2 <error: Cannot access memory at address 0x2>
    }
    >>> p desc[6]
    $9 = {
      name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001",
      type = 272101528,
      help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>,
      def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9>
    }

This patch fixes the segmentation fault in strcmp() by adding a NULL element at
the end of nbd_runtime_opts.desc list, which is the common practice to most of
other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL
check becomes safe because it will not evaluate to true when .desc list reached
its end.

Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1727259
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com>
CC: qemu-stable@nongnu.org
Fixes: 7ccc44fd7d1dfa62c4d6f3a680df809d6e7068ce
Signed-off-by: Eric Blake <eblake@redhat.com>
---
 block/nbd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/nbd.c b/block/nbd.c
index a50d24b50a..8b8ba56cdd 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -388,6 +388,7 @@ static QemuOptsList nbd_runtime_opts = {
             .type = QEMU_OPT_STRING,
             .help = "ID of the TLS credentials to use",
         },
+        { /* end of list */ }
     },
 };

-- 
2.14.3

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08
  2018-01-08 15:31 [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Eric Blake
                   ` (2 preceding siblings ...)
  2018-01-08 15:31 ` [Qemu-devel] [PULL 3/3] block/nbd: fix segmentation fault when .desc is not null-terminated Eric Blake
@ 2018-01-09 16:28 ` Peter Maydell
  3 siblings, 0 replies; 5+ messages in thread
From: Peter Maydell @ 2018-01-09 16:28 UTC (permalink / raw)
  To: Eric Blake; +Cc: QEMU Developers

On 8 January 2018 at 15:31, Eric Blake <eblake@redhat.com> wrote:
> The following changes since commit 799044b6a3a0fc63e1e020e4d9266786a2dc7a0b:
>
>   Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging (2018-01-08 13:44:01 +0000)
>
> are available in the Git repository at:
>
>   git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2018-01-08
>
> for you to fetch changes up to c4365735a7d38f4355c6f77e6670d3972315f7c2:
>
>   block/nbd: fix segmentation fault when .desc is not null-terminated (2018-01-08 09:12:23 -0600)
>
> Careful readers may note that I'm abondoning my earlier pull
> request from 20 Dec:
> https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg04517.html
> This pull request includes the first two patches from that one,
> but the second half of that request has a v2 posted by Vladimir,
> which I'm letting bake a bit longer rather than including today.
>
> ----------------------------------------------------------------
> nbd patches for 2018-01-08
>
> - Eric Blake: 0/2 Optimize sparse reads over NBD
> - Murilo Opsfelder Araujo: block/nbd: fix segmentation fault when .desc is not null-terminated
>
> ----------------------------------------------------------------
> Eric Blake (2):
>       nbd/server: Implement sparse reads atop structured reply
>       nbd/server: Optimize final chunk of sparse read
>
> Murilo Opsfelder Araujo (1):
>       block/nbd: fix segmentation fault when .desc is not null-terminated

Applied, thanks.

-- PMM

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-01-09 16:28 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-01-08 15:31 [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Eric Blake
2018-01-08 15:31 ` [Qemu-devel] [PULL 1/3] nbd/server: Implement sparse reads atop structured reply Eric Blake
2018-01-08 15:31 ` [Qemu-devel] [PULL 2/3] nbd/server: Optimize final chunk of sparse read Eric Blake
2018-01-08 15:31 ` [Qemu-devel] [PULL 3/3] block/nbd: fix segmentation fault when .desc is not null-terminated Eric Blake
2018-01-09 16:28 ` [Qemu-devel] [PULL 0/3] NBD patches through 2018-01-08 Peter Maydell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).