qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, libguestfs@redhat.com,
	nbd@other.debian.org,
	Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Subject: [PATCH v2 08/15] nbd/server: Support 64-bit block status
Date: Mon, 14 Nov 2022 16:48:41 -0600	[thread overview]
Message-ID: <20221114224848.2186298-9-eblake@redhat.com> (raw)
In-Reply-To: <20221114224848.2186298-1-eblake@redhat.com>

The previous patch handled extended headers by truncating large block
status requests from the client back to 32 bits.  But this is not
ideal; for cases where we can truly determine the status of the entire
image quickly (for example, when reporting the entire image as
non-sparse because we lack the ability to probe for holes), this
causes more network traffic for the client to iterate through 4G
chunks than for us to just report the entire image at once.  For ease
of implementation, if extended headers were negotiated, then we always
reply with 64-bit block status replies, even when the result could
have fit in the older 32-bit block status chunk (clients supporting
extended headers have to be prepared for either chunk type, so
temporarily reverting this patch proves whether a client is
compliant).

For now, all metacontexts that we know how to export never populate
more than 32 bits of information, so we don't have to worry about
NBD_REP_ERR_EXT_HEADER_REQD or filtering during handshake, and we
always send all zeroes for the upper 32 bits of status during
NBD_CMD_BLOCK_STATUS.

Note that we previously had some interesting size-juggling on call
chains, such as:

nbd_co_send_block_status(uint32_t length)
-> blockstatus_to_extents(uint32_t bytes)
  -> bdrv_block_status_above(bytes, &uint64_t num)
  -> nbd_extent_array_add(uint64_t num)
    -> store num in 32-bit length

But we were lucky that it never overflowed: bdrv_block_status_above
never sets num larger than bytes, and we had previously been capping
'bytes' at 32 bits (either by the protocol, or in the previous patch
with an explicit truncation).  This patch adds some assertions that
ensure we continue to avoid overflowing 32 bits for a narrow client,
while fully utilizing 64-bits all the way through when the client
understands that.

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 nbd/server.c | 86 +++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 59 insertions(+), 27 deletions(-)

diff --git a/nbd/server.c b/nbd/server.c
index b46655b4d8..f21f8098c1 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -2145,20 +2145,30 @@ static int coroutine_fn nbd_co_send_sparse_read(NBDClient *client,
 }

 typedef struct NBDExtentArray {
-    NBDExtent *extents;
+    union {
+        NBDStructuredMeta id;
+        NBDStructuredMetaExt meta;
+    };
+    union {
+        NBDExtent *narrow;
+        NBDExtentExt *extents;
+    };
     unsigned int nb_alloc;
     unsigned int count;
     uint64_t total_length;
+    bool extended; /* Whether 64-bit extents are allowed */
     bool can_add;
     bool converted_to_be;
 } NBDExtentArray;

-static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc)
+static NBDExtentArray *nbd_extent_array_new(unsigned int nb_alloc,
+                                            bool extended)
 {
     NBDExtentArray *ea = g_new0(NBDExtentArray, 1);

     ea->nb_alloc = nb_alloc;
-    ea->extents = g_new(NBDExtent, nb_alloc);
+    ea->extents = g_new(NBDExtentExt, nb_alloc);
+    ea->extended = extended;
     ea->can_add = true;

     return ea;
@@ -2172,17 +2182,37 @@ static void nbd_extent_array_free(NBDExtentArray *ea)
 G_DEFINE_AUTOPTR_CLEANUP_FUNC(NBDExtentArray, nbd_extent_array_free)

 /* Further modifications of the array after conversion are abandoned */
-static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
+static void nbd_extent_array_convert_to_be(NBDExtentArray *ea,
+                                           uint32_t context_id,
+                                           struct iovec *iov)
 {
     int i;

     assert(!ea->converted_to_be);
+    assert(iov[0].iov_base == &ea->meta);
+    assert(iov[1].iov_base == ea->extents);
     ea->can_add = false;
     ea->converted_to_be = true;

-    for (i = 0; i < ea->count; i++) {
-        ea->extents[i].flags = cpu_to_be32(ea->extents[i].flags);
-        ea->extents[i].length = cpu_to_be32(ea->extents[i].length);
+    stl_be_p(&ea->meta.context_id, context_id);
+    if (ea->extended) {
+        stl_be_p(&ea->meta.count, ea->count);
+        for (i = 0; i < ea->count; i++) {
+            ea->extents[i].length = cpu_to_be64(ea->extents[i].length);
+            ea->extents[i].flags = cpu_to_be64(ea->extents[i].flags);
+        }
+        iov[0].iov_len = sizeof(ea->meta);
+        iov[1].iov_len = ea->count * sizeof(ea->extents[0]);
+    } else {
+        /* Conversion reduces memory usage, order of iteration matters */
+        for (i = 0; i < ea->count; i++) {
+            assert(ea->extents[i].length <= UINT32_MAX);
+            assert((uint32_t) ea->extents[i].flags == ea->extents[i].flags);
+            ea->narrow[i].length = cpu_to_be32(ea->extents[i].length);
+            ea->narrow[i].flags = cpu_to_be32(ea->extents[i].flags);
+        }
+        iov[0].iov_len = sizeof(ea->id);
+        iov[1].iov_len = ea->count * sizeof(ea->narrow[0]);
     }
 }

@@ -2196,19 +2226,23 @@ static void nbd_extent_array_convert_to_be(NBDExtentArray *ea)
  * would result in an incorrect range reported to the client)
  */
 static int nbd_extent_array_add(NBDExtentArray *ea,
-                                uint32_t length, uint32_t flags)
+                                uint64_t length, uint32_t flags)
 {
     assert(ea->can_add);

     if (!length) {
         return 0;
     }
+    if (!ea->extended) {
+        assert(length <= UINT32_MAX);
+    }

     /* Extend previous extent if flags are the same */
     if (ea->count > 0 && flags == ea->extents[ea->count - 1].flags) {
-        uint64_t sum = (uint64_t)length + ea->extents[ea->count - 1].length;
+        uint64_t sum = length + ea->extents[ea->count - 1].length;

-        if (sum <= UINT32_MAX) {
+        assert(sum >= length);
+        if (sum <= UINT32_MAX || ea->extended) {
             ea->extents[ea->count - 1].length = sum;
             ea->total_length += length;
             return 0;
@@ -2221,7 +2255,7 @@ static int nbd_extent_array_add(NBDExtentArray *ea,
     }

     ea->total_length += length;
-    ea->extents[ea->count] = (NBDExtent) {.length = length, .flags = flags};
+    ea->extents[ea->count] = (NBDExtentExt) {.length = length, .flags = flags};
     ea->count++;

     return 0;
@@ -2288,21 +2322,20 @@ static int nbd_co_send_extents(NBDClient *client, NBDRequest *request,
                                bool last, uint32_t context_id, Error **errp)
 {
     NBDReply hdr;
-    NBDStructuredMeta chunk;
     struct iovec iov[] = {
         {.iov_base = &hdr},
-        {.iov_base = &chunk, .iov_len = sizeof(chunk)},
-        {.iov_base = ea->extents, .iov_len = ea->count * sizeof(ea->extents[0])}
+        {.iov_base = &ea->meta},
+        {.iov_base = ea->extents}
     };

-    nbd_extent_array_convert_to_be(ea);
+    nbd_extent_array_convert_to_be(ea, context_id, &iov[1]);

     trace_nbd_co_send_extents(request->handle, ea->count, context_id,
                               ea->total_length, last);
     set_be_chunk(client, &iov[0], last ? NBD_REPLY_FLAG_DONE : 0,
-                 NBD_REPLY_TYPE_BLOCK_STATUS,
+                 client->extended_headers ? NBD_REPLY_TYPE_BLOCK_STATUS_EXT
+                 : NBD_REPLY_TYPE_BLOCK_STATUS,
                  request, iov[1].iov_len + iov[2].iov_len);
-    stl_be_p(&chunk.context_id, context_id);

     return nbd_co_send_iov(client, iov, 3, errp);
 }
@@ -2310,13 +2343,14 @@ static int nbd_co_send_extents(NBDClient *client, NBDRequest *request,
 /* Get block status from the exported device and send it to the client */
 static int nbd_co_send_block_status(NBDClient *client, NBDRequest *request,
                                     BlockDriverState *bs, uint64_t offset,
-                                    uint32_t length, bool dont_fragment,
+                                    uint64_t length, bool dont_fragment,
                                     bool last, uint32_t context_id,
                                     Error **errp)
 {
     int ret;
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
-    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+    g_autoptr(NBDExtentArray) ea =
+        nbd_extent_array_new(nb_extents, client->extended_headers);

     if (context_id == NBD_META_ID_BASE_ALLOCATION) {
         ret = blockstatus_to_extents(bs, offset, length, ea);
@@ -2343,7 +2377,8 @@ static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,
     bdrv_dirty_bitmap_lock(bitmap);

     for (start = offset;
-         bdrv_dirty_bitmap_next_dirty_area(bitmap, start, end, INT32_MAX,
+         bdrv_dirty_bitmap_next_dirty_area(bitmap, start, end,
+                                           es->extended ? INT64_MAX : INT32_MAX,
                                            &dirty_start, &dirty_count);
          start = dirty_start + dirty_count)
     {
@@ -2365,11 +2400,12 @@ static void bitmap_to_extents(BdrvDirtyBitmap *bitmap,

 static int nbd_co_send_bitmap(NBDClient *client, NBDRequest *request,
                               BdrvDirtyBitmap *bitmap, uint64_t offset,
-                              uint32_t length, bool dont_fragment, bool last,
+                              uint64_t length, bool dont_fragment, bool last,
                               uint32_t context_id, Error **errp)
 {
     unsigned int nb_extents = dont_fragment ? 1 : NBD_MAX_BLOCK_STATUS_EXTENTS;
-    g_autoptr(NBDExtentArray) ea = nbd_extent_array_new(nb_extents);
+    g_autoptr(NBDExtentArray) ea =
+        nbd_extent_array_new(nb_extents, client->extended_headers);

     bitmap_to_extents(bitmap, offset, length, ea);

@@ -2665,11 +2701,7 @@ static coroutine_fn int nbd_handle_request(NBDClient *client,
             return nbd_send_generic_reply(client, request, -EINVAL,
                                           "need non-zero length", errp);
         }
-        if (request->len > UINT32_MAX) {
-            /* For now, truncate our response to a 32 bit window */
-            request->len = QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
-                                           client->check_align ?: 1);
-        }
+        assert(client->extended_headers || request->len <= UINT32_MAX);
         if (client->export_meta.count) {
             bool dont_fragment = request->flags & NBD_CMD_FLAG_REQ_ONE;
             int contexts_remaining = client->export_meta.count;
-- 
2.38.1



  parent reply	other threads:[~2022-11-14 23:19 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-14 22:41 [cross-project PATCH v2] NBD 64-bit extensions Eric Blake
2022-11-14 22:46 ` [PATCH v2 0/6] NBD spec changes for " Eric Blake
2022-11-14 22:46   ` [PATCH v2 1/6] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length Eric Blake
2022-12-16 19:32     ` Vladimir Sementsov-Ogievskiy
2023-03-03 22:17       ` Eric Blake
2023-03-05  8:41         ` Wouter Verhelst
2023-03-06  8:48           ` [Libguestfs] " Laszlo Ersek
2023-03-06 13:48           ` Nir Soffer
2022-11-14 22:46   ` [PATCH v2 2/6] spec: Tweak description of maximum block size Eric Blake
2022-12-16 20:22     ` Vladimir Sementsov-Ogievskiy
2023-03-03 22:20       ` Eric Blake
2023-02-21 15:21     ` Wouter Verhelst
2023-03-03 22:26       ` Eric Blake
2023-03-05  8:45         ` Wouter Verhelst
2022-11-14 22:46   ` [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS Eric Blake
2022-12-19 18:26     ` Vladimir Sementsov-Ogievskiy
2023-02-22  9:49     ` Wouter Verhelst
2023-03-03 22:36       ` Eric Blake
2023-03-05  8:49         ` Wouter Verhelst
2022-11-14 22:46   ` [PATCH v2 4/6] spec: Allow 64-bit block status results Eric Blake
2022-11-14 22:46   ` [PATCH v2 5/6] spec: Introduce NBD_FLAG_BLOCK_STATUS_PAYLOAD Eric Blake
2023-02-22 10:05     ` Wouter Verhelst
2023-03-03 22:40       ` Eric Blake
2023-03-05  8:50         ` Wouter Verhelst
2022-11-14 22:46   ` [PATCH v2 6/6] RFC: spec: Introduce NBD_REPLY_TYPE_OFFSET_HOLE_EXT Eric Blake
2022-11-14 22:48 ` [PATCH v2 00/15] qemu patches for 64-bit NBD extensions Eric Blake
2022-11-14 22:48   ` [PATCH v2 01/15] nbd/client: Add safety check on chunk payload length Eric Blake
2022-11-14 22:48   ` [PATCH v2 02/15] nbd/server: Prepare for alternate-size headers Eric Blake
2022-11-14 22:48   ` [PATCH v2 03/15] nbd: Prepare for 64-bit request effect lengths Eric Blake
2022-11-14 22:48   ` [PATCH v2 04/15] nbd: Add types for extended headers Eric Blake
2022-11-14 22:48   ` [PATCH v2 05/15] nbd/server: Refactor handling of request payload Eric Blake
2022-11-14 22:48   ` [PATCH v2 06/15] nbd/server: Refactor to pass full request around Eric Blake
2022-11-14 22:48   ` [PATCH v2 07/15] nbd/server: Initial support for extended headers Eric Blake
2022-11-14 22:48   ` Eric Blake [this message]
2022-11-14 22:48   ` [PATCH v2 09/15] nbd/client: " Eric Blake
2022-11-14 22:48   ` [PATCH v2 10/15] nbd/client: Accept 64-bit block status chunks Eric Blake
2022-11-14 22:48   ` [PATCH v2 11/15] nbd/client: Request extended headers during negotiation Eric Blake
2022-11-14 22:48   ` [PATCH v2 12/15] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
2022-11-14 22:48   ` [PATCH v2 13/15] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
2022-11-14 22:48   ` [PATCH v2 14/15] RFC: nbd/client: Accept 64-bit hole chunks Eric Blake
2022-11-14 22:48   ` [PATCH v2 15/15] RFC: nbd/server: Send 64-bit hole chunk Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 00/23] libnbd 64-bit NBD extensions Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 01/23] block_status: Refactor array storage Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 02/23] internal: Refactor layout of replies in sbuf Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 03/23] protocol: Add definitions for extended headers Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 04/23] states: Prepare to send 64-bit requests Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 05/23] states: Prepare to receive 64-bit replies Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 06/23] states: Break deadlock if server goofs on extended replies Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 07/23] generator: Add struct nbd_extent in prep for 64-bit extents Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 08/23] block_status: Track 64-bit extents internally Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 09/23] block_status: Accept 64-bit extents during block status Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 10/23] api: Add [aio_]nbd_block_status_64 Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 11/23] api: Add several functions for controlling extended headers Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 12/23] copy: Update nbdcopy to use 64-bit block status Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 13/23] dump: Update nbddump " Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 14/23] info: Expose extended-headers support through nbdinfo Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 15/23] info: Update nbdinfo --map to use 64-bit block status Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 16/23] examples: Update copy-libev " Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 17/23] ocaml: Add example for 64-bit extents Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 18/23] generator: Actually request extended headers Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 19/23] api: Add nbd_[aio_]opt_extended_headers() Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 20/23] interop: Add test of 64-bit block status Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 21/23] api: Add nbd_can_block_status_payload() Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 22/23] api: Add nbd_[aio_]block_status_filter() Eric Blake
2022-11-14 22:51   ` [libnbd PATCH v2 23/23] RFC: pread: Accept 64-bit holes Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221114224848.2186298-9-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=libguestfs@redhat.com \
    --cc=nbd@other.debian.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@yandex-team.ru \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).