qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: "Richard W . M . Jones" <rjones@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>,
	"open list:Network Block Dev..." <qemu-block@nongnu.org>
Subject: [Qemu-devel] [PULL 11/14] nbd/client: Support qemu-img convert from unaligned size
Date: Mon,  1 Apr 2019 09:09:00 -0500	[thread overview]
Message-ID: <20190401140903.19186-12-eblake@redhat.com> (raw)
In-Reply-To: <20190401140903.19186-1-eblake@redhat.com>

If an NBD server advertises a size that is not a multiple of a sector,
the block layer rounds up that size, even though we set info.size to
the exact byte value sent by the server. The block layer then proceeds
to let us read or query block status on the hole that it added past
EOF, which the NBD server is unlikely to be happy with. Fortunately,
qemu as a server never advertizes an unaligned size, so we generally
don't run into this problem; but the nbdkit server makes it easy to
test:

$ printf %1000d 1 > f1
$ ~/nbdkit/nbdkit -fv file f1 & pid=$!
$ qemu-img convert -f raw nbd://localhost:10809 f2
$ kill $pid
$ qemu-img compare f1 f2

Pre-patch, the server attempts a 1024-byte read, which nbdkit
rightfully rejects as going beyond its advertised 1000 byte size; the
conversion fails and the output files differ (not even the first
sector is copied, because qemu-img does not follow ddrescue's habit of
trying smaller reads to get as much information as possible in spite
of errors). Post-patch, the client's attempts to read (and query block
status, for new enough nbdkit) are properly truncated to the server's
length, with sane handling of the hole the block layer forced on
us. Although f2 ends up as a larger file (1024 bytes instead of 1000),
qemu-img compare shows the two images to have identical contents for
display to the guest.

I didn't add iotests coverage since I didn't want to add a dependency
on nbdkit in iotests. I also did NOT patch write, trim, or write
zeroes - these commands continue to fail (usually with ENOSPC, but
whatever the server chose), because we really can't write to the end
of the file, and because 'qemu-img convert' is the most common case
where we care about being tolerant (which is read-only). Perhaps we
could truncate the request if the client is writing zeros to the tail,
but that seems like more work, especially if the block layer is fixed
in 4.1 to track byte-accurate sizing (in which case this patch would
be reverted as unnecessary).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-5-eblake@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
---
 block/nbd-client.c | 39 ++++++++++++++++++++++++++++++++++++++-
 1 file changed, 38 insertions(+), 1 deletion(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 3edb508f668..409c2171bc3 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -848,6 +848,25 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset,
     if (!bytes) {
         return 0;
     }
+    /*
+     * Work around the fact that the block layer doesn't do
+     * byte-accurate sizing yet - if the read exceeds the server's
+     * advertised size because the block layer rounded size up, then
+     * truncate the request to the server and tail-pad with zero.
+     */
+    if (offset >= client->info.size) {
+        assert(bytes < BDRV_SECTOR_SIZE);
+        qemu_iovec_memset(qiov, 0, 0, bytes);
+        return 0;
+    }
+    if (offset + bytes > client->info.size) {
+        uint64_t slop = offset + bytes - client->info.size;
+
+        assert(slop < BDRV_SECTOR_SIZE);
+        qemu_iovec_memset(qiov, bytes - slop, 0, slop);
+        request.len -= slop;
+    }
+
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         return ret;
@@ -966,7 +985,8 @@ int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
         .from = offset,
         .len = MIN(MIN_NON_ZERO(QEMU_ALIGN_DOWN(INT_MAX,
                                                 bs->bl.request_alignment),
-                                client->info.max_block), bytes),
+                                client->info.max_block),
+                   MIN(bytes, client->info.size - offset)),
         .flags = NBD_CMD_FLAG_REQ_ONE,
     };

@@ -977,6 +997,23 @@ int coroutine_fn nbd_client_co_block_status(BlockDriverState *bs,
         return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
     }

+    /*
+     * Work around the fact that the block layer doesn't do
+     * byte-accurate sizing yet - if the status request exceeds the
+     * server's advertised size because the block layer rounded size
+     * up, we truncated the request to the server (above), or are
+     * called on just the hole.
+     */
+    if (offset >= client->info.size) {
+        *pnum = bytes;
+        assert(bytes < BDRV_SECTOR_SIZE);
+        /* Intentionally don't report offset_valid for the hole */
+        return BDRV_BLOCK_ZERO;
+    }
+
+    if (client->info.min_block) {
+        assert(QEMU_IS_ALIGNED(request.len, client->info.min_block));
+    }
     ret = nbd_co_send_request(bs, &request, NULL);
     if (ret < 0) {
         return ret;
-- 
2.20.1

  parent reply	other threads:[~2019-04-01 14:09 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-01 14:08 [Qemu-devel] [PULL 00/14] NBD patches for 4.0-rc2 Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 01/14] qemu-img: Report bdrv_block_status failures Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 02/14] nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 03/14] nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUS Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 04/14] nbd: Permit simple " Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 05/14] qemu-img: Gracefully shutdown when map can't finish Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 06/14] nbd-client: Work around server BLOCK_STATUS misalignment at EOF Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 07/14] iotests: Add 241 to test NBD on unaligned images Eric Blake
2019-04-10 17:45   ` [Qemu-devel] [Qemu-block] " Max Reitz
2019-04-10 17:45     ` Max Reitz
2019-04-10 18:01     ` Eric Blake
2019-04-10 18:01       ` Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 08/14] nbd/client: Lower min_block for block-status, unaligned size Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 09/14] nbd/client: Report offsets in bdrv_block_status Eric Blake
2019-04-01 14:08 ` [Qemu-devel] [PULL 10/14] nbd/client: Reject inaccessible tail of inconsistent server Eric Blake
2019-04-02 22:40   ` Eric Blake
2019-04-01 14:09 ` Eric Blake [this message]
2019-04-01 14:09 ` [Qemu-devel] [PULL 12/14] block: Add bdrv_get_request_alignment() Eric Blake
2019-04-01 14:09 ` [Qemu-devel] [PULL 13/14] nbd/server: Advertise actual minimum block size Eric Blake
2019-04-01 14:09 ` [Qemu-devel] [PULL 14/14] nbd/client: Trace server noncompliance on structured reads Eric Blake
2019-04-02  5:29 ` [Qemu-devel] [PULL 00/14] NBD patches for 4.0-rc2 Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190401140903.19186-12-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rjones@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).