From: Stefan Hajnoczi <stefanha@redhat.com>
To: qemu-devel@nongnu.org
Cc: Peter Maydell <peter.maydell@linaro.org>,
Eric Blake <eblake@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>
Subject: [Qemu-devel] [PULL v2 01/25] block: Fragment reads to max transfer length
Date: Wed, 20 Jul 2016 17:20:58 +0100 [thread overview]
Message-ID: <1469031682-21863-2-git-send-email-stefanha@redhat.com> (raw)
In-Reply-To: <1469031682-21863-1-git-send-email-stefanha@redhat.com>
From: Eric Blake <eblake@redhat.com>
Drivers should be able to rely on the block layer honoring the
max transfer length, rather than needing to return -EINVAL
(iscsi) or manually fragment things (nbd). This patch adds
the fragmentation in the block layer, after requests have been
aligned (fragmenting before alignment would lead to multiple
unaligned requests, rather than just the head and tail).
The return value was previously nebulous on success on whether
it was zero or the length read; and fragmenting may introduce
yet other non-zero values if we use the last length read. But
as at least some callers are sloppy and expect only zero on
success, it is easiest to just guarantee 0.
[Fix uninitialized ret local variable in bdrv_aligned_preadv().
--Stefan]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1468607524-19021-2-git-send-email-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
block/io.c | 57 +++++++++++++++++++++++++++++++++++----------------------
1 file changed, 35 insertions(+), 22 deletions(-)
diff --git a/block/io.c b/block/io.c
index cfda714..c3574a4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -971,21 +971,25 @@ err:
/*
* Forwards an already correctly aligned request to the BlockDriver. This
- * handles copy on read and zeroing after EOF; any other features must be
- * implemented by the caller.
+ * handles copy on read, zeroing after EOF, and fragmentation of large
+ * reads; any other features must be implemented by the caller.
*/
static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
BdrvTrackedRequest *req, int64_t offset, unsigned int bytes,
int64_t align, QEMUIOVector *qiov, int flags)
{
int64_t total_bytes, max_bytes;
- int ret;
+ int ret = 0;
+ uint64_t bytes_remaining = bytes;
+ int max_transfer;
assert(is_power_of_2(align));
assert((offset & (align - 1)) == 0);
assert((bytes & (align - 1)) == 0);
assert(!qiov || bytes == qiov->size);
assert((bs->open_flags & BDRV_O_NO_IO) == 0);
+ max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
+ align);
/* TODO: We would need a per-BDS .supported_read_flags and
* potential fallback support, if we ever implement any read flags
@@ -1024,7 +1028,7 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
}
}
- /* Forward the request to the BlockDriver */
+ /* Forward the request to the BlockDriver, possibly fragmenting it */
total_bytes = bdrv_getlength(bs);
if (total_bytes < 0) {
ret = total_bytes;
@@ -1032,30 +1036,39 @@ static int coroutine_fn bdrv_aligned_preadv(BlockDriverState *bs,
}
max_bytes = ROUND_UP(MAX(0, total_bytes - offset), align);
- if (bytes <= max_bytes) {
+ if (bytes <= max_bytes && bytes <= max_transfer) {
ret = bdrv_driver_preadv(bs, offset, bytes, qiov, 0);
- } else if (max_bytes > 0) {
- QEMUIOVector local_qiov;
-
- qemu_iovec_init(&local_qiov, qiov->niov);
- qemu_iovec_concat(&local_qiov, qiov, 0, max_bytes);
-
- ret = bdrv_driver_preadv(bs, offset, max_bytes, &local_qiov, 0);
-
- qemu_iovec_destroy(&local_qiov);
- } else {
- ret = 0;
+ goto out;
}
- /* Reading beyond end of file is supposed to produce zeroes */
- if (ret == 0 && total_bytes < offset + bytes) {
- uint64_t zero_offset = MAX(0, total_bytes - offset);
- uint64_t zero_bytes = offset + bytes - zero_offset;
- qemu_iovec_memset(qiov, zero_offset, 0, zero_bytes);
+ while (bytes_remaining) {
+ int num;
+
+ if (max_bytes) {
+ QEMUIOVector local_qiov;
+
+ num = MIN(bytes_remaining, MIN(max_bytes, max_transfer));
+ assert(num);
+ qemu_iovec_init(&local_qiov, qiov->niov);
+ qemu_iovec_concat(&local_qiov, qiov, bytes - bytes_remaining, num);
+
+ ret = bdrv_driver_preadv(bs, offset + bytes - bytes_remaining,
+ num, &local_qiov, 0);
+ max_bytes -= num;
+ qemu_iovec_destroy(&local_qiov);
+ } else {
+ num = bytes_remaining;
+ ret = qemu_iovec_memset(qiov, bytes - bytes_remaining, 0,
+ bytes_remaining);
+ }
+ if (ret < 0) {
+ goto out;
+ }
+ bytes_remaining -= num;
}
out:
- return ret;
+ return ret < 0 ? ret : 0;
}
/*
--
2.7.4
next prev parent reply other threads:[~2016-07-20 16:21 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-07-20 16:20 [Qemu-devel] [PULL v2 00/25] Block patches Stefan Hajnoczi
2016-07-20 16:20 ` Stefan Hajnoczi [this message]
2016-07-20 16:20 ` [Qemu-devel] [PULL v2 02/25] raw_bsd: Don't advertise flags not supported by protocol layer Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 03/25] block: Fragment writes to max transfer length Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 04/25] nbd: Rely on block layer to break up large requests Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 05/25] nbd: Drop unused offset parameter Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 06/25] iscsi: Rely on block layer to break up large requests Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 07/25] block: Convert bdrv_co_discard() to byte-based Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 08/25] block: Convert bdrv_discard() " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 09/25] block: Switch BlockRequest " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 10/25] block: Convert bdrv_aio_discard() " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 11/25] block: Convert BB interface to byte-based discards Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 12/25] raw-posix: Switch paio_submit() to byte-based Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 13/25] rbd: Switch rbd_start_aio() " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 14/25] block: Convert .bdrv_aio_discard() " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 15/25] block: Add .bdrv_co_pdiscard() driver callback Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 16/25] blkreplay: Switch .bdrv_co_discard() to byte-based Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 17/25] gluster: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 18/25] iscsi: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 19/25] nbd: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 20/25] qcow2: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 21/25] raw_bsd: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 22/25] sheepdog: " Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 23/25] block: Kill .bdrv_co_discard() Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 24/25] nbd: Convert to byte-based interface Stefan Hajnoczi
2016-07-20 16:21 ` [Qemu-devel] [PULL v2 25/25] raw_bsd: " Stefan Hajnoczi
2016-07-20 21:03 ` [Qemu-devel] [PULL v2 00/25] Block patches Peter Maydell
2016-07-21 9:58 ` Stefan Hajnoczi
2016-07-21 10:47 ` Peter Maydell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1469031682-21863-2-git-send-email-stefanha@redhat.com \
--to=stefanha@redhat.com \
--cc=eblake@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).