From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, kwolf@redhat.com,
Stefan Hajnoczi <stefanha@redhat.com>,
Fam Zheng <famz@redhat.com>, Max Reitz <mreitz@redhat.com>
Subject: [Qemu-devel] [PATCH v2 13/13] block: Enforce non-zero bl.max_transfer
Date: Wed, 14 Nov 2018 20:03:34 -0600 [thread overview]
Message-ID: <20181115020334.1189829-14-eblake@redhat.com> (raw)
In-Reply-To: <20181115020334.1189829-1-eblake@redhat.com>
The raw format driver and the filter drivers default to picking
up the same limits as what they wrap, and I've audited that they
are otherwise simple enough in their passthrough to be 64-bit
clean; it's not worth changing their .bdrv_refresh_limits to
advertise anything different. Previous patches changed all
remaining byte-based format and protocol drivers to supply an
explicit max_transfer consistent with an audit (or lack thereof)
of their code. And for drivers still using sector-based
callbacks (gluster, iscsi, parallels, qed, replication, sheepdog,
ssh, vhdx), we can easily supply a 2G default limit while waiting
for a later per-driver conversion and auditing while moving to
byte-based callbacks.
With that, we can assert that max_transfer is now always set (either
by sector-based default, by merge with a backing layer, or by
explicit settings), and enforce that it be non-zero by the end
of bdrv_refresh_limits(). This in turn lets us simplify portions
of the block layer to use MIN() instead of MIN_NON_ZERO(), while
still fragmenting things to no larger than 2G chunks. Also, we no
longer need to rewrite the driver's bl.max_transfer to a value below
2G. There should be no semantic change from this patch, although
it does open the door if a future patch wants to let the block layer
choose a larger value than 2G while still honoring bl.max_transfer
for fragmentation.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
[hmm - in sending this email, I realize that I didn't specifically
check whether quorum always picks up a sane limit from its children,
since it is byte-based but lacks a .bdrv_refresh_limits]
include/block/block_int.h | 10 +++++-----
block/io.c | 25 ++++++++++++-------------
2 files changed, 17 insertions(+), 18 deletions(-)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index b19d94dac5f..410553bb0cc 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -548,11 +548,11 @@ typedef struct BlockLimits {
uint32_t opt_transfer;
/* Maximal transfer length in bytes. Need not be power of 2, but
- * must be multiple of opt_transfer and bl.request_alignment, or 0
- * for no 32-bit limit. The block layer fragments all actions
- * below 2G, so setting this value to anything larger than INT_MAX
- * implies that the driver has been audited for 64-bit
- * cleanness. */
+ * must be larger than opt_transfer and request_alignment (the
+ * block layer rounds it down as needed). The block layer
+ * fragments all actions below 2G, so setting this value to
+ * anything larger than INT_MAX implies that the driver has been
+ * audited for 64-bit cleanness. */
uint64_t max_transfer;
/* memory alignment, in bytes so that no bounce buffer is needed */
diff --git a/block/io.c b/block/io.c
index 4911a1123eb..0024be0bf31 100644
--- a/block/io.c
+++ b/block/io.c
@@ -129,6 +129,9 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
/* Default alignment based on whether driver has byte interface */
bs->bl.request_alignment = (drv->bdrv_co_preadv ||
drv->bdrv_aio_preadv) ? 1 : 512;
+ /* Default cap at 2G for drivers without byte interface */
+ if (bs->bl.request_alignment == 1)
+ bs->bl.max_transfer = BDRV_REQUEST_MAX_BYTES;
/* Take some limits from the children as a default */
if (bs->file) {
@@ -160,12 +163,11 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
drv->bdrv_refresh_limits(bs, errp);
}
- /* Clamp max_transfer to 2G */
- if (bs->bl.max_transfer > UINT32_MAX) {
- bs->bl.max_transfer = QEMU_ALIGN_DOWN(BDRV_REQUEST_MAX_BYTES,
- MAX(bs->bl.opt_transfer,
- bs->bl.request_alignment));
- }
+ /* Check that we got a maximum cap, and round it down as needed */
+ assert(bs->bl.max_transfer);
+ bs->bl.max_transfer = QEMU_ALIGN_DOWN(bs->bl.max_transfer,
+ MAX(bs->bl.opt_transfer,
+ bs->bl.request_alignment));
}
/**
@@ -1145,8 +1147,7 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
int64_t cluster_bytes;
size_t skip_bytes;
int ret;
- int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
- BDRV_REQUEST_MAX_BYTES);
+ int max_transfer = MIN(bs->bl.max_transfer, BDRV_REQUEST_MAX_BYTES);
unsigned int progress = 0;
if (!drv) {
@@ -1286,8 +1287,7 @@ static int coroutine_fn bdrv_aligned_preadv(BdrvChild *child,
assert((bytes & (align - 1)) == 0);
assert(!qiov || bytes == qiov->size);
assert((bs->open_flags & BDRV_O_NO_IO) == 0);
- max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
- align);
+ max_transfer = QEMU_ALIGN_DOWN(MIN(bs->bl.max_transfer, INT_MAX), align);
/* TODO: We would need a per-BDS .supported_read_flags and
* potential fallback support, if we ever implement any read flags
@@ -1460,7 +1460,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
bs->bl.request_alignment);
- int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
+ int max_transfer = MIN(bs->bl.max_transfer, MAX_BOUNCE_BUFFER);
if (!drv) {
return -ENOMEDIUM;
@@ -1668,8 +1668,7 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
assert((offset & (align - 1)) == 0);
assert((bytes & (align - 1)) == 0);
assert(!qiov || bytes == qiov->size);
- max_transfer = QEMU_ALIGN_DOWN(MIN_NON_ZERO(bs->bl.max_transfer, INT_MAX),
- align);
+ max_transfer = QEMU_ALIGN_DOWN(MIN(bs->bl.max_transfer, INT_MAX), align);
ret = bdrv_co_write_req_prepare(child, offset, bytes, req, flags);
--
2.17.2
next prev parent reply other threads:[~2018-11-15 2:04 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-15 2:03 [Qemu-devel] [PATCH v2 for-4.0 00/13] block: byte-based blocking read/write Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 01/13] qcow2: Prefer byte-based calls into bs->file Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 02/13] vdi: Switch to byte-based calls Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 03/13] vvfat: " Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 04/13] block: Removed unused sector-based blocking I/O Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 05/13] block: Switch to 64-bit bl.max_transfer Eric Blake
2018-11-15 15:45 ` Kevin Wolf
2018-11-15 16:28 ` Eric Blake
2018-11-16 15:32 ` Kevin Wolf
2018-11-16 15:54 ` Eric Blake
2018-11-16 16:32 ` Kevin Wolf
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 06/13] blkdebug: Audit for read/write 64-bit cleanness Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 07/13] blklogwrites: " Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 08/13] crypto: " Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 09/13] RFC: crypto: Rely on block layer for fragmentation Eric Blake
2018-11-15 16:05 ` Kevin Wolf
2018-11-15 18:31 ` Daniel P. Berrangé
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 10/13] file-posix: Audit for read/write 64-bit cleanness Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 11/13] qcow2: " Eric Blake
2018-11-15 2:03 ` [Qemu-devel] [PATCH v2 12/13] block: Document need for audit of " Eric Blake
2018-11-15 2:03 ` Eric Blake [this message]
2018-11-15 16:24 ` [Qemu-devel] [PATCH v2 13/13] block: Enforce non-zero bl.max_transfer Kevin Wolf
2018-11-15 16:34 ` Eric Blake
2018-11-15 9:02 ` [Qemu-devel] [PATCH v2 for-4.0 00/13] block: byte-based blocking read/write no-reply
2018-11-15 13:09 ` Eric Blake
2018-11-15 9:04 ` no-reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181115020334.1189829-14-eblake@redhat.com \
--to=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).