From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, kwolf@redhat.com, famz@redhat.com,
stefanha@redhat.com, Max Reitz <mreitz@redhat.com>,
Ronnie Sahlberg <ronniesahlberg@gmail.com>,
Paolo Bonzini <pbonzini@redhat.com>, Peter Lieven <pl@kamp.de>,
"Michael S. Tsirkin" <mst@redhat.com>
Subject: [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based
Date: Thu, 23 Jun 2016 16:37:19 -0600 [thread overview]
Message-ID: <1466721446-27737-16-git-send-email-eblake@redhat.com> (raw)
In-Reply-To: <1466721446-27737-1-git-send-email-eblake@redhat.com>
Sector-based limits are awkward to think about; in our on-going
quest to move to byte-based interfaces, convert max_transfer_length
and opt_transfer_length. Rename them (dropping the _length suffix)
so that the compiler will help us catch the change in semantics
across any rebased code, and improve the documentation. Use unsigned
values, so that we don't have to worry about negative values and
so that bit-twiddling is easier; however, we are still constrained
by 2^31 of signed int in most APIs.
When a value comes from an external source (iscsi and raw-posix),
sanitize the results to ensure that opt_transfer is a power of 2.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v3: rebase to scsi limits fix, sanitize rather than assert on external
values
v2: Hoist unrelated cleanups earlier, use name 'max_transfer' in more
places, tweak iscsi calculations
---
include/block/block_int.h | 11 +++++++----
include/sysemu/block-backend.h | 2 +-
block/block-backend.c | 10 +++++-----
block/io.c | 23 +++++++++++------------
block/iscsi.c | 23 +++++++++++++++--------
block/nbd.c | 2 +-
block/raw-posix.c | 4 ++--
hw/block/virtio-blk.c | 9 +++++----
hw/scsi/scsi-generic.c | 12 ++++++------
qemu-img.c | 8 ++++----
10 files changed, 57 insertions(+), 47 deletions(-)
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 2057156..7d2b152 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -338,11 +338,14 @@ typedef struct BlockLimits {
* power of 2, and less than max_pwrite_zeroes if that is set */
uint32_t pwrite_zeroes_alignment;
- /* optimal transfer length in sectors */
- int opt_transfer_length;
+ /* optimal transfer length in bytes (must be power of 2, and
+ * multiple of bs->request_alignment), or 0 if no preferred size */
+ uint32_t opt_transfer;
- /* maximal transfer length in sectors */
- int max_transfer_length;
+ /* maximal transfer length in bytes (need not be power of 2, but
+ * should be multiple of opt_transfer), or 0 for no 32-bit limit.
+ * For now, anything larger than INT_MAX is clamped down. */
+ uint32_t max_transfer;
/* memory alignment so that no bounce buffer is needed */
size_t min_mem_alignment;
diff --git a/include/sysemu/block-backend.h b/include/sysemu/block-backend.h
index c04af8e..2469a1c 100644
--- a/include/sysemu/block-backend.h
+++ b/include/sysemu/block-backend.h
@@ -170,7 +170,7 @@ bool blk_is_available(BlockBackend *blk);
void blk_lock_medium(BlockBackend *blk, bool locked);
void blk_eject(BlockBackend *blk, bool eject_flag);
int blk_get_flags(BlockBackend *blk);
-int blk_get_max_transfer_length(BlockBackend *blk);
+uint32_t blk_get_max_transfer(BlockBackend *blk);
int blk_get_max_iov(BlockBackend *blk);
void blk_set_guest_block_size(BlockBackend *blk, int align);
void *blk_try_blockalign(BlockBackend *blk, size_t size);
diff --git a/block/block-backend.c b/block/block-backend.c
index 1fb070b..e042544 100644
--- a/block/block-backend.c
+++ b/block/block-backend.c
@@ -1303,16 +1303,16 @@ int blk_get_flags(BlockBackend *blk)
}
}
-/* Returns the maximum transfer length, in sectors; guaranteed nonzero */
-int blk_get_max_transfer_length(BlockBackend *blk)
+/* Returns the maximum transfer length, in bytes; guaranteed nonzero */
+uint32_t blk_get_max_transfer(BlockBackend *blk)
{
BlockDriverState *bs = blk_bs(blk);
- int max = 0;
+ uint32_t max = 0;
if (bs) {
- max = bs->bl.max_transfer_length;
+ max = bs->bl.max_transfer;
}
- return MIN_NON_ZERO(max, BDRV_REQUEST_MAX_SECTORS);
+ return MIN_NON_ZERO(max, INT_MAX);
}
int blk_get_max_iov(BlockBackend *blk)
diff --git a/block/io.c b/block/io.c
index 323e822..8ca9d43 100644
--- a/block/io.c
+++ b/block/io.c
@@ -88,8 +88,8 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
error_propagate(errp, local_err);
return;
}
- bs->bl.opt_transfer_length = bs->file->bs->bl.opt_transfer_length;
- bs->bl.max_transfer_length = bs->file->bs->bl.max_transfer_length;
+ bs->bl.opt_transfer = bs->file->bs->bl.opt_transfer;
+ bs->bl.max_transfer = bs->file->bs->bl.max_transfer;
bs->bl.min_mem_alignment = bs->file->bs->bl.min_mem_alignment;
bs->bl.opt_mem_alignment = bs->file->bs->bl.opt_mem_alignment;
bs->bl.max_iov = bs->file->bs->bl.max_iov;
@@ -107,12 +107,10 @@ void bdrv_refresh_limits(BlockDriverState *bs, Error **errp)
error_propagate(errp, local_err);
return;
}
- bs->bl.opt_transfer_length =
- MAX(bs->bl.opt_transfer_length,
- bs->backing->bs->bl.opt_transfer_length);
- bs->bl.max_transfer_length =
- MIN_NON_ZERO(bs->bl.max_transfer_length,
- bs->backing->bs->bl.max_transfer_length);
+ bs->bl.opt_transfer = MAX(bs->bl.opt_transfer,
+ bs->backing->bs->bl.opt_transfer);
+ bs->bl.max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+ bs->backing->bs->bl.max_transfer);
bs->bl.opt_mem_alignment =
MAX(bs->bl.opt_mem_alignment,
bs->backing->bs->bl.opt_mem_alignment);
@@ -1156,7 +1154,8 @@ int coroutine_fn bdrv_co_readv(BlockDriverState *bs, int64_t sector_num,
return bdrv_co_do_readv(bs, sector_num, nb_sectors, qiov, 0);
}
-#define MAX_WRITE_ZEROES_BOUNCE_BUFFER 32768
+/* Maximum buffer for write zeroes fallback, in bytes */
+#define MAX_WRITE_ZEROES_BOUNCE_BUFFER (32768 << BDRV_SECTOR_BITS)
static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
@@ -1214,7 +1213,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
if (ret == -ENOTSUP) {
/* Fall back to bounce buffer if write zeroes is unsupported */
- int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
+ int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
MAX_WRITE_ZEROES_BOUNCE_BUFFER);
BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
@@ -1225,7 +1224,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
write_flags &= ~BDRV_REQ_FUA;
need_flush = true;
}
- num = MIN(num, max_xfer_len << BDRV_SECTOR_BITS);
+ num = MIN(num, max_transfer);
iov.iov_len = num;
if (iov.iov_base == NULL) {
iov.iov_base = qemu_try_blockalign(bs, num);
@@ -1242,7 +1241,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
/* Keep bounce buffer around if it is big enough for all
* all future requests.
*/
- if (num < max_xfer_len << BDRV_SECTOR_BITS) {
+ if (num < max_transfer) {
qemu_vfree(iov.iov_base);
iov.iov_base = NULL;
}
diff --git a/block/iscsi.c b/block/iscsi.c
index b4661c9..368687d 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -473,9 +473,10 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
return -EINVAL;
}
- if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
+ if (bs->bl.max_transfer &&
+ nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
error_report("iSCSI Error: Write of %d sectors exceeds max_xfer_len "
- "of %d sectors", nb_sectors, bs->bl.max_transfer_length);
+ "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
return -EINVAL;
}
@@ -650,9 +651,10 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
return -EINVAL;
}
- if (bs->bl.max_transfer_length && nb_sectors > bs->bl.max_transfer_length) {
+ if (bs->bl.max_transfer &&
+ nb_sectors << BDRV_SECTOR_BITS > bs->bl.max_transfer) {
error_report("iSCSI Error: Read of %d sectors exceeds max_xfer_len "
- "of %d sectors", nb_sectors, bs->bl.max_transfer_length);
+ "of %" PRIu32 " bytes", nb_sectors, bs->bl.max_transfer);
return -EINVAL;
}
@@ -1707,7 +1709,7 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
* iscsi_open(): iscsi targets don't change their limits. */
IscsiLun *iscsilun = bs->opaque;
- uint32_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
+ uint64_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
bs->request_alignment = iscsilun->block_size;
@@ -1715,7 +1717,9 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
}
- bs->bl.max_transfer_length = sector_limits_lun2qemu(max_xfer_len, iscsilun);
+ if (max_xfer_len * iscsilun->block_size < INT_MAX) {
+ bs->bl.max_transfer = max_xfer_len * iscsilun->block_size;
+ }
if (iscsilun->lbp.lbpu) {
if (iscsilun->bl.max_unmap < 0xffffffff) {
@@ -1738,8 +1742,11 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
} else {
bs->bl.pwrite_zeroes_alignment = iscsilun->block_size;
}
- bs->bl.opt_transfer_length =
- sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
+ if (iscsilun->bl.opt_xfer_len &&
+ iscsilun->bl.opt_xfer_len < INT_MAX / iscsilun->block_size) {
+ bs->bl.opt_transfer = pow2floor(iscsilun->bl.opt_xfer_len *
+ iscsilun->block_size);
+ }
}
/* Note that this will not re-establish a connection with an iSCSI target - it
diff --git a/block/nbd.c b/block/nbd.c
index bf67c8a..f5511ea 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -363,7 +363,7 @@ static int nbd_co_flush(BlockDriverState *bs)
static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.max_discard = NBD_MAX_SECTORS;
- bs->bl.max_transfer_length = NBD_MAX_SECTORS;
+ bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
}
static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
diff --git a/block/raw-posix.c b/block/raw-posix.c
index bef7a67..8da2f94 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -745,8 +745,8 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
if (!fstat(s->fd, &st)) {
if (S_ISBLK(st.st_mode)) {
int ret = hdev_get_max_transfer_length(s->fd);
- if (ret >= 0) {
- bs->bl.max_transfer_length = ret;
+ if (ret > 0 && ret <= BDRV_REQUEST_MAX_SECTORS) {
+ bs->bl.max_transfer = pow2floor(ret << BDRV_SECTOR_BITS);
}
}
}
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 1d2792e..5000c27 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -382,7 +382,7 @@ static int multireq_compare(const void *a, const void *b)
void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
{
int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
- int max_xfer_len;
+ uint32_t max_transfer;
int64_t sector_num = 0;
if (mrb->num_reqs == 1) {
@@ -391,7 +391,7 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
return;
}
- max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
+ max_transfer = blk_get_max_transfer(mrb->reqs[0]->dev->blk);
qsort(mrb->reqs, mrb->num_reqs, sizeof(*mrb->reqs),
&multireq_compare);
@@ -407,8 +407,9 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
*/
if (sector_num + nb_sectors != req->sector_num ||
niov > blk_get_max_iov(blk) - req->qiov.niov ||
- req->qiov.size / BDRV_SECTOR_SIZE > max_xfer_len ||
- nb_sectors > max_xfer_len - req->qiov.size / BDRV_SECTOR_SIZE) {
+ req->qiov.size > max_transfer ||
+ nb_sectors > (max_transfer -
+ req->qiov.size) / BDRV_SECTOR_SIZE) {
submit_requests(blk, mrb, start, num_reqs, niov);
num_reqs = 0;
}
diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c
index 0cb8568..7a588a7 100644
--- a/hw/scsi/scsi-generic.c
+++ b/hw/scsi/scsi-generic.c
@@ -225,14 +225,14 @@ static void scsi_read_complete(void * opaque, int ret)
if (s->type == TYPE_DISK &&
r->req.cmd.buf[0] == INQUIRY &&
r->req.cmd.buf[2] == 0xb0) {
- uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk) /
- (s->blocksize / BDRV_SECTOR_SIZE);
+ uint32_t max_transfer =
+ blk_get_max_transfer(s->conf.blk) / s->blocksize;
- assert(max_xfer_len);
- stl_be_p(&r->buf[8], max_xfer_len);
+ assert(max_transfer);
+ stl_be_p(&r->buf[8], max_transfer);
/* Also take care of the opt xfer len. */
- if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
- stl_be_p(&r->buf[12], max_xfer_len);
+ if (ldl_be_p(&r->buf[12]) > max_transfer) {
+ stl_be_p(&r->buf[12], max_transfer);
}
}
scsi_req_data(&r->req, len);
diff --git a/qemu-img.c b/qemu-img.c
index 14e2661..cf9876d 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2078,13 +2078,13 @@ static int img_convert(int argc, char **argv)
}
out_bs = blk_bs(out_blk);
- /* increase bufsectors from the default 4096 (2M) if opt_transfer_length
+ /* increase bufsectors from the default 4096 (2M) if opt_transfer
* or discard_alignment of the out_bs is greater. Limit to 32768 (16MB)
* as maximum. */
bufsectors = MIN(32768,
- MAX(bufsectors, MAX(out_bs->bl.opt_transfer_length,
- out_bs->bl.discard_alignment))
- );
+ MAX(bufsectors,
+ MAX(out_bs->bl.opt_transfer >> BDRV_SECTOR_BITS,
+ out_bs->bl.discard_alignment)));
if (skip_create) {
int64_t output_sectors = blk_nb_sectors(out_blk);
--
2.5.5
next prev parent reply other threads:[~2016-06-23 22:37 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-23 22:37 [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 01/22] block: Tighter assertions on bdrv_aligned_pwritev() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 02/22] block: Document supported flags during bdrv_aligned_preadv() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 03/22] block: Fix harmless off-by-one in bdrv_aligned_preadv() Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 04/22] nbd: Allow larger requests Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 05/22] nbd: Advertise realistic limits to block layer Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 06/22] iscsi: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 07/22] scsi: Advertise limits by blocksize, not 512 Eric Blake
2016-06-24 5:22 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 08/22] block: Give nonzero result to blk_get_max_transfer_length() Eric Blake
2016-06-24 5:24 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 09/22] blkdebug: Set request_alignment during .bdrv_refresh_limits() Eric Blake
2016-06-24 5:42 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 10/22] iscsi: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 11/22] qcow2: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 12/22] raw-win32: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 13/22] block: " Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 14/22] block: Set default request_alignment during bdrv_refresh_limits() Eric Blake
2016-06-23 22:37 ` Eric Blake [this message]
2016-06-24 6:06 ` [Qemu-devel] [PATCH v3 15/22] block: Switch transfer length bounds to byte-based Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 16/22] block: Wording tweaks to write zeroes limits Eric Blake
2016-06-24 6:12 ` Fam Zheng
2016-06-24 14:10 ` Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 17/22] block: Switch discard length bounds to byte-based Eric Blake
2016-06-24 6:43 ` Fam Zheng
2016-06-24 14:15 ` Eric Blake
2016-06-24 14:29 ` Kevin Wolf
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 18/22] block: Drop raw_refresh_limits() Eric Blake
2016-06-24 6:44 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 19/22] block: Split bdrv_merge_limits() from bdrv_refresh_limits() Eric Blake
2016-06-24 6:48 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 20/22] block: Move request_alignment into BlockLimit Eric Blake
2016-06-24 7:07 ` Fam Zheng
2016-06-24 13:45 ` Kevin Wolf
2016-06-24 14:17 ` Eric Blake
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 21/22] block: Fix error message style Eric Blake
2016-06-24 7:09 ` Fam Zheng
2016-06-23 22:37 ` [Qemu-devel] [PATCH v3 22/22] block: Use bool as appropriate for BDS members Eric Blake
2016-06-24 7:12 ` Fam Zheng
2016-06-24 14:32 ` [Qemu-devel] [PATCH v3 00/22] Byte-based block limits Kevin Wolf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1466721446-27737-16-git-send-email-eblake@redhat.com \
--to=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=pl@kamp.de \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=ronniesahlberg@gmail.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).