Re: [Qemu-devel] [PATCH 3/5] block: Switch transfer length bounds to byte-based

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Eric Blake <eblake@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, mreitz@redhat.com, qemu-block@nongnu.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Fam Zheng <famz@redhat.com>,
	Ronnie Sahlberg <ronniesahlberg@gmail.com>,
	Paolo Bonzini <pbonzini@redhat.com>, Peter Lieven <pl@kamp.de>,
	"Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 3/5] block: Switch transfer length bounds to byte-based
Date: Sat, 11 Jun 2016 16:06:00 -0600	[thread overview]
Message-ID: <575C8B48.7030909@redhat.com> (raw)
In-Reply-To: <20160607124522.GH4684@noname.str.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 6887 bytes --]

On 06/07/2016 06:45 AM, Kevin Wolf wrote:
> Am 03.06.2016 um 19:03 hat Eric Blake geschrieben:
>> Sector-based limits are awkward to think about; in our on-going
>> quest to move to byte-based interfaces, convert max_transfer_length
>> and opt_transfer_length.  Rename them (dropping the _length suffix)
>> so that the compiler will help us catch the change in semantics
>> across any rebased code.  Improve the documentation, and guarantee
>> that blk_get_max_transfer() returns a non-zero value.  Use unsigned
>> values, so that we don't have to worry about negative values and
>> so that bit-twiddling is easier; however, we are still constrained
>> by 2^31 of signed int in most APIs.
>>
>> Of note: the iscsi calculations use a 64-bit number internally,
>> but the final result is guaranteed to fit in 32 bits.  NBD is
>> fixed to advertise the maximum length of 32M that the qemu and
>> kernel servers enforce, rather than a number of sectors that
>> would overflow int when converted back to bytes.  scsi-generic
>> now advertises a maximum always, rather than just when the
>> underlying device advertised a maximum.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
> 
>> @@ -1177,7 +1176,7 @@ static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
>>
>>          if (ret == -ENOTSUP) {
>>              /* Fall back to bounce buffer if write zeroes is unsupported */
>> -            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer_length,
>> +            int max_xfer_len = MIN_NON_ZERO(bs->bl.max_transfer,
>>                                              MAX_WRITE_ZEROES_BOUNCE_BUFFER);
> 
> You could consider renaming the variable to max_transfer to keep it
> consistent with the BlockLimits field name and to make it even more
> obvious that you converted all uses.

Good idea.

>> @@ -1706,13 +1708,14 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
>>       * iscsi_open(): iscsi targets don't change their limits. */
>>
>>      IscsiLun *iscsilun = bs->opaque;
>> -    uint32_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
>> +    uint64_t max_xfer_len = iscsilun->use_16_for_rw ? 0xffffffff : 0xffff;
>>
>>      if (iscsilun->bl.max_xfer_len) {
>>          max_xfer_len = MIN(max_xfer_len, iscsilun->bl.max_xfer_len);
>>      }
>>
>> -    bs->bl.max_transfer_length = sector_limits_lun2qemu(max_xfer_len, iscsilun);
>> +    bs->bl.max_transfer = MIN(BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS,
>> +                              max_xfer_len * iscsilun->block_size);
> 
> Why don't you simply use 0 when you defined it as no limit?
> 
> If we make some drivers put 0 there and others INT_MAX, chances are that
> we'll introduce places where we fail to handle 0 correctly.

So if I'm understanding correctly, we want something like:

if (max_xfer_len * iscsilun->block_size > INT_MAX) {
    bs->bl.max_transfer = 0;
} else {
    bs->bl.max_transfer = max_xfer_len * iscsilun->block_size;
}

and make sure that 0 continues to mean 'no signed 32-bit limit'.

>> +++ b/block/nbd.c
>> @@ -363,7 +363,7 @@ static int nbd_co_flush(BlockDriverState *bs)
>>  static void nbd_refresh_limits(BlockDriverState *bs, Error **errp)
>>  {
>>      bs->bl.max_discard = UINT32_MAX >> BDRV_SECTOR_BITS;
>> -    bs->bl.max_transfer_length = UINT32_MAX >> BDRV_SECTOR_BITS;
>> +    bs->bl.max_transfer = NBD_MAX_BUFFER_SIZE;
>>  }
> 
> This introduces a semantic difference and should therefore be a separate
> patch. Here, it should become UINT32_MAX through mechanical conversion.
> 
> Or actually, it can't because that's not a power of two. So maybe have
> the NBD patch first...

Will split.  I don't see any problem with a max_transfer that is not a
power of two, as long as it is a multiple of request_alignment (iscsi is
a case in point - the max_transfer can be 0xffff blocks).

> 
>>  static int nbd_co_discard(BlockDriverState *bs, int64_t sector_num,
>> diff --git a/block/raw-posix.c b/block/raw-posix.c
>> index ce2e20f..f3bd5ef 100644
>> --- a/block/raw-posix.c
>> +++ b/block/raw-posix.c
>> @@ -752,7 +752,7 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
>>          if (S_ISBLK(st.st_mode)) {
>>              int ret = hdev_get_max_transfer_length(s->fd);
>>              if (ret >= 0) {
>> -                bs->bl.max_transfer_length = ret;
>> +                bs->bl.max_transfer = ret << BDRV_SECTOR_BITS;
> 
> I assume that we don't care about overflows here because in practice the
> values are very low? Should we check (or maybe better just cap) it
> anyway?

Will add a check.

>> @@ -391,8 +391,9 @@ void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
>>          return;
>>      }
>>
>> -    max_xfer_len = blk_get_max_transfer_length(mrb->reqs[0]->dev->blk);
>> -    max_xfer_len = MIN_NON_ZERO(max_xfer_len, BDRV_REQUEST_MAX_SECTORS);
>> +    max_xfer_len = blk_get_max_transfer(mrb->reqs[0]->dev->blk);
>> +    assert(max_xfer_len &&
>> +           max_xfer_len >> BDRV_SECTOR_BITS <= BDRV_REQUEST_MAX_SECTORS);
> 
> Why can we assert this here? The comment for BlockLimits.max_xfer_len
> doesn't document any maximum. Of course, as I already said above, it
> might not happen in practice, but INT_MAX + 1 is theoretically valid and
> would fail the assertion.

As part of this patch, blk_get_max_transfer() guarantees that its result
is non-zero and no larger than INT_MAX.


>> +++ b/hw/scsi/scsi-generic.c
>> @@ -225,13 +225,13 @@ static void scsi_read_complete(void * opaque, int ret)
>>      if (s->type == TYPE_DISK &&
>>          r->req.cmd.buf[0] == INQUIRY &&
>>          r->req.cmd.buf[2] == 0xb0) {
>> -        uint32_t max_xfer_len = blk_get_max_transfer_length(s->conf.blk);
>> -        if (max_xfer_len) {
>> -            stl_be_p(&r->buf[8], max_xfer_len);
>> -            /* Also take care of the opt xfer len. */
>> -            if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
>> -                stl_be_p(&r->buf[12], max_xfer_len);
>> -            }
>> +        uint32_t max_xfer_len =
>> +            blk_get_max_transfer(s->conf.blk) >> BDRV_SECTOR_BITS;
>> +
>> +        stl_be_p(&r->buf[8], max_xfer_len);
>> +        /* Also take care of the opt xfer len. */
>> +        if (ldl_be_p(&r->buf[12]) > max_xfer_len) {
>> +            stl_be_p(&r->buf[12], max_xfer_len);
>>          }
>>      }
> 
> This is another hidden semantic change. Can we have a separate
> scsi-generic patch first that changes the handling for the
> max_transfer == 0 case and only then make the change in
> blk_get_max_transfer() as pure refactoring without a change in
> behaviour?

Will split.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

next prev parent reply	other threads:[~2016-06-11 22:06 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-03 17:03 [Qemu-devel] [PATCH 0/5] Byte-based block limits Eric Blake
2016-06-03 17:03 ` [Qemu-devel] [PATCH 1/5] block: Tighter assertions on bdrv_aligned_preadv() Eric Blake
2016-06-07 12:15   ` Kevin Wolf
2016-06-03 17:03 ` [Qemu-devel] [PATCH 2/5] block: Honor flags during bdrv_aligned_preadv() Eric Blake
2016-06-07 12:12   ` Kevin Wolf
2016-06-11 21:43     ` Eric Blake
2016-06-03 17:03 ` [Qemu-devel] [PATCH 3/5] block: Switch transfer length bounds to byte-based Eric Blake
2016-06-07 12:45   ` Kevin Wolf
2016-06-11 22:06     ` Eric Blake [this message]
2016-06-14  8:20       ` Kevin Wolf
2016-06-03 17:03 ` [Qemu-devel] [PATCH 4/5] block: Switch discard " Eric Blake
2016-06-07 13:12   ` Kevin Wolf
2016-06-03 17:03 ` [Qemu-devel] [PATCH 5/5] block: Move request_alignment into BlockLimit Eric Blake
2016-06-03 17:49   ` Eric Blake
2016-06-03 21:43     ` Eric Blake
2016-06-07 10:08       ` Kevin Wolf
2016-06-07 11:04         ` Paolo Bonzini
2016-06-07 11:24           ` Kevin Wolf
2016-06-14  4:39         ` Eric Blake
2016-06-14  8:05           ` Kevin Wolf
2016-06-14 14:47             ` Eric Blake
2016-06-14 15:30               ` Kevin Wolf
2016-06-07 13:19   ` Kevin Wolf
2016-06-03 23:06 ` [Qemu-devel] [PATCH 6/5] block: Fix harmless off-by-one in bdrv_aligned_preadv() Eric Blake
2016-06-07 13:47   ` Kevin Wolf
2016-06-03 23:13 ` [Qemu-devel] [PATCH 7/5] block: Refactor zero_beyond_eof hack " Eric Blake
2016-06-07 13:49   ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=575C8B48.7030909@redhat.com \
    --to=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pl@kamp.de \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=ronniesahlberg@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.