qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
	Stefan Hajnoczi <stefanha@redhat.com>,
	Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 09/13] qed: Convert to bdrv_co_pwrite_zeroes()
Date: Wed, 25 May 2016 08:28:41 -0600	[thread overview]
Message-ID: <5745B699.1040901@redhat.com> (raw)
In-Reply-To: <20160525140746.GL4815@noname.redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4086 bytes --]

On 05/25/2016 08:07 AM, Kevin Wolf wrote:
> Am 25.05.2016 um 00:25 hat Eric Blake geschrieben:
>> Another step on our continuing quest to switch to byte-based
>> interfaces.
>>
>> Kill an abuse of the comma operator while at it (fortunately,
>> the semantics were still right).
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>  block/qed.c | 25 +++++++++++++------------
>>  1 file changed, 13 insertions(+), 12 deletions(-)
>>

>> @@ -1443,10 +1443,10 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
>>
>>      /* Refuse if there are untouched backing file sectors */

The comment wasn't very helpful, so I may rewort it, too
(s/untouched/unaligned/, or something like that)

>>      if (bs->backing) {
>> -        if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE) != 0) {
>> +        if (qed_offset_into_cluster(s, offset) != 0) {
>>              return -ENOTSUP;
>>          }
>> -        if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE) != 0) {
>> +        if (qed_offset_into_cluster(s, count) != 0) {
>>              return -ENOTSUP;
>>          }
>>      }
> 
> Unaligned requests are only emulated if there is no backing file...
> 
>> @@ -1454,12 +1454,13 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
>>      /* Zero writes start without an I/O buffer.  If a buffer becomes necessary
>>       * then it will be allocated during request processing.
>>       */
>> -    iov.iov_base = NULL,
>> -    iov.iov_len  = nb_sectors * BDRV_SECTOR_SIZE,
>> +    iov.iov_base = NULL;
>> +    iov.iov_len = count;
>>
>>      qemu_iovec_init_external(&qiov, &iov, 1);
>> -    blockacb = qed_aio_setup(bs, sector_num, &qiov, nb_sectors,
>> -                             qed_co_write_zeroes_cb, &cb,
>> +    blockacb = qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov,
>> +                             count >> BDRV_SECTOR_BITS,
> 
> ...so offset and count can still be unaligned here and we end up zeroing
> out the wrong part of the sector. I guess we need to return -ENOTSUP for
> all sub-sector requests, even without a backing file.

Hmm. Wouldn't it be nicer if we could guarantee that blk_pwrite_zeroes()
will never call bdrv_co_pwrite_zeroes() with less than
request_alignment?  That is, if the block layer takes care of
read-modify-write for any unaligned byte offset less than
request_alignment, then the driver only has to worry about sector
alignment.  Except qed.c doesn't seem to set request_alignment, but is
just relying on io.c currently setting it to MAX(BDRV_SECTOR_SIZE,
bs->bl.request_alignment) everywhere. (And the fact that
request_alignment is a sibling rather than a member to BlockLimits bl is
awkward.)

Maybe we want three limits in BlockLimits, rather than two: the current
max_pwrite_zeroes does a good job at saying how small blk_pwrite_zeroes
must fragment large requests, and pwrite_zeroes_alignment does a good
job at saying how large a request must be to potentially punch a hole,
but at least in the case of qcow2, where we want to optimize a partial
write to potentially zeroing an entire cluster, we still want to limit
things to sector boundaries when checking for whether the rest of the
cluster already reads as zeroes, whether or not we also want to support
request_alignment of 1 instead of 512.

There are other drivers that I touched in this series that were relying
on the fact that the block layer currently guarantees sector alignment,
and maybe they should be setting request_alignment, or maybe we want to
add yet another BlockLimit member.  So even if we want normal read/write
to allow request_alignment of 1 in the case where we don't need the
block layer to do a read-modify-write, I'm still wondering whether we
want the write_zeroes engine to have a different minimum alignment and
ALWAYS hand off to normal read-modify-write for anything smaller.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  reply	other threads:[~2016-05-25 14:28 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-24 22:25 [Qemu-devel] [PATCH 00/13] Kill sector-based write_zeroes Eric Blake
2016-05-24 22:25 ` [Qemu-devel] [PATCH 01/13] block: Rename blk_write_zeroes() Eric Blake
2016-05-24 22:25 ` [Qemu-devel] [PATCH 02/13] block: Track write zero limits in bytes Eric Blake
2016-05-25 10:30   ` Kevin Wolf
2016-05-25 11:21     ` Eric Blake
2016-05-24 22:25 ` [Qemu-devel] [PATCH 03/13] block: Add .bdrv_co_pwrite_zeroes() Eric Blake
2016-05-25 13:02   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 04/13] block: Switch bdrv_write_zeroes() to byte interface Eric Blake
2016-05-25 13:18   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 05/13] iscsi: Convert to bdrv_co_pwrite_zeroes() Eric Blake
2016-05-25 13:34   ` Kevin Wolf
2016-06-01 16:33     ` Eric Blake
2016-05-24 22:25 ` [Qemu-devel] [PATCH 06/13] qcow2: " Eric Blake
2016-05-25 13:53   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 07/13] blkreplay: " Eric Blake
2016-05-25 13:54   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 08/13] gluster: " Eric Blake
2016-05-25 13:57   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 09/13] qed: " Eric Blake
2016-05-25 14:07   ` Kevin Wolf
2016-05-25 14:28     ` Eric Blake [this message]
2016-05-25 15:06       ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 10/13] raw-posix: " Eric Blake
2016-05-25 14:20   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 11/13] raw_bsd: " Eric Blake
2016-05-25 14:20   ` Kevin Wolf
2016-05-24 22:25 ` [Qemu-devel] [PATCH 12/13] vmdk: " Eric Blake
2016-05-25 14:23   ` Kevin Wolf
2016-05-25 14:35     ` Eric Blake
2016-05-24 22:25 ` [Qemu-devel] [PATCH 13/13] block: Kill bdrv_co_write_zeroes() Eric Blake
2016-05-25 14:24   ` Kevin Wolf
2016-05-25 11:02 ` [Qemu-devel] [PATCH 00/13] Kill sector-based write_zeroes Kevin Wolf
2016-06-01 15:35 ` Kevin Wolf
2016-06-01 15:38   ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5745B699.1040901@redhat.com \
    --to=eblake@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).