From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:40967) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1b5Znp-0000WT-5r for qemu-devel@nongnu.org; Wed, 25 May 2016 10:28:54 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1b5Znm-0004q4-TT for qemu-devel@nongnu.org; Wed, 25 May 2016 10:28:52 -0400 References: <1464128732-12667-1-git-send-email-eblake@redhat.com> <1464128732-12667-10-git-send-email-eblake@redhat.com> <20160525140746.GL4815@noname.redhat.com> From: Eric Blake Message-ID: <5745B699.1040901@redhat.com> Date: Wed, 25 May 2016 08:28:41 -0600 MIME-Version: 1.0 In-Reply-To: <20160525140746.GL4815@noname.redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="Td6FNJONFPbxlkMxcRM9kd2FnWPhae2Lf" Subject: Re: [Qemu-devel] [PATCH 09/13] qed: Convert to bdrv_co_pwrite_zeroes() List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, Stefan Hajnoczi , Max Reitz This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --Td6FNJONFPbxlkMxcRM9kd2FnWPhae2Lf Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On 05/25/2016 08:07 AM, Kevin Wolf wrote: > Am 25.05.2016 um 00:25 hat Eric Blake geschrieben: >> Another step on our continuing quest to switch to byte-based >> interfaces. >> >> Kill an abuse of the comma operator while at it (fortunately, >> the semantics were still right). >> >> Signed-off-by: Eric Blake >> --- >> block/qed.c | 25 +++++++++++++------------ >> 1 file changed, 13 insertions(+), 12 deletions(-) >> >> @@ -1443,10 +1443,10 @@ static int coroutine_fn bdrv_qed_co_write_zero= es(BlockDriverState *bs, >> >> /* Refuse if there are untouched backing file sectors */ The comment wasn't very helpful, so I may rewort it, too (s/untouched/unaligned/, or something like that) >> if (bs->backing) { >> - if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE)= !=3D 0) { >> + if (qed_offset_into_cluster(s, offset) !=3D 0) { >> return -ENOTSUP; >> } >> - if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE)= !=3D 0) { >> + if (qed_offset_into_cluster(s, count) !=3D 0) { >> return -ENOTSUP; >> } >> } >=20 > Unaligned requests are only emulated if there is no backing file... >=20 >> @@ -1454,12 +1454,13 @@ static int coroutine_fn bdrv_qed_co_write_zero= es(BlockDriverState *bs, >> /* Zero writes start without an I/O buffer. If a buffer becomes = necessary >> * then it will be allocated during request processing. >> */ >> - iov.iov_base =3D NULL, >> - iov.iov_len =3D nb_sectors * BDRV_SECTOR_SIZE, >> + iov.iov_base =3D NULL; >> + iov.iov_len =3D count; >> >> qemu_iovec_init_external(&qiov, &iov, 1); >> - blockacb =3D qed_aio_setup(bs, sector_num, &qiov, nb_sectors, >> - qed_co_write_zeroes_cb, &cb, >> + blockacb =3D qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov,= >> + count >> BDRV_SECTOR_BITS, >=20 > ...so offset and count can still be unaligned here and we end up zeroin= g > out the wrong part of the sector. I guess we need to return -ENOTSUP fo= r > all sub-sector requests, even without a backing file. Hmm. Wouldn't it be nicer if we could guarantee that blk_pwrite_zeroes() will never call bdrv_co_pwrite_zeroes() with less than request_alignment? That is, if the block layer takes care of read-modify-write for any unaligned byte offset less than request_alignment, then the driver only has to worry about sector alignment. Except qed.c doesn't seem to set request_alignment, but is just relying on io.c currently setting it to MAX(BDRV_SECTOR_SIZE, bs->bl.request_alignment) everywhere. (And the fact that request_alignment is a sibling rather than a member to BlockLimits bl is awkward.) Maybe we want three limits in BlockLimits, rather than two: the current max_pwrite_zeroes does a good job at saying how small blk_pwrite_zeroes must fragment large requests, and pwrite_zeroes_alignment does a good job at saying how large a request must be to potentially punch a hole, but at least in the case of qcow2, where we want to optimize a partial write to potentially zeroing an entire cluster, we still want to limit things to sector boundaries when checking for whether the rest of the cluster already reads as zeroes, whether or not we also want to support request_alignment of 1 instead of 512. There are other drivers that I touched in this series that were relying on the fact that the block layer currently guarantees sector alignment, and maybe they should be setting request_alignment, or maybe we want to add yet another BlockLimit member. So even if we want normal read/write to allow request_alignment of 1 in the case where we don't need the block layer to do a read-modify-write, I'm still wondering whether we want the write_zeroes engine to have a different minimum alignment and ALWAYS hand off to normal read-modify-write for anything smaller. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --Td6FNJONFPbxlkMxcRM9kd2FnWPhae2Lf Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJXRbaZAAoJEKeha0olJ0Nqd28H/iHzpDb+DayV2IDgRMpGc85c RuSp+vUvTA9p9BqQtlZaf6MuKOPEwzQ4O5M0r+ophp1fDbOajwpXqPvPMjabgbiN NOXgHBnzpEMRF2jHRE3CwS+gjjDthmHtDKmRGhiutbLDMijM9fLbFVUyiSlTQcs0 ubvehgzkpGBNnC0xzr8mvsp3CeAiWo0vnW1rSUIVMwvboN9qea0/2s/M+vtCuFwX f968SJ8Qn6KtxesHaqcvDwgiAbpP4ys4I0XVCnqU0M/0Tkqd+RBjwZ1DjPJDCyB6 k0iqnvh00oIWFh87XBKJYZNd2g5RAAffsNHNarRQ09HEHFthssHOkdQgN3Tn56Y= =RVks -----END PGP SIGNATURE----- --Td6FNJONFPbxlkMxcRM9kd2FnWPhae2Lf--