From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51937) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bPWwU-0001rP-5o for qemu-devel@nongnu.org; Tue, 19 Jul 2016 11:28:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bPWwR-0005Ue-Ty for qemu-devel@nongnu.org; Tue, 19 Jul 2016 11:28:17 -0400 References: <1468901281-22858-1-git-send-email-eblake@redhat.com> <1468901281-22858-14-git-send-email-eblake@redhat.com> <20160719062131.GG18103@ad.usersys.redhat.com> From: Eric Blake Message-ID: <578E4708.5080308@redhat.com> Date: Tue, 19 Jul 2016 09:28:08 -0600 MIME-Version: 1.0 In-Reply-To: <20160719062131.GG18103@ad.usersys.redhat.com> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="I4PUccVluUDVwopqrMvk3ie6oJlj8LkD0" Subject: Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Fam Zheng Cc: qemu-devel@nongnu.org, Kevin Wolf , pbonzini@redhat.com, qemu-block@nongnu.org, Max Reitz , "nbd-general@lists.sourceforge.net" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --I4PUccVluUDVwopqrMvk3ie6oJlj8LkD0 From: Eric Blake To: Fam Zheng Cc: qemu-devel@nongnu.org, Kevin Wolf , pbonzini@redhat.com, qemu-block@nongnu.org, Max Reitz , "nbd-general@lists.sourceforge.net" Message-ID: <578E4708.5080308@redhat.com> Subject: Re: [Qemu-devel] [PATCH v5 13/14] nbd: Implement NBD_CMD_WRITE_ZEROES on server References: <1468901281-22858-1-git-send-email-eblake@redhat.com> <1468901281-22858-14-git-send-email-eblake@redhat.com> <20160719062131.GG18103@ad.usersys.redhat.com> In-Reply-To: <20160719062131.GG18103@ad.usersys.redhat.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable [adding nbd list] On 07/19/2016 12:21 AM, Fam Zheng wrote: > On Mon, 07/18 22:08, Eric Blake wrote: >> Upstream NBD protocol recently added the ability to efficiently >> write zeroes without having to send the zeroes over the wire, >> along with a flag to control whether the client wants a hole. >> >> Signed-off-by: Eric Blake >> >> --- >> @@ -1235,6 +1242,37 @@ static void nbd_trip(void *opaque) >> } >> break; >> >> + case NBD_CMD_WRITE_ZEROES: >> + TRACE("Request type is WRITE_ZEROES"); >> + >> + if (exp->nbdflags & NBD_FLAG_READ_ONLY) { >> + TRACE("Server is read-only, return error"); >> + reply.error =3D EROFS; >> + goto error_reply; >> + } >> + >> + TRACE("Writing to device"); >> + >> + flags =3D 0; >> + if (request.flags & NBD_CMD_FLAG_FUA) { >> + flags |=3D BDRV_REQ_FUA; >> + } >> + if (!(request.flags & NBD_CMD_FLAG_NO_HOLE)) { >> + flags |=3D BDRV_REQ_MAY_UNMAP; >=20 > If I'm reading the NBD proto.md correctly, this is not enough if > NBD_CMD_FLAG_NO_HOLE is specified. We probably need to use a zeroed buf= fer with > blk_pwrite, or pass a new flag (BDRV_RED_NO_HOLE) to blk_pwrite_zeroes = to > enforce the bdrv_driver_pwritev() branch in bdrv_co_do_pwrite_zeroes().= If that's how you read it, then my proposal to proto.md needs updating. I specifically wrote the proposal to be as close as possible to the existing qemu semantics, except that we negated the sense of the bit because we wanted to allow the bit value of 0 to allow the server the most flexibility in performing optimizations. The code here (and in 14/14 on the client side) is merely catering to the fact that the bit has opposite sense in the two projects. That is, the rules in qemu are: MAY_UNMAP =3D=3D 0 : must write zeroes MAY_UNMAP =3D=3D 1 : may optimize if supported (where reads will see 0), = but must write zeroes if not while the rules in NBD are: FLAG_NO_HOLE =3D=3D 1 : must write zeroes FLAG_NO_HOLE =3D=3D 0 : may optimize if supported (where reads will see 0= ), but must write zeroes if not Or another way of putting it: in qemu, the ability to punch holes was added after the fact (default of no holes being 0 due to back-compat), where prior to its addition full allocation was the only option, and most callers have to worry about passing MAY_UNMAP when they care about optimal use of storage; while in NBD we want to allow the server the freedom to have optimal usage of storage by default, but need a way to specifically ask for full allocation. If you think the NBD flag is poorly named, we have not yet committed to the NBD_CMD_WRITE_EXTENSIONS documentation yet, and are free to patch proto.md to choose a better name and/or wording to better describe what we actually mean on the NBD side of things. --=20 Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org --I4PUccVluUDVwopqrMvk3ie6oJlj8LkD0 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBCAAGBQJXjkcIAAoJEKeha0olJ0NqwO0H/ixI5uDWXAeixpkkJrGy3jQj R1WviuOYlmQkWAVrMJ32Z7w94oAWnpS26MOxZCLh8KKWUb9fP2hEpJJPksbgSujj MbOFnnsvcmWODA/K4+5V16ARsZFTtlAG9gKnYHK2/FNzRGmAOCFZG4RCH8TFuvM+ NneFPhtS/rEvubuKnwHog85AkQx9MqdEOJ1H5XW3J0+A9uaPnoXYkAp2dae2VyBB 5WaKUmIqD7eWmYxtkUqlKeDtRB8PVQrvDMZyErwFGKTtVXlKzwmdjAoe2J0yl4e7 Wy2aecwc+9eKyMDlaPCUToF96DILF+Im4W2VsDWvGe3TRaylXX2/1N9E1yugct4= =h3X/ -----END PGP SIGNATURE----- --I4PUccVluUDVwopqrMvk3ie6oJlj8LkD0--