From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57392) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWLww-0008Fj-GQ for qemu-devel@nongnu.org; Thu, 18 Feb 2016 05:36:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aWLwt-0000QR-8Z for qemu-devel@nongnu.org; Thu, 18 Feb 2016 05:36:42 -0500 Received: from mx2.parallels.com ([199.115.105.18]:50249) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWLwt-0000QL-2H for qemu-devel@nongnu.org; Thu, 18 Feb 2016 05:36:39 -0500 References: <1455732653-3106-1-git-send-email-den@openvz.org> <56C4DF07.9020806@redhat.com> <20160218091857.GA12337@rkaganb.sw.ru> From: "Denis V. Lunev" Message-ID: <56C59EAB.9090605@openvz.org> Date: Thu, 18 Feb 2016 13:36:27 +0300 MIME-Version: 1.0 In-Reply-To: <20160218091857.GA12337@rkaganb.sw.ru> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Roman Kagan , Eric Blake , nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org On 02/18/2016 12:18 PM, Roman Kagan wrote: > On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote: >> On 02/17/2016 11:10 AM, Denis V. Lunev wrote: >>> @@ -446,6 +448,11 @@ The following request types exist: >>> about the contents of the export affected by this command, until >>> overwriting it again with `NBD_CMD_WRITE`. >>> >>> +* `NBD_CMD_WRITE_ZEROES` (6) >>> + >>> + A request to write zeroes. The command is functional equivalent of >>> + the NBD_WRITE_COMMAND but without payload sent through the channel. >> This lets us push holes during writes. Do we have the converse >> operation, that is, an easy way to query if a block of data will read as >> all zeroes, and therefore the client can bypass reading that portion of >> the disk (in other words, an equivalent to lseek(SEEK_HOLE/SEEK_DATA))? > The spec doesn't have anything like that. > > OTOH, unlike the write case, where you have all the information and just > choose whether to send normal write or zero write, the extra round-trip > of a separate SEEK_HOLE/SEEK_DATA request may lead to actually degrading > the overall throughput. > > Rather it may be a better idea to add something like sparse read where > the server would, instead of sending the full length of data in the > response payload, send a smarter variable-length package with a > scatter-gather list or a bitmap of used blocks in the beginning, and let > the client decode it and fill the gaps with zeros. > > Roman. ah, I see. This story is more difficult but also viable for backup dirty bitmap reading. But this will make the protocol more complex and will require more efforts at specification stage. I'd better start with the current change, which is simple enough and make changes in a right direction and after that continue with READ2 or whatever command. Den