From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51972) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWPCm-0005z0-Qh for qemu-devel@nongnu.org; Thu, 18 Feb 2016 09:05:23 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aWPCj-0000Tk-Kp for qemu-devel@nongnu.org; Thu, 18 Feb 2016 09:05:16 -0500 Received: from mx2.parallels.com ([199.115.105.18]:33585) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWPCj-0000Tf-Eu for qemu-devel@nongnu.org; Thu, 18 Feb 2016 09:05:13 -0500 References: <1455732653-3106-1-git-send-email-den@openvz.org> <56C4DF07.9020806@redhat.com> <20160218121409.GD12470@redhat.com> From: "Denis V. Lunev" Message-ID: <56C5CF8E.7070402@openvz.org> Date: Thu, 18 Feb 2016 17:05:02 +0300 MIME-Version: 1.0 In-Reply-To: <20160218121409.GD12470@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Daniel P. Berrange" , Eric Blake Cc: nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org On 02/18/2016 03:14 PM, Daniel P. Berrange wrote: > On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote: >> On 02/17/2016 11:10 AM, Denis V. Lunev wrote: >>> This patch proposes a new command to reduce the amount of data passed >>> through the wire when it is known that the data is all zeroes. This >>> functionality is generally useful for mirroring or backup operations. >>> >>> Currently available NBD_CMD_TRIM command can not be used as the >>> specification explicitely says that "a client MUST NOT make any >> s/explicitely/explicitly/ >> >>> assumptions about the contents of the export affected by this >>> [NBD_CMD_TRIM] command, until overwriting it again with `NBD_CMD_WRITE`" >>> >>> Particular use case could be the following: >>> >>> QEMU project uses own implementation of NBD server to transfer data >>> in between different instances of QEMU. Typically we tranfer VM virtual >> s/tranfer/transfer/ >> >>> disks over this channel. VM virtual disks are sparse and thus the >>> efficiency of backup and mirroring operations could be improved a lot. >>> >>> Signed-off-by: Denis V. Lunev >>> --- >>> doc/proto.md | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/doc/proto.md b/doc/proto.md >>> index 43065b7..c94751a 100644 >>> --- a/doc/proto.md >>> +++ b/doc/proto.md >>> @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle negotiation: >>> schedule I/O accesses as for a rotational medium >>> - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports >>> `NBD_CMD_TRIM` commands >>> +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server >>> + supports `NBD_CMD_WRITE_ZEROES` commands >>> >>> ##### Client flags >>> >>> @@ -446,6 +448,11 @@ The following request types exist: >>> about the contents of the export affected by this command, until >>> overwriting it again with `NBD_CMD_WRITE`. >>> >>> +* `NBD_CMD_WRITE_ZEROES` (6) >>> + >>> + A request to write zeroes. The command is functional equivalent of >>> + the NBD_WRITE_COMMAND but without payload sent through the channel. >> This lets us push holes during writes. Do we have the converse >> operation, that is, an easy way to query if a block of data will read as >> all zeroes, and therefore the client can bypass reading that portion of >> the disk (in other words, an equivalent to lseek(SEEK_HOLE/SEEK_DATA))? > Stefan has suggested that we add a command to the NBD spec that > implements the SCSI Get LBA Status command. This lets clients > query the allocation bitmap for the device, which would serve > this purpose. > > https://lists.gnu.org/archive/html/qemu-devel/2016-02/msg03582.html > > In that thread he talks about it being a way to serve up the dirty > bitmap for live backup scenario, but in regular usage it obviously > provides the normal allocation bitmap > > > Regards, > Daniel But in this case we should allow to query the information for more than one block at once and also we will have to make an agreement in between the client and server about the granularity of the request or specify the granularity along as the range in the call. Den