From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:43816) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWgsH-0004uj-7f for qemu-devel@nongnu.org; Fri, 19 Feb 2016 03:57:18 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aWgsE-0008B1-1R for qemu-devel@nongnu.org; Fri, 19 Feb 2016 03:57:17 -0500 Received: from mx2.parallels.com ([199.115.105.18]:52192) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWgsD-0008AM-Rw for qemu-devel@nongnu.org; Fri, 19 Feb 2016 03:57:13 -0500 Message-ID: <56C6D8D5.50600@virtuozzo.com> Date: Fri, 19 Feb 2016 11:56:53 +0300 From: Vladimir Sementsov-Ogievskiy MIME-Version: 1.0 References: <1455732653-3106-1-git-send-email-den@openvz.org> <56C4DF07.9020806@redhat.com> <20160218091857.GA12337@rkaganb.sw.ru> <56C5F2E3.1090102@redhat.com> <56C5FDF4.4060101@openvz.org> <56C6C049.7060105@openvz.org> In-Reply-To: <56C6C049.7060105@openvz.org> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] SUMMARY: Re: [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" , Eric Blake , Roman Kagan , nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org, Stefan Hajnoczi , "Daniel P. Berrange" , Fam Zheng On 19.02.2016 10:12, Denis V. Lunev wrote: > On 02/18/2016 08:23 PM, Denis V. Lunev wrote: >> On 02/18/2016 07:35 PM, Eric Blake wrote: >>> On 02/18/2016 02:18 AM, Roman Kagan wrote: >>>> On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote: >>>>> On 02/17/2016 11:10 AM, Denis V. Lunev wrote: >>>>>> @@ -446,6 +448,11 @@ The following request types exist: >>>>>> about the contents of the export affected by this command, >>>>>> until >>>>>> overwriting it again with `NBD_CMD_WRITE`. >>>>>> +* `NBD_CMD_WRITE_ZEROES` (6) >>>>>> + >>>>>> + A request to write zeroes. The command is functional >>>>>> equivalent of >>>>>> + the NBD_WRITE_COMMAND but without payload sent through the >>>>>> channel. >>>>> This lets us push holes during writes. Do we have the converse >>>>> operation, that is, an easy way to query if a block of data will >>>>> read as >>>>> all zeroes, and therefore the client can bypass reading that >>>>> portion of >>>>> the disk (in other words, an equivalent to >>>>> lseek(SEEK_HOLE/SEEK_DATA))? >>>> The spec doesn't have anything like that. >>>> >>>> OTOH, unlike the write case, where you have all the information and >>>> just >>>> choose whether to send normal write or zero write, the extra >>>> round-trip >>>> of a separate SEEK_HOLE/SEEK_DATA request may lead to actually >>>> degrading >>>> the overall throughput. >>>> >>>> Rather it may be a better idea to add something like sparse read where >>>> the server would, instead of sending the full length of data in the >>>> response payload, send a smarter variable-length package with a >>>> scatter-gather list or a bitmap of used blocks in the beginning, >>>> and let >>>> the client decode it and fill the gaps with zeros. >>> Sure, that would work too, and sounds nicer. Either way, the point is >>> that we should strongly consider improving the NBD protocol to allow >>> more efficient handling of sparse files, in both the push and in the >>> pull direction. Qemu already has a desire to use both directions of >>> improvements, but there are more programs, both clients and servers, >>> outside of qemu, that could benefit from such protocol improvements. >>> >> OK >> >> Here is a short summary of features which seems necessary from QEMU >> point of >> view: >> - ability to avoid sending zeroes during write operation. The >> proposal comes in >> the thread-starter letter >> - ability to request block status (allocate/not allocated) from >> server. This seems >> interesting to preserve "sparseness" of the transferring data >> - ability to skip zeroes during read operation, i.e. something like >> READ2 command >> which will return vector of chunks as a reply >> >> All 3 features seem usable for generic NBD use-cases and not only for >> QEMU. >> >> If there are no objections I'll sum this up and come with a >> specification draft. >> >> Den >> >> P.S. I have added here all parties which have participated in >> conversation in >> different threads on QEMU side. > > interesting point from a verbal discussion with one of my friends. > Protocol level compression could eliminate the necessity to > think about zeroes in channel either from read or from write > point of views and will also reduce the amount of data to > transfer. > > Den Compression is worse than separate commands, because after decompression we will have to write or somehow test these zeroes again. -- Best regards, Vladimir