From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52500) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWSIP-0003WC-Sd for qemu-devel@nongnu.org; Thu, 18 Feb 2016 12:23:19 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aWSIM-0007vA-Lo for qemu-devel@nongnu.org; Thu, 18 Feb 2016 12:23:17 -0500 Received: from mx2.parallels.com ([199.115.105.18]:33472) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aWSIM-0007sr-FT for qemu-devel@nongnu.org; Thu, 18 Feb 2016 12:23:14 -0500 References: <1455732653-3106-1-git-send-email-den@openvz.org> <56C4DF07.9020806@redhat.com> <20160218091857.GA12337@rkaganb.sw.ru> <56C5F2E3.1090102@redhat.com> From: "Denis V. Lunev" Message-ID: <56C5FDF4.4060101@openvz.org> Date: Thu, 18 Feb 2016 20:23:00 +0300 MIME-Version: 1.0 In-Reply-To: <56C5F2E3.1090102@redhat.com> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit Subject: [Qemu-devel] SUMMARY: Re: [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , Roman Kagan , nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org, Stefan Hajnoczi , "Daniel P. Berrange" , Vladimir Sementsov-Ogievskiy Roman Kagan , Fam Zheng On 02/18/2016 07:35 PM, Eric Blake wrote: > On 02/18/2016 02:18 AM, Roman Kagan wrote: >> On Wed, Feb 17, 2016 at 01:58:47PM -0700, Eric Blake wrote: >>> On 02/17/2016 11:10 AM, Denis V. Lunev wrote: >>>> @@ -446,6 +448,11 @@ The following request types exist: >>>> about the contents of the export affected by this command, until >>>> overwriting it again with `NBD_CMD_WRITE`. >>>> >>>> +* `NBD_CMD_WRITE_ZEROES` (6) >>>> + >>>> + A request to write zeroes. The command is functional equivalent of >>>> + the NBD_WRITE_COMMAND but without payload sent through the channel. >>> This lets us push holes during writes. Do we have the converse >>> operation, that is, an easy way to query if a block of data will read as >>> all zeroes, and therefore the client can bypass reading that portion of >>> the disk (in other words, an equivalent to lseek(SEEK_HOLE/SEEK_DATA))? >> The spec doesn't have anything like that. >> >> OTOH, unlike the write case, where you have all the information and just >> choose whether to send normal write or zero write, the extra round-trip >> of a separate SEEK_HOLE/SEEK_DATA request may lead to actually degrading >> the overall throughput. >> >> Rather it may be a better idea to add something like sparse read where >> the server would, instead of sending the full length of data in the >> response payload, send a smarter variable-length package with a >> scatter-gather list or a bitmap of used blocks in the beginning, and let >> the client decode it and fill the gaps with zeros. > Sure, that would work too, and sounds nicer. Either way, the point is > that we should strongly consider improving the NBD protocol to allow > more efficient handling of sparse files, in both the push and in the > pull direction. Qemu already has a desire to use both directions of > improvements, but there are more programs, both clients and servers, > outside of qemu, that could benefit from such protocol improvements. > OK Here is a short summary of features which seems necessary from QEMU point of view: - ability to avoid sending zeroes during write operation. The proposal comes in the thread-starter letter - ability to request block status (allocate/not allocated) from server. This seems interesting to preserve "sparseness" of the transferring data - ability to skip zeroes during read operation, i.e. something like READ2 command which will return vector of chunks as a reply All 3 features seem usable for generic NBD use-cases and not only for QEMU. If there are no objections I'll sum this up and come with a specification draft. Den P.S. I have added here all parties which have participated in conversation in different threads on QEMU side.