From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:49062) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abqKN-0007NP-T9 for qemu-devel@nongnu.org; Fri, 04 Mar 2016 09:03:36 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1abqKI-0001th-66 for qemu-devel@nongnu.org; Fri, 04 Mar 2016 09:03:35 -0500 Received: from mail-wm0-x22d.google.com ([2a00:1450:400c:c09::22d]:33390) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1abqKH-0001td-UD for qemu-devel@nongnu.org; Fri, 04 Mar 2016 09:03:30 -0500 Received: by mail-wm0-x22d.google.com with SMTP id l68so21679988wml.0 for ; Fri, 04 Mar 2016 06:03:29 -0800 (PST) Sender: Paolo Bonzini References: <1455732653-3106-1-git-send-email-den@openvz.org> <00466EEA-174D-40E5-B74F-974D11DBB97C@alex.org.uk> <56C581FC.6020603@openvz.org> <20160304084911.GA5955@grep.be> <20160304095413.GC4366@noname.redhat.com> From: Paolo Bonzini Message-ID: <56D995AE.1080805@redhat.com> Date: Fri, 4 Mar 2016 15:03:26 +0100 MIME-Version: 1.0 In-Reply-To: <20160304095413.GC4366@noname.redhat.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [Nbd] [RFC 1/1] nbd (specification): add NBD_CMD_WRITE_ZEROES command List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Kevin Wolf , Wouter Verhelst Cc: "nbd-general@lists.sourceforge.net" , "Denis V. Lunev" , "qemu-devel@nongnu.org" , Alex Bligh On 04/03/2016 10:54, Kevin Wolf wrote: >>> - pls write the following amount of zeroes in either way (even calling >>> > > write directly), i.e. ensure that the data is zeroed and the space on >>> > > the file system is allocated for that. >> > >> > IOW, you *don't* want to have a sparse file in that case? Or do I >> > misunderstand things here? > I think what we're looking for is more like "zero out this area, feel > free to use whatever method is most efficient to achieve that". > > So if the server knows that the backing store supports an efficient way > to write zeros (e.g. FALLOC_FL_ZERO_RANGE), it will use that. Otherwise, > if TRIM works and we know that the result is zeroed space instead of > undefined contents, the server is free to use it. And if even that > fails, it just falls back to an explicit write of a zeroed buffer. > > If we want, we can give the client a little more control about whether > or not discarding in the process is allowed (or maybe even preferred). > qemu's interface for writing zeros has a BDRV_REQ_MAY_UNMAP flag, for > example. NBD-wise, I think the TRIM command is good as it is, and NBD_CMD_WRITE_ZEROES should be added like Den is doing. It also makes sense to use trimming to implement NBD_CMD_WRITE_ZEROES, but it should be explicitly requested by the user. For this, my suggestion is that NBD_CMD_WRITE_ZEROES should have an NBD_FLAG_TRY_TRIM flag in bit 16. If specified, the backend can use a zero-writing mechanism that trims, _but_ it must ensure that the bytes read as zero. If it cannot ensure that, it must not trim and it should instead do a full write. This is similar to the SCSI command WRITE SAME (when the command payload is all zeroes). Like Kevin said, it also happens to map nicely to the QEMU block device layer. Paolo