From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44928) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aimTW-0005Kn-0G for qemu-devel@nongnu.org; Wed, 23 Mar 2016 13:21:46 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aimTS-0006hb-Ji for qemu-devel@nongnu.org; Wed, 23 Mar 2016 13:21:41 -0400 Received: from barbershop.grep.be ([89.106.240.122]:56097) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aimTS-0006g9-Ag for qemu-devel@nongnu.org; Wed, 23 Mar 2016 13:21:38 -0400 Date: Wed, 23 Mar 2016 18:21:16 +0100 From: Wouter Verhelst Message-ID: <20160323172116.GA2467@grep.be> References: <1458742562-30624-1-git-send-email-den@openvz.org> <1458742562-30624-2-git-send-email-den@openvz.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1458742562-30624-2-git-send-email-den@openvz.org> Subject: Re: [Qemu-devel] [Nbd] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Denis V. Lunev" Cc: nbd-general@lists.sourceforge.net, Kevin Wolf , qemu-devel@nongnu.org, Stefan Hajnoczi , Paolo Bonzini Hi, On Wed, Mar 23, 2016 at 05:16:01PM +0300, Denis V. Lunev wrote: > From: Pavel Borzenkov > > There exist some cases when a client knows that the data it is going to > write is all zeroes. Such cases include mirroring or backing up a device > implemented by a sparse file. > > With current NBD command set, the client has to issue NBD_CMD_WRITE > command with zeroed payload and transfer these zero bytes through the > wire. The server has to write the data onto disk, effectively denying > the sparseness. > > To remedy this, the patch adds WRITE_ZEROES extension with one new > NBD_CMD_WRITE_ZEROES command. > > Signed-off-by: Pavel Borzenkov > Reviewed-by: Roman Kagan > Signed-off-by: Denis V. Lunev > CC: Wouter Verhelst > CC: Paolo Bonzini > CC: Kevin Wolf > CC: Stefan Hajnoczi > --- > doc/proto.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 44 insertions(+) > > diff --git a/doc/proto.md b/doc/proto.md > index 463ef8a..cda213c 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle negotiation: > schedule I/O accesses as for a rotational medium > - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports > `NBD_CMD_TRIM` commands > +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server > + supports `NBD_CMD_WRITE_ZEROES` commands > > ##### Client flags > > @@ -471,6 +473,10 @@ The following request types exist: > about the contents of the export affected by this command, until > overwriting it again with `NBD_CMD_WRITE`. > > +* `NBD_CMD_WRITE_ZEROES` (6) > + > + Defined by the experimental `WRITE_ZEROES` extension; see below. > + > * Other requests > > Some third-party implementations may require additional protocol > @@ -594,6 +600,44 @@ option reply type. > message if they do not also send it as a reply to the > `NBD_OPT_SELECT` message. > > +### `WRITE_ZEROES` extension > + > +There exist some cases when a client knows that the data it is going to write > +is all zeroes. Such cases include mirroring or backing up a device implemented > +by a sparse file. With current NBD command set, the client has to issue > +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes > +through the wire. The server has to write the data onto disk, effectively > +denying the sparseness. > + > +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds > +one new command with two command flags. > + > +* `NBD_CMD_WRITE_ZEROES` (6) > + > + A write request with no payload. Length and offset define the location > + and amount of data to be zeroed. > + > + The server MUST zero out the data on disk, and then send the reply > + message. The server MAY send the reply message before the data has > + reached permanent storage. > + > + If the `NBD_FLAG_SEND_FUA` flag ("Force Unit Access") was set in the > + export flags field, the client MAY set the flag `NBD_CMD_FLAG_FUA` (bit 0) > + in the command flags field. If this flag was set, the server MUST NOT send > + the reply until it has ensured that the newly-zeroed data has reached > + permanent storage. > + > + If the flag `NBD_CMD_FLAG_MAY_TRIM` (bit 1) was set by the client in the > + command flags field, the server MAY use trimming to zero out the area, > + but it MUST ensure that the data reads back as zero. > + > + If an error occurs, the server SHOULD set the appropriate error code > + in the error field. The server MAY then close the connection. > + > +The server SHOULD return `ENOSPC` if it receives a write zeroes request > +including one or more sectors beyond the size of the device. It SHOULD > +return `EPERM` if it receives a write zeroes request on a read-only export. > + So, the semantics of your proposed WRITE_ZEROES are exactly the same as the WRITE command, except that no payload is sent? In that case, I think it's slightly more sensible if we don't add a new command, but instead just have an NBD_CMD_FLAG_ZEROES added to the WRITE command instead. After all, they're going to be (mostly) the same anyway. Did you propose a separate command for a specific reason that I'm missing (or forgetting), or is that just an oversight? -- < ron> I mean, the main *practical* problem with C++, is there's like a dozen people in the world who think they really understand all of its rules, and pretty much all of them are just lying to themselves too. -- #debian-devel, OFTC, 2016-02-12