From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52307) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1amG8Y-0006vK-Mq for qemu-devel@nongnu.org; Sat, 02 Apr 2016 03:38:27 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1amG8T-0003lM-ND for qemu-devel@nongnu.org; Sat, 02 Apr 2016 03:38:26 -0400 Received: from mail-am1on0125.outbound.protection.outlook.com ([157.56.112.125]:62674 helo=emea01-am1-obe.outbound.protection.outlook.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1amG8T-0003lI-6s for qemu-devel@nongnu.org; Sat, 02 Apr 2016 03:38:21 -0400 References: <1459546186-17002-1-git-send-email-eblake@redhat.com> From: "Denis V. Lunev" Message-ID: <56FF76E3.1090504@openvz.org> Date: Sat, 2 Apr 2016 10:38:11 +0300 MIME-Version: 1.0 In-Reply-To: <1459546186-17002-1-git-send-email-eblake@redhat.com> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH] doc: Flip bit sense for allowing trim during WRITE_ZEROES List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eric Blake , nbd-general@lists.sourceforge.net Cc: qemu-devel@nongnu.org, alex@alex.org.uk, pborzenkov@virtuozzo.com On 04/02/2016 12:29 AM, Eric Blake wrote: > Rather than requiring allocation by default and allowing trims > only on request during WRITE_ZEROES, it seems like a better > default is to allow server optimizations by default and require > full allocation by specific request. Since WRITE_ZEROES is > experimental and has not yet been implemented, we can flip the > sense of the command flag so that the default flags of all 0s > is the most efficient, and flags (whether _FUA or _NO_HOLE) are > added to make the operation have tighter constraints but > possibly slower execution. > > The name NO_HOLE is also slightly nicer at expressing the fact > that there is no dependency on whether NBD_CMD_TRIM is supported. > > Also tweak a couple of formatting issues for consistency (for > example, only reserve a bit number in one place). > > Signed-off-by: Eric Blake > --- > doc/proto.md | 30 +++++++++++++++++------------- > 1 file changed, 17 insertions(+), 13 deletions(-) > > diff --git a/doc/proto.md b/doc/proto.md > index 758a661..725af4d 100644 > --- a/doc/proto.md > +++ b/doc/proto.md > @@ -288,10 +288,10 @@ immediately after the handshake flags field in oldstyle negotiation: > schedule I/O accesses as for a rotational medium > - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports > `NBD_CMD_TRIM` commands > -- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server > - supports `NBD_CMD_WRITE_ZEROES` commands > -- bit 7, `NBD_FLAG_SEND_DF`; defined by the `STRUCTURED_REPLY` extension; > - see below. > +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; defined by the experimental > + `WRITE_ZEROES` extension; see below. > +- bit 7, `NBD_FLAG_SEND_DF`; defined by the experimental `STRUCTURED_REPLY` > + extension; see below. > > Clients SHOULD ignore unknown flags. > > @@ -483,7 +483,7 @@ valid may depend on negotiation during the handshake phase. > `NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires > "Force Unit Access" mode of operation. MUST NOT be set unless transmission > flags included `NBD_FLAG_SEND_FUA`. > -- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES` > +- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES` > extension; see below. > - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY` > extension; see below > @@ -736,7 +736,7 @@ losing the sparseness. > To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds > one new command and one new command flag. > > -* `NBD_CMD_WRITE_ZEROES` (6) > +* `NBD_CMD_WRITE_ZEROES` > > A write request with no payload. Length and offset define the location > and amount of data to be zeroed. > @@ -753,9 +753,13 @@ one new command and one new command flag. > If this flag was set, the server MUST NOT send the reply until it has > ensured that the newly-zeroed data has reached permanent storage. > > - If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command > - flags field, the server MAY use trimming to zero out the area, but it > - MUST ensure that the data reads back as zero. > + By default, the server MAY use trimming to zero out the area, even > + if it did not advertise `NBD_FLAG_SEND_TRIM`; but it MUST ensure > + that the data reads back as zero. However, the client MAY set the > + command flag `NBD_CMD_FLAG_NO_HOLE` to inform the server that the > + area MUST be fully provisioned, ensuring that future writes to the > + same area will not cause fragmentation or cause failure due to > + insufficient space. > > If an error occurs, the server SHOULD set the appropriate error code > in the error field. The server MAY then close the connection. > @@ -766,9 +770,9 @@ return `EPERM` if it receives a write zeroes request on a read-only export. > > The extension adds the following new command flag: > > -- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`. > - SHOULD be set to 1 if the client allows the server to use trim to perform > - the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even > +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`. > + SHOULD be set to 1 if the client wants to ensure that the server does > + not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even > if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field. > > ### `STRUCTURED_REPLY` extension > @@ -893,7 +897,7 @@ error, and alters the reply to the `NBD_CMD_READ` request. > were sent earlier in the structured reply, the server SHOULD NOT > send multiple distinct offsets that lie within the bounds of a > single content chunk. Valid as a reply to `NBD_CMD_READ`, > - `NBD_CMD_WRITE`, and `NBD_CMD_TRIM`. > + `NBD_CMD_WRITE`, `NBD_CMD_WRITE_ZEROES`, and `NBD_CMD_TRIM`. > > The payload is structured as: > - do not see the problem in this change. It looks correct to me