From: "Denis V. Lunev" <den@openvz.org>
To: nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org
Cc: Kevin Wolf <kwolf@redhat.com>, Alex Bligh <alex@alex.org.uk>,
Pavel Borzenkov <pborzenkov@virtuozzo.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>, Wouter Verhelst <w@uter.be>,
den@openvz.org
Subject: [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension
Date: Thu, 31 Mar 2016 16:02:05 +0300 [thread overview]
Message-ID: <1459429325-16350-1-git-send-email-den@openvz.org> (raw)
From: Pavel Borzenkov <pborzenkov@virtuozzo.com>
There exist some cases when a client knows that the data it is going to
write is all zeroes. Such cases include mirroring or backing up a device
implemented by a sparse file.
With current NBD command set, the client has to issue NBD_CMD_WRITE
command with zeroed payload and transfer these zero bytes through the
wire. The server has to write the data onto disk, effectively denying
the sparseness.
To remedy this, the patch adds WRITE_ZEROES extension with one new
NBD_CMD_WRITE_ZEROES command.
Signed-off-by: Pavel Borzenkov <pborzenkov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Wouter Verhelst <w@uter.be>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Wouter Verhelst <w@uter.be>
CC: Alex Bligh <alex@alex.org.uk>
CC: Eric Blake <eblake@redhat.com>
---
v2:
- rebased on master
- explicitly state that the client must not set NBD_CMD_WRITE_ZEROES if
support for it wasn't negotiated with the server;
- add new command flag's description in format suitable for moving to
"Command flags" section.
doc/proto.md | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
1 file changed, 60 insertions(+), 4 deletions(-)
diff --git a/doc/proto.md b/doc/proto.md
index c1e05c5..a574563 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -261,6 +261,8 @@ immediately after the handshake flags field in oldstyle negotiation:
schedule I/O accesses as for a rotational medium
- bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
`NBD_CMD_TRIM` commands
+- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
+ supports `NBD_CMD_WRITE_ZEROES` commands
Clients SHOULD ignore unknown flags.
@@ -444,10 +446,13 @@ affects a particular command. Clients MUST NOT set a command flag bit
that is not documented for the particular command; and whether a flag is
valid may depend on negotiation during the handshake phase.
-- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE`. SHOULD be
- set to 1 if the client requires "Force Unit Access" mode of
- operation. MUST NOT be set unless transmission flags included
- `NBD_FLAG_SEND_FUA`.
+- bit 0, `NBD_CMD_FLAG_FUA`; valid during `NBD_CMD_WRITE` and
+ `NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires
+ "Force Unit Access" mode of operation. MUST NOT be set unless transmission
+ flags included `NBD_FLAG_SEND_FUA`.
+
+- bit 1, `NBD_CMD_MAY_TRIM`; defined by the experimental `WRITE_ZEROES`
+ extension; see below.
#### Request types
@@ -523,6 +528,10 @@ The following request types exist:
A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
was set in the transmission flags field.
+* `NBD_CMD_WRITE_ZEROES` (6)
+
+ Defined by the experimental `WRITE_ZEROES` extension; see below.
+
* Other requests
Some third-party implementations may require additional protocol
@@ -654,6 +663,53 @@ option reply type.
message if they do not also send it as a reply to the
`NBD_OPT_SELECT` message.
+### `WRITE_ZEROES` extension
+
+There exist some cases when a client knows that the data it is going to write
+is all zeroes. Such cases include mirroring or backing up a device implemented
+by a sparse file. With current NBD command set, the client has to issue
+`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes
+through the wire. The server has to write the data onto disk, effectively
+losing the sparseness.
+
+To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
+one new command and one new command flag.
+
+* `NBD_CMD_WRITE_ZEROES` (6)
+
+ A write request with no payload. Length and offset define the location
+ and amount of data to be zeroed.
+
+ The server MUST zero out the data on disk, and then send the reply
+ message. The server MAY send the reply message before the data has
+ reached permanent storage.
+
+ A client MUST NOT send a write zeroes request unless
+ `NBD_FLAG_SEND_WRITE_ZEROES` was set in the transmission flags field.
+
+ If the `NBD_FLAG_SEND_FUA` flag was set in the transmission flags field,
+ the client MAY set the flag `NBD_CMD_FLAG_FUA` in the command flags field.
+ If this flag was set, the server MUST NOT send the reply until it has
+ ensured that the newly-zeroed data has reached permanent storage.
+
+ If the flag `NBD_CMD_FLAG_MAY_TRIM` was set by the client in the command
+ flags field, the server MAY use trimming to zero out the area, but it
+ MUST ensure that the data reads back as zero.
+
+ If an error occurs, the server SHOULD set the appropriate error code
+ in the error field. The server MAY then close the connection.
+
+The server SHOULD return `ENOSPC` if it receives a write zeroes request
+including one or more sectors beyond the size of the device. It SHOULD
+return `EPERM` if it receives a write zeroes request on a read-only export.
+
+The extension adds the following new command flag:
+
+- bit 1, `NBD_CMD_FLAG_MAY_TRIM`; valid during `NBD_CMD_WRITE_ZEROES`.
+ SHOULD be set to 1 if the client allows the server to use trim to perform
+ the requested operation. The client MAY send `NBD_CMD_FLAG_MAY_TRIM` even
+ if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
+
## About this file
This file tries to document the NBD protocol as it is currently
--
2.1.4
next reply other threads:[~2016-03-31 13:02 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-31 13:02 Denis V. Lunev [this message]
2016-03-31 13:53 ` [Qemu-devel] [PATCH v2 1/1] NBD proto: add WRITE_ZEROES extension Alex Bligh
2016-03-31 13:55 ` Paolo Bonzini
2016-03-31 14:27 ` Alex Bligh
2016-03-31 14:40 ` Paolo Bonzini
2016-03-31 14:08 ` Eric Blake
2016-03-31 23:46 ` Eric Blake
2016-04-01 8:37 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-04-01 20:26 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1459429325-16350-1-git-send-email-den@openvz.org \
--to=den@openvz.org \
--cc=alex@alex.org.uk \
--cc=kwolf@redhat.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
--cc=pborzenkov@virtuozzo.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=w@uter.be \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).