* [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
@ 2016-04-04 14:15 Eric Blake
2016-04-04 14:47 ` Denis V. Lunev
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Eric Blake @ 2016-04-04 14:15 UTC (permalink / raw)
To: nbd-general; +Cc: den, qemu-devel, alex, pborzenkov
qemu already has an existing server implementation option that will
explicitly search the payload of NBD_CMD_WRITE for large blocks of
zeroes, and punch holes in the underlying file. For old clients
that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
workaround to keep the server's destination file approximately as
sparse as the client's source. However, for new clients that know
how to explicitly request holes, it is unnecessary overhead; and
can lead to the server punching a hole and risking fragmentation or
future ENOSPC even when the client explicitly wanted to write
zeroes rather than a hole. So it makes sense to let the new
NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
doc/proto.md | 33 +++++++++++++++++++++++++++------
1 file changed, 27 insertions(+), 6 deletions(-)
diff --git a/doc/proto.md b/doc/proto.md
index 35a3266..fb97217 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -737,8 +737,14 @@ by a sparse file. With current NBD command set, the client has to issue
through the wire. The server has to write the data onto disk, effectively
losing the sparseness.
-To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
-one new command and one new command flag.
+To remedy this, a `WRITE_ZEROES` extension is envisioned. This
+extension adds one new transmission flag, one new command, and one new
+command flag; and refines an existing command.
+
+* `NBD_FLAG_SEND_WRITE_ZEROES`
+
+ The server SHOULD set this transmission flag to 1 if the
+ `NBD_CMD_WRITE_ZEROES` request is supported.
* `NBD_CMD_WRITE_ZEROES`
@@ -772,12 +778,27 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
including one or more sectors beyond the size of the device. It SHOULD
return `EPERM` if it receives a write zeroes request on a read-only export.
+* `NBD_CMD_WRITE`
+
+ By default, the server MAY search for large contiguous blocks of
+ all zero content, and use trimming to zero out those portions of
+ the write, even if it did not advertise `NBD_FLAG_SEND_TRIM`; but
+ it MUST ensure that any trimmed areas of data read back as zero.
+ However, the client MAY set the command flag
+ `NBD_CMD_FLAG_NO_HOLE` to inform the server that the entire
+ written area MUST be fully provisioned, ensuring that future
+ writes to the same area will not cause fragmentation or cause
+ failure due to insufficient space. Clients SHOULD NOT set this
+ flag unless the server advertised `NBD_FLAG_SEND_WRITE_ZEROES` in
+ the transmisison flags.
+
The extension adds the following new command flag:
-- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
- SHOULD be set to 1 if the client wants to ensure that the server does
- not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
- if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
+- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE` and
+ `NBD_CMD_WRITE_ZEROES`. SHOULD be set to 1 if the client wants to
+ ensure that the server does not create a hole. The client MAY send
+ `NBD_CMD_FLAG_NO_HOLE` even if `NBD_FLAG_SEND_TRIM` was not set in
+ the transmission flags field.
### `STRUCTURED_REPLY` extension
--
2.5.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 14:15 [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE Eric Blake
@ 2016-04-04 14:47 ` Denis V. Lunev
2016-04-04 15:00 ` Eric Blake
2016-04-04 15:16 ` Alex Bligh
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Denis V. Lunev @ 2016-04-04 14:47 UTC (permalink / raw)
To: Eric Blake, nbd-general; +Cc: qemu-devel, alex, pborzenkov
On 04/04/2016 05:15 PM, Eric Blake wrote:
> qemu already has an existing server implementation option that will
> explicitly search the payload of NBD_CMD_WRITE for large blocks of
> zeroes, and punch holes in the underlying file. For old clients
> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
> workaround to keep the server's destination file approximately as
> sparse as the client's source. However, for new clients that know
> how to explicitly request holes, it is unnecessary overhead; and
> can lead to the server punching a hole and risking fragmentation or
> future ENOSPC even when the client explicitly wanted to write
> zeroes rather than a hole. So it makes sense to let the new
> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
what behaviour do you expect for QCOW2 file?
We should fully provision that image as far as I could understand,
i.e. allocate data blocks with zero content.
I think that this would work.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 14:47 ` Denis V. Lunev
@ 2016-04-04 15:00 ` Eric Blake
0 siblings, 0 replies; 9+ messages in thread
From: Eric Blake @ 2016-04-04 15:00 UTC (permalink / raw)
To: Denis V. Lunev, nbd-general; +Cc: qemu-devel, alex, pborzenkov
[-- Attachment #1: Type: text/plain, Size: 2707 bytes --]
On 04/04/2016 08:47 AM, Denis V. Lunev wrote:
> On 04/04/2016 05:15 PM, Eric Blake wrote:
>> qemu already has an existing server implementation option that will
>> explicitly search the payload of NBD_CMD_WRITE for large blocks of
>> zeroes, and punch holes in the underlying file. For old clients
>> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
>> workaround to keep the server's destination file approximately as
>> sparse as the client's source. However, for new clients that know
>> how to explicitly request holes, it is unnecessary overhead; and
>> can lead to the server punching a hole and risking fragmentation or
>> future ENOSPC even when the client explicitly wanted to write
>> zeroes rather than a hole. So it makes sense to let the new
>> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
>>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> what behaviour do you expect for QCOW2 file?
> We should fully provision that image as far as I could understand,
> i.e. allocate data blocks with zero content.
For an old client (that doesn't know how to use WRITE_ZEROES), the
client will send ALL data via CMD_WRITE; and then the server pays
attention to its command-line flags on whether to parse for zeroes
(which is already tri-state, between always allocate, use WRITE_SAME
where possible, and TRIM where possible). The result is that the server
can be commanded at startup to make its file sparse, even when the
client did not have a sparse file to start with.
For a new client, the client will always use WRITE_ZEROES when it is
okay with the server punching holes, and WRITE_ZEROES+NO_HOLE when it
wants to compress the network stream but still force allocation.
Therefore, the server spending time looking for zeroes during WRITE is
wasted effort, and the client should always send WRITE+NO_HOLE for data
that is not a hole on the client's side of things (even if that content
is all zeroes). The result is that the server can now create a file
with the same sparseness pattern as the client, assuming that client and
server have the same granularity of hole sizing.
With a qcow2 file as the underlying file of the qemu nbd server, the
behavior then depends on whether the qcow2 file is version 2 (no
efficient ways to represent all zeroes, but there ARE some shortcuts
that can be taken when you know the file is backed by something else
that is all zero) or version 3 (where there is an explicit and efficient
way to mark an all-zero cluster).
>
> I think that this would work.
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 14:15 [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE Eric Blake
2016-04-04 14:47 ` Denis V. Lunev
@ 2016-04-04 15:16 ` Alex Bligh
2016-04-04 22:15 ` [Qemu-devel] [PATCH v2] " Eric Blake
2016-04-05 22:51 ` [Qemu-devel] [PATCH] " Paolo Bonzini
3 siblings, 0 replies; 9+ messages in thread
From: Alex Bligh @ 2016-04-04 15:16 UTC (permalink / raw)
To: Eric Blake
Cc: nbd-general@lists.sourceforge.net, den, qemu-devel@nongnu.org,
Alex Bligh, pborzenkov
On 4 Apr 2016, at 15:15, Eric Blake <eblake@redhat.com> wrote:
> qemu already has an existing server implementation option that will
> explicitly search the payload of NBD_CMD_WRITE for large blocks of
> zeroes, and punch holes in the underlying file. For old clients
> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
> workaround to keep the server's destination file approximately as
> sparse as the client's source. However, for new clients that know
> how to explicitly request holes, it is unnecessary overhead; and
> can lead to the server punching a hole and risking fragmentation or
> future ENOSPC even when the client explicitly wanted to write
> zeroes rather than a hole. So it makes sense to let the new
> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alex Bligh <alex@alex.org.uk>
+1
Alex
> ---
> doc/proto.md | 33 +++++++++++++++++++++++++++------
> 1 file changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 35a3266..fb97217 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -737,8 +737,14 @@ by a sparse file. With current NBD command set, the client has to issue
> through the wire. The server has to write the data onto disk, effectively
> losing the sparseness.
>
> -To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> -one new command and one new command flag.
> +To remedy this, a `WRITE_ZEROES` extension is envisioned. This
> +extension adds one new transmission flag, one new command, and one new
> +command flag; and refines an existing command.
> +
> +* `NBD_FLAG_SEND_WRITE_ZEROES`
> +
> + The server SHOULD set this transmission flag to 1 if the
> + `NBD_CMD_WRITE_ZEROES` request is supported.
>
> * `NBD_CMD_WRITE_ZEROES`
>
> @@ -772,12 +778,27 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
> including one or more sectors beyond the size of the device. It SHOULD
> return `EPERM` if it receives a write zeroes request on a read-only export.
>
> +* `NBD_CMD_WRITE`
> +
> + By default, the server MAY search for large contiguous blocks of
> + all zero content, and use trimming to zero out those portions of
> + the write, even if it did not advertise `NBD_FLAG_SEND_TRIM`; but
> + it MUST ensure that any trimmed areas of data read back as zero.
> + However, the client MAY set the command flag
> + `NBD_CMD_FLAG_NO_HOLE` to inform the server that the entire
> + written area MUST be fully provisioned, ensuring that future
> + writes to the same area will not cause fragmentation or cause
> + failure due to insufficient space. Clients SHOULD NOT set this
> + flag unless the server advertised `NBD_FLAG_SEND_WRITE_ZEROES` in
> + the transmisison flags.
> +
> The extension adds the following new command flag:
>
> -- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
> - SHOULD be set to 1 if the client wants to ensure that the server does
> - not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
> - if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
> +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE` and
> + `NBD_CMD_WRITE_ZEROES`. SHOULD be set to 1 if the client wants to
> + ensure that the server does not create a hole. The client MAY send
> + `NBD_CMD_FLAG_NO_HOLE` even if `NBD_FLAG_SEND_TRIM` was not set in
> + the transmission flags field.
>
> ### `STRUCTURED_REPLY` extension
>
> --
> 2.5.5
>
--
Alex Bligh
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 14:15 [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE Eric Blake
2016-04-04 14:47 ` Denis V. Lunev
2016-04-04 15:16 ` Alex Bligh
@ 2016-04-04 22:15 ` Eric Blake
2016-04-05 9:38 ` [Qemu-devel] [Nbd] " Markus Pargmann
2016-04-05 22:51 ` [Qemu-devel] [PATCH] " Paolo Bonzini
3 siblings, 1 reply; 9+ messages in thread
From: Eric Blake @ 2016-04-04 22:15 UTC (permalink / raw)
To: nbd-general; +Cc: den, qemu-devel, alex, pborzenkov
qemu already has an existing server implementation option that will
explicitly search the payload of NBD_CMD_WRITE for large blocks of
zeroes, and punch holes in the underlying file. For old clients
that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
workaround to keep the server's destination file approximately as
sparse as the client's source. However, for new clients that know
how to explicitly request holes, it is unnecessary overhead; and
can lead to the server punching a hole and risking fragmentation or
future ENOSPC even when the client explicitly wanted to write
zeroes rather than a hole. So it makes sense to let the new
NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
Signed-off-by: Eric Blake <eblake@redhat.com>
---
v2: fix some typos, add a sentence about server MUST support
NBD_CMD_FLAG_NO_HOLE if it advertises NBD_FLAG_SEND_WRITE_ZEROES
doc/proto.md | 38 ++++++++++++++++++++++++++++++--------
1 file changed, 30 insertions(+), 8 deletions(-)
diff --git a/doc/proto.md b/doc/proto.md
index 35a3266..bca0525 100644
--- a/doc/proto.md
+++ b/doc/proto.md
@@ -487,7 +487,7 @@ valid may depend on negotiation during the handshake phase.
`NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires
"Force Unit Access" mode of operation. MUST NOT be set unless transmission
flags included `NBD_FLAG_SEND_FUA`.
-- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
+- bit 1, `NBD_CMD_FLAG_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
extension; see below.
- bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
extension; see below
@@ -574,7 +574,7 @@ The following request types exist:
After issuing this command, a client MUST NOT make any assumptions
about the contents of the export affected by this command, until
- overwriting it again with `NBD_CMD_WRITE`.
+ overwriting it again with `NBD_CMD_WRITE` or `NBD_CMD_WRITE_ZEROES`.
A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
was set in the transmission flags field.
@@ -737,8 +737,14 @@ by a sparse file. With current NBD command set, the client has to issue
through the wire. The server has to write the data onto disk, effectively
losing the sparseness.
-To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
-one new command and one new command flag.
+To remedy this, a `WRITE_ZEROES` extension is envisioned. This
+extension adds one new transmission flag, one new command, and one new
+command flag; and refines an existing command.
+
+* `NBD_FLAG_SEND_WRITE_ZEROES`
+
+ The server SHOULD set this transmission flag to 1 if the
+ `NBD_CMD_WRITE_ZEROES` request is supported.
* `NBD_CMD_WRITE_ZEROES`
@@ -772,12 +778,28 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
including one or more sectors beyond the size of the device. It SHOULD
return `EPERM` if it receives a write zeroes request on a read-only export.
+* `NBD_CMD_WRITE`
+
+ By default, the server MAY search for large contiguous blocks of
+ all zero content, and use trimming to zero out those portions of
+ the write, even if it did not advertise `NBD_FLAG_SEND_TRIM`; but
+ it MUST ensure that any trimmed areas of data read back as zero.
+ However, the client MAY set the command flag
+ `NBD_CMD_FLAG_NO_HOLE` to inform the server that the entire
+ written area MUST be fully provisioned, ensuring that future
+ writes to the same area will not cause fragmentation or cause
+ failure due to insufficient space. Clients SHOULD NOT set this
+ flag unless the server advertised `NBD_FLAG_SEND_WRITE_ZEROES` in
+ the transmission flags.
+
The extension adds the following new command flag:
-- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
- SHOULD be set to 1 if the client wants to ensure that the server does
- not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
- if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
+- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE` and
+ `NBD_CMD_WRITE_ZEROES`. SHOULD be set to 1 if the client wants to
+ ensure that the server does not create a hole. The client MAY send
+ `NBD_CMD_FLAG_NO_HOLE` even if `NBD_FLAG_SEND_TRIM` was not set in
+ the transmission flags field. The server MUST support the use of
+ this flag if it advertises `NBD_FLAG_SEND_WRITE_ZEROES`.
### `STRUCTURED_REPLY` extension
--
2.5.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [Nbd] [PATCH v2] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 22:15 ` [Qemu-devel] [PATCH v2] " Eric Blake
@ 2016-04-05 9:38 ` Markus Pargmann
2016-04-05 16:43 ` Eric Blake
0 siblings, 1 reply; 9+ messages in thread
From: Markus Pargmann @ 2016-04-05 9:38 UTC (permalink / raw)
To: nbd-general; +Cc: den, qemu-devel, pborzenkov
[-- Attachment #1: Type: text/plain, Size: 5480 bytes --]
Hi,
On Monday 04 April 2016 16:15:43 Eric Blake wrote:
> qemu already has an existing server implementation option that will
> explicitly search the payload of NBD_CMD_WRITE for large blocks of
> zeroes, and punch holes in the underlying file. For old clients
> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
> workaround to keep the server's destination file approximately as
> sparse as the client's source. However, for new clients that know
> how to explicitly request holes, it is unnecessary overhead; and
> can lead to the server punching a hole and risking fragmentation or
> future ENOSPC even when the client explicitly wanted to write
> zeroes rather than a hole. So it makes sense to let the new
> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
From the commit message it sounds like this is only for new clients
supporting WRITE_ZEROES because for those we don't want to search
through all the data of normal WRITEs. If you don't need to set this for
each WRITE individually perhaps we could move it to the negotiation
part?
Best Regards,
Markus
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>
> v2: fix some typos, add a sentence about server MUST support
> NBD_CMD_FLAG_NO_HOLE if it advertises NBD_FLAG_SEND_WRITE_ZEROES
>
> doc/proto.md | 38 ++++++++++++++++++++++++++++++--------
> 1 file changed, 30 insertions(+), 8 deletions(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 35a3266..bca0525 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -487,7 +487,7 @@ valid may depend on negotiation during the handshake phase.
> `NBD_CMD_WRITE_ZEROES` commands. SHOULD be set to 1 if the client requires
> "Force Unit Access" mode of operation. MUST NOT be set unless transmission
> flags included `NBD_FLAG_SEND_FUA`.
> -- bit 1, `NBD_CMD_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
> +- bit 1, `NBD_CMD_FLAG_NO_HOLE`; defined by the experimental `WRITE_ZEROES`
> extension; see below.
> - bit 2, `NBD_CMD_FLAG_DF`; defined by the experimental `STRUCTURED_REPLY`
> extension; see below
> @@ -574,7 +574,7 @@ The following request types exist:
>
> After issuing this command, a client MUST NOT make any assumptions
> about the contents of the export affected by this command, until
> - overwriting it again with `NBD_CMD_WRITE`.
> + overwriting it again with `NBD_CMD_WRITE` or `NBD_CMD_WRITE_ZEROES`.
>
> A client MUST NOT send a trim request unless `NBD_FLAG_SEND_TRIM`
> was set in the transmission flags field.
> @@ -737,8 +737,14 @@ by a sparse file. With current NBD command set, the client has to issue
> through the wire. The server has to write the data onto disk, effectively
> losing the sparseness.
>
> -To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> -one new command and one new command flag.
> +To remedy this, a `WRITE_ZEROES` extension is envisioned. This
> +extension adds one new transmission flag, one new command, and one new
> +command flag; and refines an existing command.
> +
> +* `NBD_FLAG_SEND_WRITE_ZEROES`
> +
> + The server SHOULD set this transmission flag to 1 if the
> + `NBD_CMD_WRITE_ZEROES` request is supported.
>
> * `NBD_CMD_WRITE_ZEROES`
>
> @@ -772,12 +778,28 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
> including one or more sectors beyond the size of the device. It SHOULD
> return `EPERM` if it receives a write zeroes request on a read-only export.
>
> +* `NBD_CMD_WRITE`
> +
> + By default, the server MAY search for large contiguous blocks of
> + all zero content, and use trimming to zero out those portions of
> + the write, even if it did not advertise `NBD_FLAG_SEND_TRIM`; but
> + it MUST ensure that any trimmed areas of data read back as zero.
> + However, the client MAY set the command flag
> + `NBD_CMD_FLAG_NO_HOLE` to inform the server that the entire
> + written area MUST be fully provisioned, ensuring that future
> + writes to the same area will not cause fragmentation or cause
> + failure due to insufficient space. Clients SHOULD NOT set this
> + flag unless the server advertised `NBD_FLAG_SEND_WRITE_ZEROES` in
> + the transmission flags.
> +
> The extension adds the following new command flag:
>
> -- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
> - SHOULD be set to 1 if the client wants to ensure that the server does
> - not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
> - if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
> +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE` and
> + `NBD_CMD_WRITE_ZEROES`. SHOULD be set to 1 if the client wants to
> + ensure that the server does not create a hole. The client MAY send
> + `NBD_CMD_FLAG_NO_HOLE` even if `NBD_FLAG_SEND_TRIM` was not set in
> + the transmission flags field. The server MUST support the use of
> + this flag if it advertises `NBD_FLAG_SEND_WRITE_ZEROES`.
>
> ### `STRUCTURED_REPLY` extension
>
>
--
Pengutronix e.K. | |
Industrial Linux Solutions | http://www.pengutronix.de/ |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [Nbd] [PATCH v2] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-05 9:38 ` [Qemu-devel] [Nbd] " Markus Pargmann
@ 2016-04-05 16:43 ` Eric Blake
2016-04-05 20:45 ` Wouter Verhelst
0 siblings, 1 reply; 9+ messages in thread
From: Eric Blake @ 2016-04-05 16:43 UTC (permalink / raw)
To: Markus Pargmann, nbd-general; +Cc: den, qemu-devel, pborzenkov
[-- Attachment #1: Type: text/plain, Size: 3599 bytes --]
On 04/05/2016 03:38 AM, Markus Pargmann wrote:
> Hi,
>
> On Monday 04 April 2016 16:15:43 Eric Blake wrote:
>> qemu already has an existing server implementation option that will
>> explicitly search the payload of NBD_CMD_WRITE for large blocks of
>> zeroes, and punch holes in the underlying file. For old clients
>> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
>> workaround to keep the server's destination file approximately as
>> sparse as the client's source. However, for new clients that know
>> how to explicitly request holes, it is unnecessary overhead; and
>> can lead to the server punching a hole and risking fragmentation or
>> future ENOSPC even when the client explicitly wanted to write
>> zeroes rather than a hole. So it makes sense to let the new
>> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
>
> From the commit message it sounds like this is only for new clients
> supporting WRITE_ZEROES because for those we don't want to search
> through all the data of normal WRITEs. If you don't need to set this for
> each WRITE individually perhaps we could move it to the negotiation
> part?
Interesting idea. So we'd add a new NBD_OPT_XXX that lets the server
know that "I plan on using WRITE_ZEROS and TRIM as the only places where
I want you to trim, so you can avoid scanning for zeroes in WRITE"; the
server replies with NBD_REP_ACK if it understands the client (in which
case the server _should_ be advertising NBD_FLAG_SEND_WRITE_ZEROES
and/or NBD_FLAG_SEND_TRIM), and with NBD_REP_ERR_UNSUP if it is too old
(the server may still advertise TRIM, but probably should not advertise
WRITE_ZEROES - we are still early enough that we could mandate that any
server that supports WRITE_ZEROES also supports the new NBD_OPT_XXX).
The client then knows that either the server will be efficient with
WRITE and the client uses WRITE_ZEROES and TRIM as desired, or that the
server is old and the client is stuck with WRITE anyways (and whether
the server trims or not is beyond the client's control).
Meanwhile, the new server can unconditionally advertise SEND_TRIM and
SEND_WRITE_ZEROES, whether or not a client uses NBD_OPT_XXX. If it is
an old client connecting and no NBD_OPT_XXX is sent, chances are the
client is also too old to ever use WRITE_ZEROES, so the server should
feel free to apply its policy on whether to scan for zeroes in WRITE (in
qemu's case, the server policy is set via command line options,
precisely to cater to the scenario where we WANT the server to scan
zeroes to make up for the client being unable to pass sparse regions
efficiently vs. cases where the scanning is deemed too expensive); but
if the client DID send NBD_OPT_XXX, the server SHOULD NOT punch holes
during WRITE, and should not waste time scanning, no matter what the
command line policy permitted.
This also helps the case of clients divided between userspace and
kernel: the way I wrote the proposal, the kernel has to pay attention to
NBD_FLAG_SEND_WRITE_ZEROES, and if present, add NBD_CMD_FLAG_NO_HOLE to
every write. But with your proposal of option negotiation (done in
userspace), the default of WRITE is now the most efficient on new
servers, and unchanged for old servers, so the kernel doesn't have to do
anything different.
Does the idea of a new NBD_OPT_ make enough sense to write that up
rather than mandating the use of NBD_CMD_FLAG_NO_HOLE with WRITE?
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [Nbd] [PATCH v2] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-05 16:43 ` Eric Blake
@ 2016-04-05 20:45 ` Wouter Verhelst
0 siblings, 0 replies; 9+ messages in thread
From: Wouter Verhelst @ 2016-04-05 20:45 UTC (permalink / raw)
To: Eric Blake; +Cc: Markus Pargmann, nbd-general, pborzenkov, qemu-devel, den
On Tue, Apr 05, 2016 at 10:43:14AM -0600, Eric Blake wrote:
> On 04/05/2016 03:38 AM, Markus Pargmann wrote:
> > Hi,
> >
> > On Monday 04 April 2016 16:15:43 Eric Blake wrote:
> >> qemu already has an existing server implementation option that will
> >> explicitly search the payload of NBD_CMD_WRITE for large blocks of
> >> zeroes, and punch holes in the underlying file. For old clients
> >> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
> >> workaround to keep the server's destination file approximately as
> >> sparse as the client's source. However, for new clients that know
> >> how to explicitly request holes, it is unnecessary overhead; and
> >> can lead to the server punching a hole and risking fragmentation or
> >> future ENOSPC even when the client explicitly wanted to write
> >> zeroes rather than a hole. So it makes sense to let the new
> >> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
> >
> > From the commit message it sounds like this is only for new clients
> > supporting WRITE_ZEROES because for those we don't want to search
> > through all the data of normal WRITEs. If you don't need to set this for
> > each WRITE individually perhaps we could move it to the negotiation
> > part?
>
> Interesting idea. So we'd add a new NBD_OPT_XXX that lets the server
> know that "I plan on using WRITE_ZEROS and TRIM as the only places where
> I want you to trim, so you can avoid scanning for zeroes in WRITE"; the
> server replies with NBD_REP_ACK if it understands the client (in which
> case the server _should_ be advertising NBD_FLAG_SEND_WRITE_ZEROES
> and/or NBD_FLAG_SEND_TRIM), and with NBD_REP_ERR_UNSUP if it is too old
> (the server may still advertise TRIM, but probably should not advertise
> WRITE_ZEROES - we are still early enough that we could mandate that any
> server that supports WRITE_ZEROES also supports the new NBD_OPT_XXX).
Certainly.
However, I think a server should be allowed to reply to this
NBD_OPT_NO_AUTO_HOLE (or whatever we end up calling it) with
NBD_REP_ERR_POLICY -- i.e., it understands the request, but server-side
configuration forbids it to heed it.
This kind of stuff is *always* a trade-off. Someone low on diskspace
might want to force their server to scan for zeroes, in the
understanding that things might break.
[...]
> Does the idea of a new NBD_OPT_ make enough sense to write that up
> rather than mandating the use of NBD_CMD_FLAG_NO_HOLE with WRITE?
Yeah, it does to me. The client shouldn't have to care much about this
kind of stuff.
--
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
people in the world who think they really understand all of its rules,
and pretty much all of them are just lying to themselves too.
-- #debian-devel, OFTC, 2016-02-12
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE
2016-04-04 14:15 [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE Eric Blake
` (2 preceding siblings ...)
2016-04-04 22:15 ` [Qemu-devel] [PATCH v2] " Eric Blake
@ 2016-04-05 22:51 ` Paolo Bonzini
3 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2016-04-05 22:51 UTC (permalink / raw)
To: Eric Blake, nbd-general@lists.sourceforge.net
Cc: Denis V. Lunev, qemu-devel, Pavel Borzenkov
On 04/04/2016 16:15, Eric Blake wrote:
> qemu already has an existing server implementation option that will
> explicitly search the payload of NBD_CMD_WRITE for large blocks of
> zeroes, and punch holes in the underlying file. For old clients
> that don't know how to use the new NBD_CMD_WRITE_ZEROES, this is a
> workaround to keep the server's destination file approximately as
> sparse as the client's source.
I don't think this is the case; the flag is explicitly meant to override
the client.
Paolo
> However, for new clients that know
> how to explicitly request holes, it is unnecessary overhead; and
> can lead to the server punching a hole and risking fragmentation or
> future ENOSPC even when the client explicitly wanted to write
> zeroes rather than a hole. So it makes sense to let the new
> NBD_CMD_FLAG_NO_HOLE work for WRITE as well as WRITE_ZEROES.
>
> Signed-off-by: Eric Blake <eblake-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
> ---
> doc/proto.md | 33 +++++++++++++++++++++++++++------
> 1 file changed, 27 insertions(+), 6 deletions(-)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 35a3266..fb97217 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -737,8 +737,14 @@ by a sparse file. With current NBD command set, the client has to issue
> through the wire. The server has to write the data onto disk, effectively
> losing the sparseness.
>
> -To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> -one new command and one new command flag.
> +To remedy this, a `WRITE_ZEROES` extension is envisioned. This
> +extension adds one new transmission flag, one new command, and one new
> +command flag; and refines an existing command.
> +
> +* `NBD_FLAG_SEND_WRITE_ZEROES`
> +
> + The server SHOULD set this transmission flag to 1 if the
> + `NBD_CMD_WRITE_ZEROES` request is supported.
>
> * `NBD_CMD_WRITE_ZEROES`
>
> @@ -772,12 +778,27 @@ The server SHOULD return `ENOSPC` if it receives a write zeroes request
> including one or more sectors beyond the size of the device. It SHOULD
> return `EPERM` if it receives a write zeroes request on a read-only export.
>
> +* `NBD_CMD_WRITE`
> +
> + By default, the server MAY search for large contiguous blocks of
> + all zero content, and use trimming to zero out those portions of
> + the write, even if it did not advertise `NBD_FLAG_SEND_TRIM`; but
> + it MUST ensure that any trimmed areas of data read back as zero.
> + However, the client MAY set the command flag
> + `NBD_CMD_FLAG_NO_HOLE` to inform the server that the entire
> + written area MUST be fully provisioned, ensuring that future
> + writes to the same area will not cause fragmentation or cause
> + failure due to insufficient space. Clients SHOULD NOT set this
> + flag unless the server advertised `NBD_FLAG_SEND_WRITE_ZEROES` in
> + the transmisison flags.
> +
> The extension adds the following new command flag:
>
> -- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE_ZEROES`.
> - SHOULD be set to 1 if the client wants to ensure that the server does
> - not create a hole. The client MAY send `NBD_CMD_FLAG_NO_HOLE` even
> - if `NBD_FLAG_SEND_TRIM` was not set in the transmission flags field.
> +- `NBD_CMD_FLAG_NO_HOLE`; valid during `NBD_CMD_WRITE` and
> + `NBD_CMD_WRITE_ZEROES`. SHOULD be set to 1 if the client wants to
> + ensure that the server does not create a hole. The client MAY send
> + `NBD_CMD_FLAG_NO_HOLE` even if `NBD_FLAG_SEND_TRIM` was not set in
> + the transmission flags field.
>
> ### `STRUCTURED_REPLY` extension
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2016-04-05 22:51 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-04 14:15 [Qemu-devel] [PATCH] doc: Allow NBD_CMD_FLAG_NO_HOLE during NBD_CMD_WRITE Eric Blake
2016-04-04 14:47 ` Denis V. Lunev
2016-04-04 15:00 ` Eric Blake
2016-04-04 15:16 ` Alex Bligh
2016-04-04 22:15 ` [Qemu-devel] [PATCH v2] " Eric Blake
2016-04-05 9:38 ` [Qemu-devel] [Nbd] " Markus Pargmann
2016-04-05 16:43 ` Eric Blake
2016-04-05 20:45 ` Wouter Verhelst
2016-04-05 22:51 ` [Qemu-devel] [PATCH] " Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).