From: Wouter Verhelst <w@uter.be>
To: Eric Blake <eblake@redhat.com>
Cc: nbd-general@lists.sourceforge.net, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [Nbd] [PATCH 3/1] doc: Propose Structured Replies extension
Date: Tue, 29 Mar 2016 09:33:15 +0200 [thread overview]
Message-ID: <20160329073315.GC22386@grep.be> (raw)
In-Reply-To: <1459223796-28474-2-git-send-email-eblake@redhat.com>
Hi Eric,
After applying some of the other outstanding patches, this one doesn't apply
anymore. Can you rebase?
On Mon, Mar 28, 2016 at 09:56:36PM -0600, Eric Blake wrote:
> The existing transmission phase protocol is difficult to sniff,
> because correct interpretation of the server stream requires
> context from the client stream (or risks false positives if
> data payloads happen to contain the protocol magic numbers). It
> also prohibits the ability to do short reads, or to return a
> read error without also sending length bytes of data.
>
> Remedy this by adding a new global/client flag negotiation,
> which affects the response of the NBD_CMD_READ command, and sets
> forth rules for how future command responses must behave when
> they carry a data payload.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> doc/proto.md | 123 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 123 insertions(+)
>
> diff --git a/doc/proto.md b/doc/proto.md
> index 44579fc..f687e3e 100644
> --- a/doc/proto.md
> +++ b/doc/proto.md
> @@ -209,6 +209,10 @@ same value for handle as was sent by the client for each request that
> the server is replying to, so that the client may correlate which
> request is receiving a response.
>
> +Note that it is impossible to tell by reading just the server traffic
> +whether a data field will be present. To remedy this, the experimental
> +`Structured Reply` extension has been introduced; see below.
> +
> ## Values
>
> This section describes the value and meaning of constants (other than
> @@ -231,6 +235,8 @@ the first magic number.
> - bit 1, `NBD_FLAG_NO_ZEROES`; if set, and if the client replies with
> `NBD_FLAG_C_NO_ZEROES` in the client flags field, the server MUST NOT
> send the 124 bytes of zero at the end of the negotiation.
> +- bit 2, `NBD_FLAG_STRUCTURED_REPLY`; defined by the experimental
> + `Structured Reply` extension; see below.
>
> The server MUST NOT set any other flags, and SHOULD NOT change behaviour
> unless the client responds with a corresponding flag. The server MUST
> @@ -265,6 +271,8 @@ receiving the global flags from the server.
> - bit 1, `NBD_FLAG_C_NO_ZEROES`; MUST NOT be set if the server did not
> set `NBD_FLAG_NO_ZEROES`. If set, the server MUST NOT send the 124
> bytes of zeroes at the end of the negotiation.
> +- bit 2, `NBD_FLAG_C_STRUCTURED_REPLY`; defined by the experimental
> + `Structured Reply` extension; see below.
>
> Clients SHOULD NOT set any other flags; the server MUST drop the
> connection if the client sets an unknown flag, or a flag that does
> @@ -435,6 +443,10 @@ The following request types exist:
> signalling no error), the server MUST immediately close the
> connection; it MUST NOT send any further data to the client.
>
> + The experimental `Structured Reply` extension changes the set of
> + valid replies, in part to allow recovery after a partial read; see
> + below.
> +
> * `NBD_CMD_WRITE` (1)
>
> A write request. Length and offset define the location and amount of
> @@ -609,6 +621,117 @@ option reply type.
> message if they do not also send it as a reply to the
> `NBD_OPT_SELECT` message.
>
> +### `Structured Reply` extension
> +
> +Some major downsides of the default reply to `NBD_CMD_READ` is that it
> +is not possible to support partial reads (the command must succeed or
> +fail as a whole, and len bytes of data must be sent even on an error,
> +unless the connection is closed); nor is it possible to decode the
> +server traffic without also knowing what pending read requests were
> +sent by the client.
> +
> +To remedy this, a `Structured Reply` extension is envisioned. This
> +extension adds a global flag, a client flag, a new reply type during
> +the transmission phase, and alters the set of valid replies to an
> +existing command.
> +
> +* `NBD_FLAG_STRUCTURED_REPLY` (bit 2)
> +
> + This is a global flag; if set, and if the client replies with
> + `NBD_FLAG_C_STRUCTURED_REPLY` in the client flags field, the server
> + MUST use structured replies to the `NBD_CMD_READ` command.
> +
> +* `NBD_FLAG_C_STRUCTURED_REPLY` (bit 2)
> +
> + This is a client flag; MUST NOT be set if the server did not set
> + `NBD_FLAG_STRUCTURED_REPLY`. If set, the server must use structured
> + replies to the `NBD_CMD_READ` command.
> +
> +* Transmission phase
> +
> + The transmission phase includes a third message type: the structured
> + reply. If `NBD_FLAG_C_STRUCTURED_REPLY` was negotiated, then the
> + normal server reply will never contain a data payload, and all
> + server replies that send data will be in the following form:
> +
> + S: 32 bits, 0x668e33ef, magic (`NBD_STRUCTURED_REPLY_MAGIC`)
> + S: 32 bits, error
> + S: 64 bits, handle
> + S: 32 bits, length of payload (unsigned)
> + S: *length* bytes of payload data
> +
> + If `NBD_FLAG_C_STRUCTURED_REPLY` was not negotiated, then the normal
> + server reply with a data payload will be used for `NBD_CMD_READ`;
> + but any other replies with a data payload will still use a
> + structured reply (that is, only `NBD_CMD_READ` is allowed to send
> + data in the non-structured form, and negotiating
> + `NBD_FLAG_C_STRUCTURED_REPLY` only affects replies to
> + `NBD_CMD_READ`). This implies that any new commands that require
> + data in the reply should be gated by their own new global and client
> + flag. A server MAY refuse to allow a client that negotiates a
> + command that requires a structured reply, but does not also
> + negotiate `NBD_FLAG_C_STRUCTURED_REPLY`.
> +
> + In the generic form, the length field of a structured response MAY
> + be zero if there is no data payload, and the error field may be set
> + regardless of the length field (although where possible, the server
> + SHOULD use a length of zero when setting the error field). However,
> + particular commands may document additional restrictions regarding
> + what forms a valid response (for example, a structured response to
> + `NBD_CMD_READ` MUST NOT set the error field, and MUST have a
> + non-zero length of at least 9).
> +
> +* `NBD_CMD_READ`
> +
> + If `NBD_FLAG_C_STRUCTURED_REPLY` was not negotiated, then a read
> + request MUST always be answered by a single response (with magic
> + 0x67446698 `NBD_REPLY_MAGIC`); the response MUST include length
> + bytes of data according to the client's request, although those
> + bytes MAY be invalid if an error is returned, and the connection
> + MUST if an error occurs after a header claiming no error.
> +
> + Conversely, if the `NBD_FLAG_C_STRUCTURED_REPLY` was negotiated, the
> + response MUST be a sequence of zero or more structured replies (with
> + magic 0x668e33ef `NBD_STRUCTURED_REPLY_MAGIC`), followed by a
> + concluding normal response (0x67446698 `NBD_REPLY_MAGIC`), where the
> + final response MUST NOT have a data payload; all responses in the
> + sequence use the same handle from the client. The payload of each
> + intermediate structured reply, called a read chunk, MUST be of the
> + following form:
> +
> + S: 64 bits, offset (unsigned)
> + S: (*length* - 8) bytes of data
> +
> + Note that the amount of data returned in a read chunk is determined
> + by the length field of the structured reply, and not the original
> + length of the client's request. If the server sends a single read
> + chunk for a successful read, the server's length will be the
> + client's length plus 8, because the server must account for the
> + offset field in its reply. Similarly, a successful client request
> + for a read of 2^32-8 or more bytes MUST be split into at least two
> + read chunks, so that the length field does not suffer from overflow.
> +
> + The server MUST ensure that each read chunk lies within the original
> + offset and length of the original client request, MUST NOT send read
> + chunks that would cover the same offset more than once, and MUST
> + send at least one byte of data in addition to the offset field of
> + each read chunk. The server MAY send read chunks out of order, and
> + may interleave other responses between read replies. The server
> + MUST NOT set the error field of a read chunk; if an error occurs, it
> + MAY immediately end the sequence of structured response messages,
> + MUST send the error in the concluding normal response, and SHOULD
> + keep the connection open. The final non-structured response MUST
> + set an error unless the sum of data sent by all read chunks totals
> + the original client length request.
> +
> + The client SHOULD immediately close the connection if it detects
> + that the server has sent an offset more than once (whether or not
> + the overlapping data claimed to have the same contents), or if
> + receives the concluding normal reply without an error set but
> + without all bytes covered by read chunk(s). A future extension may
> + add a command flag that would allow the server to skip read chunks
> + for portions of the file that read as all zeroes.
> +
> ## About this file
>
> This file tries to document the NBD protocol as it is currently
> --
> 2.5.5
>
>
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785471&iu=/4140
> _______________________________________________
> Nbd-general mailing list
> Nbd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nbd-general
>
--
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
people in the world who think they really understand all of its rules,
and pretty much all of them are just lying to themselves too.
-- #debian-devel, OFTC, 2016-02-12
next prev parent reply other threads:[~2016-03-29 7:33 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-28 13:59 [Qemu-devel] [PATCH] doc: Mention proper use of handle Eric Blake
2016-03-29 3:56 ` [Qemu-devel] [PATCH 2/1] doc: More details on flag negotiation Eric Blake
2016-03-29 3:56 ` [Qemu-devel] [PATCH 3/1] doc: Propose Structured Replies extension Eric Blake
2016-03-29 7:33 ` Wouter Verhelst [this message]
2016-03-29 8:24 ` [Qemu-devel] [Nbd] " Alex Bligh
2016-03-29 14:21 ` Eric Blake
2016-03-29 14:37 ` Alex Bligh
2016-03-29 15:12 ` Eric Blake
2016-03-29 16:37 ` Wouter Verhelst
2016-03-29 17:34 ` Alex Bligh
2016-03-29 17:45 ` Eric Blake
2016-03-29 18:03 ` Wouter Verhelst
2016-03-29 18:07 ` Eric Blake
2016-03-29 18:19 ` Wouter Verhelst
2016-03-29 18:25 ` Eric Blake
2016-03-29 18:09 ` Alex Bligh
2016-03-29 17:53 ` Wouter Verhelst
2016-03-29 18:23 ` Eric Blake
2016-03-29 18:51 ` Wouter Verhelst
2016-03-29 19:06 ` Wouter Verhelst
2016-03-29 19:39 ` Alex Bligh
2016-03-29 20:00 ` Eric Blake
2016-03-29 20:18 ` Alex Bligh
2016-03-29 20:44 ` Alex Bligh
2016-03-29 21:05 ` Wouter Verhelst
2016-03-29 22:05 ` Alex Bligh
2016-03-29 22:45 ` Wouter Verhelst
2016-03-29 22:53 ` Alex Bligh
2016-03-29 7:11 ` [Qemu-devel] [PATCH] doc: Mention proper use of handle Wouter Verhelst
2016-03-29 13:59 ` Eric Blake
2016-03-29 23:00 ` [Qemu-devel] [PATCH v2 0/3] NBD Structured Read Eric Blake
2016-03-29 23:00 ` [Qemu-devel] [PATCH v2 1/3] NBD proto: add "Command flags" section Eric Blake
2016-03-29 23:00 ` [Qemu-devel] [PATCH v2 2/3] doc: Mention proper use of handle Eric Blake
2016-03-29 23:01 ` [Qemu-devel] [PATCH v2 3/3] doc: Propose Structured Read extension Eric Blake
2016-03-29 23:29 ` Eric Blake
2016-03-30 6:50 ` Alex Bligh
2016-03-30 17:45 ` Eric Blake
2016-03-30 19:51 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-30 20:54 ` Eric Blake
2016-03-30 21:26 ` Wouter Verhelst
2016-03-30 22:48 ` [Qemu-devel] " Alex Bligh
2016-03-30 20:44 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-30 8:09 ` [Qemu-devel] [Nbd] [PATCH v2 0/3] NBD Structured Read Wouter Verhelst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160329073315.GC22386@grep.be \
--to=w@uter.be \
--cc=eblake@redhat.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).