From: Wouter Verhelst <w@uter.be>
To: Eric Blake <eblake@redhat.com>
Cc: nbd@other.debian.org, qemu-devel@nongnu.org,
qemu-block@nongnu.org, libguestfs@redhat.com
Subject: Re: [PATCH v2 2/6] spec: Tweak description of maximum block size
Date: Sun, 5 Mar 2023 10:45:15 +0200 [thread overview]
Message-ID: <ZARWm7/vh0gMdjEC@pc220518.home.grep.be> (raw)
In-Reply-To: <20230303222653.xivvmvilswu3lii3@redhat.com>
On Fri, Mar 03, 2023 at 04:26:53PM -0600, Eric Blake wrote:
> On Tue, Feb 21, 2023 at 05:21:37PM +0200, Wouter Verhelst wrote:
> > Hi Eric,
> >
> > Busy days, busy times. Sorry about the insane delays here.
>
> No problem; I've been tackling other things in the meantime too, so
> this extension has taken far longer than I planned for more reasons
> than just slow review time.
>
> >
> > On Mon, Nov 14, 2022 at 04:46:51PM -0600, Eric Blake wrote:
> > > Commit 9f30fedb improved the spec to allow non-payload requests that
> > > exceed any advertised maximum block size. Take this one step further
> > > by permitting the server to use NBD_EOVERFLOW as a hint to the client
> > > when a request is oversize (while permitting NBD_EINVAL for
> > > back-compat), and by rewording the text to explicitly call out that
> > > what is being advertised is the maximum payload length, not maximum
> > > block size. This becomes more important when we add 64-bit
> > > extensions, where it becomes possible to extend `NBD_CMD_BLOCK_STATUS`
> > > to have both an effect length (how much of the image does the client
> > > want status on - may be larger than 32 bits) and an optional payload
> > > length (a way to filter the response to a subset of negotiated
> > > metadata contexts). In the shorter term, it means that a server may
> > > (but not must) accept a read request larger than the maximum block
> > > size if it can use structured replies to keep each chunk of the
> > > response under the maximum payload limits.
> > > ---
> > > doc/proto.md | 127 +++++++++++++++++++++++++++++----------------------
> > > 1 file changed, 73 insertions(+), 54 deletions(-)
> > >
> > > diff --git a/doc/proto.md b/doc/proto.md
> > > index 8f08583..53c334a 100644
> > > --- a/doc/proto.md
> > > +++ b/doc/proto.md
> > > @@ -745,8 +745,8 @@ text unless the client insists on TLS.
> > >
> > > During transmission phase, several operations are constrained by the
> > > export size sent by the final `NBD_OPT_EXPORT_NAME` or `NBD_OPT_GO`,
> > > -as well as by three block size constraints defined here (minimum,
> > > -preferred, and maximum).
> > > +as well as by three block size constraints defined here (minimum
> > > +block, preferred block, and maximum payload).
> > >
> > > If a client can honour server block size constraints (as set out below
> > > and under `NBD_INFO_BLOCK_SIZE`), it SHOULD announce this during the
> > > @@ -772,15 +772,15 @@ learn the server's constraints without committing to them.
> > >
> > > If block size constraints have not been advertised or agreed on
> > > externally, then a server SHOULD support a default minimum block size
> > > -of 1, a preferred block size of 2^12 (4,096), and a maximum block size
> > > -that is effectively unlimited (0xffffffff, or the export size if that
> > > -is smaller), while a client desiring maximum interoperability SHOULD
> > > -constrain its requests to a minimum block size of 2^9 (512), and limit
> > > -`NBD_CMD_READ` and `NBD_CMD_WRITE` commands to a maximum block size of
> > > -2^25 (33,554,432). A server that wants to enforce block sizes other
> > > -than the defaults specified here MAY refuse to go into transmission
> > > -phase with a client that uses `NBD_OPT_EXPORT_NAME` (via a hard
> > > -disconnect) or which uses `NBD_OPT_GO` without requesting
> > > +of 1, a preferred block size of 2^12 (4,096), and a maximum block
> > > +payload size that is at least 2^25 (33,554,432) (even if the export
> > > +size is smaller); while a client desiring maximum interoperability
> > > +SHOULD constrain its requests to a minimum block size of 2^9 (512),
> > > +and limit `NBD_CMD_READ` and `NBD_CMD_WRITE` commands to a maximum
> > > +block size of 2^25 (33,554,432). A server that wants to enforce block
> > > +sizes other than the defaults specified here MAY refuse to go into
> > > +transmission phase with a client that uses `NBD_OPT_EXPORT_NAME` (via
> > > +a hard disconnect) or which uses `NBD_OPT_GO` without requesting
> >
> > This does more than what the commit message says: it also limits payload
> > size from 0xffffffff to 2^25. We already have a "A server that desires
> > maximum interoperability" clause that mentions the 2^25, so I'm not
> > entirely sure why we need to restrict that for the default cause.
> >
> > Remember, apart from specifying how something should be implemented, the
> > spec also documents current and historic behavior. I am probably
> > convinced that it makes more sense to steer people towards limiting to
> > 2^25, but it should be done in such a way that servers which accept a
> > 0xffffffff block size are not suddenly noncompliant. I don't think this
> > does that.
>
> I'll have to play more with the wording here. A client shouldn't send
> larger write requests to a server without knowing the server will
> accept it (because traditional servers disconnect instead - the
> alternative is to read the client's entire payload, and the larger the
> payload is, the longer that takes - the client is starving the server
> from serving other clients by making it chew through data).
Well, in the case of a preforking server (like nbd-server), that isn't
really the case, the client is instead only starving itself. But yeah.
> But sending larger read requests MAY be tolerated by a server that
> sends a graceful error message ("you requested more than I'm willing
> to send, but my error response is short so the connection can stay
> open"), even if several historical implementations have also hung up
> at that point (at least qemu implements a malloc() call for both read
> and writes; reads populate the buffer from the client, writes
> populate the buffer to send back to the client - the connection is
> aborted when the malloc() is not attempted because it exceeds 2^25
> bytes).
Yeah, nbd-server does that too, except for the abortion. At least intentionally
so; again, since it's a preforking server, the worst that can happen here is
that it gets an ENOMEM, at which point it will just die, which causes the
client to lose its connection and nothing else.
> But I also see your point about not declaring a server non-compliant
> merely because it allows for a larger payload, nor a client that sends
> larger payloads to a known-happy server that accepts those payloads.
> nbdkit historically chose 64M as its limits instead of qemu's 32M.
Right, thanks.
--
w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}
I will have a Tin-Actinium-Potassium mixture, thanks.
next prev parent reply other threads:[~2023-03-05 8:45 UTC|newest]
Thread overview: 65+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-11-14 22:41 [cross-project PATCH v2] NBD 64-bit extensions Eric Blake
2022-11-14 22:46 ` [PATCH v2 0/6] NBD spec changes for " Eric Blake
2022-11-14 22:46 ` [PATCH v2 1/6] spec: Recommend cap on NBD_REPLY_TYPE_BLOCK_STATUS length Eric Blake
2022-12-16 19:32 ` Vladimir Sementsov-Ogievskiy
2023-03-03 22:17 ` Eric Blake
2023-03-05 8:41 ` Wouter Verhelst
2023-03-06 8:48 ` [Libguestfs] " Laszlo Ersek
2023-03-06 13:48 ` Nir Soffer
2022-11-14 22:46 ` [PATCH v2 2/6] spec: Tweak description of maximum block size Eric Blake
2022-12-16 20:22 ` Vladimir Sementsov-Ogievskiy
2023-03-03 22:20 ` Eric Blake
2023-02-21 15:21 ` Wouter Verhelst
2023-03-03 22:26 ` Eric Blake
2023-03-05 8:45 ` Wouter Verhelst [this message]
2022-11-14 22:46 ` [PATCH v2 3/6] spec: Add NBD_OPT_EXTENDED_HEADERS Eric Blake
2022-12-19 18:26 ` Vladimir Sementsov-Ogievskiy
2023-02-22 9:49 ` Wouter Verhelst
2023-03-03 22:36 ` Eric Blake
2023-03-05 8:49 ` Wouter Verhelst
2022-11-14 22:46 ` [PATCH v2 4/6] spec: Allow 64-bit block status results Eric Blake
2022-11-14 22:46 ` [PATCH v2 5/6] spec: Introduce NBD_FLAG_BLOCK_STATUS_PAYLOAD Eric Blake
2023-02-22 10:05 ` Wouter Verhelst
2023-03-03 22:40 ` Eric Blake
2023-03-05 8:50 ` Wouter Verhelst
2022-11-14 22:46 ` [PATCH v2 6/6] RFC: spec: Introduce NBD_REPLY_TYPE_OFFSET_HOLE_EXT Eric Blake
2022-11-14 22:48 ` [PATCH v2 00/15] qemu patches for 64-bit NBD extensions Eric Blake
2022-11-14 22:48 ` [PATCH v2 01/15] nbd/client: Add safety check on chunk payload length Eric Blake
2022-11-14 22:48 ` [PATCH v2 02/15] nbd/server: Prepare for alternate-size headers Eric Blake
2022-11-14 22:48 ` [PATCH v2 03/15] nbd: Prepare for 64-bit request effect lengths Eric Blake
2022-11-14 22:48 ` [PATCH v2 04/15] nbd: Add types for extended headers Eric Blake
2022-11-14 22:48 ` [PATCH v2 05/15] nbd/server: Refactor handling of request payload Eric Blake
2022-11-14 22:48 ` [PATCH v2 06/15] nbd/server: Refactor to pass full request around Eric Blake
2022-11-14 22:48 ` [PATCH v2 07/15] nbd/server: Initial support for extended headers Eric Blake
2022-11-14 22:48 ` [PATCH v2 08/15] nbd/server: Support 64-bit block status Eric Blake
2022-11-14 22:48 ` [PATCH v2 09/15] nbd/client: Initial support for extended headers Eric Blake
2022-11-14 22:48 ` [PATCH v2 10/15] nbd/client: Accept 64-bit block status chunks Eric Blake
2022-11-14 22:48 ` [PATCH v2 11/15] nbd/client: Request extended headers during negotiation Eric Blake
2022-11-14 22:48 ` [PATCH v2 12/15] nbd/server: Prepare for per-request filtering of BLOCK_STATUS Eric Blake
2022-11-14 22:48 ` [PATCH v2 13/15] nbd/server: Add FLAG_PAYLOAD support to CMD_BLOCK_STATUS Eric Blake
2022-11-14 22:48 ` [PATCH v2 14/15] RFC: nbd/client: Accept 64-bit hole chunks Eric Blake
2022-11-14 22:48 ` [PATCH v2 15/15] RFC: nbd/server: Send 64-bit hole chunk Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 00/23] libnbd 64-bit NBD extensions Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 01/23] block_status: Refactor array storage Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 02/23] internal: Refactor layout of replies in sbuf Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 03/23] protocol: Add definitions for extended headers Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 04/23] states: Prepare to send 64-bit requests Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 05/23] states: Prepare to receive 64-bit replies Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 06/23] states: Break deadlock if server goofs on extended replies Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 07/23] generator: Add struct nbd_extent in prep for 64-bit extents Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 08/23] block_status: Track 64-bit extents internally Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 09/23] block_status: Accept 64-bit extents during block status Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 10/23] api: Add [aio_]nbd_block_status_64 Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 11/23] api: Add several functions for controlling extended headers Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 12/23] copy: Update nbdcopy to use 64-bit block status Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 13/23] dump: Update nbddump " Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 14/23] info: Expose extended-headers support through nbdinfo Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 15/23] info: Update nbdinfo --map to use 64-bit block status Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 16/23] examples: Update copy-libev " Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 17/23] ocaml: Add example for 64-bit extents Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 18/23] generator: Actually request extended headers Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 19/23] api: Add nbd_[aio_]opt_extended_headers() Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 20/23] interop: Add test of 64-bit block status Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 21/23] api: Add nbd_can_block_status_payload() Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 22/23] api: Add nbd_[aio_]block_status_filter() Eric Blake
2022-11-14 22:51 ` [libnbd PATCH v2 23/23] RFC: pread: Accept 64-bit holes Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZARWm7/vh0gMdjEC@pc220518.home.grep.be \
--to=w@uter.be \
--cc=eblake@redhat.com \
--cc=libguestfs@redhat.com \
--cc=nbd@other.debian.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).