From: Eric Blake <eblake@redhat.com>
To: Quentin Casasnovas <quentin.casasnovas@oracle.com>,
qemu-devel <qemu-devel@nongnu.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
qemu-trivial@nongnu.org, qemu-stable@nongnu.org,
"nbd-general@lists.sourceforge.net"
<nbd-general@lists.sourceforge.net>,
qemu block <qemu-block@nongnu.org>
Subject: Re: [Qemu-trivial] [Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE
Date: Tue, 10 May 2016 08:01:00 -0600 [thread overview]
Message-ID: <5731E99C.3000108@redhat.com> (raw)
In-Reply-To: <1462524302-15558-1-git-send-email-quentin.casasnovas@oracle.com>
[-- Attachment #1: Type: text/plain, Size: 4424 bytes --]
[adding nbd-devel, qemu-block]
On 05/06/2016 02:45 AM, Quentin Casasnovas wrote:
> When running fstrim on a filesystem mounted through qemu-nbd with
> --discard=on, fstrim would fail with I/O errors:
>
> $ fstrim /k/spl/ice/
> fstrim: /k/spl/ice/: FITRIM ioctl failed: Input/output error
>
> and qemu-nbd was spitting these:
>
> nbd.c:nbd_co_receive_request():L1232: len (94621696) is larger than max len (33554432)
>
>
> The length of the request seems huge but this is really just the filesystem
> telling the block device driver that "this length should be trimmed", and,
> unlike for a NBD_CMD_READ or NBD_CMD_WRITE, we'll not try to read/write
> that amount of data from/to the NBD socket. It is thus safe to remove the
> length check for a NBD_CMD_TRIM.
>
> I've confirmed this with both the protocol documentation at:
>
> https://github.com/yoe/nbd/blob/master/doc/proto.md
Hmm. The current wording of the experimental block size additions does
NOT allow the client to send a NBD_CMD_TRIM with a size larger than the
maximum NBD_CMD_WRITE:
https://github.com/yoe/nbd/blob/extension-info/doc/proto.md#block-size-constraints
Maybe we should revisit that in the spec, and/or advertise yet another
block size (since the maximum size for a trim and/or write_zeroes
request may indeed be different than the maximum size for a read/write).
But since the kernel is the one sending the large length request, and
since you are right that this is not a denial-of-service in the amount
of data being sent in a single NBD message, I definitely agree that qemu
would be wise as a quality-of-implementation to allow the larger size,
for maximum interoperability, even if it exceeds advertised limits (that
is, when no limits are advertised, we should handle everything possible
if it is not so large as to be construed a denial-of-service, and
NBD_CMD_TRIM is not large; and when limits ARE advertised, a client that
violates limits is out of spec but we can still be liberal and respond
successfully to such a client rather than having to outright reject it).
So I think this patch is headed in the right direction.
>
> and looking at the kernel side implementation of the nbd device
> (drivers/block/nbd.c) where it only sends the request header with no data
> for a NBD_CMD_TRIM.
>
> With this fix in, I am now able to run fstrim on my qcow2 images and keep
> them small (or at least keep their size proportional to the amount of data
> present on them).
>
> Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: <qemu-devel@nongnu.org>
> CC: <qemu-stable@nongnu.org>
> CC: <qemu-trivial@nongnu.org>
This is NOT trivial material and should not go in through that tree.
However, I concur that it qualifies for a backport on a stable branch.
> ---
> nbd.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/nbd.c b/nbd.c
> index b3d9654..e733669 100644
> --- a/nbd.c
> +++ b/nbd.c
> @@ -1209,6 +1209,11 @@ static ssize_t nbd_co_send_reply(NBDRequest *req, struct nbd_reply *reply,
> return rc;
> }
>
> +static bool nbd_should_check_request_size(const struct nbd_request *request)
> +{
> + return (request->type & NBD_CMD_MASK_COMMAND) != NBD_CMD_TRIM;
> +}
> +
> static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *request)
> {
> NBDClient *client = req->client;
> @@ -1227,7 +1232,8 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque
> goto out;
> }
>
> - if (request->len > NBD_MAX_BUFFER_SIZE) {
> + if (nbd_should_check_request_size(request) &&
> + request->len > NBD_MAX_BUFFER_SIZE) {
I'd rather sort out the implications of this on the NBD protocol before
taking anything into qemu. We've got time on our hand, so let's use it
to get this right. (That, and I have several pending patches that
conflict with this as part of adding WRITE_ZEROES and INFO_BLOCK_SIZE
support, where it may be easier to resubmit this fix on top of my
pending patches).
> LOG("len (%u) is larger than max len (%u)",
> request->len, NBD_MAX_BUFFER_SIZE);
> rc = -EINVAL;
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
next prev parent reply other threads:[~2016-05-10 14:01 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-06 8:45 [Qemu-trivial] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE Quentin Casasnovas
2016-05-10 14:01 ` Eric Blake [this message]
2016-05-10 15:08 ` [Qemu-trivial] [Nbd] [Qemu-devel] " Alex Bligh
2016-05-10 15:29 ` Eric Blake
2016-05-10 15:38 ` Alex Bligh
2016-05-10 15:45 ` Quentin Casasnovas
2016-05-10 15:49 ` Alex Bligh
2016-05-10 16:04 ` Quentin Casasnovas
2016-05-10 16:23 ` Alex Bligh
2016-05-10 16:27 ` Quentin Casasnovas
2016-05-11 9:38 ` Paolo Bonzini
2016-05-11 14:08 ` Eric Blake
2016-05-11 14:55 ` Alex Bligh
2016-05-11 15:08 ` Paolo Bonzini
2016-05-10 17:55 ` Paolo Bonzini
2016-05-11 21:12 ` Wouter Verhelst
2016-05-12 15:33 ` Alex Bligh
2016-05-10 15:41 ` Alex Bligh
2016-05-10 15:46 ` Eric Blake
2016-05-10 15:52 ` Alex Bligh
2016-05-10 15:54 ` Quentin Casasnovas
2016-05-10 16:33 ` Quentin Casasnovas
2016-05-10 20:24 ` Eric Blake
2016-05-10 19:13 ` Michał Belczyk
2016-05-11 21:10 ` Wouter Verhelst
2016-05-11 21:06 ` Wouter Verhelst
2016-05-12 15:03 ` Alex Bligh
2016-05-10 20:34 ` [Qemu-trivial] " Eric Blake
2016-05-11 8:34 ` Quentin Casasnovas
2016-05-11 14:11 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5731E99C.3000108@redhat.com \
--to=eblake@redhat.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=qemu-trivial@nongnu.org \
--cc=quentin.casasnovas@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).