From: Quentin Casasnovas <quentin.casasnovas@oracle.com>
To: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Eric Blake <eblake@redhat.com>, Alex Bligh <alex@alex.org.uk>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
"qemu-trivial@nongnu.org" <qemu-trivial@nongnu.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"nbd-general@lists.sourceforge.net"
<nbd-general@lists.sourceforge.net>,
"qemu-stable@nongnu.org" <qemu-stable@nongnu.org>,
qemu block <qemu-block@nongnu.org>
Subject: Re: [Qemu-trivial] [Nbd] [Qemu-devel] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE
Date: Tue, 10 May 2016 18:33:09 +0200 [thread overview]
Message-ID: <20160510163309.GF28315@chrystal.uk.oracle.com> (raw)
In-Reply-To: <20160510155444.GC28315@chrystal.uk.oracle.com>
On Tue, May 10, 2016 at 05:54:44PM +0200, Quentin Casasnovas wrote:
> On Tue, May 10, 2016 at 09:46:36AM -0600, Eric Blake wrote:
> > On 05/10/2016 09:41 AM, Alex Bligh wrote:
> > >
> > > On 10 May 2016, at 16:29, Eric Blake <eblake@redhat.com> wrote:
> > >
> > >> So the kernel is currently one of the clients that does NOT honor block
> > >> sizes, and as such, servers should be prepared for ANY size up to
> > >> UINT_MAX (other than DoS handling).
> > >
> > > Interesting followup question:
> > >
> > > If the kernel does not fragment TRIM requests at all (in the
> > > same way it fragments read and write requests), I suspect
> > > something bad may happen with TRIM requests over 2^31
> > > in size (particularly over 2^32 in size), as the length
> > > field in nbd only has 32 bits.
> > >
> > > Whether it supports block size constraints or not, it is
> > > going to need to do *some* breaking up of requests.
> >
> > Does anyone have an easy way to cause the kernel to request a trim
> > operation that large on a > 4G export? I'm not familiar enough with
> > EXT4 operation to know what file system operations you can run to
> > ultimately indirectly create a file system trim operation that large.
> > But maybe there is something simpler - does the kernel let you use the
> > fallocate(2) syscall operation with FALLOC_FL_PUNCH_HOLE or
> > FALLOC_FL_ZERO_RANGE on an fd backed by an NBD device?
> >
>
> It was fairly reproducible here, we just used a random qcow2 image with
> some Debian minimal system pre-installed, mounted that qcow2 image through
> qemu-nbd then compiled a whole kernel inside it. Then you can make clean
> and run fstrim on the mount point. I'm assuming you can go faster than
> that by just writing a big file to the qcow2 image mounted without -o
> discard, delete the big file, then remount with -o discard + run fstrim.
>
Looks like there's an easier way:
$ qemu-img create -f qcow2 foo.qcow2 10G
$ qemu-nbd --discard=on -c /dev/nbd0 foo.qcow2
$ mkfs.ext4 /dev/nbd0
mke2fs 1.42.13 (17-May-2015)
Discarding device blocks: failed - Input/output error
Creating filesystem with 2621440 4k blocks and 655360 inodes
Filesystem UUID: 25aeb51f-0dea-4c1d-8b65-61f6bcdf97e9
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632
Allocating group tables: done
Writing inode tables: done
Creating journal (32768 blocks): done
Writing superblocks and filesystem accounting information: done
Notice the "Discarding device blocks: failed - Input/output error" line, I
bet that it is mkfs.ext4 trying to trim all blocks prior to writing the
filesystem, but it gets an I/O error while doing so. I haven't verified it
is the same problem, but it it isn't, simply mount the resulting filesystem
and run fstrim on it:
$ mount -o discard /dev/nbd0 /tmp/foo
$ fstrim /tmp/foo
fstrim: /tmp/foo: FITRIM ioctl failed: Input/output error
Quentin
next prev parent reply other threads:[~2016-05-10 16:30 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-06 8:45 [Qemu-trivial] [PATCH] nbd: fix trim/discard commands with a length bigger than NBD_MAX_BUFFER_SIZE Quentin Casasnovas
2016-05-10 14:01 ` [Qemu-trivial] [Qemu-devel] " Eric Blake
2016-05-10 15:08 ` [Qemu-trivial] [Nbd] " Alex Bligh
2016-05-10 15:29 ` Eric Blake
2016-05-10 15:38 ` Alex Bligh
2016-05-10 15:45 ` Quentin Casasnovas
2016-05-10 15:49 ` Alex Bligh
2016-05-10 16:04 ` Quentin Casasnovas
2016-05-10 16:23 ` Alex Bligh
2016-05-10 16:27 ` Quentin Casasnovas
2016-05-11 9:38 ` Paolo Bonzini
2016-05-11 14:08 ` Eric Blake
2016-05-11 14:55 ` Alex Bligh
2016-05-11 15:08 ` Paolo Bonzini
2016-05-10 17:55 ` Paolo Bonzini
2016-05-11 21:12 ` Wouter Verhelst
2016-05-12 15:33 ` Alex Bligh
2016-05-10 15:41 ` Alex Bligh
2016-05-10 15:46 ` Eric Blake
2016-05-10 15:52 ` Alex Bligh
2016-05-10 15:54 ` Quentin Casasnovas
2016-05-10 16:33 ` Quentin Casasnovas [this message]
2016-05-10 20:24 ` Eric Blake
2016-05-10 19:13 ` Michał Belczyk
2016-05-11 21:10 ` Wouter Verhelst
2016-05-11 21:06 ` Wouter Verhelst
2016-05-12 15:03 ` Alex Bligh
2016-05-10 20:34 ` [Qemu-trivial] " Eric Blake
2016-05-11 8:34 ` Quentin Casasnovas
2016-05-11 14:11 ` Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160510163309.GF28315@chrystal.uk.oracle.com \
--to=quentin.casasnovas@oracle.com \
--cc=alex@alex.org.uk \
--cc=eblake@redhat.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-stable@nongnu.org \
--cc=qemu-trivial@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).