From: "Richard W.M. Jones" <rjones@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
Alberto Garcia <berto@igalia.com>,
"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
QEMU <qemu-devel@nongnu.org>, Max Reitz <mreitz@redhat.com>,
"nbd@other.debian.org" <nbd@other.debian.org>,
"libguestfs@redhat.com" <libguestfs@redhat.com>
Subject: Re: Cross-project NBD extension proposal: NBD_INFO_INIT_STATE
Date: Mon, 10 Feb 2020 22:52:55 +0000 [thread overview]
Message-ID: <20200210225255.GJ3888@redhat.com> (raw)
In-Reply-To: <cc6e1e2e-d3a9-c498-354b-d382b5623ca0@redhat.com>
On Mon, Feb 10, 2020 at 04:29:53PM -0600, Eric Blake wrote:
> On 2/10/20 4:12 PM, Richard W.M. Jones wrote:
> >On Mon, Feb 10, 2020 at 03:37:20PM -0600, Eric Blake wrote:
> >>For now, only 2 of those 16 bits are defined: NBD_INIT_SPARSE (the
> >>image has at least one hole) and NBD_INIT_ZERO (the image reads
> >>completely as zero); the two bits are orthogonal and can be set
> >>independently, although it is easy enough to see completely sparse
> >>files with both bits set.
> >
> >I think I'm confused about the exact meaning of NBD_INIT_SPARSE. Do
> >you really mean the whole image is sparse; or (as you seem to have
> >said above) that there exists a hole somewhere in the image but we're
> >not saying where it is and there can be non-sparse parts of the image?
>
> As implemented:
>
> NBD_INIT_SPARSE - there is at least one hole somewhere (allocation
> would be required to write to that part of the file), but there may
> b allocated data elsewhere in the image. Most disk images will fit
> this definition (for example, it is very common to have a hole
> between the MBR or GPT and the first partition containing a file
> system, or for file systems themselves to be sparse within the
> larger block device).
I think I'm still confused about why this particular flag would be
useful for clients (I can completely understand why clients need
NBD_INIT_ZERO).
But anyway ... could a flag indicating that the whole image is sparse
be useful, either as well as NBD_INIT_SPARSE or instead of it? You
could use it to avoid an initial disk trim, which is something that
mke2fs does:
https://github.com/tytso/e2fsprogs/blob/0670fc20df4a4bbbeb0edb30d82628ea30a80598/misc/mke2fs.c#L2768
and which is painfully slow over NBD for very large devices because of
the 32 bit limit on request sizes - try doing mke2fs on a 1E nbdkit
memory disk some time.
> NBD_INIT_ZERO - all bytes read as zero.
>
> The combination NBD_INIT_SPARSE|NBD_INIT_ZERO is common (generally,
> if you use lseek(SEEK_DATA) to prove the entire image reads as
> zeroes, you also know the entire image is sparse), but NBD_INIT_ZERO
> in isolation is also possible (especially with the qcow2 proposal of
> a persistent autoclear bit, where even with a fully preallocated
> qcow2 image you still know it reads as zeroes but there are no
> holes). But you are also right that for servers that can advertise
> both bits efficiently, NBD_INIT_SPARSE in isolation may be more
> common than NBD_INIT_SPARSE|NBD_INIT_ZERO (the former for most disk
> images, the latter only for a freshly-created image that happens to
> create with zero initialization).
>
> What's more, in my patches, I did NOT patch qemu to set or consume
> INIT_SPARSE; so far, it only sets/consumes INIT_ZERO. Of course, if
> we can find a reason WHY qemu should track whether a qcow2 image is
> fully-allocated, by demonstrating a qemu-img algorithm that becomes
> easier for knowing if an image is sparse (even if our justification
> is: "when copying an image, I want to know if the _source_ is
> sparse, to know whether I have to bend over backwards to preallocate
> the destination"), then using that in qemu makes sense for my v2
> patches. But for v1, my only justification was "when copying an
> image, I can skip holes in the source if I know the _destination_
> already reads as zeroes", which only needed INIT_ZERO.
>
> Some of the nbdkit patches demonstrate the some-vs.-all nature of
> the two bits; for example, in the split plugin, I initialize
> h->init_sparse = false; h->init_zero = true; then in a loop over
> each file change h->init_sparse to true if at least one file was
> sparse, and change h->init_zero to false if at least one file had
> non-zero contents.
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
next prev parent reply other threads:[~2020-02-10 22:53 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-10 21:37 Cross-project NBD extension proposal: NBD_INFO_INIT_STATE Eric Blake
2020-02-10 21:41 ` [qemu PATCH 0/3] NBD_INFO_INIT_STATE extension Eric Blake
2020-02-10 21:41 ` [PATCH 1/3] nbd: Preparation for NBD_INFO_INIT_STATE Eric Blake
2020-02-10 21:41 ` [PATCH 2/3] nbd: Add .bdrv_known_zeroes() client support Eric Blake
2020-02-10 21:41 ` [PATCH 3/3] nbd: Add .bdrv_known_zeroes() server support Eric Blake
2020-02-10 21:51 ` [qemu PATCH 0/3] NBD_INFO_INIT_STATE extension no-reply
2020-02-10 21:54 ` Eric Blake
2020-02-10 21:53 ` no-reply
2020-02-10 22:12 ` Cross-project NBD extension proposal: NBD_INFO_INIT_STATE Richard W.M. Jones
2020-02-10 22:29 ` Eric Blake
2020-02-10 22:52 ` Richard W.M. Jones [this message]
2020-02-11 14:33 ` Eric Blake
2020-02-12 7:27 ` Wouter Verhelst
2020-02-12 12:09 ` Eric Blake
2020-02-12 12:36 ` Richard W.M. Jones
2020-02-12 12:47 ` Eric Blake
2020-02-17 15:13 ` Max Reitz
2020-02-18 20:55 ` Eric Blake
2020-02-19 11:10 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200210225255.GJ3888@redhat.com \
--to=rjones@redhat.com \
--cc=berto@igalia.com \
--cc=eblake@redhat.com \
--cc=libguestfs@redhat.com \
--cc=mreitz@redhat.com \
--cc=nbd@other.debian.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).