From: Kevin Wolf <kwolf@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: nbd-general@lists.sourceforge.net, xieyingtai@huawei.com,
subo7@huawei.com, qemu-block@nongnu.org, eric.fangyi@huawei.com,
qemu-devel@nongnu.org, pbonzini@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension
Date: Wed, 7 Dec 2016 17:35:08 +0100 [thread overview]
Message-ID: <20161207163508.GJ4773@noname.str.redhat.com> (raw)
In-Reply-To: <12602b60-df5f-b365-c34a-80c37610a1fc@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3522 bytes --]
Am 07.12.2016 um 16:50 hat Eric Blake geschrieben:
> On 12/07/2016 04:44 AM, Kevin Wolf wrote:
> >> Just because the NBD spec describes the bit does NOT require that
> >> servers HAVE to set the bit on all images that are all zeroes. It is
> >> perfectly compliant if the server never advertises the bit.
> >
> > True, but if no server exists that would actually make use of the
> > feature, it's kind of useless to include it in the spec.
>
> No server, or no client?
Well, you need both to make use of the feature.
> I think qemu can be a client fairly easily: if the server advertises
> the bit, then the client can set .bdrv_has_zero_init() and avoid
> rewriting any sector of a file that is already known to be zero.
Yes, the client part makes sense to me and should be easy. Are you going
to write the patches yourself?
> > The interesting case is probably where the image is
> > created externally with qemu-img before it's exported either with
> > qemu-nbd or the builtin server, and then we use it as a mirror target.
> >
> > Even in the rare cases where both image creation and the NBD server are
> > in the same process, bdrv_create() doesn't work on a BlockDriverState,
> > but just on a filename. So even then you would have to do hacks like
> > remembering file names between create and the first open or something
> > like that.
>
> Or add in a driver-specific callback that checks if a file is provably
> all-zeroes; for the raw file driver, check if lseek(SEEK_DATA) finds
> nothing, for the qcow2 driver, check for no backing files, and no L1
> table entries.
In theory, there should be a common efficient abstraction for these two:
bool bs_is_zeroed(BlockDriverState *bs)
{
int pum;
ret = bdrv_get_block_status(bs, 0, bs->total_sectors, *pnum, NULL);
return (ret == 0 && pnum == bs->total_sectors);
}
For raw this does the lseek() you mentioned, and for qcow2 it will look
at the L1 table and one L2 table (the first one that exists). This is a
bit more expensive than just looking at the L1 table, but so minimally
that it's far from justifying a new command or flag in a protocol.
Now, in practice, this doesn't work because bdrv_get_block_status() can
return early even if the contiguous area is longer that what it
returns. This is something that we should possibly fix sometime in qemu,
but it's also independent from NBD (we can loop in the NBD server to
give the desired semantics).
So we are already going to export a block status querying interface to
NBD. If we require there that the whole contiguous area of the same
status is described and the server can't just shorten it, then we
get the very same thing without a new flag.
> >> Another option on the NBD server side is to create a server option -
> >> when firing up a server to serve a particular file as an export, the
> >> user can explicitly tell the server to advertise the bit because the
> >> user has side knowledge that the file was just created (and then the
> >> burden of misbehavior is on the user if they mistakenly request the
> >> advertisement when it is not true).
> >
> > Maybe that's the only practical approach.
>
> But it's still a viable approach, and therefore this bit definition in
> the NBD protocol is worthwhile.
If it adds something that we can't easily get otherwise, and if someone
volunteers to write the patches, then yes. I'm not completely convinced
yet on that.
Kevin
[-- Attachment #2: Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2016-12-07 16:35 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-05 23:42 [Qemu-devel] [PATCH] doc: Propose NBD_FLAG_INIT_ZEROES extension Eric Blake
2016-12-06 8:46 ` [Qemu-devel] [Nbd] " Alex Bligh
2016-12-06 10:30 ` Alex Bligh
2016-12-06 9:25 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2016-12-06 10:29 ` [Qemu-devel] [Nbd] " Alex Bligh
2016-12-06 15:21 ` [Qemu-devel] " Eric Blake
2016-12-07 10:44 ` Kevin Wolf
2016-12-07 15:50 ` Eric Blake
2016-12-07 16:35 ` Kevin Wolf [this message]
2016-12-06 10:18 ` [Qemu-devel] " Stefan Hajnoczi
2016-12-12 18:12 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-12-13 7:38 ` [Qemu-devel] [Qemu-block] " Kevin Wolf
2016-12-13 12:18 ` [Qemu-devel] [Nbd] [Qemu-block] " Wouter Verhelst
2016-12-13 13:19 ` Alex Bligh
2016-12-13 22:36 ` Eric Blake
2016-12-14 8:22 ` Wouter Verhelst
2016-12-14 16:37 ` Eric Blake
2016-12-14 16:54 ` Alex Bligh
2016-12-14 19:58 ` Wouter Verhelst
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161207163508.GJ4773@noname.str.redhat.com \
--to=kwolf@redhat.com \
--cc=eblake@redhat.com \
--cc=eric.fangyi@huawei.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=subo7@huawei.com \
--cc=xieyingtai@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).