From: Asias He <asias@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>,
MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Subject: Re: [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file
Date: Mon, 19 Aug 2013 14:36:14 +0800 [thread overview]
Message-ID: <20130819063614.GA14101@hj.localdomain> (raw)
In-Reply-To: <CAJSP0QX7HRXZe5_b1KF=N0=q8igH4auQnT4mfRXYKtbvyeGLoA@mail.gmail.com>
On Fri, Aug 16, 2013 at 10:41:36AM +0200, Stefan Hajnoczi wrote:
> On Mon, Aug 5, 2013 at 10:11 AM, Asias He <asias@redhat.com> wrote:
> > From: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> >
> > While Asias is debugging an issue creating qcow2 images on top of
> > non-file protocols. It boils down to this example using NBD:
> >
> > $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
> >
> > Notice the open -g option to set bs->growable. This means you can
> > read/write beyond end of file. Reading beyond end of file is supposed
> > to produce zeroes.
> >
> > We rely on this behavior in qcow2_create2() during qcow2 image
> > creation. We create a new file and then write the qcow2 header
> > structure using bdrv_pwrite(). Since QCowHeader is not a multiple of
> > sector size, block.c first uses bdrv_read() on the empty file to fetch
> > the first sector (should be all zeroes).
> >
> > Here is the output from the qemu-io NBD example above:
> >
> > $ qemu-io -c 'open -g nbd+unix:///?socket=/tmp/nbd.sock' -c 'read -v 0 512'
> > 00000000: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ................
> > 00000010: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ................
> > 00000020: ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ................
> > ...
> >
> > We are not zeroing the buffer! As a result qcow2 image creation on top
> > of protocols is not guaranteed to work even when file creation is
> > supported by the protocol.
> >
> > Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
> > Signed-off-by: Asias He <asias@redhat.com>
> > ---
> > block.c | 30 +++++++++++++++++++++++++++++-
> > 1 file changed, 29 insertions(+), 1 deletion(-)
> >
> > diff --git a/block.c b/block.c
> > index 01b66d8..deaf0a0 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2544,7 +2544,35 @@ static int coroutine_fn bdrv_co_do_readv(BlockDriverState *bs,
> > }
> > }
> >
> > - ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
> > + if (!bs->drv->protocol_name) {
> > + ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
> > + } else {
> > + /* NBD doesn't support reading beyond end of file. */
> > + int64_t len, total_sectors, max_nb_sectors;
> > +
> > + len = bdrv_getlength(bs);
> > + if (len < 0) {
> > + ret = len;
> > + goto out;
> > + }
> > +
> > + total_sectors = len >> BDRV_SECTOR_BITS;
> > + max_nb_sectors = MAX(0, total_sectors - sector_num);
> > + if (max_nb_sectors > 0) {
> > + ret = drv->bdrv_co_readv(bs, sector_num,
> > + MIN(nb_sectors, max_nb_sectors), qiov);
> > + } else {
> > + ret = 0;
> > + }
> > +
> > + /* Reading beyond end of file is supposed to produce zeroes */
> > + if (ret == 0 && total_sectors < sector_num + nb_sectors) {
> > + size_t offset = MAX(0, total_sectors - sector_num);
> > + size_t bytes = (sector_num + nb_sectors - offset) *
> > + BDRV_SECTOR_SIZE;
> > + qemu_iovec_memset(qiov, offset * BDRV_SECTOR_SIZE, 0, bytes);
> > + }
> > + }
>
> This patch breaks qemu-iotests ./check -qcow2 022. This happens
> because qcow2 temporarily sets ->growable = 1 for vmstate accesses
> (which are stored beyond the end of regular image data).
I am a bit confused. This is from the other mail:
"""
> > I think it would break qcow2_load_vmstate(), which is basically a
> > bdrv_pread() after the end of the image.
>
> I see, then only protocols have to zeroing the buffer? In case of
> protocols, I think bdrv_getlength() returns the underlying file
> length, so qcow2_load_vmstate() would be a bdrv_pread() within the
> result of bdrv_getlength().
Limiting it to protocols solves the problem, I think.
"""
And in v1 of this patch, Kevin wanted bs->growable check instad of the
protocol_name one.
"""
> - ret = drv->bdrv_co_readv(bs, sector_num, nb_sectors, qiov);
> + if (!bs->drv->protocol_name) {
I think !bs->growable is the right check.
Checking for the protocol name is always a hack and most times wrong.
"""
Switching back to the protocol_name check, ./check -qcow2 022 test passes.
> static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
> int64_t pos, int size)
> {
> BDRVQcowState *s = bs->opaque;
> int growable = bs->growable;
> int ret;
>
> BLKDBG_EVENT(bs->file, BLKDBG_VMSTATE_LOAD);
> bs->growable = 1;
> ret = bdrv_pread(bs, qcow2_vm_state_offset(s) + pos, buf, size);
> bs->growable = growable;
>
> return ret;
> }
>
> Please *always* run qemu-iotests before submitting block patches:
> http://qemu-project.org/Documentation/QemuIoTests
OK.
> A simple but ugly way to fix this is for block.c to also have a
> ->zero_beyond_eof flag which enables the behavior you are adding.
> qcow2_load_vmstate() would disable ->zero_beyond_eof temporarily in
> addition to enabling ->growable.
I am wondering why the ->growable logic is introduced in the first
place. Adding yet another this kind of flag looks realy ugly ;(
> It's not easy to call the internal qcow2_co_readv() from
> qcow2_load_vmstate() because the vmstate functions are byte
> granularity (like bdrv_pread()) while .bdrv_co_readv() is
> sector-granularity.
>
> Stefan
--
Asias
next prev parent reply other threads:[~2013-08-19 6:36 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-08-05 8:11 [Qemu-devel] [PATCH] block: Produce zeros when protocols reading beyond end of file Asias He
2013-08-05 12:41 ` Kevin Wolf
2013-08-06 1:53 ` [Qemu-devel] [PATCH v2] " Asias He
2013-08-06 2:02 ` Fam Zheng
2013-08-06 2:38 ` Asias He
2013-08-07 8:04 ` Stefan Hajnoczi
2013-08-13 13:50 ` Stefan Hajnoczi
2013-08-22 12:13 ` Stefan Hajnoczi
2013-08-16 8:41 ` [Qemu-devel] [PATCH] " Stefan Hajnoczi
2013-08-19 6:36 ` Asias He [this message]
2013-08-19 12:09 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130819063614.GA14101@hj.localdomain \
--to=asias@redhat.com \
--cc=kwolf@redhat.com \
--cc=morita.kazutaka@lab.ntt.co.jp \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@gmail.com \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).