qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: berto@igalia.com, qemu-devel@nongnu.org, qemu-block@nongnu.org,
	mreitz@redhat.com
Subject: Re: [PATCH v6 04/10] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate
Date: Fri, 24 Apr 2020 14:17:12 +0200	[thread overview]
Message-ID: <20200424121712.GA4921@linux.fritz.box> (raw)
In-Reply-To: <436f15ce-da79-7016-6478-c5ab8642ee25@virtuozzo.com>

Am 24.04.2020 um 08:16 hat Vladimir Sementsov-Ogievskiy geschrieben:
> 23.04.2020 18:01, Kevin Wolf wrote:
> > If BDRV_REQ_ZERO_WRITE is set and we're extending the image, calling
> > qcow2_cluster_zeroize() with flags=0 does the right thing: It doesn't
> > undo any previous preallocation, but just adds the zero flag to all
> > relevant L2 entries. If an external data file is in use, a write_zeroes
> > request to the data file is made instead.
> > 
> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > Reviewed-by: Max Reitz <mreitz@redhat.com>
> > ---
> >   block/qcow2-cluster.c |  2 +-
> >   block/qcow2.c         | 33 +++++++++++++++++++++++++++++++++
> >   2 files changed, 34 insertions(+), 1 deletion(-)
> > 
> > diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
> > index 17f1363279..4b5fc8c4a7 100644
> > --- a/block/qcow2-cluster.c
> > +++ b/block/qcow2-cluster.c
> > @@ -1795,7 +1795,7 @@ int qcow2_cluster_zeroize(BlockDriverState *bs, uint64_t offset,
> >       /* Caller must pass aligned values, except at image end */
> >       assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
> >       assert(QEMU_IS_ALIGNED(end_offset, s->cluster_size) ||
> > -           end_offset == bs->total_sectors << BDRV_SECTOR_BITS);
> > +           end_offset >= bs->total_sectors << BDRV_SECTOR_BITS);
> >       /* The zero flag is only supported by version 3 and newer */
> >       if (s->qcow_version < 3) {
> > diff --git a/block/qcow2.c b/block/qcow2.c
> > index 9cfbdfc939..ad621fe404 100644
> > --- a/block/qcow2.c
> > +++ b/block/qcow2.c
> > @@ -1726,6 +1726,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options,
> >       bs->supported_zero_flags = header.version >= 3 ?
> >                                  BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK : 0;
> > +    bs->supported_truncate_flags = BDRV_REQ_ZERO_WRITE;
> >       /* Repair image if dirty */
> >       if (!(flags & (BDRV_O_CHECK | BDRV_O_INACTIVE)) && !bs->read_only &&
> > @@ -4214,6 +4215,38 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
> >           g_assert_not_reached();
> >       }
> > +    if ((flags & BDRV_REQ_ZERO_WRITE) && offset > old_length) {
> > +        uint64_t zero_start = QEMU_ALIGN_UP(old_length, s->cluster_size);
> > +
> > +        /*
> > +         * Use zero clusters as much as we can. qcow2_cluster_zeroize()
> > +         * requires a cluster-aligned start. The end may be unaligned if it is
> > +         * at the end of the image (which it is here).
> > +         */
> > +        ret = qcow2_cluster_zeroize(bs, zero_start, offset - zero_start, 0);
> > +        if (ret < 0) {
> > +            error_setg_errno(errp, -ret, "Failed to zero out new clusters");
> > +            goto fail;
> > +        }
> > +
> > +        /* Write explicit zeros for the unaligned head */
> > +        if (zero_start > old_length) {
> > +            uint8_t *buf = qemu_blockalign0(bs, s->cluster_size);
> 
> Why not s/s->cluster_size/zero_start - old_length/? We may save a lot
> of pages if cluster_size is large.

Ok.

> > +            QEMUIOVector qiov;
> > +            qemu_iovec_init_buf(&qiov, buf, zero_start - old_length);
> > +
> > +            qemu_co_mutex_unlock(&s->lock);
> 
> We are in intermediate state here. Is it safe to unlock? Anything may
> happen, up to another truncate..

I don't think it's worse than unlocking during normal writes, which we
have been doing for a long time. I don't see anything that would corrupt
any internal state.

Of course, doing truncate in parallel with other operations around EOF
has somewhat undefined semantics because the order could be anything.
But if you don't want to get undefined results, you just shouldn't make
such requests. It's similar to reading and writing the same area at the
same time.

> > +            ret = qcow2_co_pwritev_part(bs, old_length, qiov.size, &qiov, 0, 0);
> 
> Better not do it if this cluster is already ZERO.. But it may be a
> later patch to improve that.

I doubt it's worth writing code to optimise a corner case inside a
corner case.

> > +            qemu_co_mutex_lock(&s->lock);
> > +
> > +            qemu_vfree(buf);
> > +            if (ret < 0) {
> > +                error_setg_errno(errp, -ret, "Failed to zero out the new area");
> > +                goto fail;
> > +            }
> > +        }
> > +    }
> > +
> >       if (prealloc != PREALLOC_MODE_OFF) {
> >           /* Flush metadata before actually changing the image size */
> >           ret = qcow2_write_caches(bs);

Kevin



  reply	other threads:[~2020-04-24 12:18 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-23 15:01 [PATCH v6 00/10] block: Fix resize (extending) of short overlays Kevin Wolf
2020-04-23 15:01 ` [PATCH v6 01/10] block: Add flags to BlockDriver.bdrv_co_truncate() Kevin Wolf
2020-04-23 15:01 ` [PATCH v6 02/10] block: Add flags to bdrv(_co)_truncate() Kevin Wolf
2020-04-23 15:01 ` [PATCH v6 03/10] block-backend: Add flags to blk_truncate() Kevin Wolf
2020-04-23 15:01 ` [PATCH v6 04/10] qcow2: Support BDRV_REQ_ZERO_WRITE for truncate Kevin Wolf
2020-04-23 15:18   ` Eric Blake
2020-04-23 15:48     ` Kevin Wolf
2020-04-24  6:16   ` Vladimir Sementsov-Ogievskiy
2020-04-24 12:17     ` Kevin Wolf [this message]
2020-04-24 14:16       ` Eric Blake
2020-04-24 14:27       ` Vladimir Sementsov-Ogievskiy
2020-04-23 15:01 ` [PATCH v6 05/10] raw-format: " Kevin Wolf
2020-04-24  6:21   ` Vladimir Sementsov-Ogievskiy
2020-04-23 15:01 ` [PATCH v6 06/10] file-posix: " Kevin Wolf
2020-04-23 15:01 ` [PATCH v6 07/10] block: truncate: Don't make backing file data visible Kevin Wolf
2020-04-23 15:21   ` Eric Blake
2020-04-23 17:59   ` Max Reitz
2020-04-24  6:45   ` Vladimir Sementsov-Ogievskiy
2020-04-23 15:01 ` [PATCH v6 08/10] iotests: Filter testfiles out in filter_img_info() Kevin Wolf
2020-04-24  6:51   ` Vladimir Sementsov-Ogievskiy
2020-04-23 15:01 ` [PATCH v6 09/10] iotests: Test committing to short backing file Kevin Wolf
2020-04-24  8:53   ` Vladimir Sementsov-Ogievskiy
2020-04-23 15:01 ` [PATCH v6 10/10] qcow2: Forward ZERO_WRITE flag for full preallocation Kevin Wolf
2020-04-23 15:36   ` Eric Blake
2020-04-23 16:04     ` Kevin Wolf
2020-04-23 16:15       ` Eric Blake
2020-04-23 18:05   ` Max Reitz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200424121712.GA4921@linux.fritz.box \
    --to=kwolf@redhat.com \
    --cc=berto@igalia.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).