From: Eric Blake <eblake@redhat.com>
To: "Denis V. Lunev" <den@openvz.org>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: vsementsov@virtuozzo.com, Stefan Hajnoczi <stefanha@redhat.com>,
Fam Zheng <famz@redhat.com>, Kevin Wolf <kwolf@redhat.com>,
Max Reitz <mreitz@redhat.com>, Jeff Cody <jcody@redhat.com>
Subject: Re: [Qemu-devel] [PATCH 4/9] mirror: efficiently zero out target
Date: Tue, 14 Jun 2016 21:00:27 -0600 [thread overview]
Message-ID: <5760C4CB.6080703@redhat.com> (raw)
In-Reply-To: <1465917916-22348-5-git-send-email-den@openvz.org>
[-- Attachment #1: Type: text/plain, Size: 4086 bytes --]
On 06/14/2016 09:25 AM, Denis V. Lunev wrote:
> With a bdrv_co_write_zeroes method on a target BDS zeroes will not be placed
> into the wire. Thus the target could be very efficiently zeroed out. This
> is should be done with the largest chunk possible.
>
> This improves the performance of the live migration of the empty disk by
> 150 times if NBD supports write_zeroes.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Fam Zheng <famz@redhat.com>
> CC: Kevin Wolf <kwolf@redhat.com>
> CC: Max Reitz <mreitz@redhat.com>
> CC: Jeff Cody <jcody@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> ---
> block/mirror.c | 32 +++++++++++++++++++++++++++++++-
> 1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/block/mirror.c b/block/mirror.c
> index c7b3639..c2f8773 100644
> --- a/block/mirror.c
> +++ b/block/mirror.c
> @@ -21,6 +21,7 @@
> #include "qemu/ratelimit.h"
> #include "qemu/bitmap.h"
>
> +#define MIRROR_ZERO_CHUNK (3u << (29 - BDRV_SECTOR_BITS)) /* 1.5 Gb */
Probably nicer to track this in bytes. And do you really want a
hard-coded arbitrary limit, or is it better to live with
MIN_NON_ZERO(target_bs->bl.max_pwrite_zeroes, INT_MAX)?
> @@ -512,7 +513,8 @@ static int mirror_dirty_init(MirrorBlockJob *s)
>
> end = s->bdev_length / BDRV_SECTOR_SIZE;
>
> - if (base == NULL && !bdrv_has_zero_init(target_bs)) {
> + if (base == NULL && !bdrv_has_zero_init(target_bs) &&
> + target_bs->drv->bdrv_co_write_zeroes == NULL) {
Indentation is off, although if checkpatch.pl doesn't complain I guess
it doesn't matter that much.
Why should you care whether the target_bs->drv implements a callback?
Can't you just rely on the normal bdrv_*() functions to do the dirty
work of picking the most efficient implementation without you having to
bypass the block layer? In fact, isn't that the whole goal of
bdrv_make_zero() - why not call that instead of reimplementing it?
Patch needs rebasing - we've redone this into bdrv_co_pwrite_zeroes and
a byte interface, since upstream commit c1499a5e.
> bdrv_set_dirty_bitmap(s->dirty_bitmap, 0, end);
> return 0;
> }
> @@ -546,6 +548,34 @@ static int mirror_dirty_init(MirrorBlockJob *s)
> }
> sector_num += n;
> }
> +
> + if (base != NULL || bdrv_has_zero_init(target_bs)) {
You're now repeating the conditional that used to be 'bool
mark_all_dirty' (well, this is !mark_all_dirty); is it worth keeping the
simpler bool around?
> + /* no need to zero out entire disk */
> + return 0;
> + }
> +
> + for (sector_num = 0; sector_num < end; ) {
> + int nb_sectors = MIN(MIRROR_ZERO_CHUNK, end - sector_num);
Why limit yourself to 1.5G? It's either too small for what you can
really do, or too large for what the device permits. See my above
comment about MIN_NON_ZERO.
> + int64_t now = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> +
> + if (now - last_pause_ns > SLICE_TIME) {
> + last_pause_ns = now;
> + block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0);
> + }
> +
> + if (block_job_is_cancelled(&s->common)) {
> + return -EINTR;
> + }
> +
> + if (s->in_flight == MAX_IN_FLIGHT) {
> + trace_mirror_yield(s, s->in_flight, s->buf_free_count, -1);
> + mirror_wait_for_io(s);
> + continue;
> + }
Hmm - I guess your mirror yield points are why you couldn't just
directly use bdrv_make_zero(); but is that something where some code
refactoring can share more code rather than duplicating it?
> +
> + mirror_do_zero_or_discard(s, sector_num, nb_sectors, false);
> + sector_num += nb_sectors;
> + }
> return 0;
> }
>
>
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library http://libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]
next prev parent reply other threads:[~2016-06-15 3:00 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-14 15:25 [Qemu-devel] [PATCH 0/9] major rework of drive-mirror Denis V. Lunev
2016-06-14 15:25 ` [Qemu-devel] [PATCH 1/9] mirror: fix calling of blk_aio_pwritev/blk_aio_preadv Denis V. Lunev
2016-06-14 22:48 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 2/9] mirror: create mirror_dirty_init helper for mirror_run Denis V. Lunev
2016-06-15 2:29 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 3/9] mirror: optimize dirty bitmap filling in mirror_run a bit Denis V. Lunev
2016-06-15 2:36 ` Eric Blake
2016-06-15 8:41 ` Denis V. Lunev
2016-06-15 12:25 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 4/9] mirror: efficiently zero out target Denis V. Lunev
2016-06-15 3:00 ` Eric Blake [this message]
2016-06-15 8:46 ` Denis V. Lunev
2016-06-15 12:34 ` Eric Blake
2016-06-15 13:18 ` Denis V. Lunev
2016-07-06 14:33 ` Denis V. Lunev
2016-06-14 15:25 ` [Qemu-devel] [PATCH 5/9] mirror: improve performance of mirroring of empty disk Denis V. Lunev
2016-06-15 3:20 ` Eric Blake
2016-06-15 9:19 ` Stefan Hajnoczi
2016-06-15 10:37 ` Denis V. Lunev
2016-06-16 10:10 ` Stefan Hajnoczi
2016-06-17 2:53 ` Eric Blake
2016-06-17 13:56 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 6/9] block: pass qiov into before_write notifier Denis V. Lunev
2016-06-15 4:07 ` Eric Blake
2016-06-15 9:21 ` Stefan Hajnoczi
2016-06-15 9:24 ` Denis V. Lunev
2016-06-15 9:22 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 7/9] mirror: allow to save buffer for QEMUIOVector in MirrorOp Denis V. Lunev
2016-06-15 4:11 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 8/9] mirror: use synch scheme for drive mirror Denis V. Lunev
2016-06-15 4:18 ` Eric Blake
2016-06-15 8:52 ` Denis V. Lunev
2016-06-15 9:48 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 9/9] mirror: replace bdrv_dirty_bitmap with plain hbitmap Denis V. Lunev
2016-06-15 9:06 ` [Qemu-devel] [PATCH 0/9] major rework of drive-mirror Kevin Wolf
2016-06-15 9:34 ` Denis V. Lunev
2016-06-15 10:25 ` Kevin Wolf
2016-06-15 10:44 ` Denis V. Lunev
2016-06-15 9:50 ` Stefan Hajnoczi
2016-06-15 11:09 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5760C4CB.6080703@redhat.com \
--to=eblake@redhat.com \
--cc=den@openvz.org \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.