From: Max Reitz <mreitz@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-block@nongnu.org
Cc: fam@euphon.net, kwolf@redhat.com, wencongyang2@huawei.com,
xiechanglong.d@gmail.com, qemu-devel@nongnu.org,
armbru@redhat.com, jsnow@redhat.com, stefanha@redhat.com,
den@openvz.org
Subject: Re: [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers
Date: Wed, 28 Aug 2019 21:50:51 +0200 [thread overview]
Message-ID: <bcfb962a-0ca1-7f3f-1db4-5098c66a08da@redhat.com> (raw)
In-Reply-To: <20190826161312.489398-14-vsementsov@virtuozzo.com>
[-- Attachment #1.1: Type: text/plain, Size: 8464 bytes --]
On 26.08.19 18:13, Vladimir Sementsov-Ogievskiy wrote:
> Drop write notifiers and use filter node instead.
>
> = Changes =
>
> 1. add filter-node-name argument for backup qmp api. We have to do it
> in this commit, as 257 needs to be fixed.
I feel a bit bad about it not being an implicit node. But I know you
don’t like them, they’re probably just broken altogether and because
libvirt doesn’t use backup (yet), probably nobody cares.
> 2. there no move write notifiers here, so is_write_notifier parameter
s/there/there are/, I suppose?
> is dropped from block-copy paths.
>
> 3. Intersecting requests handling changed, now synchronization between
> backup-top, backup and guest writes are all done in block/block-copy.c
> and works as follows:
>
> On copy operation, we work only with dirty areas. If bits are dirty it
> means that there are no requests intersecting with this area. We clear
> dirty bits and take bdrv range lock (bdrv_co_try_lock) on this area to
> prevent further operations from interaction with guest (only with
> guest, as neither backup nor backup-top will touch non-dirty area). If
> copy-operation failed we set dirty bits back together with releasing
> the lock.
>
> The actual difference with old scheme is that on guest writes we
> don't lock the whole region but only dirty-parts, and to be more
> precise: only dirty-part we are currently operate on. In old scheme
> guest write to non-dirty area (which may be safely ignored by backup)
> may wait for intersecting request, touching some other area which is
> dirty.
>
> 4. To sync with in-flight requests at job finish we now have drained
> removing of the filter, we don't need rw-lock.
>
> = Notes =
>
> Note the consequence of three objects appearing: backup-top, backup job
> and block-copy-state:
>
> 1. We want to insert backup-top before job creation, to behave similar
> with mirror and commit, where job is started upon filter.
>
> 2. We also have to create block-copy-state after filter injection, as
> we don't want it's source child be replaced by fitler. Instead we want
s/it's/its/, s/filter/filter/ (although “fitler” does have an amusing
ring to it)
> to keep BCS.source to be real source node, as we want to use
> bdrv_co_try_lock in CBW operations and it can't be used on filter, as
> on filter we already have in-flight (write) request from upper layer.
Reasonable, even more so as I suppose BCS.source should eventually be a
BdrvChild of backup-top.
What looks wrong is that the sync_bitmap is created on the source (“bs”
in backup_job_create()), but backup_cleanup_sync_bitmap() still assumes
it is on blk_bs(job->common.blk) (which is no longer true).
> So, we firstly create inject backup-top, then create job and BCS. BCS
> is the latest just to not create extra variable for it. Finally we set
> bcs for backup-top filter.
>
> = Iotest changes =
>
> 56: op-blocker doesn't shot now, as we set it on source, but then check
s/shot/show/?
> on filter, when trying to start second backup, so error caught in
> test_dismiss_collision is changed. It's OK anyway, as this test-case
> seems to test that after some collision we can dismiss first job and
> successfully start the second one. So, the change is that collision is
> changed from op-blocker to file-posix locks.
Well, but the op blocker belongs to the source, which should be covered
by internal permissions. The fact that it now shows up as a file-posix
error thus shows that the conflict also moves from the source to the
target. It’s OK because we actually don’t have a conflict on the source.
But I wonder whether it would be better for test_dismiss_collision() to
do a blockdev-backup instead where we can see the collision on the target.
Hm. On second thought, why do we even get a conflict on the target?
block-copy does share the WRITE permission for it...
> However, it's obvious now that we'd better drop this op-blocker at all
> and add a test-case for two backups from one node (to different
> destinations) actually works. But not in these series.
>
> 257: The test wants to emulate guest write during backup. They should
> go to filter node, not to original source node, of course. Therefore we
> need to specify filter node name and use it.
>
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
> ---
> qapi/block-core.json | 8 +-
> include/block/block-copy.h | 2 +-
> include/block/block_int.h | 1 +
> block/backup-top.c | 14 +-
> block/backup.c | 113 +++-----------
> block/block-copy.c | 55 ++++---
> block/replication.c | 2 +-
> blockdev.c | 1 +
> tests/qemu-iotests/056 | 2 +-
> tests/qemu-iotests/257 | 7 +-
> tests/qemu-iotests/257.out | 306 ++++++++++++++++++-------------------
> 11 files changed, 230 insertions(+), 281 deletions(-)
Nice.
I don’t have much to object (for some reason, I feel a bit uneasy about
dropping all the request waiting code, but I suppose that is indeed
taken care of with the range locks now).
[...]
> diff --git a/block/backup.c b/block/backup.c
> index d927c63e5a..259a165405 100644
> --- a/block/backup.c
> +++ b/block/backup.c
[...]
> @@ -552,6 +480,9 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
> backup_clean(&job->common.job);
> job_early_fail(&job->common.job);
> }
> + if (backup_top) {
> + bdrv_backup_top_drop(backup_top);
> + }
This order is weird because backup-top still has a pointer to the BCS,
but we have already freed it at this point.
>
> return NULL;
> }
> diff --git a/block/block-copy.c b/block/block-copy.c
> index 6828c46ba0..f3102ec3ff 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -98,27 +98,32 @@ fail:
> * error occurred, return a negative error number
> */
> static int coroutine_fn block_copy_with_bounce_buffer(
> - BlockCopyState *s, int64_t start, int64_t end, bool is_write_notifier,
> + BlockCopyState *s, int64_t start, int64_t end,
> bool *error_is_read, void **bounce_buffer)
> {
> int ret;
> - int nbytes;
> - int read_flags = is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0;
> + int nbytes = MIN(s->cluster_size, s->len - start);
> + void *lock = bdrv_co_try_lock(blk_bs(s->source), start, nbytes);
> +
> + /*
> + * Function must be called only on full-dirty region, so nobody touching or
> + * touched these bytes. Therefore, we must successfully get lock.
Which is OK because backup-top does not share the WRITE permission. But
it is organized a bit weirdly, because it’s actually block-copy here
that relies on the WRITE permission not to be shared; which it itself
cannot do because backup-top needs it.
This of course results from the fact that backup-top and block-copy
should really use the same object to access the source, namely the
backing BdrvChild of backup-top.
Maybe there should be a comment somewhere that points this out?
> + */
> + assert(lock);
[...]
> @@ -128,13 +133,16 @@ static int coroutine_fn block_copy_with_bounce_buffer(
> if (error_is_read) {
> *error_is_read = false;
> }
> - goto fail;
> + goto out;
> }
>
> - return nbytes;
> -fail:
> - bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size);
> - return ret;
> +out:
> + if (ret < 0) {
> + bdrv_set_dirty_bitmap(s->copy_bitmap, start, s->cluster_size);
> + }
> + bdrv_co_unlock(lock);
> +
> + return ret < 0 ? ret : nbytes;
>
I feel like it would still be simpler to keep the separate fail path and
just duplicate the bdrv_co_unlock(lock) call.
[...]
> diff --git a/blockdev.c b/blockdev.c
> index fbef6845c8..f89e48fc79 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3601,6 +3601,7 @@ static BlockJob *do_backup_common(BackupCommon *backup,
> job = backup_job_create(backup->job_id, bs, target_bs, backup->speed,
> backup->sync, bmap, backup->bitmap_mode,
> backup->compress,
> + backup->filter_node_name,
Is this guaranteed to be NULL when not specified by the user?
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2019-08-28 20:36 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-26 16:12 [Qemu-devel] [PATCH v9 00/13] backup-top filter driver for backup Vladimir Sementsov-Ogievskiy
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 01/13] block/backup: fix backup_cow_with_offload for last cluster Vladimir Sementsov-Ogievskiy
2019-08-28 14:08 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 02/13] block/backup: split shareable copying part from backup_do_cow Vladimir Sementsov-Ogievskiy
2019-08-28 14:22 ` Max Reitz
2019-08-28 14:27 ` Vladimir Sementsov-Ogievskiy
2019-08-28 14:32 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 03/13] block/backup: introduce BlockCopyState Vladimir Sementsov-Ogievskiy
2019-08-28 15:59 ` Max Reitz
2019-08-29 10:52 ` Vladimir Sementsov-Ogievskiy
2019-09-02 16:09 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 04/13] block/backup: adjust block-copy functions style Vladimir Sementsov-Ogievskiy
2019-08-28 16:06 ` Max Reitz
2019-08-29 11:25 ` Vladimir Sementsov-Ogievskiy
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 05/13] block: move block_copy from block/backup.c to separate file Vladimir Sementsov-Ogievskiy
2019-08-28 16:16 ` Max Reitz
2019-08-29 12:00 ` Vladimir Sementsov-Ogievskiy
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 06/13] block: teach bdrv_debug_breakpoint skip filters with backing Vladimir Sementsov-Ogievskiy
2019-08-28 16:21 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 07/13] iotests: prepare 124 and 257 bitmap querying for backup-top filter Vladimir Sementsov-Ogievskiy
2019-08-28 16:40 ` Max Reitz
2019-08-29 13:22 ` Vladimir Sementsov-Ogievskiy
2019-09-02 16:10 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 08/13] iotests: 257: drop unused Drive.device field Vladimir Sementsov-Ogievskiy
2019-08-28 16:50 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 09/13] iotests: 257: drop device_add Vladimir Sementsov-Ogievskiy
2019-08-28 16:54 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 10/13] block/io: refactor wait_serialising_requests Vladimir Sementsov-Ogievskiy
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 11/13] block: add lock/unlock range functions Vladimir Sementsov-Ogievskiy
2019-08-28 17:02 ` Max Reitz
2019-08-29 9:17 ` Vladimir Sementsov-Ogievskiy
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 12/13] block: introduce backup-top filter driver Vladimir Sementsov-Ogievskiy
2019-08-28 17:52 ` Max Reitz
2019-08-26 16:13 ` [Qemu-devel] [PATCH v9 13/13] block/backup: use backup-top instead of write notifiers Vladimir Sementsov-Ogievskiy
2019-08-28 19:50 ` Max Reitz [this message]
2019-08-29 14:55 ` Vladimir Sementsov-Ogievskiy
2019-09-02 16:34 ` Max Reitz
2019-09-03 8:06 ` Vladimir Sementsov-Ogievskiy
2019-09-09 9:12 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=bcfb962a-0ca1-7f3f-1db4-5098c66a08da@redhat.com \
--to=mreitz@redhat.com \
--cc=armbru@redhat.com \
--cc=den@openvz.org \
--cc=fam@euphon.net \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
--cc=wencongyang2@huawei.com \
--cc=xiechanglong.d@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).