qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, famz@redhat.com,
	stefanha@redhat.com, mreitz@redhat.com, jcody@redhat.com,
	eblake@redhat.com, jsnow@redhat.com, den@openvz.org
Subject: Re: [Qemu-devel] [PATCH 2/2] block/backup: fix fleecing scheme: use serialized writes
Date: Wed, 4 Jul 2018 12:39:18 +0200	[thread overview]
Message-ID: <20180704103918.GC4334@localhost.localdomain> (raw)
In-Reply-To: <20180703180751.243496-3-vsementsov@virtuozzo.com>

Am 03.07.2018 um 20:07 hat Vladimir Sementsov-Ogievskiy geschrieben:
> Fleecing scheme works as follows: we want a kind of temporary snapshot
> of active drive A. We create temporary image B, with B->backing = A.
> Then we start backup(sync=none) from A to B. From this point, B reads
> as point-in-time snapshot of A (A continues to be active drive,
> accepting guest IO).
> 
> This scheme needs some additional synchronization between reads from B
> and backup COW operations, otherwise, the following situation is
> theoretically possible:
> 
> (assume B is qcow2, client is NBD client, reading from B)
> 
> 1. client starts reading and take qcow2 mutex in qcow2_co_preadv, and
>    goes up to l2 table loading (assume cache miss)
> 
> 2) guest write => backup COW => qcow2 write =>
>    try to take qcow2 mutex => waiting
> 
> 3. l2 table loaded, we see that cluster is UNALLOCATED, go to
>    "case QCOW2_CLUSTER_UNALLOCATED" and unlock mutex before
>    bdrv_co_preadv(bs->backing, ...)
> 
> 4) aha, mutex unlocked, backup COW continues, and we finally finish
>    guest write and change cluster in our active disk A
> 
> 5. actually, do bdrv_co_preadv(bs->backing, ...) and read
>    _new updated_ data.
> 
> To avoid this, let's make all COW writes serializing, to not intersect
> with reads from B.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

I think this should work, even though we can still improve it a bit.

First, I don't think we need to disable copy offloading as long as we
add BDRV_REQ_SERIALISING support to bdrv_co_copy_range_internal(). The
thing that's a bit strange there is that there is only a single flags
parameter for both source and target. We may need to split that so
that we can pass BDRV_REQ_NO_SERIALISING for the source and
BDRV_REQ_SERIALISING for the target.

> +    /* Detect image fleecing */
> +    fleecing = sync_mode == MIRROR_SYNC_MODE_NONE && target->backing->bs == bs;

The other thing is that I'm not convinced of the condition here. This is
the narrowest thinkable condition to recognise the exact fleecing setup
we have in mind right now. However, it is not the condition to flag all
setups that are affected by the same problem.

I don't think sync_mode actually makes a meaningful difference here. If
someone wants to create a full backup and already access it while the
job is still in progress, they will still want it to be consistent.

It also doesn't make a difference whether the fleecing overlay has the
source directly attached as a backing node or whether there is a filter
node between them.

So while I'm not sure whether it really covers all interesting cases,
this condition would already be a lot better:

    fleecing = bdrv_chain_contains(target, bs);

Maybe rename the variable to serialise_target_writes because it's no
longer restricted to the exact fleecing case.

> +    if (fleecing) {
> +        if (compress) {
> +            error_setg(errp, "Image fleecing doesn't support compressed mode.");
> +            return NULL;
> +        }
> +    }

Why not fleecing && compress instead of a nested if? And why is
fleecing even at odds with compression?

Kevin

  reply	other threads:[~2018-07-04 10:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-03 18:07 [Qemu-devel] [PATCH 0/2] fix image fleecing Vladimir Sementsov-Ogievskiy
2018-07-03 18:07 ` [Qemu-devel] [PATCH 1/2] block: add BDRV_REQ_SERIALISING flag Vladimir Sementsov-Ogievskiy
2018-07-04 14:44   ` Vladimir Sementsov-Ogievskiy
2018-07-04 15:08     ` Kevin Wolf
2018-07-04 16:11       ` Vladimir Sementsov-Ogievskiy
2018-07-04 16:20         ` Kevin Wolf
2018-07-04 16:36           ` Vladimir Sementsov-Ogievskiy
2018-07-04 17:06             ` Vladimir Sementsov-Ogievskiy
2018-07-05  8:34               ` Kevin Wolf
2018-07-03 18:07 ` [Qemu-devel] [PATCH 2/2] block/backup: fix fleecing scheme: use serialized writes Vladimir Sementsov-Ogievskiy
2018-07-04 10:39   ` Kevin Wolf [this message]
2018-07-04 12:31     ` Vladimir Sementsov-Ogievskiy
2018-07-04 13:20     ` Vladimir Sementsov-Ogievskiy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180704103918.GC4334@localhost.localdomain \
    --to=kwolf@redhat.com \
    --cc=den@openvz.org \
    --cc=eblake@redhat.com \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=vsementsov@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).