From: "Denis V. Lunev" <den@openvz.org>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org,
vsementsov@virtuozzo.com, Stefan Hajnoczi <stefanha@redhat.com>,
Fam Zheng <famz@redhat.com>, Max Reitz <mreitz@redhat.com>,
Jeff Cody <jcody@redhat.com>, Eric Blake <eblake@redhat.com>,
pbonzini@redhat.com
Subject: Re: [Qemu-devel] [PATCH 0/9] major rework of drive-mirror
Date: Wed, 15 Jun 2016 13:44:44 +0300 [thread overview]
Message-ID: <5761319C.2010705@openvz.org> (raw)
In-Reply-To: <20160615102515.GF4566@noname.redhat.com>
On 06/15/2016 01:25 PM, Kevin Wolf wrote:
> Am 15.06.2016 um 11:34 hat Denis V. Lunev geschrieben:
>> On 06/15/2016 12:06 PM, Kevin Wolf wrote:
>>> The second big thing is that I don't want to see new users of the
>>> notifiers in I/O functions. Let's try if we can't add a filter
>>> BlockDriver instead. Then we'd add an option to set the filter node-name
>>> in the mirror QMP command so that the management tool is aware of the
>>> node and can refer to it.
>> this will be much more difficult to implement at my experience.
> I agree that it will be more difficult, though I'm not sure whether it's
> much more.
>
>> Can you share more details why filters are bad?
> Basically, notifiers (and other code for supporting other features in
> the common I/O path) just don't fit the block layer design very well.
> They are more hacked on instead of designed in.
>
> This shows in different ways in different places, so I can't fully tell
> you the problems that using notifiers in the mirror code would cause
> (which is a problem in itself), but for example the copy-on-read code
> that has been added to the core block layer instead of a separate filter
> driver had trouble with ignoring its own internal requests that it made
> to the BDS, which was hacked around with a new request flag, i.e. making
> the core block layer even more complicated.
>
>
> An example that I can think of with respect to mirroring is taking a
> snapshot during the operation. Op blockers prevent this from happening
> at the moment, but it seems to be a very reasonable thing for a user to
> want. Let me use the filter node approach to visualise this:
>
> (raw-posix) (qcow2)
> source_file <- source <- mirror <- virtio-blk
> |
> target_file <- target <----+
>
> There are two ways to create a snapshot: Either you put it logically
> between the mirror and virtio-blk, so that only source (now a backing
> file), but no new writes will be mirrored. This is easy in both
> approaches, but maybe the less commonly wanted thing.
>
> The other option is putting the new overlay between source and mirror:
>
> (raw-posix) (qcow2)
> source_file <- source <- snap <- mirror <- virtio-blk
> |
> target_file <- target <------------+
>
> With the mirror intercept being its own BDS node, making this change is
> very easy and doesn't involve any code that is specific to mirroring.
>
> If it doesn't have a filter driver and uses notifiers, however, the
> mirror is directly tied to the source BDS, and now it doesn't just work,
> but you need extra code that explicitly moves the notifier from the
> source BDS to the snap BDS. And you need such code not only for
> mirroring, but for everything that potentially hooks into the I/O
> functions.
>
>
> Maybe these two examples give you an idea why I want to use the concepts
> that are designed into the block layer with much thought rather than
> hacked on notifiers. Notifiers may be simpler to implement in the first
> place, but they lead to a less consistent and harder to maintain block
> layer in the long run.
ok. great explanation, thank you
>>> If we don't do this now, we'll have to introduce it later and can't be
>>> sure that the management tool knows about it. This would complicate
>>> things quite a bit because we would have to make sure that the added
>>> node stays invisible to the management tool.
>>>
>>>
>>> I think these two things are the big architectural questions. The rest
>>> is hopefully more or less implementation details.
>> I completely agree with you.
>>
>> We have the following choices:
>> 1. implement parameter to use 'active'/'passive' mode from the very
>> beginning
>> 2. switch to 'active' mode upon receiving "block-job-complete"
>> command unconditionally
>> 3. switch to 'active' mode upon receiving "block-job-complete"
>> command with proper parameter
>> 4. switch to 'active' mode after timeout (I personally do not like
>> this option)
>>
>> I think that choices 1 and 3 do not contradict each other and
>> could be implemented to gather.
> I think we definitely want 1. and some way to switch after the fact.
> That you suggest three different conditions for doing the switch
> suggests that it is policy and doesn't belong in qemu, but in the
> management tools. So how about complementing 1. with 5.?
>
> 5. Provide a QMP command to switch between active and passive mode
reasonable. looks OK to me.
Den
next prev parent reply other threads:[~2016-06-15 10:45 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-14 15:25 [Qemu-devel] [PATCH 0/9] major rework of drive-mirror Denis V. Lunev
2016-06-14 15:25 ` [Qemu-devel] [PATCH 1/9] mirror: fix calling of blk_aio_pwritev/blk_aio_preadv Denis V. Lunev
2016-06-14 22:48 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 2/9] mirror: create mirror_dirty_init helper for mirror_run Denis V. Lunev
2016-06-15 2:29 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 3/9] mirror: optimize dirty bitmap filling in mirror_run a bit Denis V. Lunev
2016-06-15 2:36 ` Eric Blake
2016-06-15 8:41 ` Denis V. Lunev
2016-06-15 12:25 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 4/9] mirror: efficiently zero out target Denis V. Lunev
2016-06-15 3:00 ` Eric Blake
2016-06-15 8:46 ` Denis V. Lunev
2016-06-15 12:34 ` Eric Blake
2016-06-15 13:18 ` Denis V. Lunev
2016-07-06 14:33 ` Denis V. Lunev
2016-06-14 15:25 ` [Qemu-devel] [PATCH 5/9] mirror: improve performance of mirroring of empty disk Denis V. Lunev
2016-06-15 3:20 ` Eric Blake
2016-06-15 9:19 ` Stefan Hajnoczi
2016-06-15 10:37 ` Denis V. Lunev
2016-06-16 10:10 ` Stefan Hajnoczi
2016-06-17 2:53 ` Eric Blake
2016-06-17 13:56 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 6/9] block: pass qiov into before_write notifier Denis V. Lunev
2016-06-15 4:07 ` Eric Blake
2016-06-15 9:21 ` Stefan Hajnoczi
2016-06-15 9:24 ` Denis V. Lunev
2016-06-15 9:22 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 7/9] mirror: allow to save buffer for QEMUIOVector in MirrorOp Denis V. Lunev
2016-06-15 4:11 ` Eric Blake
2016-06-14 15:25 ` [Qemu-devel] [PATCH 8/9] mirror: use synch scheme for drive mirror Denis V. Lunev
2016-06-15 4:18 ` Eric Blake
2016-06-15 8:52 ` Denis V. Lunev
2016-06-15 9:48 ` Stefan Hajnoczi
2016-06-14 15:25 ` [Qemu-devel] [PATCH 9/9] mirror: replace bdrv_dirty_bitmap with plain hbitmap Denis V. Lunev
2016-06-15 9:06 ` [Qemu-devel] [PATCH 0/9] major rework of drive-mirror Kevin Wolf
2016-06-15 9:34 ` Denis V. Lunev
2016-06-15 10:25 ` Kevin Wolf
2016-06-15 10:44 ` Denis V. Lunev [this message]
2016-06-15 9:50 ` Stefan Hajnoczi
2016-06-15 11:09 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5761319C.2010705@openvz.org \
--to=den@openvz.org \
--cc=eblake@redhat.com \
--cc=famz@redhat.com \
--cc=jcody@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).