From: Max Reitz <mreitz@redhat.com>
To: John Snow <jsnow@redhat.com>,
Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Cc: kwolf@redhat.com, den@openvz.org, armbru@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup
Date: Mon, 11 Dec 2017 18:05:09 +0100 [thread overview]
Message-ID: <f3918f16-774b-4c73-7df4-8ec3e757ae3c@redhat.com> (raw)
In-Reply-To: <23523fec-96b1-f50b-6937-85daea8a832a@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6205 bytes --]
On 2017-12-11 17:47, John Snow wrote:
>
>
> On 12/11/2017 11:31 AM, Max Reitz wrote:
>> On 2017-12-08 18:09, John Snow wrote:
>>>
>>>
>>> On 12/08/2017 09:30 AM, Max Reitz wrote:
>>>> On 2017-12-05 01:48, John Snow wrote:
>>>>>
>>>>>
>>>>> On 12/04/2017 05:21 PM, Max Reitz wrote:
>>>>>> On 2017-12-04 23:15, John Snow wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 12/01/2017 02:41 PM, Max Reitz wrote:
>>>>>>>> ((By the way, I don't suppose that's how it should work... But I don't
>>>>>>>> suppose that we want propagation of dirtying towards the BDS roots, do
>>>>>>>> we? :-/))
>>>>>>>
>>>>>>> I have never really satisfactorily explained to myself what bitmaps on
>>>>>>> intermediate notes truly represent or mean.
>>>>>>>
>>>>>>> The simple case is "This layer itself serviced a write request."
>>>>>>>
>>>>>>> If that information is not necessarily meaningful, I'm not sure that's a
>>>>>>> problem except in configuration.
>>>>>>>
>>>>>>>
>>>>>>> ...Now, if you wanted to talk about bitmaps that associate with a
>>>>>>> Backend instead of a Node...
>>>>>>
>>>>>> But it's not about bitmaps on intermediate nodes, quite the opposite.
>>>>>> It's about bitmaps on roots but write requests happening on intermediate
>>>>>> nodes.
>>>>>>
>>>>>
>>>>> Oh, I see what you're saying. It magically doesn't really change my
>>>>> opinion, by coincidence!
>>>>>
>>>>>> Say you have a node I and two filter nodes A and B using it (and they
>>>>>> are OK with shared writers). There is a dirty bitmap on A.
>>>>>>
>>>>>> Now when a write request goes through B, I will obviously have changed,
>>>>>> and because A and B are filters, so will A. But the dirty bitmap on A
>>>>>> will still be clean.
>>>>>>
>>>>>> My example was that when you run a mirror over A, you won't see dirtying
>>>>>> from B. So you can't e.g. add a throttle driver between a mirror job
>>>>>> and the node you want to mirror, because the dirty bitmap on the
>>>>>> throttle driver will not be affected by accesses to the actual node.
>>>>>>
>>>>>> Max
>>>>>>
>>>>>
>>>>> Well, in this case I would say that a root BDS is not really any
>>>>> different from an intermediate one and can't really know what's going on
>>>>> in the world outside.
>>>>>
>>>>> At least, I think that's how we model it right now -- we pretend that we
>>>>> can record the activity of an entire drive graph by putting the bitmap
>>>>> on the root-most node we can get a hold of and assuming that all writes
>>>>> are going to go through us.
>>>>
>>>> Well, yeah, I know we do. But I consider this counter-intuitive and if
>>>> something is counter-intuitive it's often a bug.
>>>>
>>>>> Clearly this is increasingly false the more we modularise the block graph.
>>>>>
>>>>>
>>>>> *uhm*
>>>>>
>>>>>
>>>>> I would say that a bitmap attached to a BlockBackend should behave in
>>>>> the way you say: writes to any children should change the bitmap here.
>>>>>
>>>>> bitmaps attached to nodes shouldn't worry about such things.
>>>>
>>>> Do we have bitmaps attached to BlockBackends? I sure hope not.
>>>>
>>>> We should not have any interface that requires the use of BlockBackends
>>>> by now. If we do, that's something that has to be fixed.
>>>>
>>>> Max
>>>>
>>>
>>> I'm not sure what the right paradigm is anymore, then.
>>>
>>> A node is just a node, but something has to represent the "drive" as far
>>> as the device model sees it. I thought that *was* the BlockBackend, but
>>> is it not?
>>
>> Yes, and on the other side the BB represents the device model for the
>> block layer. But the thing is that the user should be blissfully
>> unaware... Or do you want to make bitmaps attachable to guest devices
>> (through the QOM path or ID) instead?
>>
>
> OK, sure -- the user can specify a device model to attach it to instead
> of a node. They don't have to be aware of the BB itself.
>
> The implementation though, I imagine it associates with that BB.
But that would be a whole new implementation...
>> (The block layer would then internally translate that to a BB. But
>> that's a bad internal interface because the bitmap is still attached to
>> a BDS, and it's a bad external interface because currently you can
>> attach bitmaps to nodes and only to nodes...)
>
> What if the type of bitmap we want to track trans-node changes was not
> attached to a BDS? That'd be one way to obviously discriminate between
> "This tracks tree-wide changes" and "This tracks node-local changes."
A new type of bitmap? :-/
> Implementation wise I don't have any actual thought as to how this could
> possibly be efficient. Maybe a bitmap reference at each BDS that is a
> child of that particular BB?
>
> On attach, the BDS gets a set-only reference to that bitmap.
> On detach, we remove the reference.
>
> Then, any writes anywhere in the tree will coagulate in one place. It
> may or may not be particularly true or correct, because a write down the
> tree doesn't necessarily mean the visible data has changed at the top
> layer, but I don't think we have anything institutionally set in place
> to determine if we are changing visible data up-graph with a write
> down-graph.
Hmmm... The first thing to clarify is whether we want two types of
bitmaps. I don't think there is much use to node-local bitmaps, all
bitmaps should track every dirtying of their associated node (wherever
it comes from).
However, if that is too much of a performance penalty... Then we
probably do have to distinguish between the two so that users only add
tree-wide bitmaps when they need them.
OTOH, I guess that in the common case it's not a performance penalty at
all, if done right. Usually, a node you attach a bitmap to will not
have child nodes that are written to by other nodes. So in the common
case your tree-wide bitmap is just a plain local bitmap and thus just as
efficient.
And if some child node is indeed written to by some other node... I
think you always want a tree-wide bitmap anyway.
So I think all bitmaps should be tree-wide and the fact that they
currently are not is a bug.
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
next prev parent reply other threads:[~2017-12-11 17:05 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-09 14:16 [Qemu-devel] [PATCH] blockdev-backup: enable non-root nodes for backup Vladimir Sementsov-Ogievskiy
2017-11-09 16:33 ` Eric Blake
2017-11-09 17:29 ` Kevin Wolf
2017-11-09 21:29 ` [Qemu-devel] [Qemu-block] " John Snow
2017-12-01 19:41 ` [Qemu-devel] " Max Reitz
2017-12-04 22:15 ` [Qemu-devel] [Qemu-block] " John Snow
2017-12-04 22:21 ` Max Reitz
2017-12-05 0:48 ` John Snow
2017-12-08 14:30 ` Max Reitz
2017-12-08 17:09 ` John Snow
2017-12-11 16:31 ` Max Reitz
2017-12-11 16:47 ` John Snow
2017-12-11 17:05 ` Max Reitz [this message]
2017-12-11 17:18 ` John Snow
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f3918f16-774b-4c73-7df4-8ec3e757ae3c@redhat.com \
--to=mreitz@redhat.com \
--cc=armbru@redhat.com \
--cc=den@openvz.org \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).