From: Manos Pitsidianakis <el13635@mail.ntua.gr>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
qemu-block <qemu-block@nongnu.org>,
qemu-devel <qemu-devel@nongnu.org>,
Alberto Garcia <berto@igalia.com>, John Snow <jsnow@redhat.com>,
Jeff Cody <jcody@redhat.com>
Subject: Re: [Qemu-devel] [RFC] block-insert-node and block-job-delete
Date: Tue, 1 Aug 2017 16:57:56 +0300 [thread overview]
Message-ID: <20170801135756.haskbcc3a7jv6lms@postretch> (raw)
In-Reply-To: <20170801135036.GC4257@localhost.localdomain>
[-- Attachment #1: Type: text/plain, Size: 6095 bytes --]
On Tue, Aug 01, 2017 at 03:50:36PM +0200, Kevin Wolf wrote:
>Am 31.07.2017 um 19:30 hat Manos Pitsidianakis geschrieben:
>> On Fri, Jul 28, 2017 at 02:08:43PM +0200, Kevin Wolf wrote:
>> > Am 27.07.2017 um 12:07 hat Stefan Hajnoczi geschrieben:
>> > > On Wed, Jul 26, 2017 at 09:23:20PM +0300, Manos Pitsidianakis wrote:
>> > > > On Wed, Jul 26, 2017 at 04:12:21PM +0100, Stefan Hajnoczi wrote:
>> > > > > On Wed, Jul 26, 2017 at 05:19:24PM +0300, Manos Pitsidianakis wrote:
>> > > > > > This proposal follows a discussion we had with Kevin and Stefan on filter
>> > > > > > node management.
>> > > > > >
>> > > > > > With block filter drivers arises a need to configure filter nodes on runtime
>> > > > > > with QMP on live graphs. A problem with doing live graph modifications is
>> > > > > > that some block jobs modify the graph when they are done and don't operate
>> > > > > > under any synchronisation, resulting in a race condition if we try to insert
>> > > > > > a filter node in the place of an existing edge.
>> > > > >
>> > > > > But maybe you are thinking about higher level race conditions between
>> > > > > QMP commands. Can you give an example of the race?
>> > > >
>> > > > Exactly, synchronisation between the QMP/user actions. Here's an example,
>> > > > correct me if I'm wrong:
>> > > >
>> > > > User issues a block-commit to this tree to commit A onto B:
>> > > > BB -> A -> B
>> > > >
>> > > > This inserts the dummy mirror node at the top since this is a mirror job
>> > > > (active commit)
>> > > > BB -> dummy -> A -> B
>> > > >
>> > > > User wants to add a filter node F below the BB. First we create the node:
>> > > > BB -> dummy -> A -> B
>> > > > F /
>> > > >
>> > > > Commit finishes before we do block-insert-node, dummy and A is removed from
>> > > > the BB's chain, mirror_exit calls bdrv_replace_node()
>> > > >
>> > > > BB ------------> B
>> > > > F -> /
>> > > >
>> > > > Now inserting F under BB will error since dummy does not exist any more.
>> > >
>> > > I see the diagrams but the flow isn't clear without the user's QMP
>> > > commands.
>> > >
>> > > F is created using blockdev-add? What are the parameters and how does
>> > > it know about dummy? I think this is an interesting question in itself
>> > > because dummy is a transient internal node that QMP users don't
>> > > necessarily know about.
>> >
>> > We can expect that users of block-insert-node also know about block job
>> > filter nodes, simply because the former is newer than the latter.
>> >
>> > (I also don't like the name "dummy", this makes it sound like it's not
>> > really a proper node. In reality, there is little reason to treat it
>> > specially.)
>> >
>> > > What is the full block-insert-node command?
>> > >
>> > > > The proposal doesn't guard against the following:
>> > > > User issues a block-commit to this tree to commit B onto C:
>> > > > BB -> A -> B -> C
>> > > >
>> > > > The dummy node (commit's top bs is A):
>> > > > BB -> A -> dummy -> B -> C
>> > > >
>> > > > blockdev-add of a filter we wish to eventually be between A and C:
>> > > > BB -> A -> dummy -> B -> C
>> > > > F/
>> > > >
>> > > > if commit finishes before we issue block-insert-node (commit_complete()
>> > > > calls bdrv_set_backing_hd() which only touches A)
>> > > > BB -> A --------------> C
>> > > > F -> dummy -> B /
>> > > > resulting in error when issuing block-insert-node,
>> > > >
>> > > > else if we set the job to manual-delete and insert F:
>> > > > BB -> A -> F -> dummy -> B -> C
>> > > > deleting the job will result in F being lost since the job's top bs wasn't
>> > > > updated.
>> > >
>> > > manual-delete is a solution for block jobs. The write threshold
>> > > feature is a plain QMP command that in the future will create a
>> > > transient internal node (before write notifier).
>> >
>> > Yes, that becomes a legacy QMP command then. The new way is blockdev-add
>> > and block-insert-node for write threshold, too.
>> >
>> > Apart from that, a write threshold node never disappears by itself, it
>> > is only inserted or removed in the context of a QMP command. That makes
>> > it a lot less dangerous than automatic block completion.
>> >
>> > > I'm not sure it makes sense to turn the write threshold feature into a
>> > > block job. I guess it could work, but it seems a little unnatural and
>> > > maybe there will be other features that use transient internal nodes.
>> >
>> > Yeah, no reason to do so. Just a manually created filter node.
>> >
>>
>> Can you explain what you have in mind? The current workflow is using
>> block-set-write-threshold on the targetted bs. If we want write threshold to
>> be on node level we would want a new filter driver so that it can take
>> options special to write-threshold.
>
>Yes, I think that's what we want.
>
>> Unless we make the notify filter be a write threshold by default, and
>> when using it internally by the backup job, disable the threshold and
>> add our backup notifier to the node's list. Of course the current
>> write threshold code could be refactored into a new driver so that it
>> doesn't have to rely on notifiers.
>
>As discussed on IRC, we shouldn't implement a bad interface (which
>notifiers are, they bypass normal block device configuration) in a
>different way, but we should replace it by something that fits better in
>the block layer design.
>
>In other words, don't add magic to provide the same notifier functions
>as we had, but convert their callers to have a block node of their own
>(i.e. one backup_top driver, one write-treshold driver, etc.).
>
>> The way I've currently done the conversion is to add an implicit
>> filter when enabling the threshold, just like in backup jobs.
>
>This is fine, but it should be specifically a write-threshold filter
>that a user can also manually insert whereever they like. The old QMP
>interfaces would be considered legacy commands then, because
>blockdev-add and blockdev-insert-node can provide the same thing.
>
>Kevin
Alright thanks, this will make it easier.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2017-08-01 13:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-07-26 14:19 [Qemu-devel] [RFC] block-insert-node and block-job-delete Manos Pitsidianakis
2017-07-26 15:12 ` Stefan Hajnoczi
2017-07-26 18:23 ` Manos Pitsidianakis
2017-07-27 10:07 ` Stefan Hajnoczi
2017-07-28 12:08 ` Kevin Wolf
2017-07-31 14:53 ` Stefan Hajnoczi
2017-07-31 17:30 ` Manos Pitsidianakis
2017-08-01 13:50 ` Kevin Wolf
2017-08-01 13:57 ` Manos Pitsidianakis [this message]
2017-07-27 22:09 ` John Snow
2017-07-28 8:49 ` Manos Pitsidianakis
2017-07-28 11:55 ` Kevin Wolf
2017-08-02 10:47 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170801135756.haskbcc3a7jv6lms@postretch \
--to=el13635@mail.ntua.gr \
--cc=berto@igalia.com \
--cc=jcody@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).