qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Benoît Canet" <benoit.canet@irqsave.net>
To: "Benoît Canet" <benoit.canet@irqsave.net>
Cc: kwolf@redhat.com, jcody@redhat.com, Fam Zheng <famz@redhat.com>,
	qemu-devel@nongnu.org, stefanha@redhat.com
Subject: Re: [Qemu-devel] [PATCH] block: Make op blocker recursive
Date: Sat, 21 Jun 2014 17:40:28 +0200	[thread overview]
Message-ID: <20140621154028.GC2108@irqsave.net> (raw)
In-Reply-To: <20140621153911.GB2108@irqsave.net>

The Saturday 21 Jun 2014 à 17:39:11 (+0200), Benoît Canet wrote :
> The Saturday 21 Jun 2014 à 23:15:19 (+0800), Fam Zheng wrote :
> > On Sat, 06/21 12:45, Benoît Canet wrote:
> > > The Saturday 21 Jun 2014 à 16:53:58 (+0800), Fam Zheng wrote :
> > > > On Fri, 06/20 09:30, Eric Blake wrote:
> > > > > On 06/19/2014 11:01 PM, Fam Zheng wrote:
> > > > > > On Thu, 06/19 22:20, Benoît Canet wrote:
> > > > > >> The Thursday 19 Jun 2014 à 14:13:20 (-0600), Eric Blake wrote :
> > > > > >>> On 06/19/2014 02:01 PM, Benoît Canet wrote:
> > > > > >>>> As the code will start to operate on arbitratry nodes we need the op blocker
> > > > > >>>
> > > > > >>> s/arbitratry/arbitrary/
> > > > > >>>
> > > > > >>>> to recursively block or unblock whole BDS subtrees.
> > > > > > 
> > > > > > I don't get the reason, can you elaborate?
> > > > > 
> > > > > Consider what happens if I have:
> > > > > 
> > > > > base <- snap1 <- active
> > > > > 
> > > > > then I start a fleecing NBD server on the state as it was at snap1:
> > > > > 
> > > > > base <- snap1 <- active
> > > > >               \- fleecing
> > > > > 
> > > > > then I do a blockpull into active:
> > > > > 
> > > > > base <- snap1 <- fleecing
> > > > > active
> > > > > 
> > > > > at this point, base and snap1 are no longer tied to active, but they
> > > > > STILL must be protected from operations that would modify their contents
> > > > > in a way that would break the fleecing operation.  The solution we are
> > > > > looking at is making BDS blockers recursive to every element of the
> > > > > chain, not just the top-level device.
> > > > 
> > > > This would already have been protected by backing blocker of fleecing target.
> > > > 
> > > > > 
> > > > > Another example: consider:
> > > > > 
> > > > > base <- snap1 <- active
> > > > > 
> > > > > then someone uses Jeff's proposed new change-backing-file QMP command to
> > > > > rewrite the snap1 metadata to point to base via a relative name instead
> > > > > of an absolute name.  It shouldn't matter whether active is blocked, but
> > > > > only whether snap1 is blocked.  But to know if snap1 is blocked, we have
> > > > > to propagate the blockers of active down recursively to its backing files.
> > > > 
> > > > Why do we need to block changging of metadata? I think this operation is safe
> > > > in most cases.
> > > > 
> > > > Correct me if I'm missing anything, but even if snap1 _is_ blocked, it would be
> > > > because snap1 is serving as backing of active. In this case, the actual blocker
> > > > should be active->backing_blocker.
> > > > 
> > > > > 
> > > > > >> What would be a cleaner solution ?
> > > > > > 
> > > > > > What is the question to solve?
> > > > > 
> > > > > I think Jeff's idea is on target - rather than blocking by operation, we
> > > > > should instead be blocking on access patterns (various operations
> > > > > trigger several access patterns):
> > > > > https://lists.gnu.org/archive/html/qemu-devel/2014-06/msg04752.html
> > > > > 
> > > > > Jeff's initial list included:
> > > > > 
> > > > > > So if I think of operations that are done on block devices from a
> > > > > > block job, and chuck them into categories, I think we have:
> > > > > > 
> > > > > > 1) Read of guest-visible data
> > > > > > 2) Write of guest-visible data
> > > > > > 3) Read of host-visible data (e.g. image file metadata)
> > > > > > 4) Write of host-visible data (e.g. image file metadata, such as
> > > > > > the backing-file)
> > > > > > 5) Block chain manipulations (e.g. movement of a BDS, change to r/w
> > > > > > instead of r/o, etc..)
> > > > > > 6) I/O attribute changes (e.g. throttling, etc..)
> > > > 
> > > > Most operations looks safe to me, given the way how IOThreads and coroutine
> > > > work now. It's only the chain manpulations in long running block jobs that are
> > > > exclusive, and by nature it should be checked per chain.  Can we set some op
> > > > blockers on the bottom BDS and check it each time, to prevent user from
> > > > starting a second chain manipulator?
> > > 
> > > I don't know if bottom BDS locking is any good because some driver like quorum
> > > have multiple childs.
> > > Locking everytime the root (top) of the tree seems a feasible solution indeed.
> > 
> > Quorom doesn't change the convensions of backing chain, so each child belongs
> > to its own backing chain, and that chain has a deterministic top and bottom.
> > 
> > Blocking flag on bottom saves us from adding reverse looking up (->overlay
> > pointer), because we already have the ->backing_hd pointer in BDS.
> 
> I like the consequence that when a loop is formed like in commit's drive-mirror
> run code and must be unlocked the bottom locked BDS will act as a guard to prevent
> unlock loop cycling.
> 
> We still have the issue of unlocking the bottom BDS when a subtree is detached
> from the graphs by a swap. (It does happen in my drive-mirror arbitrary node
> replacement series).
> 
> From my understanding the unlocking of the root BDS is done by drive_mirror_complete
> while the mirror code tries to unref the orphaned subtree _before_ drive_mirror_complete
> is called.

One fixe to my sentence:
s/drive_mirror_complete/block_job_complete/


> 
> So the bottom BDS would be unrefed before being unlocked.
> 
> Best regards
> 
> Benoît
> 
> > 
> > Fam
> > 

  reply	other threads:[~2014-06-21 15:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-19 20:01 [Qemu-devel] [PATCH] Recursive blocker Benoît Canet
2014-06-19 20:01 ` [Qemu-devel] [PATCH] block: Make op blocker recursive Benoît Canet
2014-06-19 20:13   ` Eric Blake
2014-06-19 20:20     ` Benoît Canet
2014-06-19 20:26       ` Eric Blake
2014-06-19 20:36         ` Benoît Canet
2014-06-20  5:01       ` Fam Zheng
2014-06-20 15:30         ` Eric Blake
2014-06-21  8:53           ` Fam Zheng
2014-06-21 10:45             ` Benoît Canet
2014-06-21 15:15               ` Fam Zheng
2014-06-21 15:39                 ` Benoît Canet
2014-06-21 15:40                   ` Benoît Canet [this message]
2014-06-23  4:32                     ` Fam Zheng
2014-06-23  5:17                       ` Benoît Canet
2014-06-23  7:07                         ` Fam Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140621154028.GC2108@irqsave.net \
    --to=benoit.canet@irqsave.net \
    --cc=famz@redhat.com \
    --cc=jcody@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).