qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Krempa <pkrempa@redhat.com>, svc-armband <armband@enea.com>,
	Markus Armbruster <armbru@redhat.com>,
	Jeff Cody <jcody@redhat.com>,
	Ciprian Barbu <Ciprian.Barbu@enea.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Max Reitz <mreitz@redhat.com>,
	Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Subject: Re: [Qemu-devel] nbd: Possible regression in 2.9 RCs
Date: Thu, 6 Apr 2017 11:03:47 +0200	[thread overview]
Message-ID: <20170406090347.GD4341@noname.redhat.com> (raw)
In-Reply-To: <20170406084824.GA4341@noname.redhat.com>

Am 06.04.2017 um 10:48 hat Kevin Wolf geschrieben:
> Am 05.04.2017 um 23:13 hat Paolo Bonzini geschrieben:
> > On 05/04/2017 13:01, Kevin Wolf wrote:
> > > Am 04.04.2017 um 17:09 hat Paolo Bonzini geschrieben:
> > >> On 04/04/2017 16:53, Kevin Wolf wrote:
> > >>>> The big question is how this fits into release management.  We have
> > >>>> another important regression from the op blocker work and only a week
> > >>>> to go before the last rc.  Are we going to delay 2.9 arbitrarily?  Are
> > >>>> we going to shorten the 2.10 development period correspondingly?  (I
> > >>>> vote yes and yes, FWIW).
> > >>> Which is the other regression?
> > >>
> > >> The assertion failure for snapshot_blkdev with iothreads.
> > > 
> > > Ah, right, I keep forgetting that this started appearing with the op
> > > blocker series because the failure mode is completely different, so it
> > > seems to have been a latent bug somewhere else that was uncovered by it.
> > > 
> > > If we're sure that the change of the order in bdrv_append() is what
> > > caused the bug to appear, we can just undo that for 2.9, at the cost of
> > > a messed up graph in the error case when bdrv_set_backing_hd() fails
> > > (because we have no way to undo bdrv_replace_node()).
> > 
> > I don't know if that is enough to fix all of the issues, but the bug is
> > easy to reproduce.
> > 
> > The issue is the lack of understanding of what node movement does to
> > quiesce_counter.  The invariant is that children cannot have a lower
> > quiesce_counter than parents, I think (paths in the graph can only join
> > in the children direction, right?).
> 
> Maybe I'm missing something, but I think this isn't true at all. Drains
> are propagated to the parents, so that this specific node doesn't
> receive new requests, but not to the children. The assumption is that
> children don't do anything anyway without requests from their parents,
> so they are effectively quiesced even with quiesce_counter == 0.
> 
> So if anything, the invariant should be the exact opposite: Parents
> cannot have a lower quiesce_counter than their children.
> 
> I think the exact thing that the quiesce_counter of a node is expected
> to be is the number of paths from itself to an explicitly drained node
> in the directed block driver graph (counting one path if it is
> explicitly drained itself). A path counts multiple times if a node is
> explicitly drained multiple times.
> 
> > Is it checked, and are there violations already?  Maybe we need a
> > get_quiesce_counter method in BdrvChildRole, to cover BlockBackend's
> > quiesce_counter?  Then we can use that information to adjust the
> > quiesce_counter when nodes move in the graph.
> 
> We would need that if we had a downwards propagation and if a
> BlockBackend could be drained, but as it stands, I don't see what could
> be missing from bdrv_replace_child_noperm() (well, except that I think
> your patch is right to avoid calling drained_end/begin if both nodes
> were drained because new requests could sneak in this way in theory).

Actually, to get this part completely right, we also need to drain the
BlockBackend _before_ attaching the new BDS. Otherwise, if the old BDS
wasn't quiesced, but the new one is, the BdrvChildRole.drained_begin()
callback could send requests to the already drained new BDS.

Kevin

> > The block layer has good tests, but as the internal logic grows more
> > complex we should probably have more C level tests.  I'm constantly
> > impressed by the amount of tricky cases that test-replication.c catches
> > in the block job code.
> 
> Never really noticed test-replication specifically catching things when
> I worked on the op blockers code which changed a lot around block jobs,
> but that we should consider this type of tests more often is probably a
> good point.
> 
> Kevin

  reply	other threads:[~2017-04-06  9:04 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-31 16:03 [Qemu-devel] nbd: Possible regression in 2.9 RCs Ciprian Barbu
2017-03-31 17:32 ` Ciprian Barbu
2017-03-31 17:36   ` Ciprian Barbu
2017-03-31 17:43 ` Max Reitz
2017-03-31 17:49   ` Ciprian Barbu
2017-03-31 17:57     ` Max Reitz
2017-03-31 19:07       ` Alexandru Avadanii
2017-04-03 18:52     ` Kashyap Chamarthy
2017-04-04  8:07       ` ciprian.barbu
2017-04-03  8:15   ` Kevin Wolf
2017-04-03 12:39     ` Max Reitz
2017-04-03 13:00       ` Kevin Wolf
2017-04-03 13:50         ` Peter Krempa
2017-04-04 12:16           ` Kevin Wolf
2017-04-04 13:51             ` Eric Blake
2017-04-04 14:04             ` Paolo Bonzini
2017-04-04 14:53               ` Kevin Wolf
2017-04-04 15:09                 ` Paolo Bonzini
2017-04-05 11:01                   ` Kevin Wolf
2017-04-05 21:13                     ` Paolo Bonzini
2017-04-06  8:48                       ` Kevin Wolf
2017-04-06  9:03                         ` Kevin Wolf [this message]
2017-04-03 19:48         ` Eric Blake
2017-04-03 19:44       ` Eric Blake
2017-04-04  8:17         ` ciprian.barbu
2017-04-04 11:00           ` Paolo Bonzini
2017-04-04 11:14             ` Daniel P. Berrange
2017-04-03 12:51     ` Peter Krempa
2017-04-04 13:19       ` Kevin Wolf
2017-04-04 13:27         ` Peter Krempa
2017-04-04 13:54           ` Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170406090347.GD4341@noname.redhat.com \
    --to=kwolf@redhat.com \
    --cc=Alexandru.Avadanii@enea.com \
    --cc=Ciprian.Barbu@enea.com \
    --cc=armband@enea.com \
    --cc=armbru@redhat.com \
    --cc=jcody@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).