All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Cody <jcody@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	qemu-block@nongnu.org, mreitz@redhat.com, stefanha@redhat.com,
	famz@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-2.11 1/4] Revert "coroutine: abort if we try to schedule or enter a pending coroutine"
Date: Tue, 28 Nov 2017 12:33:10 -0500	[thread overview]
Message-ID: <20171128173310.GI25110@localhost.localdomain> (raw)
In-Reply-To: <20171128171931.GJ3703@localhost.localdomain>

On Tue, Nov 28, 2017 at 06:19:31PM +0100, Kevin Wolf wrote:
> Am 28.11.2017 um 18:01 hat Paolo Bonzini geschrieben:
> > On 28/11/2017 17:37, Kevin Wolf wrote:
> > >>
> > >> It can also conflict badly with another aio_co_schedule().  Your patch
> > >> here removes the assertion in this case, and patch 3 makes it easier to
> > >> get into the situation where two aio_co_schedule()s conflict with each
> > >> other.
> > > I don't see how they conflict. If the second aio_co_schedule() comes
> > > before the coroutine is actually entered, they are effectively simply
> > > merged into a single one. Which is exactly what was intended.
> > 
> > Suppose you have
> > 
> > 	ctx->scheduled_coroutines
> > 	  |
> > 	  '---> co
> >                 |
> >                 '---> NULL
> > 
> > Then aio_co_schedule(ctx, co) does QSLIST_INSERT_HEAD_ATOMIC, which is
> > 
> > 	/* On entry, ctx->scheduled_coroutines == co.  */
> > 	co->next = ctx->scheduled_coroutines
> > 	change ctx->scheduled_coroutines from co->next to co
> > 
> > This results in a loop:
> > 	
> > 	ctx->scheduled_coroutines
> > 	  |
> > 	  '---> co <-.
> >                 |    |
> >                 '----'
> > 
> > This is an obvious hang due to list corruption.  In other more
> > complicated cases it can skip a wakeup, which is a more subtle kind of
> > hang.  You can also get memory corruption if the coroutine terminates
> > (and is freed) before aio_co_schedule executes.
> 
> Okay, I can see that. I thought you were talking about the logic
> introduced by this series, but you're actually talking about preexisting
> behaviour which limits the usefulness of the patches.
> 
> > Basically, once you do aio_co_schedule or aio_co_wake the coroutine is
> > not any more yours.  It's owned by the context that will run it and you
> > should not do anything with it.
> > 
> > The same is true for co_aio_sleep_ns.  Just because in practice it works
> > (and it seemed like a clever idea at the time) it doesn't mean it *is* a
> > good idea...
> 
> Well, but that poses the question how you can implement any coroutine
> that can be reentered from more than one place. Not being able to do
> that would be a severe restriction.
> 
> For example, take quorum, which handles requests by spawning a coroutine
> for every child and then yielding until acb->count == s->num_children.
> This means that the main request coroutine can be reentered from the
> child request coroutines in any order and timing.
> 
> What saves us currently is that they are all in the same AioContext, so
> we won't actually end up in aio_co_schedule(), but I'm sure that sooner
> or later we'll find cases where we're waiting for several workers that
> are spread across different I/O threads.
> 
> I don't think "don't do that" is a good answer to this.
> 
> And yes, reentering co_aio_sleep_ns() early is a real requirement, too.
> The test case that used speed=1 and would have waited a few hours before
> actually cancelling the job is an extreme example, but even just
> delaying cancellation for a second is bad if this is a second of
> migration downtime.
> 

At least this last part should be doable; reentering co_aio_sleep_ns()
really should mean that we want to short-circuit the QEMU timer for the
sleep cb.  Any reason we can't allow a reenter by removing the timer for
that sleep, and then doing an aio_co_wake?

  reply	other threads:[~2017-11-28 17:33 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-28 15:43 [Qemu-devel] [PATCH for-2.11 0/4] Fix qemu-iotests failures Kevin Wolf
2017-11-28 15:43 ` [Qemu-devel] [PATCH for-2.11 1/4] Revert "coroutine: abort if we try to schedule or enter a pending coroutine" Kevin Wolf
2017-11-28 16:00   ` Jeff Cody
2017-11-28 16:18   ` Paolo Bonzini
2017-11-28 16:37     ` Kevin Wolf
2017-11-28 17:01       ` Paolo Bonzini
2017-11-28 17:19         ` Kevin Wolf
2017-11-28 17:33           ` Jeff Cody [this message]
2017-11-28 17:35           ` Paolo Bonzini
2017-11-28 15:43 ` [Qemu-devel] [PATCH for-2.11 2/4] Revert "blockjob: do not allow coroutine double entry or entry-after-completion" Kevin Wolf
2017-11-28 16:00   ` Jeff Cody
2017-11-28 15:43 ` [Qemu-devel] [PATCH for-2.11 3/4] coroutine: Cancel aio_co_schedule() on direct entry Kevin Wolf
2017-11-28 16:09   ` Jeff Cody
2017-11-28 16:14   ` Paolo Bonzini
2017-11-28 16:28     ` Kevin Wolf
2017-11-28 16:42       ` Jeff Cody
2017-11-28 16:51         ` Paolo Bonzini
2017-11-28 17:09           ` Jeff Cody
2017-11-28 17:14             ` Paolo Bonzini
2017-11-28 17:03         ` Kevin Wolf
2017-11-28 16:45       ` Paolo Bonzini
2017-11-28 16:30   ` Fam Zheng
2017-11-28 16:46   ` Eric Blake
2017-11-28 15:43 ` [Qemu-devel] [PATCH for-2.11 4/4] block: Expect graph changes in bdrv_parent_drained_begin/end Kevin Wolf
2017-11-28 16:10   ` Jeff Cody
2017-11-28 15:56 ` [Qemu-devel] [PATCH for-2.11 0/4] Fix qemu-iotests failures Jeff Cody
2017-11-28 16:00 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171128173310.GI25110@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=famz@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.