All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Cody <jcody@redhat.com>
To: Stefan Hajnoczi <stefanha@gmail.com>
Cc: qemu-devel@nongnu.org, kwolf@redhat.com, famz@redhat.com,
	qemu-block@nongnu.org, mreitz@redhat.com, stefanha@redhat.com,
	pbonzini@redhat.com, jsnow@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] [PATCH 1/5] blockjob: do not allow coroutine double entry or entry-after-completion
Date: Mon, 20 Nov 2017 08:36:19 -0500	[thread overview]
Message-ID: <20171120133619.GA32161@localhost.localdomain> (raw)
In-Reply-To: <20171120111653.GB4516@stefanha-x1.localdomain>

On Mon, Nov 20, 2017 at 11:16:53AM +0000, Stefan Hajnoczi wrote:
> On Sun, Nov 19, 2017 at 09:46:42PM -0500, Jeff Cody wrote:
> > --- a/blockjob.c
> > +++ b/blockjob.c
> > @@ -291,10 +291,10 @@ void block_job_start(BlockJob *job)
> >  {
> >      assert(job && !block_job_started(job) && job->paused &&
> >             job->driver && job->driver->start);
> > -    job->co = qemu_coroutine_create(block_job_co_entry, job);
> >      job->pause_count--;
> >      job->busy = true;
> >      job->paused = false;
> > +    job->co = qemu_coroutine_create(block_job_co_entry, job);
> >      bdrv_coroutine_enter(blk_bs(job->blk), job->co);
> >  }
> >  
> 
> This hunk makes no difference.  The coroutine is only entered by
> bdrv_coroutine_enter() so the order of job field initialization doesn't
> matter.
> 

It likely makes no difference with the current code (unless there is a
latent bug). However I made the change to protect against the following
scenario - which, perhaps to your point, would be a bug in any case:

1. job->co = qemu_coroutine_create()

    * Now block_job_started() returns true, as it just checks for job->co

2. Another thread calls block_job_enter(), before we call
   bdrv_coroutine_enter().

    * block_job_enter() checks job->busy and block_job_started() to
      determine if coroutine entry is allowed.  Without this change, these
      checks could pass and coroutine entry could occur.

    * I don't think this can happen in the current code, but the above hunk
      change is still correct, and would protect against such an
      occurrence.


I guess the question is, "is it worth doing?", to try and prevent that sort
of buggy behavior. My thought was "yes" because:

    A) there is no penalty in doing it this way

    B) while a bug, double entry like this can lead to memory and/or
    data corruption, and the checks for co->caller et al. might not
    catch it.  This is particularly true if the coroutine exits
    (COROUTINE_TERMINATE) before the re-entry.

But maybe if we are concerned about that we should figure out a way to
abort() instead.  Of course, that makes allowing recursive coroutines more
difficult in the future.


> > @@ -797,11 +797,14 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
> >          return;
> >      }
> >  
> > -    job->busy = false;
> > +    /* We need to leave job->busy set here, because when we have
> > +     * put a coroutine to 'sleep', we have scheduled it to run in
> > +     * the future.  We cannot enter that same coroutine again before
> > +     * it wakes and runs, otherwise we risk double-entry or entry after
> > +     * completion. */
> >      if (!block_job_should_pause(job)) {
> >          co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
> >      }
> > -    job->busy = true;
> >  
> >      block_job_pause_point(job);
> 
> This leaves a stale doc comment in include/block/blockjob_int.h:
> 
>   /**
>    * block_job_sleep_ns:
>    * @job: The job that calls the function.
>    * @clock: The clock to sleep on.
>    * @ns: How many nanoseconds to stop for.
>    *
>    * Put the job to sleep (assuming that it wasn't canceled) for @ns
>    * nanoseconds.  Canceling the job will interrupt the wait immediately.
>                    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>    */

I didn't catch the doc, that should be changed as well.

>   void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns);
> 
> This raises questions about the ability to cancel sleep:
> 
> 1. Does something depend on cancelling sleep?
> 

Not that I can tell.  The advantage is that you don't have to wait for the
timer, so something like qmp_block_job_cancel() will cancel sooner.

But it is obviously broken with the current coroutine implementation to try
to do that.

> 2. Did cancellation work properly in commit
>    4513eafe928ff47486f4167c28d364c72b5ff7e3 ("block: add
>    block_job_sleep_ns") and was it broken afterwards?
> 

With iothreads, the answer is complicated.  It was broken for a while for
other reasons.

It broke after using aio_co_wake() in the sleep timer cb (commmit
2f47da5f7f), which added the ability to schedule a coroutine if the timer
callback was called from the wrong AioContext.

Prior to that it "worked" in that the segfault was not present.

But even to bisect back to 2f47da5f7f was not straightforward, because
attempting them stream/cancel with iothreads would not even work until
c324fd0 (so I only bisected back as far as c324fd0 would cleanly apply).

And it is tricky to say if it "works" or not, because it is racy.  What may
have appeared to work may be more attributed to luck and timing.

If the coroutine is going to run at a future time, we cannot enter it
beforehand.  We risk the coroutine not even existing when the timer does run
the sleeping coroutine.  At the very least, early entry with the current
code would require a way to delete the associated timer.

> It is possible to fix the recursive coroutine entry without losing sleep
> cancellation.  Whether it's worth the trouble depends on the answers to
> the above questions.
> 

I contemplated the same thing.

At least for 2.11, fixing recursive coroutine entry is probably more than we
want to do.

Long term, my opinion is that we should fix it, because preventing it
becomes more difficult. It is easy to miss something that might cause a
recursive entry in code reviews, and since it can be racy, casual testing
may often miss it as well.

Jeff

  reply	other threads:[~2017-11-20 13:36 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-20  2:46 [Qemu-devel] [PATCH 0/5] Fix segfault in blockjob race condition Jeff Cody
2017-11-20  2:46 ` [Qemu-devel] [PATCH 1/5] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
2017-11-20 11:16   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:36     ` Jeff Cody [this message]
2017-11-21 10:47       ` Stefan Hajnoczi
2017-11-20 22:25     ` Paolo Bonzini
2017-11-21 12:42       ` Kevin Wolf
2017-11-20  2:46 ` [Qemu-devel] [PATCH 2/5] coroutine: abort if we try to enter coroutine scheduled for another ctx Jeff Cody
2017-11-20 11:28   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:42     ` Jeff Cody
2017-11-20  2:46 ` [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine Jeff Cody
2017-11-20 11:43   ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:45     ` Jeff Cody
2017-11-21 10:17       ` Stefan Hajnoczi
2017-11-20 22:30   ` [Qemu-devel] " Paolo Bonzini
2017-11-20 22:35     ` Jeff Cody
2017-11-20 22:47       ` Paolo Bonzini
2017-11-20 23:08         ` Jeff Cody
2017-11-20 23:13           ` Paolo Bonzini
2017-11-20 23:31             ` Jeff Cody
2017-11-20  2:46 ` [Qemu-devel] [PATCH 4/5] qemu-iotests: add option in common.qemu for mismatch only Jeff Cody
2017-11-20  2:46 ` [Qemu-devel] [PATCH 5/5] qemu-iotest: add test for blockjob coroutine race condition Jeff Cody

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171120133619.GA32161@localhost.localdomain \
    --to=jcody@redhat.com \
    --cc=famz@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.