From: Jeff Cody <jcody@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, mreitz@redhat.com,
stefanha@redhat.com, famz@redhat.com, kwolf@redhat.com
Subject: Re: [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine
Date: Mon, 20 Nov 2017 18:08:37 -0500 [thread overview]
Message-ID: <20171120230837.GI5399@localhost.localdomain> (raw)
In-Reply-To: <82e388ce-cbbf-7e59-6433-6e109e426512@redhat.com>
On Mon, Nov 20, 2017 at 11:47:09PM +0100, Paolo Bonzini wrote:
> On 20/11/2017 23:35, Jeff Cody wrote:
> >> Is this a different "state" (in Stefan's parlance) than scheduled? In
> >> practice both means that someone may call qemu_(aio_)coroutine_enter
> >> concurrently, so you'd better not do it yourself.
> >>
> > It is slightly different; it is from sleeping with a timer via
> > co_aio_sleep_ns and waking via co_sleep_cb. Whereas the 'co->scheduled' is
> > specifically from being scheduled for a specific AioContext, via
> > aio_co_schedule().
>
> Right; however, that would only make a difference if we allowed
> canceling a co_aio_sleep_ns. Since we don't want that, they have the
> same transitions.
>
> > In practice, 'co->schedule' and 'co->sleeping' certainly rhyme, at the very
> > least.
> >
> > But having them separate will put the abort closer to where the problem lies,
> > so it should make debugging a bit easier if we hit it.
>
> What do you mean by closer? It would print a slightly more informative
> message, but the message is in qemu_aio_coroutine_for both cases.
>
Sorry, sloppy wording; I meant what you said above, that the error message
is more informative, so by tracking down where co->sleeping is set the
developer is closer to where the problem lies.
> In fact, unifying co->scheduled and co->sleeping means that you can
> easily abort when co_aio_sleep_ns is called on a scheduled coroutine, like
>
> /* This is valid. */
> aio_co_schedule(qemu_get_current_aio_context(),
> qemu_coroutine_self());
>
> /* But only if there was a qemu_coroutine_yield here. */
> co_aio_sleep_ns(qemu_get_current_aio_context(), 1000);
>
That is true. But we could also check (co->sleeping || co->scheduled) in
co_aio_sleep_ns() though, as well.
Hmm... not checking co->sleeping in co_aio_sleep_ns() is a bug in my
patch. We don't want to schedule a coroutine on two different timers,
either.
So what do you think about adding this to the patch:
@@ -34,6 +36,7 @@ void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type,
CoSleepCB sleep_cb = {
.co = qemu_coroutine_self(),
};
+ if (sleep_cb.co->sleeping == 1 || sleep_cb.co->scheduled == 1) {
+ fprintf(stderr, "Cannot sleep a co-routine that is already sleeping "
+ " or scheduled\n");
+ abort();
+ }
+ sleep_cb.co->sleeping = 1;
sleep_cb.ts = aio_timer_new(ctx, type, SCALE_NS, co_sleep_cb, &sleep_cb);
timer_mod(sleep_cb.ts, qemu_clock_get_ns(type) + ns);
qemu_coroutine_yield();
Jeff
next prev parent reply other threads:[~2017-11-20 23:08 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-20 2:46 [Qemu-devel] [PATCH 0/5] Fix segfault in blockjob race condition Jeff Cody
2017-11-20 2:46 ` [Qemu-devel] [PATCH 1/5] blockjob: do not allow coroutine double entry or entry-after-completion Jeff Cody
2017-11-20 11:16 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:36 ` Jeff Cody
2017-11-21 10:47 ` Stefan Hajnoczi
2017-11-20 22:25 ` Paolo Bonzini
2017-11-21 12:42 ` Kevin Wolf
2017-11-20 2:46 ` [Qemu-devel] [PATCH 2/5] coroutine: abort if we try to enter coroutine scheduled for another ctx Jeff Cody
2017-11-20 11:28 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:42 ` Jeff Cody
2017-11-20 2:46 ` [Qemu-devel] [PATCH 3/5] coroutines: abort if we try to enter a still-sleeping coroutine Jeff Cody
2017-11-20 11:43 ` [Qemu-devel] [Qemu-block] " Stefan Hajnoczi
2017-11-20 13:45 ` Jeff Cody
2017-11-21 10:17 ` Stefan Hajnoczi
2017-11-20 22:30 ` [Qemu-devel] " Paolo Bonzini
2017-11-20 22:35 ` Jeff Cody
2017-11-20 22:47 ` Paolo Bonzini
2017-11-20 23:08 ` Jeff Cody [this message]
2017-11-20 23:13 ` Paolo Bonzini
2017-11-20 23:31 ` Jeff Cody
2017-11-20 2:46 ` [Qemu-devel] [PATCH 4/5] qemu-iotests: add option in common.qemu for mismatch only Jeff Cody
2017-11-20 2:46 ` [Qemu-devel] [PATCH 5/5] qemu-iotest: add test for blockjob coroutine race condition Jeff Cody
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171120230837.GI5399@localhost.localdomain \
--to=jcody@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.