From: Fam Zheng <famz@redhat.com>
To: Anton Nefedov <anton.nefedov@virtuozzo.com>
Cc: Alberto Garcia <berto@igalia.com>,
qemu-devel@nongnu.org, kwolf@redhat.com, qemu-block@nongnu.org,
mreitz@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)
Date: Thu, 9 Nov 2017 14:05:26 +0800 [thread overview]
Message-ID: <20171109060526.GA30991@lemon> (raw)
In-Reply-To: <84b1b069-642e-2b33-7723-70edce3aea43@virtuozzo.com>
On Wed, 11/08 18:50, Anton Nefedov wrote:
>
>
> On 8/11/2017 5:45 PM, Alberto Garcia wrote:
> > On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote:
> > > BlockBackend gets deleted by another job's stream_complete(), deferred
> > > to the main loop, so the fact that the job is put to sleep by
> > > bdrv_drain_all_begin() doesn't really stop it from execution.
> >
> > I was debugging this a bit, and the block_job_defer_to_main_loop() call
> > happens _after_ all jobs have been paused, so I think that when the BDS
> > is drained then stream_run() finishes the last iteration without
> > checking if it's paused.
> >
> > Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I
> > assume that the function would have to continue looping and
> > block_job_sleep_ns() would make the job coroutine yield, effectively
> > pausing the job and preventing the crash.
> >
> > I can fix the crash by adding block_job_pause_point(&s->common) at the
> > end of stream_run() (where the 'out' label is).
> >
> > I'm thinking that perhaps we should add the pause point directly to
> > block_job_defer_to_main_loop(), to prevent any block job from running
> > the exit function when it's paused.
> >
>
> Is it possible that the exit function is already deferred when the jobs
> are being paused? (even though it's at least less likely to happen)
>
> Then should we flush the bottom halves somehow in addition to putting
> the jobs to sleep? And also then it all probably has to happen before
> bdrv_reopen_queue()
Or we can stash away the BH temporarily during pause period:
---
diff --git a/blockjob.c b/blockjob.c
index 3a0c49137e..7058ff7ae1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -127,6 +127,9 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
static void block_job_pause(BlockJob *job)
{
job->pause_count++;
+ if (job->defer_to_main_loop_bh) {
+ qemu_bh_cancel(job->defer_to_main_loop_bh);
+ }
}
static void block_job_resume(BlockJob *job)
@@ -136,6 +139,9 @@ static void block_job_resume(BlockJob *job)
if (job->pause_count) {
return;
}
+ if (job->defer_to_main_loop_bh) {
+ qemu_bh_schedule(job->defer_to_main_loop_bh);
+ }
block_job_enter(job);
}
@@ -900,6 +906,9 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
/* Prevent race with block_job_defer_to_main_loop() */
aio_context_acquire(data->aio_context);
+ qemu_bh_delete(data->job->defer_to_main_loop_bh);
+ data->job->defer_to_main_loop_bh = NULL;
+
/* Fetch BDS AioContext again, in case it has changed */
aio_context = blk_get_aio_context(data->job->blk);
if (aio_context != data->aio_context) {
@@ -927,7 +936,9 @@ void block_job_defer_to_main_loop(BlockJob *job,
data->fn = fn;
data->opaque = opaque;
job->deferred_to_main_loop = true;
+ assert(!job->defer_to_main_loop_bh);
+ job->defer_to_main_loop_bh = aio_bh_new(qemu_get_aio_context(),
+ block_job_defer_to_main_loop_bh, data);
- aio_bh_schedule_oneshot(qemu_get_aio_context(),
- block_job_defer_to_main_loop_bh, data);
+ qemu_bh_schedule(job->defer_to_main_loop_bh);
}
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 67c0968fa5..28d29b3be6 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -97,6 +97,11 @@ typedef struct BlockJob {
*/
bool deferred_to_main_loop;
+ /**
+ * The main loop BH for "defer to main loop" operation.
+ */
+ QEMUBH *defer_to_main_loop_bh;
+
/** Element of the list of block jobs */
QLIST_ENTRY(BlockJob) job_list;
next prev parent reply other threads:[~2017-11-09 6:05 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-07 16:19 [Qemu-devel] segfault in parallel blockjobs (iotest 30) Anton Nefedov
2017-11-08 13:58 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-11-08 14:45 ` Alberto Garcia
2017-11-08 15:50 ` Anton Nefedov
2017-11-09 6:05 ` Fam Zheng [this message]
2017-11-09 10:03 ` Alberto Garcia
2017-11-09 16:26 ` Alberto Garcia
2017-11-10 3:02 ` Fam Zheng
2017-11-15 15:42 ` Alberto Garcia
2017-11-15 16:31 ` Anton Nefedov
2017-11-16 15:54 ` Alberto Garcia
2017-11-16 16:09 ` Anton Nefedov
2017-11-21 12:51 ` Alberto Garcia
2017-11-21 15:18 ` Anton Nefedov
2017-11-21 15:31 ` Alberto Garcia
2017-11-22 0:13 ` John Snow
2017-11-22 12:55 ` Alberto Garcia
2017-11-22 15:58 ` John Snow
2017-11-16 21:56 ` John Snow
2017-11-17 16:16 ` Alberto Garcia
2017-11-22 15:05 ` Alberto Garcia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171109060526.GA30991@lemon \
--to=famz@redhat.com \
--cc=anton.nefedov@virtuozzo.com \
--cc=berto@igalia.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.