qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Fam Zheng <famz@redhat.com>
To: Anton Nefedov <anton.nefedov@virtuozzo.com>
Cc: Alberto Garcia <berto@igalia.com>,
	qemu-devel@nongnu.org, kwolf@redhat.com, qemu-block@nongnu.org,
	mreitz@redhat.com
Subject: Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)
Date: Thu, 9 Nov 2017 14:05:26 +0800	[thread overview]
Message-ID: <20171109060526.GA30991@lemon> (raw)
In-Reply-To: <84b1b069-642e-2b33-7723-70edce3aea43@virtuozzo.com>

On Wed, 11/08 18:50, Anton Nefedov wrote:
> 
> 
> On 8/11/2017 5:45 PM, Alberto Garcia wrote:
> > On Tue 07 Nov 2017 05:19:41 PM CET, Anton Nefedov wrote:
> > > BlockBackend gets deleted by another job's stream_complete(), deferred
> > > to the main loop, so the fact that the job is put to sleep by
> > > bdrv_drain_all_begin() doesn't really stop it from execution.
> > 
> > I was debugging this a bit, and the block_job_defer_to_main_loop() call
> > happens _after_ all jobs have been paused, so I think that when the BDS
> > is drained then stream_run() finishes the last iteration without
> > checking if it's paused.
> > 
> > Without your patch (i.e. with a smaller STREAM_BUFFER_SIZE) then I
> > assume that the function would have to continue looping and
> > block_job_sleep_ns() would make the job coroutine yield, effectively
> > pausing the job and preventing the crash.
> > 
> > I can fix the crash by adding block_job_pause_point(&s->common) at the
> > end of stream_run() (where the 'out' label is).
> > 
> > I'm thinking that perhaps we should add the pause point directly to
> > block_job_defer_to_main_loop(), to prevent any block job from running
> > the exit function when it's paused.
> > 
> 
> Is it possible that the exit function is already deferred when the jobs
> are being paused? (even though it's at least less likely to happen)
> 
> Then should we flush the bottom halves somehow in addition to putting
> the jobs to sleep? And also then it all probably has to happen before
> bdrv_reopen_queue()

Or we can stash away the BH temporarily during pause period:

---

diff --git a/blockjob.c b/blockjob.c
index 3a0c49137e..7058ff7ae1 100644
--- a/blockjob.c
+++ b/blockjob.c
@@ -127,6 +127,9 @@ void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
 static void block_job_pause(BlockJob *job)
 {
     job->pause_count++;
+    if (job->defer_to_main_loop_bh) {
+        qemu_bh_cancel(job->defer_to_main_loop_bh);
+    }
 }
 
 static void block_job_resume(BlockJob *job)
@@ -136,6 +139,9 @@ static void block_job_resume(BlockJob *job)
     if (job->pause_count) {
         return;
     }
+    if (job->defer_to_main_loop_bh) {
+        qemu_bh_schedule(job->defer_to_main_loop_bh);
+    }
     block_job_enter(job);
 }
 
@@ -900,6 +906,9 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
     /* Prevent race with block_job_defer_to_main_loop() */
     aio_context_acquire(data->aio_context);
 
+    qemu_bh_delete(data->job->defer_to_main_loop_bh);
+    data->job->defer_to_main_loop_bh = NULL;
+
     /* Fetch BDS AioContext again, in case it has changed */
     aio_context = blk_get_aio_context(data->job->blk);
     if (aio_context != data->aio_context) {
@@ -927,7 +936,9 @@ void block_job_defer_to_main_loop(BlockJob *job,
     data->fn = fn;
     data->opaque = opaque;
     job->deferred_to_main_loop = true;
+    assert(!job->defer_to_main_loop_bh);
+    job->defer_to_main_loop_bh = aio_bh_new(qemu_get_aio_context(),
+                                 block_job_defer_to_main_loop_bh, data);
 
-    aio_bh_schedule_oneshot(qemu_get_aio_context(),
-                            block_job_defer_to_main_loop_bh, data);
+    qemu_bh_schedule(job->defer_to_main_loop_bh);
 }
diff --git a/include/block/blockjob.h b/include/block/blockjob.h
index 67c0968fa5..28d29b3be6 100644
--- a/include/block/blockjob.h
+++ b/include/block/blockjob.h
@@ -97,6 +97,11 @@ typedef struct BlockJob {
      */
     bool deferred_to_main_loop;
 
+    /**
+     * The main loop BH for "defer to main loop" operation.
+     */
+    QEMUBH *defer_to_main_loop_bh;
+
     /** Element of the list of block jobs */
     QLIST_ENTRY(BlockJob) job_list;
 

  reply	other threads:[~2017-11-09  6:05 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-07 16:19 [Qemu-devel] segfault in parallel blockjobs (iotest 30) Anton Nefedov
2017-11-08 13:58 ` [Qemu-devel] [Qemu-block] " Alberto Garcia
2017-11-08 14:45 ` Alberto Garcia
2017-11-08 15:50   ` Anton Nefedov
2017-11-09  6:05     ` Fam Zheng [this message]
2017-11-09 10:03       ` Alberto Garcia
2017-11-09 16:26   ` Alberto Garcia
2017-11-10  3:02     ` Fam Zheng
2017-11-15 15:42       ` Alberto Garcia
2017-11-15 16:31         ` Anton Nefedov
2017-11-16 15:54           ` Alberto Garcia
2017-11-16 16:09             ` Anton Nefedov
2017-11-21 12:51               ` Alberto Garcia
2017-11-21 15:18                 ` Anton Nefedov
2017-11-21 15:31                   ` Alberto Garcia
2017-11-22  0:13                     ` John Snow
2017-11-22 12:55                   ` Alberto Garcia
2017-11-22 15:58                     ` John Snow
2017-11-16 21:56             ` John Snow
2017-11-17 16:16               ` Alberto Garcia
2017-11-22 15:05 ` Alberto Garcia

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171109060526.GA30991@lemon \
    --to=famz@redhat.com \
    --cc=anton.nefedov@virtuozzo.com \
    --cc=berto@igalia.com \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).