All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-block@nongnu.org, mreitz@redhat.com,
	peter.maydell@linaro.org, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH] test-replication: Lock AioContext around blk_unref()
Date: Mon, 1 Oct 2018 18:03:48 +0200	[thread overview]
Message-ID: <20181001160348.GB4445@localhost.localdomain> (raw)
In-Reply-To: <9a0e30f2-90d0-11ba-b5db-3e2df18590cc@redhat.com>

Am 01.10.2018 um 17:40 hat Paolo Bonzini geschrieben:
> On 01/10/2018 16:32, Kevin Wolf wrote:
> > Recently, the test case has started failing because some job related
> > functions want to drop the AioContext lock even though it hasn't been
> > taken:
> > 
> >     (gdb) bt
> >     #0  0x00007f51c067c9fb in raise () from /lib64/libc.so.6
> >     #1  0x00007f51c067e77d in abort () from /lib64/libc.so.6
> >     #2  0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
> >     #3  0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96
> >     #4  0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516
> >     #5  0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738
> >     #6  0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986
> >     #7  0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941
> >     #8  0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148
> >     #9  0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420
> >     #10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629
> >     #11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685
> >     #12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783
> >     #13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402
> >     #14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457
> >     #15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478
> >     #16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
> >     #17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
> >     #18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
> >     #19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0
> >     #20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0
> >     #21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581
> > 
> > It is yet unclear whether this should really be considered a bug in the
> > test case or whether blk_unref() should work for callers that haven't
> > taken the AioContext lock, but in order to fix the build tests quickly,
> > just take the AioContext lock around blk_unref().
> 
> Given the backtrace, I think bdrv_close should be taking the AioContext
> lock instead of blockdev_close_all_bdrv_states.

Conversely, that would mean that calling bdrv_unref() with the
AioContext lock held is a bug (because close callbacks can involve
AIO_WAIT_WHILE()). I'm not sure if that's very practical.

Of course, there will probably be a lot of callers to fix either way
after we define whether to hold the lock for bdrv_unref() or not. Either
you need to add locking to the places where it's missing or you need to
drop the locks in all other places.

I was leaning towards requiring the lock for bdrv_unref() (and
therefore blk_unref()).

Kevin

  reply	other threads:[~2018-10-01 16:04 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-01 14:32 [Qemu-devel] [PATCH] test-replication: Lock AioContext around blk_unref() Kevin Wolf
2018-10-01 15:40 ` Paolo Bonzini
2018-10-01 16:03   ` Kevin Wolf [this message]
2018-10-01 16:13     ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181001160348.GB4445@localhost.localdomain \
    --to=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.