From: Sergio Lopez <slp@redhat.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: Sergio Lopez <slp@redhat.com>,
qemu-block@nongnu.org, qemu-devel@nongnu.org,
stefanha@redhat.com, mreitz@redhat.com
Subject: Re: [Qemu-devel] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context
Date: Wed, 06 Mar 2019 13:47:19 +0100 [thread overview]
Message-ID: <87imwwdvag.fsf@redhat.com> (raw)
In-Reply-To: <20190301155002.GE5861@localhost.localdomain>
Kevin Wolf writes:
> Am 01.03.2019 um 14:47 hat Sergio Lopez geschrieben:
>> >> Otherwise, we can simply add an extra condition at
>> >> child_job_drained_poll(), before the drv->drained_poll(), to return
>> >> true if the job isn't yet paused.
>> >
>> > Yes, I think something like this is this right fix.
>>
>> Fixing this has uncovered another issue also triggered by issuing
>> 'block_commit' and 'device_del' consecutively. At the end, mirror_run()
>> calls to bdrv_drained_begin(), which is scheduled of later (via
>> bdrv_co_yield_to_drain()) as the mirror job is running in a coroutine.
>>
>> At the same time, the Guest requests the device to be unplugged, which
>> leads to blk_unref()->blk_drain()->bdrv_do_drained_begin(). When the
>> latter reaches BDRV_POLL_WHILE, the bdrv_drained_begin scheduled above
>> is run, which also runs BDRV_POLL_WHILE, leading to the thread getting
>> stuck in aio_poll().
>>
>> Is it really safe scheduling a bdrv_drained_begin() with poll == true?
>
> I don't see what the problem would be with it in theory. Once the node
> becomes idle, both the inner and the outer BDRV_POLL_WHILE() should
> return.
>
> The question with such hangs is usually, what is the condition that made
> bdrv_drain_poll() return true, and why aren't we making progress so that
> is would become false. With iothreads, it could also be that the
> condition has actually already changed, but aio_wait_kick() wasn't
> called, so aio_poll() isn't woken up.
Turns out we can't restrict child_job_drained_poll() to signal
completion only if the job has already been effectively paused or
cancelled, as we may reach this point from job_finish_sync().
Do you think it's worth to keep trying that bdrv_drained_begin() only
returns when the related jobs are completely paused, or should we just
use AIO_WAIT_WHILE at block_job_detach_aio_context() as previously
suggested?
Thanks,
Sergio (slp).
prev parent reply other threads:[~2019-03-06 12:47 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-27 16:52 [Qemu-devel] [PATCH] virtio-blk: dataplane: release AioContext before blk_set_aio_context Sergio Lopez
2019-02-27 17:37 ` Kevin Wolf
2019-02-28 15:01 ` Sergio Lopez
2019-02-28 15:50 ` Kevin Wolf
2019-02-28 17:04 ` Sergio Lopez
2019-02-28 17:22 ` Kevin Wolf
2019-02-28 18:36 ` Sergio Lopez
2019-03-01 11:44 ` Kevin Wolf
2019-03-01 13:47 ` Sergio Lopez
2019-03-01 15:50 ` Kevin Wolf
2019-03-06 12:47 ` Sergio Lopez [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87imwwdvag.fsf@redhat.com \
--to=slp@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.