From: Max Reitz <mreitz@redhat.com>
To: Fam Zheng <famz@redhat.com>, Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS
Date: Fri, 10 Nov 2017 16:23:17 +0100 [thread overview]
Message-ID: <d0e58edd-39cf-2bb5-61bd-442825b82ff9@redhat.com> (raw)
In-Reply-To: <20171110133207.GA9235@lemon.Home>
[-- Attachment #1: Type: text/plain, Size: 6015 bytes --]
On 2017-11-10 14:32, Fam Zheng wrote:
> On Fri, 11/10 14:17, Kevin Wolf wrote:
>> Am 10.11.2017 um 03:45 hat Fam Zheng geschrieben:
>>> On Thu, 11/09 21:43, Max Reitz wrote:
>>>> Draining a BDS may lead to graph modifications, which in turn may result
>>>> in it and other BDS being stripped of their current references. If
>>>> bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong
>>>> references themselves, the BDS they are trying to drain (or undrain) may
>>>> disappear right under their feet -- or, more specifically, under the
>>>> feet of BDRV_POLL_WHILE() in bdrv_drain_recurse().
>>>>
>>>> This fixes an occasional hang of iotest 194.
>>>>
>>>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>>>> ---
>>>> block/io.c | 47 ++++++++++++++++++++++++++++++++++++++++++++---
>>>> 1 file changed, 44 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/block/io.c b/block/io.c
>>>> index 3d5ef2cabe..a0a2833e8e 100644
>>>> --- a/block/io.c
>>>> +++ b/block/io.c
>>>> @@ -340,7 +340,10 @@ void bdrv_drain_all_begin(void)
>>>> bool waited = true;
>>>> BlockDriverState *bs;
>>>> BdrvNextIterator it;
>>>> - GSList *aio_ctxs = NULL, *ctx;
>>>> + GSList *aio_ctxs = NULL, *ctx, *bs_list = NULL, *bs_list_entry;
>>>> +
>>>> + /* Must be called from the main loop */
>>>> + assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>>>>
>>>> block_job_pause_all();
>>>>
>>>> @@ -355,6 +358,12 @@ void bdrv_drain_all_begin(void)
>>>> if (!g_slist_find(aio_ctxs, aio_context)) {
>>>> aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
>>>> }
>>>> +
>>>> + /* Keep a strong reference to all root BDS and copy them into
>>>> + * an own list because draining them may lead to graph
>>>> + * modifications. */
>>>> + bdrv_ref(bs);
>>>> + bs_list = g_slist_prepend(bs_list, bs);
>>>> }
>>>>
>>>> /* Note that completion of an asynchronous I/O operation can trigger any
>>>> @@ -370,7 +379,11 @@ void bdrv_drain_all_begin(void)
>>>> AioContext *aio_context = ctx->data;
>>>>
>>>> aio_context_acquire(aio_context);
>>>> - for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
>>>> + for (bs_list_entry = bs_list; bs_list_entry;
>>>> + bs_list_entry = bs_list_entry->next)
>>>> + {
>>>> + bs = bs_list_entry->data;
>>>> +
>>>> if (aio_context == bdrv_get_aio_context(bs)) {
>>>> waited |= bdrv_drain_recurse(bs, true);
>>>> }
>>>> @@ -379,24 +392,52 @@ void bdrv_drain_all_begin(void)
>>>> }
>>>> }
>>>>
>>>> + for (bs_list_entry = bs_list; bs_list_entry;
>>>> + bs_list_entry = bs_list_entry->next)
>>>> + {
>>>> + bdrv_unref(bs_list_entry->data);
>>>> + }
>>>> +
>>>> g_slist_free(aio_ctxs);
>>>> + g_slist_free(bs_list);
>>>> }
>>>>
>>>> void bdrv_drain_all_end(void)
>>>> {
>>>> BlockDriverState *bs;
>>>> BdrvNextIterator it;
>>>> + GSList *bs_list = NULL, *bs_list_entry;
>>>> +
>>>> + /* Must be called from the main loop */
>>>> + assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>>>>
>>>> + /* Keep a strong reference to all root BDS and copy them into an
>>>> + * own list because draining them may lead to graph modifications.
>>>> + */
>>>> for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
>>>> - AioContext *aio_context = bdrv_get_aio_context(bs);
>>>> + bdrv_ref(bs);
>>>> + bs_list = g_slist_prepend(bs_list, bs);
>>>> + }
>>>> +
>>>> + for (bs_list_entry = bs_list; bs_list_entry;
>>>> + bs_list_entry = bs_list_entry->next)
>>>> + {
>>>> + AioContext *aio_context;
>>>> +
>>>> + bs = bs_list_entry->data;
>>>> + aio_context = bdrv_get_aio_context(bs);
>>>>
>>>> aio_context_acquire(aio_context);
>>>> aio_enable_external(aio_context);
>>>> bdrv_parent_drained_end(bs);
>>>> bdrv_drain_recurse(bs, false);
>>>> aio_context_release(aio_context);
>>>> +
>>>> + bdrv_unref(bs);
>>>> }
>>>>
>>>> + g_slist_free(bs_list);
>>>> +
>>>> block_job_resume_all();
>>>> }
>>>
>>> It is better to put the references into BdrvNextIterator and introduce
>>> bdrv_next_iterator_destroy() to free them? You'll need to touch all callers
>>> because it is not C++, but it secures all of rest, which seems vulnerable in the
>>> same pattern, for example the aio_poll() in iothread_stop_all().
>>
>> You could automatically free the references when bdrv_next() returns
>> NULL. Then you need an explicit bdrv_next_iterator_destroy() only for
>> callers that stop iterating halfway through the list.
>> Yes, good idea.
But bdrv_unref() is safe only in the main loop. Without having checked,
I'm not sure whether all callers of bdrv_next() are running in the main
loop.
I'd rather introduce a bdrv_next_safe() which is used by those callers
which need it.
>> Do you actually need to keep references to all BDSes in the whole list
>> while using the iterator or would it be enough to just keep a reference
>> to the current one?
>
> To fix the bug we now see I think keeping the current is enough, but I think
> implementing just like this patch is also good with some future-proofing: we
> cannot know what will be wedged into the nexted aio_poll()'s over time (and yes,
> we should really reduce the number of them.)
I don't really want to think about whether it's safe to only keep a
reference to the current BDS. I can't imagine any case where destroying
one root BDS leads to destroying another, but I'd rather be safe and not
have to think about it. (Unless there is an important reason to only
keep a strong reference to the current one.)
Max
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 512 bytes --]
next prev parent reply other threads:[~2017-11-10 15:23 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-09 20:43 [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS Max Reitz
2017-11-09 21:02 ` Eric Blake
2017-11-10 2:45 ` Fam Zheng
2017-11-10 13:17 ` Kevin Wolf
2017-11-10 13:32 ` Fam Zheng
2017-11-10 15:23 ` Max Reitz [this message]
2017-11-10 15:31 ` Fam Zheng
2017-11-10 16:05 ` Kevin Wolf
2017-11-10 16:13 ` Max Reitz
2017-11-10 16:22 ` Kevin Wolf
2017-11-10 16:43 ` Max Reitz
2017-11-10 9:19 ` Stefan Hajnoczi
2017-11-10 15:26 ` Max Reitz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d0e58edd-39cf-2bb5-61bd-442825b82ff9@redhat.com \
--to=mreitz@redhat.com \
--cc=famz@redhat.com \
--cc=kwolf@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).