From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Kevin Wolf <kwolf@redhat.com>
Cc: "libvir-list @ redhat . com" <libvir-list@redhat.com>,
qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH] qcow2: Fix race in cache invalidation
Date: Fri, 26 Sep 2014 00:05:57 +1000 [thread overview]
Message-ID: <54242145.6070808@ozlabs.ru> (raw)
In-Reply-To: <20140925123944.GK4667@noname.redhat.com>
On 09/25/2014 10:39 PM, Kevin Wolf wrote:
> Am 25.09.2014 um 14:29 hat Alexey Kardashevskiy geschrieben:
>> On 09/25/2014 08:20 PM, Kevin Wolf wrote:
>>> Am 25.09.2014 um 11:55 hat Alexey Kardashevskiy geschrieben:
>>>> Right. Cool. So is below what was suggested? I am doublechecking as it does
>>>> not solve the original issue - the bottomhalf is called first and then
>>>> nbd_trip() crashes in qcow2_co_flush_to_os().
>>>>
>>>> diff --git a/block.c b/block.c
>>>> index d06dd51..1e6dfd1 100644
>>>> --- a/block.c
>>>> +++ b/block.c
>>>> @@ -5037,20 +5037,22 @@ void bdrv_invalidate_cache(BlockDriverState *bs,
>>>> Error **errp)
>>>> if (local_err) {
>>>> error_propagate(errp, local_err);
>>>> return;
>>>> }
>>>>
>>>> ret = refresh_total_sectors(bs, bs->total_sectors);
>>>> if (ret < 0) {
>>>> error_setg_errno(errp, -ret, "Could not refresh total sector count");
>>>> return;
>>>> }
>>>> +
>>>> + bdrv_drain_all();
>>>> }
>>>
>>> Try moving the bdrv_drain_all() call to the top of the function (at
>>> least it must be called before bs->drv->bdrv_invalidate_cache).
>>
>>
>> Ok, I did. Did not help.
>>
>>
>>>
>>>> +static QEMUBH *migration_complete_bh;
>>>> +static void process_incoming_migration_complete(void *opaque);
>>>> +
>>>> static void process_incoming_migration_co(void *opaque)
>>>> {
>>>> QEMUFile *f = opaque;
>>>> - Error *local_err = NULL;
>>>> int ret;
>>>>
>>>> ret = qemu_loadvm_state(f);
>>>> qemu_fclose(f);
>>>
>>> Paolo suggested to move eveything starting from here, but as far as I
>>> can tell, leaving the next few lines here shouldn't hurt.
>>
>>
>> Ouch. I was looking at wrong qcow2_fclose() all this time :)
>> Aaaany what you suggested did not help -
>> bdrv_co_flush() calls qemu_coroutine_yield() while this BH is being
>> executed and the situation is still the same.
>
> Hm, do you have a backtrace? The idea with the BH was that it would be
> executed _outside_ coroutine context and therefore wouldn't be able to
> yield. If it's still executed in coroutine context, it would be
> interesting to see who that caller is.
Like this?
process_incoming_migration_complete
bdrv_invalidate_cache_all
bdrv_drain_all
aio_dispatch
node->io_read (which is nbd_read)
nbd_trip
bdrv_co_flush
[...]
--
Alexey
next prev parent reply other threads:[~2014-09-25 14:07 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-15 10:50 [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 12:02 ` Alexey Kardashevskiy
2014-09-16 12:10 ` Paolo Bonzini
2014-09-16 12:34 ` Kevin Wolf
2014-09-16 12:35 ` Paolo Bonzini
2014-09-16 12:52 ` Kevin Wolf
2014-09-16 12:59 ` Paolo Bonzini
2014-09-19 8:47 ` Kevin Wolf
2014-09-23 8:47 ` [Qemu-devel] [RFC PATCH] qcow2: Fix race in cache invalidation Alexey Kardashevskiy
2014-09-24 7:30 ` Alexey Kardashevskiy
2014-09-24 9:48 ` Kevin Wolf
2014-09-25 8:41 ` Alexey Kardashevskiy
2014-09-25 8:57 ` Kevin Wolf
2014-09-25 9:55 ` Alexey Kardashevskiy
2014-09-25 10:20 ` Kevin Wolf
2014-09-25 12:29 ` Alexey Kardashevskiy
2014-09-25 12:39 ` Kevin Wolf
2014-09-25 14:05 ` Alexey Kardashevskiy [this message]
2014-09-28 11:14 ` Alexey Kardashevskiy
2014-09-17 6:46 ` [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 14:52 ` Alexey Kardashevskiy
2014-09-17 9:06 ` Stefan Hajnoczi
2014-09-17 9:25 ` Paolo Bonzini
2014-09-17 13:44 ` Alexey Kardashevskiy
2014-09-17 15:07 ` Stefan Hajnoczi
2014-09-18 3:26 ` Alexey Kardashevskiy
2014-09-18 9:56 ` Paolo Bonzini
2014-09-19 8:23 ` Alexey Kardashevskiy
2014-09-17 15:04 ` Stefan Hajnoczi
2014-09-17 15:17 ` Eric Blake
2014-09-17 15:53 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54242145.6070808@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=dgilbert@redhat.com \
--cc=kwolf@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.