From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: Kevin Wolf <kwolf@redhat.com>
Cc: "libvir-list @ redhat . com" <libvir-list@redhat.com>,
qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com>,
Stefan Hajnoczi <stefanha@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [RFC PATCH] qcow2: Fix race in cache invalidation
Date: Thu, 25 Sep 2014 22:29:36 +1000 [thread overview]
Message-ID: <54240AB0.508@ozlabs.ru> (raw)
In-Reply-To: <20140925102027.GH4667@noname.redhat.com>
On 09/25/2014 08:20 PM, Kevin Wolf wrote:
> Am 25.09.2014 um 11:55 hat Alexey Kardashevskiy geschrieben:
>> Right. Cool. So is below what was suggested? I am doublechecking as it does
>> not solve the original issue - the bottomhalf is called first and then
>> nbd_trip() crashes in qcow2_co_flush_to_os().
>>
>> diff --git a/block.c b/block.c
>> index d06dd51..1e6dfd1 100644
>> --- a/block.c
>> +++ b/block.c
>> @@ -5037,20 +5037,22 @@ void bdrv_invalidate_cache(BlockDriverState *bs,
>> Error **errp)
>> if (local_err) {
>> error_propagate(errp, local_err);
>> return;
>> }
>>
>> ret = refresh_total_sectors(bs, bs->total_sectors);
>> if (ret < 0) {
>> error_setg_errno(errp, -ret, "Could not refresh total sector count");
>> return;
>> }
>> +
>> + bdrv_drain_all();
>> }
>
> Try moving the bdrv_drain_all() call to the top of the function (at
> least it must be called before bs->drv->bdrv_invalidate_cache).
Ok, I did. Did not help.
>
>> +static QEMUBH *migration_complete_bh;
>> +static void process_incoming_migration_complete(void *opaque);
>> +
>> static void process_incoming_migration_co(void *opaque)
>> {
>> QEMUFile *f = opaque;
>> - Error *local_err = NULL;
>> int ret;
>>
>> ret = qemu_loadvm_state(f);
>> qemu_fclose(f);
>
> Paolo suggested to move eveything starting from here, but as far as I
> can tell, leaving the next few lines here shouldn't hurt.
Ouch. I was looking at wrong qcow2_fclose() all this time :)
Aaaany what you suggested did not help -
bdrv_co_flush() calls qemu_coroutine_yield() while this BH is being
executed and the situation is still the same.
>
>> free_xbzrle_decoded_buf();
>> if (ret < 0) {
>> error_report("load of migration failed: %s", strerror(-ret));
>> exit(EXIT_FAILURE);
>> }
>> qemu_announce_self();
>>
>> bdrv_clear_incoming_migration_all();
>> +
>> + migration_complete_bh = aio_bh_new(qemu_get_aio_context(),
>> + process_incoming_migration_complete,
>> + NULL);
>> + qemu_bh_schedule(migration_complete_bh);
>> +}
>> +
>> +static void process_incoming_migration_complete(void *opaque)
>> +{
>> + Error *local_err = NULL;
>> +
>> /* Make sure all file formats flush their mutable metadata */
>> bdrv_invalidate_cache_all(&local_err);
>> if (local_err) {
>> qerror_report_err(local_err);
>> error_free(local_err);
>> exit(EXIT_FAILURE);
>> }
>>
>> if (autostart) {
>> vm_start();
>> } else {
>> runstate_set(RUN_STATE_PAUSED);
>> }
>> + qemu_bh_delete(migration_complete_bh);
>> + migration_complete_bh = NULL;
>> }
>
> That part looks good to me. I hope moving bdrv_drain_all() does the
> trick, otherwise there's somthing wrong with our reasoning.
>
> Kevin
>
--
Alexey
next prev parent reply other threads:[~2014-09-25 12:30 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-09-15 10:50 [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 12:02 ` Alexey Kardashevskiy
2014-09-16 12:10 ` Paolo Bonzini
2014-09-16 12:34 ` Kevin Wolf
2014-09-16 12:35 ` Paolo Bonzini
2014-09-16 12:52 ` Kevin Wolf
2014-09-16 12:59 ` Paolo Bonzini
2014-09-19 8:47 ` Kevin Wolf
2014-09-23 8:47 ` [Qemu-devel] [RFC PATCH] qcow2: Fix race in cache invalidation Alexey Kardashevskiy
2014-09-24 7:30 ` Alexey Kardashevskiy
2014-09-24 9:48 ` Kevin Wolf
2014-09-25 8:41 ` Alexey Kardashevskiy
2014-09-25 8:57 ` Kevin Wolf
2014-09-25 9:55 ` Alexey Kardashevskiy
2014-09-25 10:20 ` Kevin Wolf
2014-09-25 12:29 ` Alexey Kardashevskiy [this message]
2014-09-25 12:39 ` Kevin Wolf
2014-09-25 14:05 ` Alexey Kardashevskiy
2014-09-28 11:14 ` Alexey Kardashevskiy
2014-09-17 6:46 ` [Qemu-devel] migration: qemu-coroutine-lock.c:141: qemu_co_mutex_unlock: Assertion `mutex->locked == 1' failed Alexey Kardashevskiy
2014-09-16 14:52 ` Alexey Kardashevskiy
2014-09-17 9:06 ` Stefan Hajnoczi
2014-09-17 9:25 ` Paolo Bonzini
2014-09-17 13:44 ` Alexey Kardashevskiy
2014-09-17 15:07 ` Stefan Hajnoczi
2014-09-18 3:26 ` Alexey Kardashevskiy
2014-09-18 9:56 ` Paolo Bonzini
2014-09-19 8:23 ` Alexey Kardashevskiy
2014-09-17 15:04 ` Stefan Hajnoczi
2014-09-17 15:17 ` Eric Blake
2014-09-17 15:53 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54240AB0.508@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=dgilbert@redhat.com \
--cc=kwolf@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mreitz@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.