From: Hailiang Zhang <zhang.zhanghailiang@huawei.com>
To: Kashyap Chamarthy <kchamart@redhat.com>, qemu-devel@nongnu.org
Cc: xuquan8@huawei.com, qemu-block@nongnu.org
Subject: Re: [Qemu-devel] [QEMU-2.8] Source QEMU crashes with: "bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed"
Date: Sat, 22 Apr 2017 17:23:49 +0800 [thread overview]
Message-ID: <58FB2125.7020603@huawei.com> (raw)
In-Reply-To: <20170421204123.bdftoeuko6lwox3x@eukaryote>
Hi,
I think the bellow patch can fix your problme.
[PATCH 2/4] qmp-cont: invalidate on RUN_STATE_PRELAUNCH
https://patchwork.kernel.org/patch/9591885/
Actually, we encounter the same problem in our test, we fix it with the follow patch:
From 0e4d6d706afd9909b5fd71536b45c58af60892f8 Mon Sep 17 00:00:00 2001
From: zhanghailiang<zhang.zhanghailiang@huawei.com>
Date: Tue, 21 Mar 2017 09:44:36 +0800
Subject: [PATCH] migration: Re-activate blocks whenever migration been
cancelled
In commit 1d2acc3162d9c7772510c973f446353fbdd1f9a8, we try to fix the bug
'bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed'
which occured in migration cancelling process.
But it seems that we didn't cover all the cases, we caught such a case which
slipped from the old fixup in our test: if libvirtd cancelled the migration
process for a shutting down VM, it will send 'system_reset' command first,
and then 'cont' command behind, after VM resumes to run, it will trigger the above
error reports, because we didn't regain the control of blocks for VM.
Signed-off-by: zhanghailiang<zhang.zhanghailiang@huawei.com>
Signed-off-by: Hongyang Yang<yanghongyang@huawei.com>
---
block.c | 12 +++++++++++-
include/block/block.h | 1 +
include/migration/migration.h | 3 ---
migration/migration.c | 7 +------
qmp.c | 4 +---
5 files changed, 14 insertions(+), 13 deletions(-)
diff --git a/block.c b/block.c
index 6e906ec..c2555b0 100644
--- a/block.c
+++ b/block.c
@@ -3875,6 +3875,13 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
}
}
+static bool is_inactivated;
+
+bool bdrv_is_inactivated(void)
+{
+ return is_inactivated;
+}
+
void bdrv_invalidate_cache_all(Error **errp)
{
BlockDriverState *bs;
@@ -3892,6 +3899,7 @@ void bdrv_invalidate_cache_all(Error **errp)
return;
}
}
+ is_inactivated = false;
}
static int bdrv_inactivate_recurse(BlockDriverState *bs,
@@ -3948,7 +3956,9 @@ out:
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
aio_context_release(bdrv_get_aio_context(bs));
}
-
+ if (!ret) {
+ is_inactivated = true;
+ }
return ret;
}
diff --git a/include/block/block.h b/include/block/block.h
index 5149260..f77b57f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -365,6 +365,7 @@ int bdrv_co_ioctl(BlockDriverState *bs, int req, void *buf);
void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp);
void bdrv_invalidate_cache_all(Error **errp);
int bdrv_inactivate_all(void);
+bool bdrv_is_inactivated(void);
/* Ensure contents are flushed to disk. */
int bdrv_flush(BlockDriverState *bs);
diff --git a/include/migration/migration.h b/include/migration/migration.h
index 5720c88..a9a2071 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -183,9 +183,6 @@ struct MigrationState
/* Flag set once the migration thread is running (and needs joining) */
bool migration_thread_running;
- /* Flag set once the migration thread called bdrv_inactivate_all */
- bool block_inactive;
-
/* Queue of outstanding page requests from the destination */
QemuMutex src_page_req_mutex;
QSIMPLEQ_HEAD(src_page_requests, MigrationSrcPageRequest) src_page_requests;
diff --git a/migration/migration.c b/migration/migration.c
index 54060f7..7c3d593 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1015,14 +1015,12 @@ static void migrate_fd_cancel(MigrationState *s)
if (s->state == MIGRATION_STATUS_CANCELLING && f) {
qemu_file_shutdown(f);
}
- if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) {
+ if (bdrv_is_inactivated()) {
Error *local_err = NULL;
bdrv_invalidate_cache_all(&local_err);
if (local_err) {
error_report_err(local_err);
- } else {
- s->block_inactive = false;
}
}
}
@@ -1824,7 +1822,6 @@ static void migration_completion(MigrationState *s, int current_active_state,
if (ret >= 0) {
qemu_file_set_rate_limit(s->to_dst_file, INT64_MAX);
qemu_savevm_state_complete_precopy(s->to_dst_file, false);
- s->block_inactive = true;
}
}
qemu_mutex_unlock_iothread();
@@ -1879,8 +1876,6 @@ fail_invalidate:
bdrv_invalidate_cache_all(&local_err);
if (local_err) {
error_report_err(local_err);
- } else {
- s->block_inactive = false;
}
qemu_mutex_unlock_iothread();
}
diff --git a/qmp.c b/qmp.c
index fa82b59..147923b 100644
--- a/qmp.c
+++ b/qmp.c
@@ -197,9 +197,7 @@ void qmp_cont(Error **errp)
/* Continuing after completed migration. Images have been inactivated to
* allow the destination to take control. Need to get control back now. */
- if (runstate_check(RUN_STATE_FINISH_MIGRATE) ||
- runstate_check(RUN_STATE_POSTMIGRATE))
- {
+ if (bdrv_is_inactivated()) {
bdrv_invalidate_cache_all(&local_err);
if (local_err) {
error_propagate(errp, local_err);
--
1.8.3.1
Is this help for you ?
Thanks,
Hailiang
On 2017/4/22 4:41, Kashyap Chamarthy wrote:
> I have seen a bunch of reports about this assertion error (on source
> QEMU). [At least I recall Greg Kurz mentioning this a week or so ago on
> #qemu, OFTC.]
>
> I just noticed this crash in upstream OpenStack CI environment. This
> seems to occur (only intermittently, though) during live migration
> without shared-storage.
>
> [I've attached the complete source and destination QEMU command-line in
> a separate plain text file.]
>
> But here are the errors from source and destination QEMU:
>
> Source QEMU:
> -----------------------------------------------------------------------
> [...]
> 2017-04-21 13:54:08.505+0000: initiating migration
> qemu-system-x86_64: /build/qemu-5OJ39u/qemu-2.8+dfsg/block/io.c:1514: bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed.
> 2017-04-21 13:54:08.791+0000: shutting down, reason=crashed
> -----------------------------------------------------------------------
>
> Destination QEMU:
> -----------------------------------------------------------------------
> [...]
> /build/qemu-5OJ39u/qemu-2.8+dfsg/nbd/server.c:nbd_receive_request():L710: read failed
> 2017-04-21 13:54:08.792+0000: shutting down, reason=failed
> 2017-04-21T13:54:08.793259Z qemu-system-x86_64: terminating on signal 15 from pid 12160 (/usr/sbin/libvirtd)
> -----------------------------------------------------------------------
>
> Any hints as how to how to deal with this in QEMU 2.8?
>
> FWIW, the upstream OpenStack CI only very recently moved to QEMU 2.8, so
> it is unlikely that the CI env will move to the just-released 2.9
> anytime soon. (But there's work in progress to create a CI job that
> tests with QEMU 2.9.)
>
next prev parent reply other threads:[~2017-04-22 9:24 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-04-21 20:41 [Qemu-devel] [QEMU-2.8] Source QEMU crashes with: "bdrv_co_pwritev: Assertion `!(bs->open_flags & BDRV_O_INACTIVE)' failed" Kashyap Chamarthy
2017-04-22 9:23 ` Hailiang Zhang [this message]
2017-04-24 7:59 ` Kashyap Chamarthy
2017-04-25 7:58 ` Hailiang Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=58FB2125.7020603@huawei.com \
--to=zhang.zhanghailiang@huawei.com \
--cc=kchamart@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=xuquan8@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).