qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] ping RE: question: I found a qemu crash about migration
@ 2017-09-28  7:38 wangjie (P)
  2017-09-28 17:01 ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 8+ messages in thread
From: wangjie (P) @ 2017-09-28  7:38 UTC (permalink / raw)
  To: qemu-devel@nongnu.org, kwolf@redhat.com, pbonzini@redhat.com
  Cc: fuweiwei (C), eblake@redhat.com, kchamart@redhat.com,
	dgilbert@redhat.com, famz@redhat.com

[-- Attachment #1: Type: text/plain, Size: 3032 bytes --]

Ping?

From: wangjie (P)
Sent: Tuesday, September 26, 2017 9:10 PM
To: qemu-devel@nongnu.org; kwolf@redhat.com; pbonzini@redhat.com
Cc: wangjie (P) <wangjie88@huawei.com>; fuweiwei (C) <fuweiwei2@huawei.com>; eblake@redhat.com; kchamart@redhat.com; dgilbert@redhat.com; famz@redhat.com; Wubin (H) <wu.wubin@huawei.com>
Subject: question: I found a qemu crash about migration

Hi,

When I use qemuMigrationRun to migrate both memory and storage with some IO press in VM, and configured iothreads. We triggered a error reports:  (I use the current qemu master branch)
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",

I reviewed the code, and gdb the coredump file, I think one case can trigger the error reports

Case:

Migration_thread()
      Migration_completion() ----------> last iteration of memory migration
            Vm_stop_force_state()--------------> Stop the VM, and call bdrv_drain_all, but I gdb the core file, and found the cnt of dirty bitmap of driver-mirror is not 0, and in_flight mirror IO is 16,
                  Bdrv_inactivate_all()----------------> inactivate images and set the INACTIVE label.
      -> bdrv_co_do_pwritev()-------------->then the mirror IO handled after will trigger the Assertion `!(bs->open_flags & 0x0800)' and qemu crashed




As we can see from above,  Migration_completion call Bdrv_inactivate_all to inactivate images, but the mirror_run is not done (still has dirty clusters), the mirror_run IO issued later will triggered error reports: " bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",

It seems that memory migration and storage mirror is done independently and the sequence of the two progresses are quite random.

How can I solve this problem, should we not set  INACTIVE label for drive-mirror BlockDriverState?

Qemu Crash bt:
(gdb) bt
#0  0x00007f6b6e2a71d7 in raise () from /usr/lib64/libc.so.6
#1  0x00007f6b6e2a88c8 in abort () from /usr/lib64/libc.so.6
#2  0x00007f6b6e2a0146 in __assert_fail_base () from /usr/lib64/libc.so.6
#3  0x00007f6b6e2a01f2 in __assert_fail () from /usr/lib64/libc.so.6
#4  0x00000000007b9211 in bdrv_co_pwritev (child=<optimized out>, offset=offset@entry=7034896384, bytes=bytes@entry=65536,
    qiov=qiov@entry=0x7f69cc09b068, flags=0) at block/io.c:1536
#5  0x00000000007a6f02 in blk_co_pwritev (blk=0x2f92750, offset=7034896384, bytes=65536, qiov=0x7f69cc09b068,
    flags=<optimized out>) at block/block_backend.c:851
#6  0x00000000007a6fc1 in blk_aio_write_entry (opaque=0x301dad0) at block/block_backend.c:1043
#7  0x0000000000835e2a in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine_ucontext.c:79
#8  0x00007f6b6e2b8cf0 in ?? () from /usr/lib64/libc.so.6
#9  0x00007f6a1bcfc780 in ?? ()
#10 0x0000000000000000 in ?? ()

And I see the mirror_run is not done,  gdb info as following:
[cid:image001.png@01D3386F.DBC9FF10]


Src VM qemu log:

[cid:image002.png@01D3386F.DBC9FF10]















[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 61384 bytes --]

[-- Attachment #3: image002.png --]
[-- Type: image/png, Size: 153960 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-10-11  8:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-28  7:38 [Qemu-devel] ping RE: question: I found a qemu crash about migration wangjie (P)
2017-09-28 17:01 ` Dr. David Alan Gilbert
2017-09-29  9:35   ` Kevin Wolf
2017-09-29 19:06     ` Dr. David Alan Gilbert
2017-09-29 20:44       ` Kevin Wolf
2017-10-09 11:55         ` Dr. David Alan Gilbert
2017-10-11  1:34           ` wangjie (P)
2017-10-11  8:28             ` Kevin Wolf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).