From: Balamuruhan S <bala24@linux.vnet.ibm.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH for-3.0 0/9] migration: postcopy recovery unit test, bug fixes
Date: Fri, 6 Jul 2018 17:15:09 +0530 [thread overview]
Message-ID: <20180706114509.GC16585@localhost.localdomain> (raw)
In-Reply-To: <20180706105658.GB2661@work-vm>
On Fri, Jul 06, 2018 at 11:56:59AM +0100, Dr. David Alan Gilbert wrote:
> * Dr. David Alan Gilbert (dgilbert@redhat.com) wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Based-on: <20180627132246.5576-1-peterx@redhat.com>
> > >
> > > Based on the series to unbreak postcopy:
> > > Subject: [PATCH v3 0/4] migation: unbreak postcopy recovery
> > > Message-Id: <20180627132246.5576-1-peterx@redhat.com>
> > >
> > > This series introduce a new postcopy recovery test. The new test
> > > actually helped me to identify two bugs there so fix them as well
> > > before 3.0 release.
> > >
> > > Patch 1: a trivial cleanup for existing postcopy ram load, which I
> > > found a bit confusing during debugging the problem.
> > >
> > > Patch 2-3: two bug fixes that address different issues. Please see
> > > the commit log for more information.
> > >
> > > Patch 4-9: add the postcopy recovery unit test.
> > >
> > > Please review. Thanks,
> >
> > Queued
>
> Hi Peter,
> There's a problem in there somewhere; I'm getting
> an intermittent failure of the test if I run a make check -j 8 on my
> laptop. Just running two copies of tests/migration-test in parallel
> sometimes triggers it (but not if I turn on QTEST_LOG!).
> But it's always failing with:
>
> ERROR:/home/dgilbert/git/migpull/tests/migration-test.c:373:migrate_recover: assertion failed: (qdict_haskey(rsp, "return"))
Hi Peter and Dave,
I have tested postcopy migration pause/recover after applying this
patchset on upstream Qemu,
Observation 1:
We loose the target after triggering migrate_pause,
source
# ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \
-machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \
-device virtio-blk-pci,drive=rootdisk -drive \
file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \
-monitor telnet:127.0.0.1:1235,server,nowait -net nic,model=virtio \
-net user -redir tcp:2001::22
qemu-system-ppc64: Detected IO failure for postcopy. Migration paused.
source Monitor
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate_set_parameter max-postcopy-bandwidth 4096
(qemu) migrate -d tcp:127.0.0.1:4444
(qemu) migrate_start_postcopy
(qemu) migrate_pause
(qemu) info migrate
globals:
store-global-state: on
only-migratable: off
send-configuration: on
send-section-footer: on
decompress-error-check: on
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off
zero-blocks: off compress: off events: off postcopy-ram: on x-colo: off
release-ram: off block: off return-path: off pause-before-switchover:
off x-multifd: off dirty-bitmaps: off postcopy-blocktime: off
late-block-activate: off
Migration status: postcopy-paused
total time: 371289 milliseconds
expected downtime: 656414 milliseconds
setup: 93 milliseconds
transferred ram: 690856 kbytes
throughput: 46.65 mbps
remaining ram: 3716864 kbytes
total ram: 67109120 kbytes
duplicate: 16631167 pages
skipped: 0 pages
normal: 135905 pages
normal bytes: 543620 kbytes
dirty sync count: 2
page size: 4 kbytes
multifd bytes: 0 kbytes
dirty pages rate: 626209 pages
postcopy request count: 395
source remains to be in postcopy-paused state as the target is lost.
target
# ppc64-softmmu/qemu-system-ppc64 --enable-kvm --nographic -vga none \
-machine pseries -m 64G,slots=128,maxmem=128G -smp 16,maxcpus=32 \
-device virtio-blk-pci,drive=rootdisk -drive \
file=/home/bala/sharing/hostos-ppc64le.qcow2,if=none,cache=none,format=qcow2,id=rootdisk
\
-monitor telnet:127.0.0.1:1235,server,nowait -net nic,model=virtio \
-net user -redir tcp:2001::22 -incoming tcp:127.0.0.1:4444
Error observed
check_section_footer: Read section footer failed: -5
qemu-system-ppc64: postcopy_ram_listen_thread: loadvm failed: -22
[ 188.815436] Unable to handle kernel paging request for instruction
fetch
Target Monitor
(qemu) migrate_set_capability postcopy-ram on
Observation 2:
Unlike error observed by Dave in Qtest it hangs for me waiting for
the migration to complete, but the source remains to be in
migration-paused state.
# time QTEST_QEMU_BINARY=./ppc64-softmmu/qemu-system-ppc64
# ./tests/migration-test
/ppc64/migration/deprecated: OK
/ppc64/migration/bad_dest: OK
/ppc64/migration/postcopy/unix: OK
/ppc64/migration/postcopy/recovery: ^C
real 21m55.176s
user 2m28.800s
sys 4m55.980s
-- Bala
>
> Dave
>
> > > Peter Xu (9):
> > > migration: simplify check to use qemu file buffer
> > > migration: loosen recovery check when load vm
> > > migration: fix incorrect bitmap size calculation
> > > tests: introduce migrate_postcopy_* helpers
> > > tests: allow migrate() to take extra flags
> > > tests: introduce migrate_query*() helpers
> > > tests: introduce wait_for_migration_status()
> > > tests: add postcopy recovery test
> > > tests: hide stderr for postcopy recovery test
> > >
> > > migration/ram.c | 21 +++--
> > > migration/savevm.c | 16 ++--
> > > tests/migration-test.c | 198 ++++++++++++++++++++++++++++++++---------
> > > 3 files changed, 176 insertions(+), 59 deletions(-)
> > >
> > > --
> > > 2.17.1
> > >
> > >
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
next prev parent reply other threads:[~2018-07-06 13:08 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-07-05 3:17 [Qemu-devel] [PATCH for-3.0 0/9] migration: postcopy recovery unit test, bug fixes Peter Xu
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 1/9] migration: simplify check to use qemu file buffer Peter Xu
2018-07-05 9:01 ` Dr. David Alan Gilbert
2018-07-05 9:11 ` Peter Xu
2018-07-05 12:59 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 2/9] migration: loosen recovery check when load vm Peter Xu
2018-07-05 9:15 ` Dr. David Alan Gilbert
2018-07-05 9:31 ` Peter Xu
2018-07-05 13:01 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 3/9] migration: fix incorrect bitmap size calculation Peter Xu
2018-07-05 9:38 ` Dr. David Alan Gilbert
2018-07-05 13:01 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 4/9] tests: introduce migrate_postcopy_* helpers Peter Xu
2018-07-05 9:31 ` Balamuruhan S
2018-07-06 2:19 ` Peter Xu
2018-07-06 6:17 ` Balamuruhan S
2018-07-05 9:59 ` Dr. David Alan Gilbert
2018-07-05 13:03 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 5/9] tests: allow migrate() to take extra flags Peter Xu
2018-07-05 10:18 ` Dr. David Alan Gilbert
2018-07-05 13:05 ` Juan Quintela
2018-07-06 10:36 ` Balamuruhan S
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 6/9] tests: introduce migrate_query*() helpers Peter Xu
2018-07-05 10:23 ` Dr. David Alan Gilbert
2018-07-05 13:07 ` Juan Quintela
2018-07-05 10:59 ` Balamuruhan S
2018-07-05 13:06 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 7/9] tests: introduce wait_for_migration_status() Peter Xu
2018-07-05 10:27 ` Dr. David Alan Gilbert
2018-07-05 13:07 ` Juan Quintela
2018-07-06 10:41 ` Balamuruhan S
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 8/9] tests: add postcopy recovery test Peter Xu
2018-07-05 10:30 ` Dr. David Alan Gilbert
2018-07-05 13:08 ` Juan Quintela
2018-07-05 3:17 ` [Qemu-devel] [PATCH for-3.0 9/9] tests: hide stderr for " Peter Xu
2018-07-05 10:36 ` Dr. David Alan Gilbert
2018-07-05 13:09 ` Juan Quintela
2018-07-06 9:17 ` [Qemu-devel] [PATCH for-3.0 0/9] migration: postcopy recovery unit test, bug fixes Dr. David Alan Gilbert
2018-07-06 10:56 ` Dr. David Alan Gilbert
2018-07-06 11:45 ` Balamuruhan S [this message]
2018-07-06 12:46 ` Balamuruhan S
2018-07-12 8:50 ` Dr. David Alan Gilbert
2018-07-10 3:27 ` Peter Xu
2018-07-10 8:53 ` Dr. David Alan Gilbert
2018-07-10 1:56 ` Balamuruhan S
2018-07-10 3:07 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180706114509.GC16585@localhost.localdomain \
--to=bala24@linux.vnet.ibm.com \
--cc=dgilbert@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.