qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH v2 0/1] Fix block migration bug
@ 2014-12-30 10:04 Vladimir Sementsov-Ogievskiy
  2014-12-30 10:04 ` [Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value Vladimir Sementsov-Ogievskiy
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2014-12-30 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: amit.shah, den, vsementsov, quintela

v2:
  - rebase to master
  - fix typos in description

Because of wrong return value of .save_live_pending() in
block-migration, migration finishes before the whole disk
is transferred. Such situation occurs when the migration
process is fast enough, for example when source and dest 
are on the same host.

It's easy to test this with the following:

bug.sh
=====================================================================
#!/bin/sh

size=$1
addr=$2

rm /tmp/fifo-mig /tmp/a /tmp/b /tmp/sock-mig

./qemu-img create -f qcow2 /tmp/a $size
./qemu-img create -f qcow2 /tmp/b $size

./qemu-io -c "write -P 0x22 $addr 512" /tmp/a

mkfifo /tmp/fifo-mig

./x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/b,id=disk\
    -qmp unix:/tmp/sock-mig,server,nowait\
    -incoming "exec: cat /tmp/fifo-mig" &

echo 'migrate -b exec:cat>/tmp/fifo-mig\nquit\n' |\
./x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/a,id=disk\
    -monitor stdio

./scripts/qmp/qmp --path=/tmp/sock-mig quit
sleep 3

echo checking
./qemu-io -c "read -P 0x22 $addr 512" /tmp/b
=====================================================================

For './bug.sh 1G 1M' qemu-io check finishes successfully,
but for './bug.sh 1G 1022M' it finishes with 'Pattern verification
failed' status.

The following patch fixes this bug.

Vladimir Sementsov-Ogievskiy (1):
  migration/block: fix pending() return value

 migration/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value
  2014-12-30 10:04 [Qemu-devel] [PATCH v2 0/1] Fix block migration bug Vladimir Sementsov-Ogievskiy
@ 2014-12-30 10:04 ` Vladimir Sementsov-Ogievskiy
  2015-01-02 16:23   ` Stefan Hajnoczi
  0 siblings, 1 reply; 3+ messages in thread
From: Vladimir Sementsov-Ogievskiy @ 2014-12-30 10:04 UTC (permalink / raw)
  To: qemu-devel; +Cc: amit.shah, den, vsementsov, quintela

Because of wrong return value of .save_live_pending() in
migration/block.c, migration finishes before the whole disk is
transferred. Such situation occurs when the migration process is fast
enough, for example when source and dest are on the same host.

If in the bulk phase we return something < max_size, we will skip
transferring the tail of the device. Currently we have "set pending to
BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
it will be < max_size.

True approach is to return, for example, max_size+1 when we are in the
bulk phase.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
---
 migration/block.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration/block.c b/migration/block.c
index 74d9eb1..2e92605 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -765,8 +765,8 @@ static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
                        block_mig_state.read_done * BLOCK_SIZE;
 
     /* Report at least one block pending during bulk phase */
-    if (pending == 0 && !block_mig_state.bulk_completed) {
-        pending = BLOCK_SIZE;
+    if (pending <= max_size && !block_mig_state.bulk_completed) {
+        pending = max_size + BLOCK_SIZE;
     }
     blk_mig_unlock();
     qemu_mutex_unlock_iothread();
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value
  2014-12-30 10:04 ` [Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value Vladimir Sementsov-Ogievskiy
@ 2015-01-02 16:23   ` Stefan Hajnoczi
  0 siblings, 0 replies; 3+ messages in thread
From: Stefan Hajnoczi @ 2015-01-02 16:23 UTC (permalink / raw)
  To: Vladimir Sementsov-Ogievskiy
  Cc: amit.shah, den, qemu-stable, qemu-devel, quintela

[-- Attachment #1: Type: text/plain, Size: 998 bytes --]

On Tue, Dec 30, 2014 at 01:04:16PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Because of wrong return value of .save_live_pending() in
> migration/block.c, migration finishes before the whole disk is
> transferred. Such situation occurs when the migration process is fast
> enough, for example when source and dest are on the same host.
> 
> If in the bulk phase we return something < max_size, we will skip
> transferring the tail of the device. Currently we have "set pending to
> BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
> it will be < max_size.
> 
> True approach is to return, for example, max_size+1 when we are in the
> bulk phase.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
> ---
>  migration/block.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Yikes, this is a nasty bug.  CCing qemu-stable.

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan

[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2015-01-02 16:23 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-30 10:04 [Qemu-devel] [PATCH v2 0/1] Fix block migration bug Vladimir Sementsov-Ogievskiy
2014-12-30 10:04 ` [Qemu-devel] [PATCH v2 1/1] migration/block: fix pending() return value Vladimir Sementsov-Ogievskiy
2015-01-02 16:23   ` Stefan Hajnoczi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).