qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH 0/7] Handle errors during migration
@ 2011-09-20 13:24 Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 1/7] migration: simplify state assignmente Juan Quintela
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel

Hi

This patch series contains error handling for migration.

After this series are applied, migrate_cancel after one error don't
hang.  And we add some error checking left and right.

This is the error handling patches that were on the middle of my
migration-cleanup of some months ago.  migration_cancel fix has been
added.

Later, Juan.

Juan Quintela (5):
  migration: simplify state assignmente
  migration: only flush when there are no errors
  migration: Check that migration is active before cancel it
  migration: If there is one error, it makes no sense to continue
  migration: qemu_savevm_iterate has three return values

Yoshiaki Tamura (2):
  savevm: avoid qemu_savevm_state_iterate() to return 1 when qemu file
    has error.
  migration: add error handling to migrate_fd_put_notify().

 buffered_file.c  |   19 +++++++++++--------
 buffered_file.h  |    2 +-
 hw/hw.h          |    4 ++--
 migration-exec.c |    6 +++---
 migration-fd.c   |    4 ++--
 migration-tcp.c  |    4 ++--
 migration-unix.c |    4 ++--
 migration.c      |   33 ++++++++++++++++++---------------
 migration.h      |    4 ++--
 savevm.c         |   27 +++++++++++++++------------
 10 files changed, 58 insertions(+), 49 deletions(-)

-- 
1.7.6.2

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 1/7] migration: simplify state assignmente
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors Juan Quintela
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel

Once there, make sure that if we already know that there is one error, just
call migration_fd_cleanup() with the ERROR state.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration.c |   11 +++++------
 1 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/migration.c b/migration.c
index f5959b4..9a93e3b 100644
--- a/migration.c
+++ b/migration.c
@@ -370,7 +370,6 @@ void migrate_fd_put_ready(void *opaque)

     DPRINTF("iterate\n");
     if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
-        int state;
         int old_vm_running = vm_running;

         DPRINTF("done iterating\n");
@@ -380,17 +379,17 @@ void migrate_fd_put_ready(void *opaque)
             if (old_vm_running) {
                 vm_start();
             }
-            state = MIG_STATE_ERROR;
-        } else {
-            state = MIG_STATE_COMPLETED;
+            s->state = MIG_STATE_ERROR;
         }
         if (migrate_fd_cleanup(s) < 0) {
             if (old_vm_running) {
                 vm_start();
             }
-            state = MIG_STATE_ERROR;
+            s->state = MIG_STATE_ERROR;
+        }
+        if (s->state == MIG_STATE_ACTIVE) {
+            s->state = MIG_STATE_COMPLETED;
         }
-        s->state = state;
         notifier_list_notify(&migration_state_notifiers, NULL);
     }
 }
-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 1/7] migration: simplify state assignmente Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 14:25   ` Daniel P. Berrange
  2011-09-20 15:48   ` Paolo Bonzini
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 3/7] migration: Check that migration is active before cancel it Juan Quintela
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel

If we have one error while migrating, and then we issuse a
"migrate_cancel" command, guest hang.  Fix it for flushing only when
migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
don't flush.

We had an infinite loop at buffered_close()

        while (!s->has_error && s->buffer_size) {
            buffered_flush(s);
            if (s->freeze_output)
                s->wait_for_unfreeze(s);
        }

There was no errors, there were things to send, and connection was
broken.  send() returns -EAGAIN, so we freezed output, but we
unfreeze_output and try again.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c  |   17 ++++++++++-------
 buffered_file.h  |    2 +-
 hw/hw.h          |    4 ++--
 migration-exec.c |    6 +++---
 migration-fd.c   |    4 ++--
 migration-tcp.c  |    4 ++--
 migration-unix.c |    4 ++--
 migration.c      |    6 +++---
 migration.h      |    4 ++--
 savevm.c         |   20 +++++++++++---------
 10 files changed, 38 insertions(+), 33 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 486af57..5ba3d19 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -166,20 +166,23 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
     return offset;
 }

-static int buffered_close(void *opaque)
+static int buffered_close(void *opaque, bool flush)
 {
     QEMUFileBuffered *s = opaque;
     int ret;

     DPRINTF("closing\n");

-    while (!s->has_error && s->buffer_size) {
-        buffered_flush(s);
-        if (s->freeze_output)
-            s->wait_for_unfreeze(s);
+    if (flush) {
+        while (!s->has_error && s->buffer_size) {
+            buffered_flush(s);
+            if (s->freeze_output) {
+                s->wait_for_unfreeze(s);
+            }
+        }
     }

-    ret = s->close(s->opaque);
+    ret = s->close(s->opaque, flush);

     qemu_del_timer(s->timer);
     qemu_free_timer(s->timer);
@@ -233,7 +236,7 @@ static void buffered_rate_tick(void *opaque)
     QEMUFileBuffered *s = opaque;

     if (s->has_error) {
-        buffered_close(s);
+        buffered_close(s, false);
         return;
     }

diff --git a/buffered_file.h b/buffered_file.h
index 98d358b..16162ec 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -19,7 +19,7 @@
 typedef ssize_t (BufferedPutFunc)(void *opaque, const void *data, size_t size);
 typedef void (BufferedPutReadyFunc)(void *opaque);
 typedef void (BufferedWaitForUnfreezeFunc)(void *opaque);
-typedef int (BufferedCloseFunc)(void *opaque);
+typedef int (BufferedCloseFunc)(void *opaque, bool flush);

 QEMUFile *qemu_fopen_ops_buffered(void *opaque, size_t xfer_limit,
                                   BufferedPutFunc *put_buffer,
diff --git a/hw/hw.h b/hw/hw.h
index a124da9..129de0e 100644
--- a/hw/hw.h
+++ b/hw/hw.h
@@ -28,7 +28,7 @@ typedef int (QEMUFileGetBufferFunc)(void *opaque, uint8_t *buf,
                                     int64_t pos, int size);

 /* Close a file and return an error code */
-typedef int (QEMUFileCloseFunc)(void *opaque);
+typedef int (QEMUFileCloseFunc)(void *opaque, bool flush);

 /* Called to determine if the file has exceeded it's bandwidth allocation.  The
  * bandwidth capping is a soft limit, not a hard limit.
@@ -55,7 +55,7 @@ QEMUFile *qemu_popen(FILE *popen_file, const char *mode);
 QEMUFile *qemu_popen_cmd(const char *command, const char *mode);
 int qemu_stdio_fd(QEMUFile *f);
 void qemu_fflush(QEMUFile *f);
-int qemu_fclose(QEMUFile *f);
+int qemu_fclose(QEMUFile *f, bool flush);
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size);
 void qemu_put_byte(QEMUFile *f, int v);

diff --git a/migration-exec.c b/migration-exec.c
index 2cfb6f2..5fe09e3 100644
--- a/migration-exec.c
+++ b/migration-exec.c
@@ -42,12 +42,12 @@ static int file_write(FdMigrationState *s, const void * buf, size_t size)
     return write(s->fd, buf, size);
 }

-static int exec_close(FdMigrationState *s)
+static int exec_close(FdMigrationState *s, bool flush)
 {
     int ret = 0;
     DPRINTF("exec_close\n");
     if (s->opaque) {
-        ret = qemu_fclose(s->opaque);
+        ret = qemu_fclose(s->opaque, flush);
         s->opaque = NULL;
         s->fd = -1;
         if (ret != -1 &&
@@ -123,7 +123,7 @@ static void exec_accept_incoming_migration(void *opaque)

     process_incoming_migration(f);
     qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, NULL, NULL, NULL);
-    qemu_fclose(f);
+    qemu_fclose(f, false);
 }

 int exec_start_incoming_migration(const char *command)
diff --git a/migration-fd.c b/migration-fd.c
index aee690a..3908850 100644
--- a/migration-fd.c
+++ b/migration-fd.c
@@ -40,7 +40,7 @@ static int fd_write(FdMigrationState *s, const void * buf, size_t size)
     return write(s->fd, buf, size);
 }

-static int fd_close(FdMigrationState *s)
+static int fd_close(FdMigrationState *s, bool flush)
 {
     DPRINTF("fd_close\n");
     if (s->fd != -1) {
@@ -106,7 +106,7 @@ static void fd_accept_incoming_migration(void *opaque)

     process_incoming_migration(f);
     qemu_set_fd_handler2(qemu_stdio_fd(f), NULL, NULL, NULL, NULL);
-    qemu_fclose(f);
+    qemu_fclose(f, false);
 }

 int fd_start_incoming_migration(const char *infd)
diff --git a/migration-tcp.c b/migration-tcp.c
index c431e03..cb061f6 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -38,7 +38,7 @@ static int socket_write(FdMigrationState *s, const void * buf, size_t size)
     return send(s->fd, buf, size, 0);
 }

-static int tcp_close(FdMigrationState *s)
+static int tcp_close(FdMigrationState *s, bool flush)
 {
     DPRINTF("tcp_close\n");
     if (s->fd != -1) {
@@ -160,7 +160,7 @@ static void tcp_accept_incoming_migration(void *opaque)
     }

     process_incoming_migration(f);
-    qemu_fclose(f);
+    qemu_fclose(f, false);
 out:
     close(c);
 out2:
diff --git a/migration-unix.c b/migration-unix.c
index 6dc985d..bafb855 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -38,7 +38,7 @@ static int unix_write(FdMigrationState *s, const void * buf, size_t size)
     return write(s->fd, buf, size);
 }

-static int unix_close(FdMigrationState *s)
+static int unix_close(FdMigrationState *s, bool flush)
 {
     DPRINTF("unix_close\n");
     if (s->fd != -1) {
@@ -168,7 +168,7 @@ static void unix_accept_incoming_migration(void *opaque)
     }

     process_incoming_migration(f);
-    qemu_fclose(f);
+    qemu_fclose(f, false);
 out:
     qemu_set_fd_handler2(s, NULL, NULL, NULL, NULL);
     close(s);
diff --git a/migration.c b/migration.c
index 9a93e3b..15d001e 100644
--- a/migration.c
+++ b/migration.c
@@ -288,7 +288,7 @@ int migrate_fd_cleanup(FdMigrationState *s)

     if (s->file) {
         DPRINTF("closing file\n");
-        if (qemu_fclose(s->file) != 0) {
+        if (qemu_fclose(s->file, s->state == MIG_STATE_ACTIVE) != 0) {
             ret = -1;
         }
         s->file = NULL;
@@ -449,7 +449,7 @@ void migrate_fd_wait_for_unfreeze(void *opaque)
     } while (ret == -1 && (s->get_error(s)) == EINTR);
 }

-int migrate_fd_close(void *opaque)
+int migrate_fd_close(void *opaque, bool flush)
 {
     FdMigrationState *s = opaque;

@@ -457,7 +457,7 @@ int migrate_fd_close(void *opaque)
         monitor_resume(s->mon);
     }
     qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
-    return s->close(s);
+    return s->close(s, flush);
 }

 void add_migration_state_change_notifier(Notifier *notify)
diff --git a/migration.h b/migration.h
index 050c56c..27a1f25 100644
--- a/migration.h
+++ b/migration.h
@@ -46,7 +46,7 @@ struct FdMigrationState
     Monitor *mon;
     int state;
     int (*get_error)(struct FdMigrationState*);
-    int (*close)(struct FdMigrationState*);
+    int (*close)(struct FdMigrationState*, bool flush);
     int (*write)(struct FdMigrationState*, const void *, size_t);
     void *opaque;
 };
@@ -128,7 +128,7 @@ void migrate_fd_release(MigrationState *mig_state);

 void migrate_fd_wait_for_unfreeze(void *opaque);

-int migrate_fd_close(void *opaque);
+int migrate_fd_close(void *opaque, bool flush);

 static inline FdMigrationState *migrate_to_fms(MigrationState *mig_state)
 {
diff --git a/savevm.c b/savevm.c
index 1feaa70..a793137 100644
--- a/savevm.c
+++ b/savevm.c
@@ -203,7 +203,7 @@ static int socket_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return len;
 }

-static int socket_close(void *opaque)
+static int socket_close(void *opaque, bool flush)
 {
     QEMUFileSocket *s = opaque;
     g_free(s);
@@ -229,7 +229,7 @@ static int stdio_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return bytes;
 }

-static int stdio_pclose(void *opaque)
+static int stdio_pclose(void *opaque, bool flush)
 {
     QEMUFileStdio *s = opaque;
     int ret;
@@ -238,7 +238,7 @@ static int stdio_pclose(void *opaque)
     return ret;
 }

-static int stdio_fclose(void *opaque)
+static int stdio_fclose(void *opaque, bool flush)
 {
     QEMUFileStdio *s = opaque;
     fclose(s->stdio_file);
@@ -389,7 +389,7 @@ static int block_get_buffer(void *opaque, uint8_t *buf, int64_t pos, int size)
     return bdrv_load_vmstate(opaque, buf, pos, size);
 }

-static int bdrv_fclose(void *opaque)
+static int bdrv_fclose(void *opaque, bool flush)
 {
     return 0;
 }
@@ -471,12 +471,14 @@ static void qemu_fill_buffer(QEMUFile *f)
         f->has_error = 1;
 }

-int qemu_fclose(QEMUFile *f)
+int qemu_fclose(QEMUFile *f, bool flush)
 {
     int ret = 0;
-    qemu_fflush(f);
+    if (flush) {
+        qemu_fflush(f);
+    }
     if (f->close)
-        ret = f->close(f->opaque);
+        ret = f->close(f->opaque, flush);
     g_free(f);
     return ret;
 }
@@ -1980,7 +1982,7 @@ void do_savevm(Monitor *mon, const QDict *qdict)
     }
     ret = qemu_savevm_state(mon, f);
     vm_state_size = qemu_ftell(f);
-    qemu_fclose(f);
+    qemu_fclose(f, true);
     if (ret < 0) {
         monitor_printf(mon, "Error %d while writing VM\n", ret);
         goto the_end;
@@ -2077,7 +2079,7 @@ int load_vmstate(const char *name)
     qemu_system_reset(VMRESET_SILENT);
     ret = qemu_loadvm_state(f);

-    qemu_fclose(f);
+    qemu_fclose(f, false);
     if (ret < 0) {
         error_report("Error %d while loading VM state", ret);
         return ret;
-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 3/7] migration: Check that migration is active before cancel it
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 1/7] migration: simplify state assignmente Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 4/7] savevm: avoid qemu_savevm_state_iterate() to return 1 when qemu file has error Juan Quintela
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/migration.c b/migration.c
index 15d001e..c56d29c 100644
--- a/migration.c
+++ b/migration.c
@@ -132,9 +132,9 @@ int do_migrate_cancel(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
     MigrationState *s = current_migration;

-    if (s)
+    if (s && s->get_status(s) == MIG_STATE_ACTIVE) {
         s->cancel(s);
-
+    }
     return 0;
 }

-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 4/7] savevm: avoid qemu_savevm_state_iterate() to return 1 when qemu file has error.
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
                   ` (2 preceding siblings ...)
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 3/7] migration: Check that migration is active before cancel it Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify() Juan Quintela
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Yoshiaki Tamura

From: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>

When qemu on the receiver gets killed during live migration, if debug
is turned on, migrate_fd_put_ready() says,

migration: done iterating

and proceeds.  The reason was qemu_savevm_state_iterate() returning 1
even when qemu file has error.  This patch checks
qemu_file_has_error() before returning 1/0, and avoids
migrate_fd_put_ready() to proceed in case of error.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 savevm.c |    7 ++++---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/savevm.c b/savevm.c
index a793137..af0d0e7 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1531,14 +1531,15 @@ int qemu_savevm_state_iterate(Monitor *mon, QEMUFile *f)
         }
     }

-    if (ret)
-        return 1;
-
     if (qemu_file_has_error(f)) {
         qemu_savevm_state_cancel(mon, f);
         return -EIO;
     }

+    if (ret) {
+        return 1;
+    }
+
     return 0;
 }

-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify().
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
                   ` (3 preceding siblings ...)
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 4/7] savevm: avoid qemu_savevm_state_iterate() to return 1 when qemu file has error Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 15:49   ` Paolo Bonzini
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 6/7] migration: If there is one error, it makes no sense to continue Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values Juan Quintela
  6 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel; +Cc: Yoshiaki Tamura

From: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>

Although migrate_fd_put_buffer() sets MIG_STATE_ERROR if it failed,
since migrate_fd_put_notify() isn't checking error of underlying
QEMUFile, those resources are kept open.  This patch checks it and
calls migrate_fd_error() in case of error.

Signed-off-by: Yoshiaki Tamura <tamura.yoshiaki@lab.ntt.co.jp>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/migration.c b/migration.c
index c56d29c..7f8928a 100644
--- a/migration.c
+++ b/migration.c
@@ -312,6 +312,9 @@ void migrate_fd_put_notify(void *opaque)

     qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
     qemu_file_put_notify(s->file);
+    if (qemu_file_has_error(s->file)) {
+        migrate_fd_error(s);
+    }
 }

 ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size)
@@ -328,9 +331,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void *data, size_t size)

     if (ret == -EAGAIN) {
         qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s);
-    } else if (ret < 0) {
-        s->state = MIG_STATE_ERROR;
-        notifier_list_notify(&migration_state_notifiers, NULL);
     }

     return ret;
-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 6/7] migration: If there is one error, it makes no sense to continue
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
                   ` (4 preceding siblings ...)
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify() Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values Juan Quintela
  6 siblings, 0 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel


Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 5ba3d19..10d14f9 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -197,7 +197,7 @@ static int buffered_rate_limit(void *opaque)
     QEMUFileBuffered *s = opaque;

     if (s->has_error)
-        return 0;
+        return -1;

     if (s->freeze_output)
         return 1;
-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values
  2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
                   ` (5 preceding siblings ...)
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 6/7] migration: If there is one error, it makes no sense to continue Juan Quintela
@ 2011-09-20 13:24 ` Juan Quintela
  2011-09-22  9:50   ` Wayne Xia
  6 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 13:24 UTC (permalink / raw)
  To: qemu-devel

We were retrying when there was one error, entering a loop.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/migration.c b/migration.c
index 7f8928a..0baed23 100644
--- a/migration.c
+++ b/migration.c
@@ -362,6 +362,7 @@ void migrate_fd_connect(FdMigrationState *s)
 void migrate_fd_put_ready(void *opaque)
 {
     FdMigrationState *s = opaque;
+    int ret;

     if (s->state != MIG_STATE_ACTIVE) {
         DPRINTF("put_ready returning because of non-active state\n");
@@ -369,7 +370,10 @@ void migrate_fd_put_ready(void *opaque)
     }

     DPRINTF("iterate\n");
-    if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
+    ret = qemu_savevm_state_iterate(s->mon, s->file);
+    if (ret == -1) {
+        migrate_fd_error(s);
+    } else if (ret == 1) {
         int old_vm_running = vm_running;

         DPRINTF("done iterating\n");
-- 
1.7.6.2

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors Juan Quintela
@ 2011-09-20 14:25   ` Daniel P. Berrange
  2011-09-20 14:47     ` Juan Quintela
  2011-09-20 15:48   ` Paolo Bonzini
  1 sibling, 1 reply; 13+ messages in thread
From: Daniel P. Berrange @ 2011-09-20 14:25 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

On Tue, Sep 20, 2011 at 03:24:41PM +0200, Juan Quintela wrote:
> If we have one error while migrating, and then we issuse a
> "migrate_cancel" command, guest hang.  Fix it for flushing only when
> migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
> don't flush.
> 
> We had an infinite loop at buffered_close()
> 
>         while (!s->has_error && s->buffer_size) {
>             buffered_flush(s);
>             if (s->freeze_output)
>                 s->wait_for_unfreeze(s);
>         }
> 
> There was no errors, there were things to send, and connection was
> broken.  send() returns -EAGAIN, so we freezed output, but we
> unfreeze_output and try again.

I posted a couple of alternative approaches to fixing this
hang problem

http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg03248.html

My second approach of checking the migration state in migrate_fd_put_buffer()
seems like it would be worthwhile, even with your patch as an additional
safety net against bad code.

Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors
  2011-09-20 14:25   ` Daniel P. Berrange
@ 2011-09-20 14:47     ` Juan Quintela
  0 siblings, 0 replies; 13+ messages in thread
From: Juan Quintela @ 2011-09-20 14:47 UTC (permalink / raw)
  To: Daniel P. Berrange; +Cc: qemu-devel

"Daniel P. Berrange" <berrange@redhat.com> wrote:
> On Tue, Sep 20, 2011 at 03:24:41PM +0200, Juan Quintela wrote:
>> If we have one error while migrating, and then we issuse a
>> "migrate_cancel" command, guest hang.  Fix it for flushing only when
>> migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
>> don't flush.
>> 
>> We had an infinite loop at buffered_close()
>> 
>>         while (!s->has_error && s->buffer_size) {
>>             buffered_flush(s);
>>             if (s->freeze_output)
>>                 s->wait_for_unfreeze(s);
>>         }
>> 
>> There was no errors, there were things to send, and connection was
>> broken.  send() returns -EAGAIN, so we freezed output, but we
>> unfreeze_output and try again.
>
> I posted a couple of alternative approaches to fixing this
> hang problem
>
> http://lists.nongnu.org/archive/html/qemu-devel/2011-08/msg03248.html
>
> My second approach of checking the migration state in migrate_fd_put_buffer()
> seems like it would be worthwhile, even with your patch as an additional
> safety net against bad code.

We can add that there, but in my tests, the s->write() was returning
correctly an error (or -EAGAIN).  The problem was that we were not
exiting when we didn't needed to.

I agree that we can have *both* tests.  I will add your patch to my
series.

Thanks for the fast review.

Later, Juan.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors Juan Quintela
  2011-09-20 14:25   ` Daniel P. Berrange
@ 2011-09-20 15:48   ` Paolo Bonzini
  1 sibling, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2011-09-20 15:48 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

On 09/20/2011 03:24 PM, Juan Quintela wrote:
> If we have one error while migrating, and then we issuse a
> "migrate_cancel" command, guest hang.  Fix it for flushing only when
> migration is in MIG_STATE_ACTIVE.  In case of error of cancellation,
> don't flush.
>
> We had an infinite loop at buffered_close()
>
>          while (!s->has_error && s->buffer_size) {
>              buffered_flush(s);
>              if (s->freeze_output)
>                  s->wait_for_unfreeze(s);
>          }
>
> There was no errors, there were things to send, and connection was
> broken.  send() returns -EAGAIN, so we freezed output, but we
> unfreeze_output and try again.
>
> Signed-off-by: Juan Quintela<quintela@redhat.com>

I don't like the idea of adding an extra argument to fix a bug 
elsewhere.  I don't consider this even a safety net, since it is relying 
anyway on the migration code setting the new argument correctly:

diff --git a/migration.c b/migration.c
index 9a93e3b..15d001e 100644
--- a/migration.c
+++ b/migration.c
@@ -288,7 +288,7 @@ int migrate_fd_cleanup(FdMigrationState *s)

      if (s->file) {
          DPRINTF("closing file\n");
-        if (qemu_fclose(s->file) != 0) {
+        if (qemu_fclose(s->file, s->state == MIG_STATE_ACTIVE) != 0) {
              ret = -1;
          }
          s->file = NULL;

Dan's patch is the right fix, this one can be dropped altogether.

If anything, you may consider making wait_for_unfreeze return an error 
code, and set has_error when wait_for_unfreeze returns an error. 
Alternatively, merge QEMUFile and BufferedFile's has_error flags, and 
use qemu_file_set_error when migration is canceled.

Paolo

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify().
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify() Juan Quintela
@ 2011-09-20 15:49   ` Paolo Bonzini
  0 siblings, 0 replies; 13+ messages in thread
From: Paolo Bonzini @ 2011-09-20 15:49 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel, Yoshiaki Tamura

On 09/20/2011 03:24 PM, Juan Quintela wrote:
>
> diff --git a/migration.c b/migration.c
> index c56d29c..7f8928a 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -312,6 +312,9 @@ void migrate_fd_put_notify(void *opaque)
>
>       qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
>       qemu_file_put_notify(s->file);
> +    if (qemu_file_has_error(s->file)) {
> +        migrate_fd_error(s);
> +    }

Note this is testing QEMUFile's has_error, not QEMUBufferedFile's. 
Merging the two flags would be a good idea.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values
  2011-09-20 13:24 ` [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values Juan Quintela
@ 2011-09-22  9:50   ` Wayne Xia
  0 siblings, 0 replies; 13+ messages in thread
From: Wayne Xia @ 2011-09-22  9:50 UTC (permalink / raw)
  To: qemu-devel

于 2011-9-20 21:24, Juan Quintela 写道:
> We were retrying when there was one error, entering a loop.
> 
> Signed-off-by: Juan Quintela<quintela@redhat.com>
> ---
>   migration.c |    6 +++++-
>   1 files changed, 5 insertions(+), 1 deletions(-)
> 
> diff --git a/migration.c b/migration.c
> index 7f8928a..0baed23 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -362,6 +362,7 @@ void migrate_fd_connect(FdMigrationState *s)
>   void migrate_fd_put_ready(void *opaque)
>   {
>       FdMigrationState *s = opaque;
> +    int ret;
> 
>       if (s->state != MIG_STATE_ACTIVE) {
>           DPRINTF("put_ready returning because of non-active state\n");
> @@ -369,7 +370,10 @@ void migrate_fd_put_ready(void *opaque)
>       }
> 
>       DPRINTF("iterate\n");
> -    if (qemu_savevm_state_iterate(s->mon, s->file) == 1) {
> +    ret = qemu_savevm_state_iterate(s->mon, s->file);
> +    if (ret == -1) {
> +        migrate_fd_error(s);
> +    } else if (ret == 1) {
>           int old_vm_running = vm_running;
> 
>           DPRINTF("done iterating\n");
Maybe macro could be used in the situation that more than 2 possible
types of value may return.


-- 
Best Regards

Wayne Xia
mail:xiawenc@linux.vnet.ibm.com
tel:86-010-82450803

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2011-09-22  9:51 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-20 13:24 [Qemu-devel] [PATCH 0/7] Handle errors during migration Juan Quintela
2011-09-20 13:24 ` [Qemu-devel] [PATCH 1/7] migration: simplify state assignmente Juan Quintela
2011-09-20 13:24 ` [Qemu-devel] [PATCH 2/7] migration: only flush when there are no errors Juan Quintela
2011-09-20 14:25   ` Daniel P. Berrange
2011-09-20 14:47     ` Juan Quintela
2011-09-20 15:48   ` Paolo Bonzini
2011-09-20 13:24 ` [Qemu-devel] [PATCH 3/7] migration: Check that migration is active before cancel it Juan Quintela
2011-09-20 13:24 ` [Qemu-devel] [PATCH 4/7] savevm: avoid qemu_savevm_state_iterate() to return 1 when qemu file has error Juan Quintela
2011-09-20 13:24 ` [Qemu-devel] [PATCH 5/7] migration: add error handling to migrate_fd_put_notify() Juan Quintela
2011-09-20 15:49   ` Paolo Bonzini
2011-09-20 13:24 ` [Qemu-devel] [PATCH 6/7] migration: If there is one error, it makes no sense to continue Juan Quintela
2011-09-20 13:24 ` [Qemu-devel] [PATCH 7/7] migration: qemu_savevm_iterate has three return values Juan Quintela
2011-09-22  9:50   ` Wayne Xia

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).