[Qemu-devel] [RFC 00/14] Migration thread

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC 00/14] Migration thread
@ 2012-09-21 14:08 Juan Quintela
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 01/14] split MRU ram list Juan Quintela
                   ` (13 more replies)
  0 siblings, 14 replies; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Hi

This is work in progress on top of the previous migration series just sent.

- Introduces a thread for migration instead of using a timer and callback
- remove the writting to the fd from the iothread lock
- make the writes synchronous
- Introduce a new pending method that returns how many bytes are pending for
  one save live section
- last patch just shows printfs to see where the time is being spent
  on the migration complete phase.
  (yes it pollutes all uses of stop on the monitor)

So far I have found that we spent a lot of time on bdrv_flush_all() It
can take from 1ms to 600ms (yes, it is not a typo).  That dwarfs the
migration default downtime time (30ms).

Stop all vcpus:

- it works now (after the changes on qemu_cpu_is_vcpu on the previous
  series) caveat is that the time that brdv_flush_all() takes is
  "unpredictable".  Any silver bullets?

  Paolo suggested to call for migration completion phase:

  bdrv_aio_flush_all();
  Sent the dirty pages;
  bdrv_drain_all()
  brdv_flush_all()
  another round through the bitmap in case that completions have
  changed some page

  Paolo, did I get it right?
  Any other suggestion?

- migrate_cancel() is not properly implemented (as in the film that we
  take no locks, ...)

- expected_downtime is not calculated.

  I am about to merge migrate_fd_put_ready & buffered_thread() and
  that would make trivial to calculate.

It outputs something like:

wakeup_request 0
time cpu_disable_ticks 0
time pause_all_vcpus 1
time runstate_set 1
time vmstate_notify 2
time bdrv_drain_all 2
time flush device /dev/disk/by-path/ip-192.168.10.200:3260-iscsi-iqn.2010-12.org.trasno:iscsi.lvm-lun-1: 3
time flush device : 3
time flush device : 3
time flush device : 3
time bdrv_flush_all 5
time monitor_protocol_event 5
vm_stop 2 5
synchronize_all_states 1
migrate RAM 37
migrate rest devices 1
complete without error 3a 44
completed 45
end completed stage 45

As you can see, we estimate that we can sent all pending data in 30ms,
it took 37ms to send the RAM (that is what we calculate).  So
estimation is quite good.

What it gives me lots of variation is on the line with device name of "time flush device".
That is what varies between 1ms to 600ms

This is in a completely idle guest.  I am running:

	while (1) {
		uint64_t delay;

		if (gettimeofday(&t0, NULL) != 0)
			perror("gettimeofday 1");
		if (usleep(ms2us(10)) != 0)
			perror("usleep");
		if (gettimeofday(&t1, NULL) != 0)
			perror("gettimeofday 2");

		t1.tv_usec -= t0.tv_usec;
		if (t1.tv_usec < 0) {
			t1.tv_usec += 1000000;
			t1.tv_sec--;
		}
		t1.tv_sec -= t0.tv_sec;

		delay = t1.tv_sec * 1000 + t1.tv_usec/1000;

		if (delay > 100)
			printf("delay of %ld ms\n", delay);
       }

To see the latency inside the guest (i.e. ask for a 10ms sleep, and see how long it takes).


[root@d1 ~]# ./timer 
delay of 161 ms
delay of 135 ms
delay of 143 ms
delay of 132 ms
delay of 131 ms
delay of 141 ms
delay of 113 ms
delay of 119 ms
delay of 114 ms


But that values are independent of migration.  Without even starting
the migration, idle guest doing nothing, we get it sometimes.

Comments?



Thanks, Juan.


The following changes since commit 4bce0b88b10ed790ad3669ce4ff61c945cd655eb:

  cpus: create qemu_cpu_is_vcpu() (2012-09-21 10:43:10 +0200)

are available in the git repository at:

  http://repo.or.cz/r/qemu/quintela.git migration-thread-v3

for you to fetch changes up to 0e0f8dfd9fc308b790e55ceca5c2c193e1802417:

  migration: print times for end phase (2012-09-21 11:52:20 +0200)

Juan Quintela (11):
  buffered_file: Move from using a timer to use a thread
  migration: make qemu_fopen_ops_buffered() return void
  migration: stop all cpus correctly
  migration: make writes blocking
  migration: remove unfreeze logic
  migration: take finer locking
  buffered_file: Unfold the trick to restart generating migration data
  buffered_file: don't flush on put buffer
  buffered_file: unfold buffered_append in buffered_put_buffer
  savevm: New save live migration method: pending
  migration: print times for end phase

Paolo Bonzini (1):
  split MRU ram list

Umesh Deshpande (2):
  add a version number to ram_list
  protect the ramlist with a separate mutex

 arch_init.c       |  62 +++++++++++++-------------
 block-migration.c |  49 +++++---------------
 block.c           |   6 +++
 buffered_file.c   | 130 +++++++++++++++++++++++++-----------------------------
 buffered_file.h   |   2 +-
 cpu-all.h         |  13 +++++-
 cpus.c            |  17 +++++++
 exec.c            |  43 +++++++++++++++---
 migration-exec.c  |   2 -
 migration-fd.c    |   6 ---
 migration-tcp.c   |   2 +-
 migration-unix.c  |   2 -
 migration.c       | 108 +++++++++++++++++++--------------------------
 migration.h       |   4 +-
 qemu-file.h       |   5 ---
 savevm.c          |  37 +++++++++++++---
 sysemu.h          |   1 +
 vmstate.h         |   1 +
 18 files changed, 255 insertions(+), 235 deletions(-)

-- 
1.7.11.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 01/14] split MRU ram list
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 14:38   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 02/14] add a version number to ram_list Juan Quintela
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini

From: Paolo Bonzini <pbonzini@redhat.com>

Outside the execution threads the normal, non-MRU-ized order of
the RAM blocks should always be enough.  So manage two separate
lists, which will have separate locking rules.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 cpu-all.h |  4 +++-
 exec.c    | 16 +++++++++++-----
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index db25f73..3e67019 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -487,8 +487,9 @@ typedef struct RAMBlock {
     ram_addr_t offset;
     ram_addr_t length;
     uint32_t flags;
-    char idstr[256];
+    QLIST_ENTRY(RAMBlock) next_mru;
     QLIST_ENTRY(RAMBlock) next;
+    char idstr[256];
 #if defined(__linux__) && !defined(TARGET_S390X)
     int fd;
 #endif
@@ -496,6 +497,7 @@ typedef struct RAMBlock {

 typedef struct RAMList {
     uint8_t *phys_dirty;
+    QLIST_HEAD(, RAMBlock) blocks_mru;
     QLIST_HEAD(, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
diff --git a/exec.c b/exec.c
index ad2cc2e..0db2beb 100644
--- a/exec.c
+++ b/exec.c
@@ -112,7 +112,10 @@ static uint8_t *code_gen_ptr;
 int phys_ram_fd;
 static int in_migration;

-RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks) };
+RAMList ram_list = {
+    .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks),
+    .blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
+};

 static MemoryRegion *system_memory;
 static MemoryRegion *system_io;
@@ -2578,6 +2581,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     new_block->length = size;

     QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
+    QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);

     ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
                                        last_ram_offset() >> TARGET_PAGE_BITS);
@@ -2605,6 +2609,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
+            QLIST_REMOVE(block, next_mru);
             g_free(block);
             return;
         }
@@ -2618,6 +2623,7 @@ void qemu_ram_free(ram_addr_t addr)
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
+            QLIST_REMOVE(block, next_mru);
             if (block->flags & RAM_PREALLOC_MASK) {
                 ;
             } else if (mem_path) {
@@ -2723,12 +2729,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
 {
     RAMBlock *block;

-    QLIST_FOREACH(block, &ram_list.blocks, next) {
+    QLIST_FOREACH(block, &ram_list.blocks_mru, next_mru) {
         if (addr - block->offset < block->length) {
             /* Move this entry to to start of the list.  */
             if (block != QLIST_FIRST(&ram_list.blocks)) {
-                QLIST_REMOVE(block, next);
-                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
+                QLIST_REMOVE(block, next_mru);
+                QLIST_INSERT_HEAD(&ram_list.blocks_mru, block, next_mru);
             }
             if (xen_enabled()) {
                 /* We need to check if the requested address is in the RAM
@@ -2823,7 +2829,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
         return 0;
     }

-    QLIST_FOREACH(block, &ram_list.blocks, next) {
+    QLIST_FOREACH(block, &ram_list.blocks_mru, next_mru) {
         /* This case append when the block is not mapped. */
         if (block->host == NULL) {
             continue;
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 02/14] add a version number to ram_list
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 01/14] split MRU ram list Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 14:42   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex Juan Quintela
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Umesh Deshpande

From: Umesh Deshpande <udeshpan@redhat.com>

This will be used to detect if last_block might have become invalid
across different calls to ram_save_live.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c | 7 ++++++-
 cpu-all.h   | 1 +
 exec.c      | 4 ++++
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/arch_init.c b/arch_init.c
index d96e888..eb33fdd 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -336,6 +336,7 @@ static RAMBlock *last_block;
 static ram_addr_t last_offset;
 static unsigned long *migration_bitmap;
 static uint64_t migration_dirty_pages;
+static uint32_t last_version;

 static inline bool migration_bitmap_test_and_reset_dirty(MemoryRegion *mr,
                                                          ram_addr_t offset)
@@ -406,7 +407,6 @@ static void migration_bitmap_sync(void)
     }
 }

-
 /*
  * ram_save_block: Writes a page of memory to the stream f
  *
@@ -558,6 +558,7 @@ static void reset_ram_globals(void)
 {
     last_block = NULL;
     last_offset = 0;
+    last_version = ram_list.version;
     sort_ram_list();
 }

@@ -613,6 +614,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     uint64_t expected_downtime;
     MigrationState *s = migrate_get_current();

+    if (ram_list.version != last_version) {
+        reset_ram_globals();
+    }
+
     bytes_transferred_last = bytes_transferred;
     bwidth = qemu_get_clock_ns(rt_clock);

diff --git a/cpu-all.h b/cpu-all.h
index 3e67019..6576229 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -497,6 +497,7 @@ typedef struct RAMBlock {

 typedef struct RAMList {
     uint8_t *phys_dirty;
+    uint32_t version;
     QLIST_HEAD(, RAMBlock) blocks_mru;
     QLIST_HEAD(, RAMBlock) blocks;
 } RAMList;
diff --git a/exec.c b/exec.c
index 0db2beb..e9d1509 100644
--- a/exec.c
+++ b/exec.c
@@ -2583,6 +2583,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
     QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);

+    ram_list.version++;
+
     ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
                                        last_ram_offset() >> TARGET_PAGE_BITS);
     memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
@@ -2610,6 +2612,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
             QLIST_REMOVE(block, next_mru);
+            ram_list.version++;
             g_free(block);
             return;
         }
@@ -2624,6 +2627,7 @@ void qemu_ram_free(ram_addr_t addr)
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
             QLIST_REMOVE(block, next_mru);
+            ram_list.version++;
             if (block->flags & RAM_PREALLOC_MASK) {
                 ;
             } else if (mem_path) {
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 01/14] split MRU ram list Juan Quintela
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 02/14] add a version number to ram_list Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 14:44   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread Juan Quintela
                   ` (10 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Umesh Deshpande

From: Umesh Deshpande <udeshpan@redhat.com>

Add the new mutex that protects shared state between ram_save_live
and the iothread.  If the iothread mutex has to be taken together
with the ramlist mutex, the iothread shall always be _outside_.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c |  9 ++++++++-
 cpu-all.h   |  8 ++++++++
 exec.c      | 23 +++++++++++++++++++++--
 3 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index eb33fdd..0d963b4 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -553,7 +553,6 @@ static void ram_migration_cancel(void *opaque)
     migration_end();
 }

-
 static void reset_ram_globals(void)
 {
     last_block = NULL;
@@ -573,6 +572,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
     bitmap_set(migration_bitmap, 1, ram_pages);
     migration_dirty_pages = ram_pages;

+    qemu_mutex_lock_ramlist();
     migration_bitmap_sync();
     bytes_transferred = 0;
     reset_ram_globals();
@@ -600,6 +600,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         qemu_put_be64(f, block->length);
     }

+    qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);

     return 0;
@@ -614,6 +615,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
     uint64_t expected_downtime;
     MigrationState *s = migrate_get_current();

+    qemu_mutex_lock_ramlist();
+
     if (ram_list.version != last_version) {
         reset_ram_globals();
     }
@@ -662,6 +665,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         bwidth = 0.000001;
     }

+    qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);

     expected_downtime = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
@@ -682,6 +686,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
 {
     migration_bitmap_sync();

+    qemu_mutex_lock_ramlist();
+
     /* try transferring iterative blocks of memory */

     /* flush all remaining blocks regardless of rate limiting */
@@ -697,6 +703,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     }
     memory_global_dirty_log_stop();

+    qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);

     g_free(migration_bitmap);
diff --git a/cpu-all.h b/cpu-all.h
index 6576229..2b0a640 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -22,6 +22,7 @@
 #include "qemu-common.h"
 #include "qemu-tls.h"
 #include "cpu-common.h"
+#include "qemu-thread.h"

 /* some important defines:
  *
@@ -487,7 +488,9 @@ typedef struct RAMBlock {
     ram_addr_t offset;
     ram_addr_t length;
     uint32_t flags;
+    /* Protected by the iothread lock.  */
     QLIST_ENTRY(RAMBlock) next_mru;
+    /* Protected by the ramlist lock.  */
     QLIST_ENTRY(RAMBlock) next;
     char idstr[256];
 #if defined(__linux__) && !defined(TARGET_S390X)
@@ -496,9 +499,12 @@ typedef struct RAMBlock {
 } RAMBlock;

 typedef struct RAMList {
+    QemuMutex mutex;
+    /* Protected by the iothread lock.  */
     uint8_t *phys_dirty;
     uint32_t version;
     QLIST_HEAD(, RAMBlock) blocks_mru;
+    /* Protected by the ramlist lock.  */
     QLIST_HEAD(, RAMBlock) blocks;
 } RAMList;
 extern RAMList ram_list;
@@ -520,6 +526,8 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
 #endif /* !CONFIG_USER_ONLY */

 ram_addr_t last_ram_offset(void);
+void qemu_mutex_lock_ramlist(void);
+void qemu_mutex_unlock_ramlist(void);

 int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
                         uint8_t *buf, int len, int is_write);
diff --git a/exec.c b/exec.c
index e9d1509..3a8a4dd 100644
--- a/exec.c
+++ b/exec.c
@@ -637,6 +637,7 @@ bool tcg_enabled(void)

 void cpu_exec_init_all(void)
 {
+    qemu_mutex_init(&ram_list.mutex);
 #if !defined(CONFIG_USER_ONLY)
     memory_map_init();
     io_mem_init();
@@ -2364,6 +2365,16 @@ static long gethugepagesize(const char *path)
     return fs.f_bsize;
 }

+void qemu_mutex_lock_ramlist(void)
+{
+    qemu_mutex_lock(&ram_list.mutex);
+}
+
+void qemu_mutex_unlock_ramlist(void)
+{
+    qemu_mutex_unlock(&ram_list.mutex);
+}
+
 static void *file_ram_alloc(RAMBlock *block,
                             ram_addr_t memory,
                             const char *path)
@@ -2519,6 +2530,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev)
     }
     pstrcat(new_block->idstr, sizeof(new_block->idstr), name);

+    qemu_mutex_lock_ramlist();
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (block != new_block && !strcmp(block->idstr, new_block->idstr)) {
             fprintf(stderr, "RAMBlock \"%s\" already registered, abort!\n",
@@ -2526,6 +2538,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev)
             abort();
         }
     }
+    qemu_mutex_unlock_ramlist();
 }

 static int memory_try_enable_merging(void *addr, size_t len)
@@ -2549,6 +2562,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     size = TARGET_PAGE_ALIGN(size);
     new_block = g_malloc0(sizeof(*new_block));

+    qemu_mutex_lock_ramlist();
     new_block->mr = mr;
     new_block->offset = find_ram_offset(size);
     if (host) {
@@ -2584,6 +2598,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
     QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);

     ram_list.version++;
+    qemu_mutex_unlock_ramlist();

     ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
                                        last_ram_offset() >> TARGET_PAGE_BITS);
@@ -2608,21 +2623,24 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
 {
     RAMBlock *block;

+    qemu_mutex_lock_ramlist();
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
             QLIST_REMOVE(block, next_mru);
             ram_list.version++;
             g_free(block);
-            return;
+            break;
         }
     }
+    qemu_mutex_unlock_ramlist();
 }

 void qemu_ram_free(ram_addr_t addr)
 {
     RAMBlock *block;

+    qemu_mutex_lock_ramlist();
     QLIST_FOREACH(block, &ram_list.blocks, next) {
         if (addr == block->offset) {
             QLIST_REMOVE(block, next);
@@ -2653,9 +2671,10 @@ void qemu_ram_free(ram_addr_t addr)
 #endif
             }
             g_free(block);
-            return;
+            break;
         }
     }
+    qemu_mutex_unlock_ramlist();

 }

-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (2 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:29   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void Juan Quintela
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

We still protect everything except the wait with the iothread lock.
But we moved from a timer to a thread.  Steps one by one.

We also need to detect when we have finished with a variable "complete".

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 58 +++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 36 insertions(+), 22 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 318d0f0..61985a9 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -18,6 +18,7 @@
 #include "qemu-timer.h"
 #include "qemu-char.h"
 #include "buffered_file.h"
+#include "qemu-thread.h"

 //#define DEBUG_BUFFERED_FILE

@@ -31,7 +32,8 @@ typedef struct QEMUFileBuffered
     uint8_t *buffer;
     size_t buffer_size;
     size_t buffer_capacity;
-    QEMUTimer *timer;
+    QemuThread thread;
+    bool complete;
 } QEMUFileBuffered;

 #ifdef DEBUG_BUFFERED_FILE
@@ -159,11 +161,8 @@ static int buffered_close(void *opaque)
     if (ret >= 0) {
         ret = ret2;
     }
-    qemu_del_timer(s->timer);
-    qemu_free_timer(s->timer);
-    g_free(s->buffer);
-    g_free(s);
-
+    ret = migrate_fd_close(s->migration_state);
+    s->complete = true;
     return ret;
 }

@@ -214,23 +213,38 @@ static int64_t buffered_get_rate_limit(void *opaque)
     return s->xfer_limit;
 }

-static void buffered_rate_tick(void *opaque)
+/* 10ms  xfer_limit is the limit that we should write each 10ms */
+#define BUFFER_DELAY 100
+
+static void *buffered_file_thread(void *opaque)
 {
     QEMUFileBuffered *s = opaque;
+    int64_t expire_time = qemu_get_clock_ms(rt_clock) + BUFFER_DELAY;

-    if (qemu_file_get_error(s->file)) {
-        buffered_close(s);
-        return;
-    }
-
-    qemu_mod_timer(s->timer, qemu_get_clock_ms(rt_clock) + 100);
-
-    if (s->freeze_output)
-        return;
-
-    s->bytes_xfer = 0;
+    while (true) {
+        int64_t current_time = qemu_get_clock_ms(rt_clock);

-    buffered_put_buffer(s, NULL, 0, 0);
+        if (s->complete) {
+            break;
+        }
+        if (s->freeze_output) {
+            continue;
+        }
+        if (current_time >= expire_time) {
+            s->bytes_xfer = 0;
+            expire_time = current_time + BUFFER_DELAY;
+        }
+        if (s->bytes_xfer >= s->xfer_limit) {
+            /* usleep expects microseconds */
+            usleep((expire_time - current_time)*1000);
+        }
+        qemu_mutex_lock_iothread();
+        buffered_put_buffer(s, NULL, 0, 0);
+        qemu_mutex_unlock_iothread();
+    }
+    g_free(s->buffer);
+    g_free(s);
+    return NULL;
 }

 QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
@@ -241,15 +255,15 @@ QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)

     s->migration_state = migration_state;
     s->xfer_limit = migration_state->bandwidth_limit / 10;
+    s->complete = false;

     s->file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
                              buffered_close, buffered_rate_limit,
                              buffered_set_rate_limit,
 			     buffered_get_rate_limit);

-    s->timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
-
-    qemu_mod_timer(s->timer, qemu_get_clock_ms(rt_clock) + 100);
+    qemu_thread_create(&s->thread, buffered_file_thread, s,
+                       QEMU_THREAD_DETACHED);

     return s->file;
 }
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (3 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 14:57   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly Juan Quintela
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

We want the file assignment to happen before the thread is created to
avoid locking, so we just do it before creating the thread.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 13 ++++++-------
 buffered_file.h |  2 +-
 migration.c     |  2 +-
 migration.h     |  1 +
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 61985a9..6c3d057 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -33,7 +33,6 @@ typedef struct QEMUFileBuffered
     size_t buffer_size;
     size_t buffer_capacity;
     QemuThread thread;
-    bool complete;
 } QEMUFileBuffered;

 #ifdef DEBUG_BUFFERED_FILE
@@ -162,7 +161,7 @@ static int buffered_close(void *opaque)
         ret = ret2;
     }
     ret = migrate_fd_close(s->migration_state);
-    s->complete = true;
+    s->migration_state->complete = true;
     return ret;
 }

@@ -224,7 +223,7 @@ static void *buffered_file_thread(void *opaque)
     while (true) {
         int64_t current_time = qemu_get_clock_ms(rt_clock);

-        if (s->complete) {
+        if (s->migration_state->complete) {
             break;
         }
         if (s->freeze_output) {
@@ -247,7 +246,7 @@ static void *buffered_file_thread(void *opaque)
     return NULL;
 }

-QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
+void qemu_fopen_ops_buffered(MigrationState *migration_state)
 {
     QEMUFileBuffered *s;

@@ -255,15 +254,15 @@ QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)

     s->migration_state = migration_state;
     s->xfer_limit = migration_state->bandwidth_limit / 10;
-    s->complete = false;
+    s->migration_state->complete = false;

     s->file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
                              buffered_close, buffered_rate_limit,
                              buffered_set_rate_limit,
 			     buffered_get_rate_limit);

+    migration_state->file = s->file;
+
     qemu_thread_create(&s->thread, buffered_file_thread, s,
                        QEMU_THREAD_DETACHED);
-
-    return s->file;
 }
diff --git a/buffered_file.h b/buffered_file.h
index ef010fe..8a246fd 100644
--- a/buffered_file.h
+++ b/buffered_file.h
@@ -17,6 +17,6 @@
 #include "hw/hw.h"
 #include "migration.h"

-QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state);
+void qemu_fopen_ops_buffered(MigrationState *migration_state);

 #endif
diff --git a/migration.c b/migration.c
index 2c29d04..550c5c2 100644
--- a/migration.c
+++ b/migration.c
@@ -429,7 +429,7 @@ void migrate_fd_connect(MigrationState *s)
     int ret;

     s->state = MIG_STATE_ACTIVE;
-    s->file = qemu_fopen_ops_buffered(s);
+    qemu_fopen_ops_buffered(s);

     DPRINTF("beginning savevm\n");
     ret = qemu_savevm_state_begin(s->file, &s->params);
diff --git a/migration.h b/migration.h
index 1c3e9b7..a63c5d5 100644
--- a/migration.h
+++ b/migration.h
@@ -45,6 +45,7 @@ struct MigrationState
     int64_t dirty_pages_rate;
     bool enabled_capabilities[MIGRATION_CAPABILITY_MAX];
     int64_t xbzrle_cache_size;
+    bool complete;
 };

 void process_incoming_migration(QEMUFile *f);
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (4 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:20   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 07/14] migration: make writes blocking Juan Quintela
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

You can only stop all cpus from the iothread or an vcpu.  As we want
to do it from the migration_thread, we need to do this dance with the
botton handlers.

This patch is a request for ideas.  I can move this function to cpus.c, but
wondered if there is an easy way of doing this?

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration.c | 29 +++++++++++++++++------------
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/migration.c b/migration.c
index 550c5c2..ee0519e 100644
--- a/migration.c
+++ b/migration.c
@@ -20,6 +20,7 @@
 #include "sysemu.h"
 #include "block.h"
 #include "qemu_socket.h"
+#include "qemu-thread.h"
 #include "block-migration.h"
 #include "qmp-commands.h"

@@ -320,11 +321,22 @@ ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
 void migrate_fd_put_ready(MigrationState *s)
 {
     int ret;
+    static bool first_time = true;

     if (s->state != MIG_STATE_ACTIVE) {
         DPRINTF("put_ready returning because of non-active state\n");
         return;
     }
+    if (first_time) {
+        first_time = false;
+        DPRINTF("beginning savevm\n");
+        ret = qemu_savevm_state_begin(s->file, &s->params);
+        if (ret < 0) {
+            DPRINTF("failed, %d\n", ret);
+            migrate_fd_error(s);
+            return;
+        }
+    }

     DPRINTF("iterate\n");
     ret = qemu_savevm_state_iterate(s->file);
@@ -337,7 +349,11 @@ void migrate_fd_put_ready(MigrationState *s)
         DPRINTF("done iterating\n");
         start_time = qemu_get_clock_ms(rt_clock);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
-        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+        if (old_vm_running) {
+            vm_stop(RUN_STATE_FINISH_MIGRATE);
+        } else {
+            vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
+        }

         if (qemu_savevm_state_complete(s->file) < 0) {
             migrate_fd_error(s);
@@ -426,19 +442,8 @@ bool migration_has_failed(MigrationState *s)

 void migrate_fd_connect(MigrationState *s)
 {
-    int ret;
-
     s->state = MIG_STATE_ACTIVE;
     qemu_fopen_ops_buffered(s);
-
-    DPRINTF("beginning savevm\n");
-    ret = qemu_savevm_state_begin(s->file, &s->params);
-    if (ret < 0) {
-        DPRINTF("failed, %d\n", ret);
-        migrate_fd_error(s);
-        return;
-    }
-    migrate_fd_put_ready(s);
 }

 static MigrationState *migrate_init(const MigrationParams *params)
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 07/14] migration: make writes blocking
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (5 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:31   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic Juan Quintela
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Move all the writes to the migration_thread, and make writings
blocking.  Notice that are still using the iothread for everything
that we do.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration-exec.c |  2 --
 migration-fd.c   |  6 ------
 migration-tcp.c  |  2 +-
 migration-unix.c |  2 --
 migration.c      | 19 -------------------
 qemu-file.h      |  5 -----
 savevm.c         |  5 -----
 7 files changed, 1 insertion(+), 40 deletions(-)

diff --git a/migration-exec.c b/migration-exec.c
index 6c97db9..908f22e 100644
--- a/migration-exec.c
+++ b/migration-exec.c
@@ -76,8 +76,6 @@ int exec_start_outgoing_migration(MigrationState *s, const char *command)
         goto err_after_open;
     }

-    socket_set_nonblock(s->fd);
-
     s->opaque = qemu_popen(f, "w");

     s->close = exec_close;
diff --git a/migration-fd.c b/migration-fd.c
index 50138ed..e5be972 100644
--- a/migration-fd.c
+++ b/migration-fd.c
@@ -81,11 +81,6 @@ int fd_start_outgoing_migration(MigrationState *s, const char *fdname)
         goto err_after_get_fd;
     }

-    if (fcntl(s->fd, F_SETFL, O_NONBLOCK) == -1) {
-        DPRINTF("Unable to set nonblocking mode on file descriptor\n");
-        goto err_after_open;
-    }
-
     s->get_error = fd_errno;
     s->write = fd_write;
     s->close = fd_close;
@@ -93,7 +88,6 @@ int fd_start_outgoing_migration(MigrationState *s, const char *fdname)
     migrate_fd_connect(s);
     return 0;

-err_after_open:
     close(s->fd);
 err_after_get_fd:
     return -1;
diff --git a/migration-tcp.c b/migration-tcp.c
index ac891c3..942f199 100644
--- a/migration-tcp.c
+++ b/migration-tcp.c
@@ -88,7 +88,7 @@ int tcp_start_outgoing_migration(MigrationState *s, const char *host_port,
     s->write = socket_write;
     s->close = tcp_close;

-    s->fd = inet_connect(host_port, false, &in_progress, errp);
+    s->fd = inet_connect(host_port, true, &in_progress, errp);
     if (error_is_set(errp)) {
         migrate_fd_error(s);
         return -1;
diff --git a/migration-unix.c b/migration-unix.c
index 169de88..bb41bac 100644
--- a/migration-unix.c
+++ b/migration-unix.c
@@ -96,8 +96,6 @@ int unix_start_outgoing_migration(MigrationState *s, const char *path)
         return -errno;
     }

-    socket_set_nonblock(s->fd);
-
     do {
         ret = connect(s->fd, (struct sockaddr *)&addr, sizeof(addr));
         if (ret == -1) {
diff --git a/migration.c b/migration.c
index ee0519e..a8b2f4a 100644
--- a/migration.c
+++ b/migration.c
@@ -247,8 +247,6 @@ static int migrate_fd_cleanup(MigrationState *s)
 {
     int ret = 0;

-    qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
-
     if (s->file) {
         DPRINTF("closing file\n");
         ret = qemu_fclose(s->file);
@@ -283,18 +281,6 @@ static void migrate_fd_completed(MigrationState *s)
     notifier_list_notify(&migration_state_notifiers, s);
 }

-static void migrate_fd_put_notify(void *opaque)
-{
-    MigrationState *s = opaque;
-    int ret;
-
-    qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
-    ret = qemu_file_put_notify(s->file);
-    if (ret) {
-        migrate_fd_error(s);
-    }
-}
-
 ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
                               size_t size)
 {
@@ -311,10 +297,6 @@ ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
     if (ret == -1)
         ret = -(s->get_error(s));

-    if (ret == -EAGAIN) {
-        qemu_set_fd_handler2(s->fd, NULL, NULL, migrate_fd_put_notify, s);
-    }
-
     return ret;
 }

@@ -410,7 +392,6 @@ int migrate_fd_wait_for_unfreeze(MigrationState *s)

 int migrate_fd_close(MigrationState *s)
 {
-    qemu_set_fd_handler2(s->fd, NULL, NULL, NULL, NULL);
     return s->close(s);
 }

diff --git a/qemu-file.h b/qemu-file.h
index 9c8985b..e88892c 100644
--- a/qemu-file.h
+++ b/qemu-file.h
@@ -104,11 +104,6 @@ int64_t qemu_file_set_rate_limit(QEMUFile *f, int64_t new_rate);
 int64_t qemu_file_get_rate_limit(QEMUFile *f);
 int qemu_file_get_error(QEMUFile *f);

-/* Try to send any outstanding data.  This function is useful when output is
- * halted due to rate limiting or EAGAIN errors occur as it can be used to
- * resume output. */
-int qemu_file_put_notify(QEMUFile *f);
-
 static inline void qemu_put_be64s(QEMUFile *f, const uint64_t *pv)
 {
     qemu_put_be64(f, *pv);
diff --git a/savevm.c b/savevm.c
index 2ea1fa6..ef44a78 100644
--- a/savevm.c
+++ b/savevm.c
@@ -523,11 +523,6 @@ int qemu_fclose(QEMUFile *f)
     return ret;
 }

-int qemu_file_put_notify(QEMUFile *f)
-{
-    return f->put_buffer(f->opaque, NULL, 0, 0);
-}
-
 void qemu_put_buffer(QEMUFile *f, const uint8_t *buf, int size)
 {
     int l;
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (6 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 07/14] migration: make writes blocking Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 14:58   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 09/14] migration: take finer locking Juan Quintela
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Now that we have a thread, and blocking writes, we don't need it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 24 +-----------------------
 migration.c     | 23 -----------------------
 migration.h     |  1 -
 3 files changed, 1 insertion(+), 47 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 6c3d057..2c9859b 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -26,7 +26,6 @@ typedef struct QEMUFileBuffered
 {
     MigrationState *migration_state;
     QEMUFile *file;
-    int freeze_output;
     size_t bytes_xfer;
     size_t xfer_limit;
     uint8_t *buffer;
@@ -70,13 +69,6 @@ static int buffered_flush(QEMUFileBuffered *s)

         ret = migrate_fd_put_buffer(s->migration_state, s->buffer + offset,
                                     s->buffer_size - offset);
-        if (ret == -EAGAIN) {
-            DPRINTF("backend not ready, freezing\n");
-            ret = 0;
-            s->freeze_output = 1;
-            break;
-        }
-
         if (ret <= 0) {
             DPRINTF("error flushing data, %zd\n", ret);
             break;
@@ -110,9 +102,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
         return error;
     }

-    DPRINTF("unfreezing output\n");
-    s->freeze_output = 0;
-
     if (size > 0) {
         DPRINTF("buffering %d bytes\n", size - offset);
         buffered_append(s, buf, size);
@@ -126,7 +115,7 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in

     if (pos == 0 && size == 0) {
         DPRINTF("file is ready\n");
-        if (!s->freeze_output && s->bytes_xfer < s->xfer_limit) {
+        if (s->bytes_xfer < s->xfer_limit) {
             DPRINTF("notifying client\n");
             migrate_fd_put_ready(s->migration_state);
         }
@@ -148,12 +137,6 @@ static int buffered_close(void *opaque)
         if (ret < 0) {
             break;
         }
-        if (s->freeze_output) {
-            ret = migrate_fd_wait_for_unfreeze(s->migration_state);
-            if (ret < 0) {
-                break;
-            }
-        }
     }

     ret2 = migrate_fd_close(s->migration_state);
@@ -180,8 +163,6 @@ static int buffered_rate_limit(void *opaque)
     if (ret) {
         return ret;
     }
-    if (s->freeze_output)
-        return 1;

     if (s->bytes_xfer > s->xfer_limit)
         return 1;
@@ -226,9 +207,6 @@ static void *buffered_file_thread(void *opaque)
         if (s->migration_state->complete) {
             break;
         }
-        if (s->freeze_output) {
-            continue;
-        }
         if (current_time >= expire_time) {
             s->bytes_xfer = 0;
             expire_time = current_time + BUFFER_DELAY;
diff --git a/migration.c b/migration.c
index a8b2f4a..29ee710 100644
--- a/migration.c
+++ b/migration.c
@@ -367,29 +367,6 @@ static void migrate_fd_cancel(MigrationState *s)
     migrate_fd_cleanup(s);
 }

-int migrate_fd_wait_for_unfreeze(MigrationState *s)
-{
-    int ret;
-
-    DPRINTF("wait for unfreeze\n");
-    if (s->state != MIG_STATE_ACTIVE)
-        return -EINVAL;
-
-    do {
-        fd_set wfds;
-
-        FD_ZERO(&wfds);
-        FD_SET(s->fd, &wfds);
-
-        ret = select(s->fd + 1, NULL, &wfds, NULL, NULL);
-    } while (ret == -1 && (s->get_error(s)) == EINTR);
-
-    if (ret == -1) {
-        return -s->get_error(s);
-    }
-    return 0;
-}
-
 int migrate_fd_close(MigrationState *s)
 {
     return s->close(s);
diff --git a/migration.h b/migration.h
index a63c5d5..505f792 100644
--- a/migration.h
+++ b/migration.h
@@ -82,7 +82,6 @@ void migrate_fd_connect(MigrationState *s);
 ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
                               size_t size);
 void migrate_fd_put_ready(MigrationState *s);
-int migrate_fd_wait_for_unfreeze(MigrationState *s);
 int migrate_fd_close(MigrationState *s);

 void add_migration_state_change_notifier(Notifier *notify);
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 09/14] migration: take finer locking
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (7 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:33   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data Juan Quintela
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Instead of locking the whole migration_thread inside loop, just lock
migration_fd_put_notify, that is what interacts with the rest of the
world.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 2 --
 migration.c     | 5 +++++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 2c9859b..22b973f 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -215,9 +215,7 @@ static void *buffered_file_thread(void *opaque)
             /* usleep expects microseconds */
             usleep((expire_time - current_time)*1000);
         }
-        qemu_mutex_lock_iothread();
         buffered_put_buffer(s, NULL, 0, 0);
-        qemu_mutex_unlock_iothread();
     }
     g_free(s->buffer);
     g_free(s);
diff --git a/migration.c b/migration.c
index 29ee710..82c2663 100644
--- a/migration.c
+++ b/migration.c
@@ -305,8 +305,10 @@ void migrate_fd_put_ready(MigrationState *s)
     int ret;
     static bool first_time = true;

+    qemu_mutex_lock_iothread();
     if (s->state != MIG_STATE_ACTIVE) {
         DPRINTF("put_ready returning because of non-active state\n");
+        qemu_mutex_unlock_iothread();
         return;
     }
     if (first_time) {
@@ -316,6 +318,7 @@ void migrate_fd_put_ready(MigrationState *s)
         if (ret < 0) {
             DPRINTF("failed, %d\n", ret);
             migrate_fd_error(s);
+            qemu_mutex_unlock_iothread();
             return;
         }
     }
@@ -351,6 +354,8 @@ void migrate_fd_put_ready(MigrationState *s)
             }
         }
     }
+    qemu_mutex_unlock_iothread();
+
 }

 static void migrate_fd_cancel(MigrationState *s)
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (8 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 09/14] migration: take finer locking Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:00   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer Juan Quintela
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

This was needed before due to the way that the callbacks worked.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 22b973f..f4a788d 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -113,14 +113,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
         return error;
     }

-    if (pos == 0 && size == 0) {
-        DPRINTF("file is ready\n");
-        if (s->bytes_xfer < s->xfer_limit) {
-            DPRINTF("notifying client\n");
-            migrate_fd_put_ready(s->migration_state);
-        }
-    }
-
     return size;
 }

@@ -215,8 +207,15 @@ static void *buffered_file_thread(void *opaque)
             /* usleep expects microseconds */
             usleep((expire_time - current_time)*1000);
         }
-        buffered_put_buffer(s, NULL, 0, 0);
+        buffered_flush(s);
+
+        DPRINTF("file is ready\n");
+        if (s->bytes_xfer < s->xfer_limit) {
+            DPRINTF("notifying client\n");
+            migrate_fd_put_ready(s->migration_state);
+        }
     }
+
     g_free(s->buffer);
     g_free(s);
     return NULL;
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (9 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:34   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer Juan Quintela
                   ` (2 subsequent siblings)
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

We call buffered_put_buffer with iothread held, and buffered_flush() does
synchronous writes.  We only want to do the synchronous writes outside.

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index f4a788d..929b911 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -107,12 +107,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
         buffered_append(s, buf, size);
     }

-    error = buffered_flush(s);
-    if (error < 0) {
-        DPRINTF("buffered flush error. bailing: %s\n", strerror(-error));
-        return error;
-    }
-
     return size;
 }

-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (10 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 17:07   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending Juan Quintela
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 14/14] migration: print times for end phase Juan Quintela
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

It was the only user, and now buffered_put_buffer just do the append

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 buffered_file.c | 33 ++++++++++++++-------------------
 1 file changed, 14 insertions(+), 19 deletions(-)

diff --git a/buffered_file.c b/buffered_file.c
index 929b911..1b261ec 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -42,22 +42,6 @@ typedef struct QEMUFileBuffered
     do { } while (0)
 #endif

-static void buffered_append(QEMUFileBuffered *s,
-                            const uint8_t *buf, size_t size)
-{
-    if (size > (s->buffer_capacity - s->buffer_size)) {
-        DPRINTF("increasing buffer capacity from %zu by %zu\n",
-                s->buffer_capacity, size + 1024);
-
-        s->buffer_capacity += size + 1024;
-
-        s->buffer = g_realloc(s->buffer, s->buffer_capacity);
-    }
-
-    memcpy(s->buffer + s->buffer_size, buf, size);
-    s->buffer_size += size;
-}
-
 static int buffered_flush(QEMUFileBuffered *s)
 {
     size_t offset = 0;
@@ -102,11 +86,22 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
         return error;
     }

-    if (size > 0) {
-        DPRINTF("buffering %d bytes\n", size - offset);
-        buffered_append(s, buf, size);
+    if (size <= 0) {
+        return size;
     }

+    if (size > (s->buffer_capacity - s->buffer_size)) {
+        DPRINTF("increasing buffer capacity from %zu by %zu\n",
+                s->buffer_capacity, size + 1024);
+
+        s->buffer_capacity += size + 1024;
+
+        s->buffer = g_realloc(s->buffer, s->buffer_capacity);
+    }
+
+    memcpy(s->buffer + s->buffer_size, buf, size);
+    s->buffer_size += size;
+
     return size;
 }

-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (11 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:11   ` Paolo Bonzini
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 14/14] migration: print times for end phase Juan Quintela
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Code just now does (simplified for clarity)

    if (qemu_savevm_state_iterate(s->file) == 1) {
       vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
       qemu_savevm_state_complete(s->file);
    }

Problem here is that qemu_savevm_state_iterate() returns 1 when it
knows that remaining memory to sent takes less than max downtime.

But this means that we could end spending 2x max_downtime, one
downtime in qemu_savevm_iterate, and the other in
qemu_savevm_state_complete.

Changed code to:

    pending_size = qemu_savevm_state_pending(s->file, max_size);
    DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
    if (pending_size >= max_size) {
        ret = qemu_savevm_state_iterate(s->file);
     } else {
        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
        qemu_savevm_state_complete(s->file);
     }

So what we do is: at current network speed, we calculate the maximum
number of bytes we can sent: max_size.

Then we ask every save_live section how much they have pending.  If
they are less than max_size, we move to complete phase, otherwise we
do an iterate one.

This makes things much simpler, because now individual sections don't
have to caluclate the bandwidth (it was implossible to do right from
there).

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 arch_init.c       | 48 ++++++++++++++++++------------------------------
 block-migration.c | 49 ++++++++++---------------------------------------
 buffered_file.c   | 25 ++++++++++++++++++-------
 migration.c       | 22 +++++++++++++++-------
 migration.h       |  2 +-
 savevm.c          | 19 +++++++++++++++++++
 sysemu.h          |  1 +
 vmstate.h         |  1 +
 8 files changed, 83 insertions(+), 84 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 0d963b4..55bfd2a 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -608,12 +608,9 @@ static int ram_save_setup(QEMUFile *f, void *opaque)

 static int ram_save_iterate(QEMUFile *f, void *opaque)
 {
-    uint64_t bytes_transferred_last;
-    double bwidth = 0;
     int ret;
     int i;
-    uint64_t expected_downtime;
-    MigrationState *s = migrate_get_current();
+    int64_t t0;

     qemu_mutex_lock_ramlist();

@@ -621,9 +618,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         reset_ram_globals();
     }

-    bytes_transferred_last = bytes_transferred;
-    bwidth = qemu_get_clock_ns(rt_clock);
-
+    t0 = qemu_get_clock_ns(rt_clock);
     i = 0;
     while ((ret = qemu_file_rate_limit(f)) == 0) {
         int bytes_sent;
@@ -641,7 +636,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
            iterations
         */
         if ((i & 63) == 0) {
-            uint64_t t1 = (qemu_get_clock_ns(rt_clock) - bwidth) / 1000000;
+            uint64_t t1 = (qemu_get_clock_ns(rt_clock) - t0) / 1000000;
             if (t1 > MAX_WAIT) {
                 DPRINTF("big wait: %" PRIu64 " milliseconds, %d iterations\n",
                         t1, i);
@@ -655,31 +650,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
         return ret;
     }

-    bwidth = qemu_get_clock_ns(rt_clock) - bwidth;
-    bwidth = (bytes_transferred - bytes_transferred_last) / bwidth;
-
-    /* if we haven't transferred anything this round, force
-     * expected_downtime to a very high value, but without
-     * crashing */
-    if (bwidth == 0) {
-        bwidth = 0.000001;
-    }
-
     qemu_mutex_unlock_ramlist();
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);

-    expected_downtime = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
-    DPRINTF("ram_save_live: expected(%" PRIu64 ") <= max(" PRIu64 ")?\n",
-            expected_downtime, migrate_max_downtime());
-
-    if (expected_downtime <= migrate_max_downtime()) {
-        migration_bitmap_sync();
-        expected_downtime = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
-        s->expected_downtime = expected_downtime / 1000000; /* ns -> ms */
-
-        return expected_downtime <= migrate_max_downtime();
-    }
-    return 0;
+    return i;
 }

 static int ram_save_complete(QEMUFile *f, void *opaque)
@@ -712,6 +686,19 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
     return 0;
 }

+static uint64_t ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
+{
+    uint64_t remaining_size;
+
+    remaining_size = ram_save_remaining() * TARGET_PAGE_SIZE;
+
+    if (remaining_size < max_size) {
+        migration_bitmap_sync();
+        remaining_size = ram_save_remaining() * TARGET_PAGE_SIZE;
+    }
+    return remaining_size;
+}
+
 static int load_xbzrle(QEMUFile *f, ram_addr_t addr, void *host)
 {
     int ret, rc = 0;
@@ -897,6 +884,7 @@ SaveVMHandlers savevm_ram_handlers = {
     .save_live_setup = ram_save_setup,
     .save_live_iterate = ram_save_iterate,
     .save_live_complete = ram_save_complete,
+    .save_live_pending = ram_save_pending,
     .load_state = ram_load,
     .cancel = ram_migration_cancel,
 };
diff --git a/block-migration.c b/block-migration.c
index 9f94733..d0b3edc 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -77,9 +77,7 @@ typedef struct BlkMigState {
     int64_t total_sector_sum;
     int prev_progress;
     int bulk_completed;
-    long double total_time;
     long double prev_time_offset;
-    int reads;
 } BlkMigState;

 static BlkMigState block_mig_state;
@@ -132,12 +130,6 @@ uint64_t blk_mig_bytes_total(void)
     return sum << BDRV_SECTOR_BITS;
 }

-static inline long double compute_read_bwidth(void)
-{
-    assert(block_mig_state.total_time != 0);
-    return (block_mig_state.reads / block_mig_state.total_time) * BLOCK_SIZE;
-}
-
 static int bmds_aio_inflight(BlkMigDevState *bmds, int64_t sector)
 {
     int64_t chunk = sector / (int64_t)BDRV_SECTORS_PER_DIRTY_CHUNK;
@@ -191,8 +183,6 @@ static void blk_mig_read_cb(void *opaque, int ret)

     blk->ret = ret;

-    block_mig_state.reads++;
-    block_mig_state.total_time += (curr_time - block_mig_state.prev_time_offset);
     block_mig_state.prev_time_offset = curr_time;

     QSIMPLEQ_INSERT_TAIL(&block_mig_state.blk_list, blk, entry);
@@ -310,8 +300,6 @@ static void init_blk_migration(QEMUFile *f)
     block_mig_state.total_sector_sum = 0;
     block_mig_state.prev_progress = -1;
     block_mig_state.bulk_completed = 0;
-    block_mig_state.total_time = 0;
-    block_mig_state.reads = 0;

     bdrv_iterate(init_blk_migration_it, NULL);
 }
@@ -493,32 +481,6 @@ static int64_t get_remaining_dirty(void)
     return dirty * BLOCK_SIZE;
 }

-static int is_stage2_completed(void)
-{
-    int64_t remaining_dirty;
-    long double bwidth;
-
-    if (block_mig_state.bulk_completed == 1) {
-
-        remaining_dirty = get_remaining_dirty();
-        if (remaining_dirty == 0) {
-            return 1;
-        }
-
-        bwidth = compute_read_bwidth();
-
-        if ((remaining_dirty / bwidth) <=
-            migrate_max_downtime()) {
-            /* finish stage2 because we think that we can finish remaining work
-               below max_downtime */
-
-            return 1;
-        }
-    }
-
-    return 0;
-}
-
 static void blk_mig_cleanup(void)
 {
     BlkMigDevState *bmds;
@@ -617,7 +579,7 @@ static int block_save_iterate(QEMUFile *f, void *opaque)

     qemu_put_be64(f, BLK_MIG_FLAG_EOS);

-    return is_stage2_completed();
+    return 0;
 }

 static int block_save_complete(QEMUFile *f, void *opaque)
@@ -656,6 +618,14 @@ static int block_save_complete(QEMUFile *f, void *opaque)
     return 0;
 }

+static uint64_t block_save_pending(QEMUFile *f, void *opaque, uint64_t max_size)
+{
+
+    DPRINTF("Enter save live pending  %ld\n", get_remaining_dirty());
+
+    return get_remaining_dirty();
+}
+
 static int block_load(QEMUFile *f, void *opaque, int version_id)
 {
     static int banner_printed;
@@ -752,6 +722,7 @@ SaveVMHandlers savevm_block_handlers = {
     .save_live_setup = block_save_setup,
     .save_live_iterate = block_save_iterate,
     .save_live_complete = block_save_complete,
+    .save_live_pending = block_save_pending,
     .load_state = block_load,
     .cancel = block_migration_cancel,
     .is_active = block_is_active,
diff --git a/buffered_file.c b/buffered_file.c
index 1b261ec..ca39e7f 100644
--- a/buffered_file.c
+++ b/buffered_file.c
@@ -174,13 +174,15 @@ static int64_t buffered_get_rate_limit(void *opaque)
     return s->xfer_limit;
 }

-/* 10ms  xfer_limit is the limit that we should write each 10ms */
+/* 100ms  xfer_limit is the limit that we should write each 100ms */
 #define BUFFER_DELAY 100

 static void *buffered_file_thread(void *opaque)
 {
     QEMUFileBuffered *s = opaque;
-    int64_t expire_time = qemu_get_clock_ms(rt_clock) + BUFFER_DELAY;
+    int64_t initial_time = qemu_get_clock_ms(rt_clock);
+    int64_t max_size = 0;
+    bool last_round = false;

     while (true) {
         int64_t current_time = qemu_get_clock_ms(rt_clock);
@@ -188,20 +190,29 @@ static void *buffered_file_thread(void *opaque)
         if (s->migration_state->complete) {
             break;
         }
-        if (current_time >= expire_time) {
+        if (current_time >= initial_time + BUFFER_DELAY) {
+            uint64_t transferred_bytes = s->bytes_xfer;
+            uint64_t time_spent = current_time - initial_time;
+            double bandwidth = transferred_bytes / time_spent;
+            max_size = bandwidth * migrate_max_downtime() / 1000000;
+
+            DPRINTF("transferred %" PRIu64 " time_spent %" PRIu64
+                    " bandwidth %g max_size %" PRId64 "\n",
+                    transferred_bytes, time_spent, bandwidth, max_size);
+
             s->bytes_xfer = 0;
-            expire_time = current_time + BUFFER_DELAY;
+            initial_time = current_time;
         }
-        if (s->bytes_xfer >= s->xfer_limit) {
+        if (!last_round && (s->bytes_xfer >= s->xfer_limit)) {
             /* usleep expects microseconds */
-            usleep((expire_time - current_time)*1000);
+            usleep((initial_time + BUFFER_DELAY - current_time)*1000);
         }
         buffered_flush(s);

         DPRINTF("file is ready\n");
         if (s->bytes_xfer < s->xfer_limit) {
             DPRINTF("notifying client\n");
-            migrate_fd_put_ready(s->migration_state);
+            last_round = migrate_fd_put_ready(s->migration_state, max_size);
         }
     }

diff --git a/migration.c b/migration.c
index 82c2663..8054a77 100644
--- a/migration.c
+++ b/migration.c
@@ -300,16 +300,18 @@ ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
     return ret;
 }

-void migrate_fd_put_ready(MigrationState *s)
+bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size)
 {
     int ret;
     static bool first_time = true;
+    uint64_t pending_size;
+    bool last_round = false;

     qemu_mutex_lock_iothread();
     if (s->state != MIG_STATE_ACTIVE) {
         DPRINTF("put_ready returning because of non-active state\n");
         qemu_mutex_unlock_iothread();
-        return;
+        return false;
     }
     if (first_time) {
         first_time = false;
@@ -319,15 +321,19 @@ void migrate_fd_put_ready(MigrationState *s)
             DPRINTF("failed, %d\n", ret);
             migrate_fd_error(s);
             qemu_mutex_unlock_iothread();
-            return;
+            return false;
         }
     }

     DPRINTF("iterate\n");
-    ret = qemu_savevm_state_iterate(s->file);
-    if (ret < 0) {
-        migrate_fd_error(s);
-    } else if (ret == 1) {
+    pending_size = qemu_savevm_state_pending(s->file, max_size);
+    DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
+    if (pending_size >= max_size) {
+        ret = qemu_savevm_state_iterate(s->file);
+        if (ret < 0) {
+            migrate_fd_error(s);
+        }
+    } else {
         int old_vm_running = runstate_is_running();
         int64_t start_time, end_time;

@@ -353,9 +359,11 @@ void migrate_fd_put_ready(MigrationState *s)
                 vm_start();
             }
         }
+        last_round = true;
     }
     qemu_mutex_unlock_iothread();

+    return last_round;
 }

 static void migrate_fd_cancel(MigrationState *s)
diff --git a/migration.h b/migration.h
index 505f792..1f2baed 100644
--- a/migration.h
+++ b/migration.h
@@ -81,7 +81,7 @@ void migrate_fd_connect(MigrationState *s);

 ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
                               size_t size);
-void migrate_fd_put_ready(MigrationState *s);
+bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size);
 int migrate_fd_close(MigrationState *s);

 void add_migration_state_change_notifier(Notifier *notify);
diff --git a/savevm.c b/savevm.c
index ef44a78..bdc70c2 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1681,6 +1681,25 @@ int qemu_savevm_state_complete(QEMUFile *f)
     return qemu_file_get_error(f);
 }

+uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size)
+{
+    SaveStateEntry *se;
+    uint64_t ret = 0;
+
+    QTAILQ_FOREACH(se, &savevm_handlers, entry) {
+        if (!se->ops || !se->ops->save_live_pending) {
+            continue;
+        }
+        if (se->ops && se->ops->is_active) {
+            if (!se->ops->is_active(se->opaque)) {
+                continue;
+            }
+        }
+        ret += se->ops->save_live_pending(f, se->opaque, max_size);
+    }
+    return ret;
+}
+
 void qemu_savevm_state_cancel(QEMUFile *f)
 {
     SaveStateEntry *se;
diff --git a/sysemu.h b/sysemu.h
index 65552ac..c07ac0a 100644
--- a/sysemu.h
+++ b/sysemu.h
@@ -83,6 +83,7 @@ int qemu_savevm_state_begin(QEMUFile *f,
 int qemu_savevm_state_iterate(QEMUFile *f);
 int qemu_savevm_state_complete(QEMUFile *f);
 void qemu_savevm_state_cancel(QEMUFile *f);
+uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size);
 int qemu_loadvm_state(QEMUFile *f);

 /* SLIRP */
diff --git a/vmstate.h b/vmstate.h
index c9c320e..241d962 100644
--- a/vmstate.h
+++ b/vmstate.h
@@ -35,6 +35,7 @@ typedef struct SaveVMHandlers {
     int (*save_live_setup)(QEMUFile *f, void *opaque);
     int (*save_live_iterate)(QEMUFile *f, void *opaque);
     int (*save_live_complete)(QEMUFile *f, void *opaque);
+    uint64_t (*save_live_pending)(QEMUFile *f, void *opaque, uint64_t max_size);
     void (*cancel)(void *opaque);
     LoadStateHandler *load_state;
     bool (*is_active)(void *opaque);
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [Qemu-devel] [PATCH 14/14] migration: print times for end phase
  2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
                   ` (12 preceding siblings ...)
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending Juan Quintela
@ 2012-09-21 14:08 ` Juan Quintela
  2012-09-21 15:13   ` Paolo Bonzini
  13 siblings, 1 reply; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 14:08 UTC (permalink / raw)
  To: qemu-devel

Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 block.c     |  6 ++++++
 cpus.c      | 17 +++++++++++++++++
 migration.c | 10 +++++++++-
 savevm.c    | 13 +++++++++++++
 4 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index e78039b..d6e7a59 100644
--- a/block.c
+++ b/block.c
@@ -2269,9 +2269,15 @@ int bdrv_get_flags(BlockDriverState *bs)
 void bdrv_flush_all(void)
 {
     BlockDriverState *bs;
+    int64_t start_time, end_time;
+
+    start_time = qemu_get_clock_ms(rt_clock);

     QTAILQ_FOREACH(bs, &bdrv_states, list) {
         bdrv_flush(bs);
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time flush device %s: %ld\n", bs->filename,
+               end_time - start_time);
     }
 }

diff --git a/cpus.c b/cpus.c
index 1b7061a..0ccc1e0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -439,14 +439,31 @@ int cpu_is_stopped(CPUArchState *env)

 static void do_vm_stop(RunState state)
 {
+    int64_t start_time, end_time;
+
     if (runstate_is_running()) {
+        start_time = qemu_get_clock_ms(rt_clock);
         cpu_disable_ticks();
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time cpu_disable_ticks %ld\n", end_time - start_time);
         pause_all_vcpus();
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time pause_all_vcpus %ld\n", end_time - start_time);
         runstate_set(state);
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time runstate_set %ld\n", end_time - start_time);
         vm_state_notify(0, state);
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time vmstate_notify %ld\n", end_time - start_time);
         bdrv_drain_all();
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time bdrv_drain_all %ld\n", end_time - start_time);
         bdrv_flush_all();
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time bdrv_flush_all %ld\n", end_time - start_time);
         monitor_protocol_event(QEVENT_STOP, NULL);
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("time monitor_protocol_event %ld\n", end_time - start_time);
     }
 }

diff --git a/migration.c b/migration.c
index 8054a77..9c62614 100644
--- a/migration.c
+++ b/migration.c
@@ -340,18 +340,24 @@ bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size)
         DPRINTF("done iterating\n");
         start_time = qemu_get_clock_ms(rt_clock);
         qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("wakeup_request %ld\n", end_time - start_time);
         if (old_vm_running) {
             vm_stop(RUN_STATE_FINISH_MIGRATE);
         } else {
             vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
         }
-
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("vm_stop 2 %ld\n", end_time - start_time);
         if (qemu_savevm_state_complete(s->file) < 0) {
             migrate_fd_error(s);
         } else {
+            end_time = qemu_get_clock_ms(rt_clock);
+            printf("complete without error 3a %ld\n", end_time - start_time);
             migrate_fd_completed(s);
         }
         end_time = qemu_get_clock_ms(rt_clock);
+        printf("completed %ld\n", end_time - start_time);
         s->total_time = end_time - s->total_time;
         s->downtime = end_time - start_time;
         if (s->state != MIG_STATE_COMPLETED) {
@@ -359,6 +365,8 @@ bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size)
                 vm_start();
             }
         }
+        end_time = qemu_get_clock_ms(rt_clock);
+        printf("end completed stage %ld\n", end_time - start_time);
         last_round = true;
     }
     qemu_mutex_unlock_iothread();
diff --git a/savevm.c b/savevm.c
index bdc70c2..ba399f1 100644
--- a/savevm.c
+++ b/savevm.c
@@ -1629,9 +1629,14 @@ int qemu_savevm_state_complete(QEMUFile *f)
 {
     SaveStateEntry *se;
     int ret;
+    int64_t t1;
+    int64_t t0 = qemu_get_clock_ms(rt_clock);

     cpu_synchronize_all_states();
+    t1 = qemu_get_clock_ms(rt_clock);
+    printf("synchronize_all_states %ld\n", t1 - t0);

+    t0 = t1;
     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         if (!se->ops || !se->ops->save_live_complete) {
             continue;
@@ -1652,6 +1657,11 @@ int qemu_savevm_state_complete(QEMUFile *f)
             return ret;
         }
     }
+    t1 = qemu_get_clock_ms(rt_clock);
+
+    printf("migrate RAM %ld\n", t1 - t0);
+
+    t0 = t1;

     QTAILQ_FOREACH(se, &savevm_handlers, entry) {
         int len;
@@ -1676,6 +1686,9 @@ int qemu_savevm_state_complete(QEMUFile *f)
         trace_savevm_section_end(se->section_id);
     }

+    t1 = qemu_get_clock_ms(rt_clock);
+
+    printf("migrate rest devices %ld\n", t1 - t0);
     qemu_put_byte(f, QEMU_VM_EOF);

     return qemu_file_get_error(f);
-- 
1.7.11.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 01/14] split MRU ram list
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 01/14] split MRU ram list Juan Quintela
@ 2012-09-21 14:38   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 14:38 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> From: Paolo Bonzini <pbonzini@redhat.com>
> 
> Outside the execution threads the normal, non-MRU-ized order of
> the RAM blocks should always be enough.  So manage two separate
> lists, which will have separate locking rules.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  cpu-all.h |  4 +++-
>  exec.c    | 16 +++++++++++-----
>  2 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/cpu-all.h b/cpu-all.h
> index db25f73..3e67019 100644
> --- a/cpu-all.h
> +++ b/cpu-all.h
> @@ -487,8 +487,9 @@ typedef struct RAMBlock {
>      ram_addr_t offset;
>      ram_addr_t length;
>      uint32_t flags;
> -    char idstr[256];
> +    QLIST_ENTRY(RAMBlock) next_mru;
>      QLIST_ENTRY(RAMBlock) next;
> +    char idstr[256];
>  #if defined(__linux__) && !defined(TARGET_S390X)
>      int fd;
>  #endif
> @@ -496,6 +497,7 @@ typedef struct RAMBlock {
> 
>  typedef struct RAMList {
>      uint8_t *phys_dirty;
> +    QLIST_HEAD(, RAMBlock) blocks_mru;
>      QLIST_HEAD(, RAMBlock) blocks;
>  } RAMList;
>  extern RAMList ram_list;
> diff --git a/exec.c b/exec.c
> index ad2cc2e..0db2beb 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -112,7 +112,10 @@ static uint8_t *code_gen_ptr;
>  int phys_ram_fd;
>  static int in_migration;
> 
> -RAMList ram_list = { .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks) };
> +RAMList ram_list = {
> +    .blocks = QLIST_HEAD_INITIALIZER(ram_list.blocks),
> +    .blocks_mru = QLIST_HEAD_INITIALIZER(ram_list.blocks_mru)
> +};
> 
>  static MemoryRegion *system_memory;
>  static MemoryRegion *system_io;
> @@ -2578,6 +2581,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>      new_block->length = size;
> 
>      QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
> +    QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);
> 
>      ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
>                                         last_ram_offset() >> TARGET_PAGE_BITS);
> @@ -2605,6 +2609,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
> +            QLIST_REMOVE(block, next_mru);
>              g_free(block);
>              return;
>          }
> @@ -2618,6 +2623,7 @@ void qemu_ram_free(ram_addr_t addr)
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
> +            QLIST_REMOVE(block, next_mru);
>              if (block->flags & RAM_PREALLOC_MASK) {
>                  ;
>              } else if (mem_path) {
> @@ -2723,12 +2729,12 @@ void *qemu_get_ram_ptr(ram_addr_t addr)
>  {
>      RAMBlock *block;
> 
> -    QLIST_FOREACH(block, &ram_list.blocks, next) {
> +    QLIST_FOREACH(block, &ram_list.blocks_mru, next_mru) {
>          if (addr - block->offset < block->length) {
>              /* Move this entry to to start of the list.  */
>              if (block != QLIST_FIRST(&ram_list.blocks)) {
> -                QLIST_REMOVE(block, next);
> -                QLIST_INSERT_HEAD(&ram_list.blocks, block, next);
> +                QLIST_REMOVE(block, next_mru);
> +                QLIST_INSERT_HEAD(&ram_list.blocks_mru, block, next_mru);
>              }
>              if (xen_enabled()) {
>                  /* We need to check if the requested address is in the RAM
> @@ -2823,7 +2829,7 @@ int qemu_ram_addr_from_host(void *ptr, ram_addr_t *ram_addr)
>          return 0;
>      }
> 
> -    QLIST_FOREACH(block, &ram_list.blocks, next) {
> +    QLIST_FOREACH(block, &ram_list.blocks_mru, next_mru) {
>          /* This case append when the block is not mapped. */
>          if (block->host == NULL) {
>              continue;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 02/14] add a version number to ram_list
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 02/14] add a version number to ram_list Juan Quintela
@ 2012-09-21 14:42   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 14:42 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Umesh Deshpande, qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> From: Umesh Deshpande <udeshpan@redhat.com>
> 
> This will be used to detect if last_block might have become invalid
> across different calls to ram_save_live.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  arch_init.c | 7 ++++++-
>  cpu-all.h   | 1 +
>  exec.c      | 4 ++++
>  3 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index d96e888..eb33fdd 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -336,6 +336,7 @@ static RAMBlock *last_block;
>  static ram_addr_t last_offset;
>  static unsigned long *migration_bitmap;
>  static uint64_t migration_dirty_pages;
> +static uint32_t last_version;
> 
>  static inline bool migration_bitmap_test_and_reset_dirty(MemoryRegion *mr,
>                                                           ram_addr_t offset)
> @@ -406,7 +407,6 @@ static void migration_bitmap_sync(void)
>      }
>  }
> 
> -
>  /*
>   * ram_save_block: Writes a page of memory to the stream f
>   *
> @@ -558,6 +558,7 @@ static void reset_ram_globals(void)
>  {
>      last_block = NULL;
>      last_offset = 0;
> +    last_version = ram_list.version;
>      sort_ram_list();
>  }
> 
> @@ -613,6 +614,10 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      uint64_t expected_downtime;
>      MigrationState *s = migrate_get_current();
> 
> +    if (ram_list.version != last_version) {
> +        reset_ram_globals();
> +    }
> +
>      bytes_transferred_last = bytes_transferred;
>      bwidth = qemu_get_clock_ns(rt_clock);
> 

Assuming ram_save_iterate() will take the RAM lock throughout, that's ok.

> diff --git a/cpu-all.h b/cpu-all.h
> index 3e67019..6576229 100644
> --- a/cpu-all.h
> +++ b/cpu-all.h
> @@ -497,6 +497,7 @@ typedef struct RAMBlock {
> 
>  typedef struct RAMList {
>      uint8_t *phys_dirty;
> +    uint32_t version;
>      QLIST_HEAD(, RAMBlock) blocks_mru;
>      QLIST_HEAD(, RAMBlock) blocks;
>  } RAMList;
> diff --git a/exec.c b/exec.c
> index 0db2beb..e9d1509 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2583,6 +2583,8 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>      QLIST_INSERT_HEAD(&ram_list.blocks, new_block, next);
>      QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);
> 
> +    ram_list.version++;
> +
>      ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
>                                         last_ram_offset() >> TARGET_PAGE_BITS);
>      memset(ram_list.phys_dirty + (new_block->offset >> TARGET_PAGE_BITS),
> @@ -2610,6 +2612,7 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
>              QLIST_REMOVE(block, next_mru);
> +            ram_list.version++;
>              g_free(block);
>              return;
>          }
> @@ -2624,6 +2627,7 @@ void qemu_ram_free(ram_addr_t addr)
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
>              QLIST_REMOVE(block, next_mru);
> +            ram_list.version++;
>              if (block->flags & RAM_PREALLOC_MASK) {
>                  ;
>              } else if (mem_path) {
> 

Reviewed-by: Paolo Bonzini <pbonzini@gnu.org>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex Juan Quintela
@ 2012-09-21 14:44   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 14:44 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Umesh Deshpande, qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> From: Umesh Deshpande <udeshpan@redhat.com>
> 
> Add the new mutex that protects shared state between ram_save_live
> and the iothread.  If the iothread mutex has to be taken together
> with the ramlist mutex, the iothread shall always be _outside_.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Umesh Deshpande <udeshpan@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  arch_init.c |  9 ++++++++-
>  cpu-all.h   |  8 ++++++++
>  exec.c      | 23 +++++++++++++++++++++--
>  3 files changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/arch_init.c b/arch_init.c
> index eb33fdd..0d963b4 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -553,7 +553,6 @@ static void ram_migration_cancel(void *opaque)
>      migration_end();
>  }
> 
> -
>  static void reset_ram_globals(void)
>  {
>      last_block = NULL;
> @@ -573,6 +572,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>      bitmap_set(migration_bitmap, 1, ram_pages);
>      migration_dirty_pages = ram_pages;
> 
> +    qemu_mutex_lock_ramlist();
>      migration_bitmap_sync();
>      bytes_transferred = 0;
>      reset_ram_globals();
> @@ -600,6 +600,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>          qemu_put_be64(f, block->length);
>      }
> 
> +    qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> 
>      return 0;
> @@ -614,6 +615,8 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>      uint64_t expected_downtime;
>      MigrationState *s = migrate_get_current();
> 
> +    qemu_mutex_lock_ramlist();
> +
>      if (ram_list.version != last_version) {
>          reset_ram_globals();
>      }
> @@ -662,6 +665,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>          bwidth = 0.000001;
>      }
> 
> +    qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> 
>      expected_downtime = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
> @@ -682,6 +686,8 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>  {
>      migration_bitmap_sync();
> 
> +    qemu_mutex_lock_ramlist();
> +
>      /* try transferring iterative blocks of memory */
> 
>      /* flush all remaining blocks regardless of rate limiting */
> @@ -697,6 +703,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>      }
>      memory_global_dirty_log_stop();
> 
> +    qemu_mutex_unlock_ramlist();
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> 
>      g_free(migration_bitmap);
> diff --git a/cpu-all.h b/cpu-all.h
> index 6576229..2b0a640 100644
> --- a/cpu-all.h
> +++ b/cpu-all.h
> @@ -22,6 +22,7 @@
>  #include "qemu-common.h"
>  #include "qemu-tls.h"
>  #include "cpu-common.h"
> +#include "qemu-thread.h"
> 
>  /* some important defines:
>   *
> @@ -487,7 +488,9 @@ typedef struct RAMBlock {
>      ram_addr_t offset;
>      ram_addr_t length;
>      uint32_t flags;
> +    /* Protected by the iothread lock.  */
>      QLIST_ENTRY(RAMBlock) next_mru;
> +    /* Protected by the ramlist lock.  */
>      QLIST_ENTRY(RAMBlock) next;
>      char idstr[256];
>  #if defined(__linux__) && !defined(TARGET_S390X)
> @@ -496,9 +499,12 @@ typedef struct RAMBlock {
>  } RAMBlock;
> 
>  typedef struct RAMList {
> +    QemuMutex mutex;
> +    /* Protected by the iothread lock.  */
>      uint8_t *phys_dirty;
>      uint32_t version;
>      QLIST_HEAD(, RAMBlock) blocks_mru;
> +    /* Protected by the ramlist lock.  */
>      QLIST_HEAD(, RAMBlock) blocks;
>  } RAMList;
>  extern RAMList ram_list;
> @@ -520,6 +526,8 @@ void dump_exec_info(FILE *f, fprintf_function cpu_fprintf);
>  #endif /* !CONFIG_USER_ONLY */
> 
>  ram_addr_t last_ram_offset(void);
> +void qemu_mutex_lock_ramlist(void);
> +void qemu_mutex_unlock_ramlist(void);
> 
>  int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
>                          uint8_t *buf, int len, int is_write);
> diff --git a/exec.c b/exec.c
> index e9d1509..3a8a4dd 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -637,6 +637,7 @@ bool tcg_enabled(void)
> 
>  void cpu_exec_init_all(void)
>  {
> +    qemu_mutex_init(&ram_list.mutex);
>  #if !defined(CONFIG_USER_ONLY)
>      memory_map_init();
>      io_mem_init();
> @@ -2364,6 +2365,16 @@ static long gethugepagesize(const char *path)
>      return fs.f_bsize;
>  }
> 
> +void qemu_mutex_lock_ramlist(void)
> +{
> +    qemu_mutex_lock(&ram_list.mutex);
> +}
> +
> +void qemu_mutex_unlock_ramlist(void)
> +{
> +    qemu_mutex_unlock(&ram_list.mutex);
> +}
> +
>  static void *file_ram_alloc(RAMBlock *block,
>                              ram_addr_t memory,
>                              const char *path)
> @@ -2519,6 +2530,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev)
>      }
>      pstrcat(new_block->idstr, sizeof(new_block->idstr), name);
> 
> +    qemu_mutex_lock_ramlist();
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (block != new_block && !strcmp(block->idstr, new_block->idstr)) {
>              fprintf(stderr, "RAMBlock \"%s\" already registered, abort!\n",
> @@ -2526,6 +2538,7 @@ void qemu_ram_set_idstr(ram_addr_t addr, const char *name, DeviceState *dev)
>              abort();
>          }
>      }
> +    qemu_mutex_unlock_ramlist();
>  }
> 
>  static int memory_try_enable_merging(void *addr, size_t len)
> @@ -2549,6 +2562,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>      size = TARGET_PAGE_ALIGN(size);
>      new_block = g_malloc0(sizeof(*new_block));
> 
> +    qemu_mutex_lock_ramlist();
>      new_block->mr = mr;
>      new_block->offset = find_ram_offset(size);
>      if (host) {
> @@ -2584,6 +2598,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
>      QLIST_INSERT_HEAD(&ram_list.blocks_mru, new_block, next_mru);
> 
>      ram_list.version++;
> +    qemu_mutex_unlock_ramlist();
> 
>      ram_list.phys_dirty = g_realloc(ram_list.phys_dirty,
>                                         last_ram_offset() >> TARGET_PAGE_BITS);
> @@ -2608,21 +2623,24 @@ void qemu_ram_free_from_ptr(ram_addr_t addr)
>  {
>      RAMBlock *block;
> 
> +    qemu_mutex_lock_ramlist();
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
>              QLIST_REMOVE(block, next_mru);
>              ram_list.version++;
>              g_free(block);
> -            return;
> +            break;
>          }
>      }
> +    qemu_mutex_unlock_ramlist();
>  }
> 
>  void qemu_ram_free(ram_addr_t addr)
>  {
>      RAMBlock *block;
> 
> +    qemu_mutex_lock_ramlist();
>      QLIST_FOREACH(block, &ram_list.blocks, next) {
>          if (addr == block->offset) {
>              QLIST_REMOVE(block, next);
> @@ -2653,9 +2671,10 @@ void qemu_ram_free(ram_addr_t addr)
>  #endif
>              }
>              g_free(block);
> -            return;
> +            break;
>          }
>      }
> +    qemu_mutex_unlock_ramlist();
> 
>  }
> 

All reads and writes of ram_list.version, and the "next" list are
protected by the new mutex.  The next_mru list is still protected by the
BQL.  The BQL is always taken outside the ram_list mutex.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void Juan Quintela
@ 2012-09-21 14:57   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 14:57 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> We want the file assignment to happen before the thread is created to
> avoid locking, so we just do it before creating the thread.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 13 ++++++-------
>  buffered_file.h |  2 +-
>  migration.c     |  2 +-
>  migration.h     |  1 +
>  4 files changed, 9 insertions(+), 9 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 61985a9..6c3d057 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -33,7 +33,6 @@ typedef struct QEMUFileBuffered
>      size_t buffer_size;
>      size_t buffer_capacity;
>      QemuThread thread;
> -    bool complete;

Not a separate patch? :)  (Just kidding).

>  } QEMUFileBuffered;
> 
>  #ifdef DEBUG_BUFFERED_FILE
> @@ -162,7 +161,7 @@ static int buffered_close(void *opaque)
>          ret = ret2;
>      }
>      ret = migrate_fd_close(s->migration_state);
> -    s->complete = true;
> +    s->migration_state->complete = true;
>      return ret;
>  }
> 
> @@ -224,7 +223,7 @@ static void *buffered_file_thread(void *opaque)
>      while (true) {
>          int64_t current_time = qemu_get_clock_ms(rt_clock);
> 
> -        if (s->complete) {
> +        if (s->migration_state->complete) {
>              break;
>          }
>          if (s->freeze_output) {
> @@ -247,7 +246,7 @@ static void *buffered_file_thread(void *opaque)
>      return NULL;
>  }
> 
> -QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
> +void qemu_fopen_ops_buffered(MigrationState *migration_state)
>  {
>      QEMUFileBuffered *s;
> 
> @@ -255,15 +254,15 @@ QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
> 
>      s->migration_state = migration_state;
>      s->xfer_limit = migration_state->bandwidth_limit / 10;
> -    s->complete = false;
> +    s->migration_state->complete = false;
> 
>      s->file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
>                               buffered_close, buffered_rate_limit,
>                               buffered_set_rate_limit,
>  			     buffered_get_rate_limit);
> 
> +    migration_state->file = s->file;
> +
>      qemu_thread_create(&s->thread, buffered_file_thread, s,
>                         QEMU_THREAD_DETACHED);
> -
> -    return s->file;
>  }
> diff --git a/buffered_file.h b/buffered_file.h
> index ef010fe..8a246fd 100644
> --- a/buffered_file.h
> +++ b/buffered_file.h
> @@ -17,6 +17,6 @@
>  #include "hw/hw.h"
>  #include "migration.h"
> 
> -QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state);
> +void qemu_fopen_ops_buffered(MigrationState *migration_state);
> 
>  #endif
> diff --git a/migration.c b/migration.c
> index 2c29d04..550c5c2 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -429,7 +429,7 @@ void migrate_fd_connect(MigrationState *s)
>      int ret;
> 
>      s->state = MIG_STATE_ACTIVE;
> -    s->file = qemu_fopen_ops_buffered(s);
> +    qemu_fopen_ops_buffered(s);
> 
>      DPRINTF("beginning savevm\n");
>      ret = qemu_savevm_state_begin(s->file, &s->params);
> diff --git a/migration.h b/migration.h
> index 1c3e9b7..a63c5d5 100644
> --- a/migration.h
> +++ b/migration.h
> @@ -45,6 +45,7 @@ struct MigrationState
>      int64_t dirty_pages_rate;
>      bool enabled_capabilities[MIGRATION_CAPABILITY_MAX];
>      int64_t xbzrle_cache_size;
> +    bool complete;
>  };
> 
>  void process_incoming_migration(QEMUFile *f);
> 

Consider a rename of qemu_fopen_ops_buffered, buffered_file.*, and
QEMUFileBuffered at the end of the series (e.g. migration_start_thread,
migration-thread.*, QEMUFileMigration).

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic Juan Quintela
@ 2012-09-21 14:58   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 14:58 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> Now that we have a thread, and blocking writes, we don't need it.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 24 +-----------------------
>  migration.c     | 23 -----------------------
>  migration.h     |  1 -
>  3 files changed, 1 insertion(+), 47 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 6c3d057..2c9859b 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -26,7 +26,6 @@ typedef struct QEMUFileBuffered
>  {
>      MigrationState *migration_state;
>      QEMUFile *file;
> -    int freeze_output;
>      size_t bytes_xfer;
>      size_t xfer_limit;
>      uint8_t *buffer;
> @@ -70,13 +69,6 @@ static int buffered_flush(QEMUFileBuffered *s)
> 
>          ret = migrate_fd_put_buffer(s->migration_state, s->buffer + offset,
>                                      s->buffer_size - offset);
> -        if (ret == -EAGAIN) {
> -            DPRINTF("backend not ready, freezing\n");
> -            ret = 0;
> -            s->freeze_output = 1;
> -            break;
> -        }
> -
>          if (ret <= 0) {
>              DPRINTF("error flushing data, %zd\n", ret);
>              break;
> @@ -110,9 +102,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
>          return error;
>      }
> 
> -    DPRINTF("unfreezing output\n");
> -    s->freeze_output = 0;
> -
>      if (size > 0) {
>          DPRINTF("buffering %d bytes\n", size - offset);
>          buffered_append(s, buf, size);
> @@ -126,7 +115,7 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
> 
>      if (pos == 0 && size == 0) {
>          DPRINTF("file is ready\n");
> -        if (!s->freeze_output && s->bytes_xfer < s->xfer_limit) {
> +        if (s->bytes_xfer < s->xfer_limit) {
>              DPRINTF("notifying client\n");
>              migrate_fd_put_ready(s->migration_state);
>          }
> @@ -148,12 +137,6 @@ static int buffered_close(void *opaque)
>          if (ret < 0) {
>              break;
>          }
> -        if (s->freeze_output) {
> -            ret = migrate_fd_wait_for_unfreeze(s->migration_state);
> -            if (ret < 0) {
> -                break;
> -            }
> -        }
>      }
> 
>      ret2 = migrate_fd_close(s->migration_state);
> @@ -180,8 +163,6 @@ static int buffered_rate_limit(void *opaque)
>      if (ret) {
>          return ret;
>      }
> -    if (s->freeze_output)
> -        return 1;
> 
>      if (s->bytes_xfer > s->xfer_limit)
>          return 1;
> @@ -226,9 +207,6 @@ static void *buffered_file_thread(void *opaque)
>          if (s->migration_state->complete) {
>              break;
>          }
> -        if (s->freeze_output) {
> -            continue;
> -        }
>          if (current_time >= expire_time) {
>              s->bytes_xfer = 0;
>              expire_time = current_time + BUFFER_DELAY;
> diff --git a/migration.c b/migration.c
> index a8b2f4a..29ee710 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -367,29 +367,6 @@ static void migrate_fd_cancel(MigrationState *s)
>      migrate_fd_cleanup(s);
>  }
> 
> -int migrate_fd_wait_for_unfreeze(MigrationState *s)
> -{
> -    int ret;
> -
> -    DPRINTF("wait for unfreeze\n");
> -    if (s->state != MIG_STATE_ACTIVE)
> -        return -EINVAL;
> -
> -    do {
> -        fd_set wfds;
> -
> -        FD_ZERO(&wfds);
> -        FD_SET(s->fd, &wfds);
> -
> -        ret = select(s->fd + 1, NULL, &wfds, NULL, NULL);
> -    } while (ret == -1 && (s->get_error(s)) == EINTR);
> -
> -    if (ret == -1) {
> -        return -s->get_error(s);
> -    }
> -    return 0;
> -}
> -
>  int migrate_fd_close(MigrationState *s)
>  {
>      return s->close(s);
> diff --git a/migration.h b/migration.h
> index a63c5d5..505f792 100644
> --- a/migration.h
> +++ b/migration.h
> @@ -82,7 +82,6 @@ void migrate_fd_connect(MigrationState *s);
>  ssize_t migrate_fd_put_buffer(MigrationState *s, const void *data,
>                                size_t size);
>  void migrate_fd_put_ready(MigrationState *s);
> -int migrate_fd_wait_for_unfreeze(MigrationState *s);
>  int migrate_fd_close(MigrationState *s);
> 
>  void add_migration_state_change_notifier(Notifier *notify);
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data Juan Quintela
@ 2012-09-21 15:00   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:00 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> This was needed before due to the way that the callbacks worked.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 22b973f..f4a788d 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -113,14 +113,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
>          return error;
>      }
> 
> -    if (pos == 0 && size == 0) {
> -        DPRINTF("file is ready\n");
> -        if (s->bytes_xfer < s->xfer_limit) {
> -            DPRINTF("notifying client\n");
> -            migrate_fd_put_ready(s->migration_state);
> -        }
> -    }
> -
>      return size;
>  }
> 
> @@ -215,8 +207,15 @@ static void *buffered_file_thread(void *opaque)
>              /* usleep expects microseconds */
>              usleep((expire_time - current_time)*1000);
>          }
> -        buffered_put_buffer(s, NULL, 0, 0);
> +        buffered_flush(s);
> +
> +        DPRINTF("file is ready\n");
> +        if (s->bytes_xfer < s->xfer_limit) {
> +            DPRINTF("notifying client\n");
> +            migrate_fd_put_ready(s->migration_state);
> +        }
>      }
> +
>      g_free(s->buffer);
>      g_free(s);
>      return NULL;
> 

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending Juan Quintela
@ 2012-09-21 15:11   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:11 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> Code just now does (simplified for clarity)
> 
>     if (qemu_savevm_state_iterate(s->file) == 1) {
>        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>        qemu_savevm_state_complete(s->file);
>     }
> 
> Problem here is that qemu_savevm_state_iterate() returns 1 when it
> knows that remaining memory to sent takes less than max downtime.
> 
> But this means that we could end spending 2x max_downtime, one
> downtime in qemu_savevm_iterate, and the other in
> qemu_savevm_state_complete.

Technically, the one in qemu_savevm_iterate may not translate into
actual downtime if the guest does not need I/O, as it is "just" stolen
time.  But this is just a nit in the commit message.  Very good catch!

Another nit:

> 
> +uint64_t qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size)
> +{
> +    SaveStateEntry *se;
> +    uint64_t ret = 0;
> +
> +    QTAILQ_FOREACH(se, &savevm_handlers, entry) {
> +        if (!se->ops || !se->ops->save_live_pending) {
> +            continue;
> +        }
> +        if (se->ops && se->ops->is_active) {
> +            if (!se->ops->is_active(se->opaque)) {
> +                continue;
> +            }
> +        }
> +        ret += se->ops->save_live_pending(f, se->opaque, max_size);
> +    }
> +    return ret;
> +}

Here you may want to pass max_size - ret for the last argument of
se->ops->save_live_pending.  But in practice this loop will be executed
exactly once, so:

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Paolo

> Changed code to:
> 
>     pending_size = qemu_savevm_state_pending(s->file, max_size);
>     DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
>     if (pending_size >= max_size) {
>         ret = qemu_savevm_state_iterate(s->file);
>      } else {
>         vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>         qemu_savevm_state_complete(s->file);
>      }
> 
> So what we do is: at current network speed, we calculate the maximum
> number of bytes we can sent: max_size.
> 
> Then we ask every save_live section how much they have pending.  If
> they are less than max_size, we move to complete phase, otherwise we
> do an iterate one.
> 
> This makes things much simpler, because now individual sections don't
> have to caluclate the bandwidth (it was implossible to do right from
> there).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] migration: print times for end phase
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 14/14] migration: print times for end phase Juan Quintela
@ 2012-09-21 15:13   ` Paolo Bonzini
  2012-09-21 15:27     ` Juan Quintela
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:13 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Not sure you want these in master. :)

Paolo

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  block.c     |  6 ++++++
>  cpus.c      | 17 +++++++++++++++++
>  migration.c | 10 +++++++++-
>  savevm.c    | 13 +++++++++++++
>  4 files changed, 45 insertions(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index e78039b..d6e7a59 100644
> --- a/block.c
> +++ b/block.c
> @@ -2269,9 +2269,15 @@ int bdrv_get_flags(BlockDriverState *bs)
>  void bdrv_flush_all(void)
>  {
>      BlockDriverState *bs;
> +    int64_t start_time, end_time;
> +
> +    start_time = qemu_get_clock_ms(rt_clock);
> 
>      QTAILQ_FOREACH(bs, &bdrv_states, list) {
>          bdrv_flush(bs);
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time flush device %s: %ld\n", bs->filename,
> +               end_time - start_time);
>      }
>  }
> 
> diff --git a/cpus.c b/cpus.c
> index 1b7061a..0ccc1e0 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -439,14 +439,31 @@ int cpu_is_stopped(CPUArchState *env)
> 
>  static void do_vm_stop(RunState state)
>  {
> +    int64_t start_time, end_time;
> +
>      if (runstate_is_running()) {
> +        start_time = qemu_get_clock_ms(rt_clock);
>          cpu_disable_ticks();
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time cpu_disable_ticks %ld\n", end_time - start_time);
>          pause_all_vcpus();
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time pause_all_vcpus %ld\n", end_time - start_time);
>          runstate_set(state);
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time runstate_set %ld\n", end_time - start_time);
>          vm_state_notify(0, state);
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time vmstate_notify %ld\n", end_time - start_time);
>          bdrv_drain_all();
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time bdrv_drain_all %ld\n", end_time - start_time);
>          bdrv_flush_all();
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time bdrv_flush_all %ld\n", end_time - start_time);
>          monitor_protocol_event(QEVENT_STOP, NULL);
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("time monitor_protocol_event %ld\n", end_time - start_time);
>      }
>  }
> 
> diff --git a/migration.c b/migration.c
> index 8054a77..9c62614 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -340,18 +340,24 @@ bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size)
>          DPRINTF("done iterating\n");
>          start_time = qemu_get_clock_ms(rt_clock);
>          qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER);
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("wakeup_request %ld\n", end_time - start_time);
>          if (old_vm_running) {
>              vm_stop(RUN_STATE_FINISH_MIGRATE);
>          } else {
>              vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>          }
> -
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("vm_stop 2 %ld\n", end_time - start_time);
>          if (qemu_savevm_state_complete(s->file) < 0) {
>              migrate_fd_error(s);
>          } else {
> +            end_time = qemu_get_clock_ms(rt_clock);
> +            printf("complete without error 3a %ld\n", end_time - start_time);
>              migrate_fd_completed(s);
>          }
>          end_time = qemu_get_clock_ms(rt_clock);
> +        printf("completed %ld\n", end_time - start_time);
>          s->total_time = end_time - s->total_time;
>          s->downtime = end_time - start_time;
>          if (s->state != MIG_STATE_COMPLETED) {
> @@ -359,6 +365,8 @@ bool migrate_fd_put_ready(MigrationState *s, uint64_t max_size)
>                  vm_start();
>              }
>          }
> +        end_time = qemu_get_clock_ms(rt_clock);
> +        printf("end completed stage %ld\n", end_time - start_time);
>          last_round = true;
>      }
>      qemu_mutex_unlock_iothread();
> diff --git a/savevm.c b/savevm.c
> index bdc70c2..ba399f1 100644
> --- a/savevm.c
> +++ b/savevm.c
> @@ -1629,9 +1629,14 @@ int qemu_savevm_state_complete(QEMUFile *f)
>  {
>      SaveStateEntry *se;
>      int ret;
> +    int64_t t1;
> +    int64_t t0 = qemu_get_clock_ms(rt_clock);
> 
>      cpu_synchronize_all_states();
> +    t1 = qemu_get_clock_ms(rt_clock);
> +    printf("synchronize_all_states %ld\n", t1 - t0);
> 
> +    t0 = t1;
>      QTAILQ_FOREACH(se, &savevm_handlers, entry) {
>          if (!se->ops || !se->ops->save_live_complete) {
>              continue;
> @@ -1652,6 +1657,11 @@ int qemu_savevm_state_complete(QEMUFile *f)
>              return ret;
>          }
>      }
> +    t1 = qemu_get_clock_ms(rt_clock);
> +
> +    printf("migrate RAM %ld\n", t1 - t0);
> +
> +    t0 = t1;
> 
>      QTAILQ_FOREACH(se, &savevm_handlers, entry) {
>          int len;
> @@ -1676,6 +1686,9 @@ int qemu_savevm_state_complete(QEMUFile *f)
>          trace_savevm_section_end(se->section_id);
>      }
> 
> +    t1 = qemu_get_clock_ms(rt_clock);
> +
> +    printf("migrate rest devices %ld\n", t1 - t0);
>      qemu_put_byte(f, QEMU_VM_EOF);
> 
>      return qemu_file_get_error(f);
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly Juan Quintela
@ 2012-09-21 15:20   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:20 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> You can only stop all cpus from the iothread or an vcpu.  As we want
> to do it from the migration_thread, we need to do this dance with the
> botton handlers.
> 
> This patch is a request for ideas.  I can move this function to cpus.c, but
> wondered if there is an easy way of doing this?

Obsolete commit message.

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 14/14] migration: print times for end phase
  2012-09-21 15:13   ` Paolo Bonzini
@ 2012-09-21 15:27     ` Juan Quintela
  0 siblings, 0 replies; 33+ messages in thread
From: Juan Quintela @ 2012-09-21 15:27 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Not sure you want these in master. :)
>
> Paolo

/me points to PATCH 00 O;-)

> - last patch just shows printfs to see where the time is being spent
>   on the migration complete phase.
>  (yes it pollutes all uses of stop on the monitor)

There are other users that are having long downtimes (with/without
migration thread).  I just want to know where, and then will start
pointing fingers O;-)

Later, Juan.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread Juan Quintela
@ 2012-09-21 15:29   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:29 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> We still protect everything except the wait with the iothread lock.
> But we moved from a timer to a thread.  Steps one by one.
> 
> We also need to detect when we have finished with a variable "complete".
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 58 +++++++++++++++++++++++++++++++++++----------------------
>  1 file changed, 36 insertions(+), 22 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 318d0f0..61985a9 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -18,6 +18,7 @@
>  #include "qemu-timer.h"
>  #include "qemu-char.h"
>  #include "buffered_file.h"
> +#include "qemu-thread.h"
> 
>  //#define DEBUG_BUFFERED_FILE
> 
> @@ -31,7 +32,8 @@ typedef struct QEMUFileBuffered
>      uint8_t *buffer;
>      size_t buffer_size;
>      size_t buffer_capacity;
> -    QEMUTimer *timer;
> +    QemuThread thread;
> +    bool complete;
>  } QEMUFileBuffered;
> 
>  #ifdef DEBUG_BUFFERED_FILE
> @@ -159,11 +161,8 @@ static int buffered_close(void *opaque)
>      if (ret >= 0) {
>          ret = ret2;
>      }
> -    qemu_del_timer(s->timer);
> -    qemu_free_timer(s->timer);
> -    g_free(s->buffer);
> -    g_free(s);
> -
> +    ret = migrate_fd_close(s->migration_state);
> +    s->complete = true;
>      return ret;
>  }
> 
> @@ -214,23 +213,38 @@ static int64_t buffered_get_rate_limit(void *opaque)
>      return s->xfer_limit;
>  }
> 
> -static void buffered_rate_tick(void *opaque)
> +/* 10ms  xfer_limit is the limit that we should write each 10ms */
> +#define BUFFER_DELAY 100
> +
> +static void *buffered_file_thread(void *opaque)
>  {
>      QEMUFileBuffered *s = opaque;
> +    int64_t expire_time = qemu_get_clock_ms(rt_clock) + BUFFER_DELAY;
> 
> -    if (qemu_file_get_error(s->file)) {
> -        buffered_close(s);
> -        return;
> -    }
> -
> -    qemu_mod_timer(s->timer, qemu_get_clock_ms(rt_clock) + 100);
> -
> -    if (s->freeze_output)
> -        return;
> -
> -    s->bytes_xfer = 0;
> +    while (true) {
> +        int64_t current_time = qemu_get_clock_ms(rt_clock);
> 
> -    buffered_put_buffer(s, NULL, 0, 0);
> +        if (s->complete) {
> +            break;
> +        }
> +        if (s->freeze_output) {
> +            continue;
> +        }
> +        if (current_time >= expire_time) {
> +            s->bytes_xfer = 0;
> +            expire_time = current_time + BUFFER_DELAY;
> +        }
> +        if (s->bytes_xfer >= s->xfer_limit) {
> +            /* usleep expects microseconds */
> +            usleep((expire_time - current_time)*1000);

usleep is broken under IIRC Solaris (it uses signals and conflicts with
qemu-timer.c).  Please use g_usleep instead.

Bonus points for converting other users (hw/xenfb.c, qemu-ga.c,
tests/test-iov.c).

> +        }
> +        qemu_mutex_lock_iothread();
> +        buffered_put_buffer(s, NULL, 0, 0);
> +        qemu_mutex_unlock_iothread();
> +    }
> +    g_free(s->buffer);
> +    g_free(s);

I think freeing "s" here is incorrect.  Instead, you should "join" the
thread after setting s->complete = true, and move the g_free(s) to
buffered_close.

> +    return NULL;
>  }
> 
>  QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
> @@ -241,15 +255,15 @@ QEMUFile *qemu_fopen_ops_buffered(MigrationState *migration_state)
> 
>      s->migration_state = migration_state;
>      s->xfer_limit = migration_state->bandwidth_limit / 10;
> +    s->complete = false;
> 
>      s->file = qemu_fopen_ops(s, buffered_put_buffer, NULL,
>                               buffered_close, buffered_rate_limit,
>                               buffered_set_rate_limit,
>  			     buffered_get_rate_limit);
> 
> -    s->timer = qemu_new_timer_ms(rt_clock, buffered_rate_tick, s);
> -
> -    qemu_mod_timer(s->timer, qemu_get_clock_ms(rt_clock) + 100);
> +    qemu_thread_create(&s->thread, buffered_file_thread, s,
> +                       QEMU_THREAD_DETACHED);

So this needs to become QEMU_THREAD_JOINABLE.

> 
>      return s->file;
>  }
> 

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 07/14] migration: make writes blocking
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 07/14] migration: make writes blocking Juan Quintela
@ 2012-09-21 15:31   ` Paolo Bonzini
  2012-12-14 12:40     ` Juan Quintela
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:31 UTC (permalink / raw)
  To: Juan Quintela; +Cc: Orit Wasserman, qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> -    s->fd = inet_connect(host_port, false, &in_progress, errp);
> +    s->fd = inet_connect(host_port, true, &in_progress, errp);

This makes the connect operation blocking.

Does this mean that Orit's patches for non-blocking connect are not
useful anymore?

Or should we instead leave this as is, and later call socket_set_block?

Paolo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 09/14] migration: take finer locking
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 09/14] migration: take finer locking Juan Quintela
@ 2012-09-21 15:33   ` Paolo Bonzini
  2012-12-14 12:44     ` Juan Quintela
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:33 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> Instead of locking the whole migration_thread inside loop, just lock
> migration_fd_put_notify, that is what interacts with the rest of the
> world.

Wrong commit message: just lock migrate_fd_put_ready.

> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 2 --
>  migration.c     | 5 +++++
>  2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 2c9859b..22b973f 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -215,9 +215,7 @@ static void *buffered_file_thread(void *opaque)
>              /* usleep expects microseconds */
>              usleep((expire_time - current_time)*1000);
>          }
> -        qemu_mutex_lock_iothread();
>          buffered_put_buffer(s, NULL, 0, 0);
> -        qemu_mutex_unlock_iothread();
>      }
>      g_free(s->buffer);
>      g_free(s);
> diff --git a/migration.c b/migration.c
> index 29ee710..82c2663 100644
> --- a/migration.c
> +++ b/migration.c
> @@ -305,8 +305,10 @@ void migrate_fd_put_ready(MigrationState *s)
>      int ret;
>      static bool first_time = true;
> 
> +    qemu_mutex_lock_iothread();
>      if (s->state != MIG_STATE_ACTIVE) {
>          DPRINTF("put_ready returning because of non-active state\n");
> +        qemu_mutex_unlock_iothread();

Please use a goto instead.

>          return;
>      }
>      if (first_time) {
> @@ -316,6 +318,7 @@ void migrate_fd_put_ready(MigrationState *s)
>          if (ret < 0) {
>              DPRINTF("failed, %d\n", ret);
>              migrate_fd_error(s);
> +            qemu_mutex_unlock_iothread();

Same here.

>              return;
>          }
>      }
> @@ -351,6 +354,8 @@ void migrate_fd_put_ready(MigrationState *s)
>              }
>          }
>      }
> +    qemu_mutex_unlock_iothread();
> +
>  }
> 
>  static void migrate_fd_cancel(MigrationState *s)
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer Juan Quintela
@ 2012-09-21 15:34   ` Paolo Bonzini
  2012-09-21 21:12     ` Eric Blake
  0 siblings, 1 reply; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 15:34 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> We call buffered_put_buffer with iothread held, and buffered_flush() does
> synchronous writes.  We only want to do the synchronous writes outside.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index f4a788d..929b911 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -107,12 +107,6 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
>          buffered_append(s, buf, size);
>      }
> 
> -    error = buffered_flush(s);
> -    if (error < 0) {
> -        DPRINTF("buffered flush error. bailing: %s\n", strerror(-error));
> -        return error;
> -    }
> -
>      return size;
>  }
> 

This means that the buffer can grow to up to s->xfer_limit bytes.
Perhaps you want to make the granularity (currently fixed to 100ms) a
bit smaller, like 20-30 ms, if the bandwidth limit grows above say 100
MB/sec?  This can be done on top of this series however.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer
  2012-09-21 14:08 ` [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer Juan Quintela
@ 2012-09-21 17:07   ` Paolo Bonzini
  0 siblings, 0 replies; 33+ messages in thread
From: Paolo Bonzini @ 2012-09-21 17:07 UTC (permalink / raw)
  To: Juan Quintela; +Cc: qemu-devel

Il 21/09/2012 16:08, Juan Quintela ha scritto:
> It was the only user, and now buffered_put_buffer just do the append
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> ---
>  buffered_file.c | 33 ++++++++++++++-------------------
>  1 file changed, 14 insertions(+), 19 deletions(-)
> 
> diff --git a/buffered_file.c b/buffered_file.c
> index 929b911..1b261ec 100644
> --- a/buffered_file.c
> +++ b/buffered_file.c
> @@ -42,22 +42,6 @@ typedef struct QEMUFileBuffered
>      do { } while (0)
>  #endif
> 
> -static void buffered_append(QEMUFileBuffered *s,
> -                            const uint8_t *buf, size_t size)
> -{
> -    if (size > (s->buffer_capacity - s->buffer_size)) {
> -        DPRINTF("increasing buffer capacity from %zu by %zu\n",
> -                s->buffer_capacity, size + 1024);
> -
> -        s->buffer_capacity += size + 1024;
> -
> -        s->buffer = g_realloc(s->buffer, s->buffer_capacity);
> -    }
> -
> -    memcpy(s->buffer + s->buffer_size, buf, size);
> -    s->buffer_size += size;
> -}
> -
>  static int buffered_flush(QEMUFileBuffered *s)
>  {
>      size_t offset = 0;
> @@ -102,11 +86,22 @@ static int buffered_put_buffer(void *opaque, const uint8_t *buf, int64_t pos, in
>          return error;
>      }
> 
> -    if (size > 0) {
> -        DPRINTF("buffering %d bytes\n", size - offset);
> -        buffered_append(s, buf, size);
> +    if (size <= 0) {
> +        return size;
>      }
> 
> +    if (size > (s->buffer_capacity - s->buffer_size)) {
> +        DPRINTF("increasing buffer capacity from %zu by %zu\n",
> +                s->buffer_capacity, size + 1024);
> +
> +        s->buffer_capacity += size + 1024;
> +
> +        s->buffer = g_realloc(s->buffer, s->buffer_capacity);
> +    }
> +
> +    memcpy(s->buffer + s->buffer_size, buf, size);
> +    s->buffer_size += size;
> +
>      return size;
>  }
> 

Ah, finally this arrived as well. :)

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer
  2012-09-21 15:34   ` Paolo Bonzini
@ 2012-09-21 21:12     ` Eric Blake
  0 siblings, 0 replies; 33+ messages in thread
From: Eric Blake @ 2012-09-21 21:12 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel, Juan Quintela

[-- Attachment #1: Type: text/plain, Size: 796 bytes --]

On 09/21/2012 09:34 AM, Paolo Bonzini wrote:
> Il 21/09/2012 16:08, Juan Quintela ha scritto:
>> We call buffered_put_buffer with iothread held, and buffered_flush() does
>> synchronous writes.  We only want to do the synchronous writes outside.
>>

> 
> This means that the buffer can grow to up to s->xfer_limit bytes.
> Perhaps you want to make the granularity (currently fixed to 100ms) a
> bit smaller, like 20-30 ms, if the bandwidth limit grows above say 100
> MB/sec?  This can be done on top of this series however.

Note that libvirt intentionally tries to raise the bandwidth as high as
possible in cases where the user did not request a particular bandwidth.

-- 
Eric Blake   eblake@redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 617 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 07/14] migration: make writes blocking
  2012-09-21 15:31   ` Paolo Bonzini
@ 2012-12-14 12:40     ` Juan Quintela
  0 siblings, 0 replies; 33+ messages in thread
From: Juan Quintela @ 2012-12-14 12:40 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Orit Wasserman, qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 21/09/2012 16:08, Juan Quintela ha scritto:
>> -    s->fd = inet_connect(host_port, false, &in_progress, errp);
>> +    s->fd = inet_connect(host_port, true, &in_progress, errp);
>
> This makes the connect operation blocking.
>
> Does this mean that Orit's patches for non-blocking connect are not
> useful anymore?
>
> Or should we instead leave this as is, and later call socket_set_block?

fixed calling set-block.  Easier that way.

Later, Juan.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [Qemu-devel] [PATCH 09/14] migration: take finer locking
  2012-09-21 15:33   ` Paolo Bonzini
@ 2012-12-14 12:44     ` Juan Quintela
  0 siblings, 0 replies; 33+ messages in thread
From: Juan Quintela @ 2012-12-14 12:44 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: qemu-devel

Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 21/09/2012 16:08, Juan Quintela ha scritto:
>> Instead of locking the whole migration_thread inside loop, just lock
>> migration_fd_put_notify, that is what interacts with the rest of the
>> world.
>
> Wrong commit message: just lock migrate_fd_put_ready.

Fixed.

>> @@ -305,8 +305,10 @@ void migrate_fd_put_ready(MigrationState *s)
>>      int ret;
>>      static bool first_time = true;
>> 
>> +    qemu_mutex_lock_iothread();
>>      if (s->state != MIG_STATE_ACTIVE) {
>>          DPRINTF("put_ready returning because of non-active state\n");
>> +        qemu_mutex_unlock_iothread();
>
> Please use a goto instead.
>
>>          return;
>>      }
>>      if (first_time) {
>> @@ -316,6 +318,7 @@ void migrate_fd_put_ready(MigrationState *s)
>>          if (ret < 0) {
>>              DPRINTF("failed, %d\n", ret);
>>              migrate_fd_error(s);
>> +            qemu_mutex_unlock_iothread();
>
> Same here.
>

Don't work.  The last branch of the code drops the lock before the end.

Code is:

lock()
while() {
  if (foo) {
     ...
     unlock()
     break;
  }
  if (bar) {
     ...
     unlock()
     break;
  }
  unlock()
  ... /* this is the interesting bit */
}
.... /* more stuff needed both sides */


adding a goto would not help making things clearer.



>>              return;
>>          }
>>      }
>> @@ -351,6 +354,8 @@ void migrate_fd_put_ready(MigrationState *s)
>>              }
>>          }
>>      }
>> +    qemu_mutex_unlock_iothread();
>> +
>>  }
>> 
>>  static void migrate_fd_cancel(MigrationState *s)
>> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2012-12-14 12:45 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-09-21 14:08 [Qemu-devel] [RFC 00/14] Migration thread Juan Quintela
2012-09-21 14:08 ` [Qemu-devel] [PATCH 01/14] split MRU ram list Juan Quintela
2012-09-21 14:38   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 02/14] add a version number to ram_list Juan Quintela
2012-09-21 14:42   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 03/14] protect the ramlist with a separate mutex Juan Quintela
2012-09-21 14:44   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 04/14] buffered_file: Move from using a timer to use a thread Juan Quintela
2012-09-21 15:29   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 05/14] migration: make qemu_fopen_ops_buffered() return void Juan Quintela
2012-09-21 14:57   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 06/14] migration: stop all cpus correctly Juan Quintela
2012-09-21 15:20   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 07/14] migration: make writes blocking Juan Quintela
2012-09-21 15:31   ` Paolo Bonzini
2012-12-14 12:40     ` Juan Quintela
2012-09-21 14:08 ` [Qemu-devel] [PATCH 08/14] migration: remove unfreeze logic Juan Quintela
2012-09-21 14:58   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 09/14] migration: take finer locking Juan Quintela
2012-09-21 15:33   ` Paolo Bonzini
2012-12-14 12:44     ` Juan Quintela
2012-09-21 14:08 ` [Qemu-devel] [PATCH 10/14] buffered_file: Unfold the trick to restart generating migration data Juan Quintela
2012-09-21 15:00   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 11/14] buffered_file: don't flush on put buffer Juan Quintela
2012-09-21 15:34   ` Paolo Bonzini
2012-09-21 21:12     ` Eric Blake
2012-09-21 14:08 ` [Qemu-devel] [PATCH 12/14] buffered_file: unfold buffered_append in buffered_put_buffer Juan Quintela
2012-09-21 17:07   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 13/14] savevm: New save live migration method: pending Juan Quintela
2012-09-21 15:11   ` Paolo Bonzini
2012-09-21 14:08 ` [Qemu-devel] [PATCH 14/14] migration: print times for end phase Juan Quintela
2012-09-21 15:13   ` Paolo Bonzini
2012-09-21 15:27     ` Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).