[PATCH v1 0/7] migration: auto-converge refinements for huge VM

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 0/7] migration: auto-converge refinements for huge VM
@ 2024-09-15 16:08 Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 1/7] migration: Introduce structs for background sync Hyman Huang
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

This is the first version for auto-converge refinements; refer to the
following link for details about the RFC version:
https://patchew.org/QEMU/cover.1725891841.git.yong.huang@smartx.com/

This series introduces two refinements called "background sync" and
"responsive throttle," respectively.

1. background sync:

   The original auto-converge throttle logic doesn't look like it will
   scale because migration_trigger_throttle() is only called for each
   iteration, so it won't be invoked for a long time if one iteration
   can take a long time.

   The background sync would fix this issue by implementing the background
   dirty bitmap sync and throttle automatically once detect that
   the iteration lasts a long time during the migration.

   The background sync is implemented transparently, and there is no
   new-added interface for upper apps.

2. responsive throttle: 

   The original auto-converge throttle logic determines if the migration
   is convergent by one criteria, and if the iteration fits twice, then
   launch the CPU throttle or increase the throttle percentage. This
   results in that the migration_trigger_throttle() won't be invoked for
   a long time if one iteration can take a long time too.

   The responsive throttle introduce one more criteria to assist detecting
   the convergence of the migration, if either of the two criteria is
   met, migration_trigger_throttle() would be called. This also makes it
   more likely that the CPU throttle will be activated, thereby
   accelerating the migration process.

   The responsive throttle provides the 'cpu-responsive-throttle' option
   to enable this feature. 

We test this two features with the following environment:

a. Test tool:
guestperf

Refer to the following link to see details:
https://github.com/qemu/qemu/tree/master/tests/migration/guestperf

b. Test VM scale:
CPU: 16; Memory: 100GB

c. Average bandwidth between source and destination for migration:
1.59 Gbits/sec

We use stress tool contained in the initrd-stress.img to update
ramsize MB on every CPU in guest, refer to the following link to
see the source code:
https://github.com/qemu/qemu/blob/master/tests/migration/stress.c

The following command is executed to compare our refined QEMU with the
original QEMU:
# python3.6 guestperf.py --binary /path/to/qemu-kvm --cpus 16 \
--mem 100 --max-iters 200 --max-time 1200  --dst-host {dst_ip} \
--kernel /path/to/vmlinuz --initrd /path/to/initrd-stress.img \
--transport tcp --downtime 500 --auto-converge --auto-converge-step 10 \
--verbose --stress-mem {ramsize} 

We set ramsize to 150MB to simulate the light load, 3000MB as moderate
load and 5000MB as heavy load. Test cases were executed three times in
each scenario.  

The following data shows the migration test results with an increase in
stress.

ramsize: 150MB
|------------+-----------+----------+-----------+--------------|
|            | totaltime | downtime | iteration | max throttle |
|            | (ms)      | (ms)     | count     | percent      |
|------------+-----------+----------+-----------+--------------|
| original   | 123685    | 490      | 87        | 99%          |
|            | 116249    | 542      | 45        | 60%          |  
|            | 107772    | 587      | 8         | 0%           | 
|------------+-----------+----------+-----------+--------------|
| background | 113744    | 1654     | 16        | 20%          | 
| sync       | 122623    | 758      | 60        | 80%          | 
|            | 112668    | 547      | 23        | 20%          | 
|------------+-----------+----------+-----------+--------------|
| background | 113660    | 573      | 5         | 0%           | 
| sync +     | 109357    | 576      | 6         | 0%           | 
| responsive | 126792    | 494      | 37        | 99%          | 
| throttle   |           |          |           |              | 
|------------+-----------+----------+-----------+--------------|

ramsize: 3000MB
|------------+-----------+----------+-----------+--------------|
|            | totaltime | downtime | iteration | max throttle |
|            | (ms)      | (ms)     | count     | percent      |
|------------+-----------+----------+-----------+--------------|
| original   | 404398    | 515      | 26        | 99%          |
|            | 392552    | 528      | 25        | 99%          | 
|            | 400113    | 447      | 24        | 99%          | 
|------------+-----------+----------+-----------+--------------|
| background | 239151    | 681      | 25        | 99%          | 
| sync       | 295047    | 587      | 41        | 99%          | 
|            | 289936    | 681      | 34        | 99%          | 
|------------+-----------+----------+-----------+--------------|
| background | 212786    | 487      | 22        | 99%          | 
| sync +     | 225246    | 666      | 23        | 99%          | 
| responsive | 244053    | 572      | 27        | 99%          | 
| throttle   |           |          |           |              | 
|------------+-----------+----------+-----------+--------------|

ramsize: 5000MB
|------------+-----------+----------+-----------+--------------|
|            | totaltime | downtime | iteration | max throttle |
|            | (ms)      | (ms)     | count     | percent      |
|------------+-----------+----------+-----------+--------------|
| original   | 566357    | 644      | 22        | 99%          |
|            | 607471    | 320      | 23        | 99%          | 
|            | 603136    | 417      | 22        | 99%          | 
|------------+-----------+----------+-----------+--------------|
| background | 284605    | 793      | 27        | 99%          | 
| sync       | 272270    | 668      | 28        | 99%          | 
|            | 267543    | 545      | 28        | 99%          | 
|------------+-----------+----------+-----------+--------------|
| background | 226446    | 413      | 22        | 99%          | 
| sync +     | 232082    | 494      | 23        | 99%          | 
| responsive | 269863    | 533      | 23        | 99%          | 
| throttle   |           |          |           |              | 
|------------+-----------+----------+-----------+--------------|

To summarize the data above, any data that implies negative optimization
does not appear, the refinement saves the total time significantly and,
therefore, shortens the duration of the guest performance degradation.

Additionally, we examined the memory performance curves generated from
the test case results above; while no negative optimization is there,
but the performance degradation occurs more quickly. Since it is
inconvenient to display the graphic data, one can independently
verify it.

Please review, any comments and advice are appreciated.

Thanks,

Yong

Hyman Huang (7):
  migration: Introduce structs for background sync
  migration: Refine util functions to support background sync
  qapi/migration: Introduce the iteration-count
  migration: Implment background sync watcher
  migration: Support background dirty bitmap sync and throttle
  qapi/migration: Introduce cpu-responsive-throttle parameter
  migration: Support responsive CPU throttle

 include/exec/ram_addr.h        | 107 ++++++++++++++-
 include/exec/ramblock.h        |  45 +++++++
 migration/migration-hmp-cmds.c |   8 ++
 migration/migration-stats.h    |   4 +
 migration/migration.c          |  13 ++
 migration/options.c            |  20 +++
 migration/options.h            |   1 +
 migration/ram.c                | 236 ++++++++++++++++++++++++++++++---
 migration/ram.h                |   3 +
 migration/trace-events         |   4 +
 qapi/migration.json            |  22 ++-
 tests/qtest/migration-test.c   |  40 ++++++
 12 files changed, 479 insertions(+), 24 deletions(-)

-- 
2.39.1



^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-16 21:11   ` Fabiano Rosas
  2024-09-15 16:08 ` [PATCH v1 2/7] migration: Refine util functions to support " Hyman Huang
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

shadow_bmap, iter_bmap and iter_dirty_pages are introduced
to satisfy the need for background sync.

Meanwhile, introduce enumeration of sync method.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 include/exec/ramblock.h | 45 +++++++++++++++++++++++++++++++++++++++++
 migration/ram.c         |  6 ++++++
 2 files changed, 51 insertions(+)

diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
index 0babd105c0..0e327bc0ae 100644
--- a/include/exec/ramblock.h
+++ b/include/exec/ramblock.h
@@ -24,6 +24,30 @@
 #include "qemu/rcu.h"
 #include "exec/ramlist.h"
 
+/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
+
+/*
+ * The old-fashioned sync, which is, in turn, used for CPU
+ * throttle and memory transfer.
+ */
+#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
+
+/*
+ * The modern sync, which is, in turn, used for CPU throttle
+ * and memory transfer.
+ */
+#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
+
+/* The modern sync, which is used for CPU throttle only */
+#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
+
+#define RAMBLOCK_SYN_MASK  (0x7)
+
+typedef enum RAMBlockSynMode {
+    RAMBLOCK_SYN_LEGACY, /* Old-fashined mode */
+    RAMBLOCK_SYN_MODERN, /* Background-sync-supported mode */
+} RAMBlockSynMode;
+
 struct RAMBlock {
     struct rcu_head rcu;
     struct MemoryRegion *mr;
@@ -89,6 +113,27 @@ struct RAMBlock {
      * could not have been valid on the source.
      */
     ram_addr_t postcopy_length;
+
+    /*
+     * Used to backup the bmap during background sync to see whether any dirty
+     * pages were sent during that time.
+     */
+    unsigned long *shadow_bmap;
+
+    /*
+     * The bitmap "bmap," which was initially used for both sync and memory
+     * transfer, will be replaced by two bitmaps: the previously used "bmap"
+     * and the recently added "iter_bmap." Only the memory transfer is
+     * conducted with the previously used "bmap"; the recently added
+     * "iter_bmap" is utilized for dirty bitmap sync.
+     */
+    unsigned long *iter_bmap;
+
+    /* Number of new dirty pages during iteration */
+    uint64_t iter_dirty_pages;
+
+    /* If background sync has shown up during iteration */
+    bool background_sync_shown_up;
 };
 #endif
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index 67ca3d5d51..f29faa82d6 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2362,6 +2362,10 @@ static void ram_bitmaps_destroy(void)
         block->bmap = NULL;
         g_free(block->file_bmap);
         block->file_bmap = NULL;
+        g_free(block->shadow_bmap);
+        block->shadow_bmap = NULL;
+        g_free(block->iter_bmap);
+        block->iter_bmap = NULL;
     }
 }
 
@@ -2753,6 +2757,8 @@ static void ram_list_init_bitmaps(void)
             }
             block->clear_bmap_shift = shift;
             block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
+            block->shadow_bmap = bitmap_new(pages);
+            block->iter_bmap = bitmap_new(pages);
         }
     }
 }
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 2/7] migration: Refine util functions to support background sync
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 1/7] migration: Introduce structs for background sync Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 3/7] qapi/migration: Introduce the iteration-count Hyman Huang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

Supply the migration_bitmap_sync function along with the
background argument. Introduce the sync_mode global variable
to track the sync mode and support background sync while
keeping backward compatibility.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 include/exec/ram_addr.h | 107 +++++++++++++++++++++++++++++++++++++---
 migration/ram.c         |  53 ++++++++++++++++----
 2 files changed, 144 insertions(+), 16 deletions(-)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 891c44cf2d..d0d123ac60 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -472,17 +472,68 @@ static inline void cpu_physical_memory_clear_dirty_range(ram_addr_t start,
     cpu_physical_memory_test_and_clear_dirty(start, length, DIRTY_MEMORY_CODE);
 }
 
+static void ramblock_clear_iter_bmap(RAMBlock *rb,
+                                     ram_addr_t start,
+                                     ram_addr_t length)
+{
+    ram_addr_t addr;
+    unsigned long *bmap = rb->bmap;
+    unsigned long *shadow_bmap = rb->shadow_bmap;
+    unsigned long *iter_bmap = rb->iter_bmap;
+
+    for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+        long k = (start + addr) >> TARGET_PAGE_BITS;
+        if (test_bit(k, shadow_bmap) && !test_bit(k, bmap)) {
+            /* Page has been sent, clear the iter bmap */
+            clear_bit(k, iter_bmap);
+        }
+    }
+}
+
+static void ramblock_update_iter_bmap(RAMBlock *rb,
+                                      ram_addr_t start,
+                                      ram_addr_t length)
+{
+    ram_addr_t addr;
+    unsigned long *bmap = rb->bmap;
+    unsigned long *iter_bmap = rb->iter_bmap;
+
+    for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
+        long k = (start + addr) >> TARGET_PAGE_BITS;
+        if (test_bit(k, iter_bmap)) {
+            if (!test_bit(k, bmap)) {
+                set_bit(k, bmap);
+                rb->iter_dirty_pages++;
+            }
+        }
+    }
+}
 
 /* Called with RCU critical section */
 static inline
 uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
                                                ram_addr_t start,
-                                               ram_addr_t length)
+                                               ram_addr_t length,
+                                               unsigned int flag)
 {
     ram_addr_t addr;
     unsigned long word = BIT_WORD((start + rb->offset) >> TARGET_PAGE_BITS);
     uint64_t num_dirty = 0;
     unsigned long *dest = rb->bmap;
+    unsigned long *shadow_bmap = rb->shadow_bmap;
+    unsigned long *iter_bmap = rb->iter_bmap;
+
+    assert(flag && !(flag & (~RAMBLOCK_SYN_MASK)));
+
+    /*
+     * We must remove the sent dirty page from the iter_bmap in order to
+     * minimize redundant page transfers if background sync has appeared
+     * during this iteration.
+     */
+    if (rb->background_sync_shown_up &&
+        (flag & (RAMBLOCK_SYN_MODERN_ITER | RAMBLOCK_SYN_MODERN_BACKGROUND))) {
+        ramblock_clear_iter_bmap(rb, start, length);
+    }
 
     /* start address and length is aligned at the start of a word? */
     if (((word * BITS_PER_LONG) << TARGET_PAGE_BITS) ==
@@ -503,8 +554,20 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
             if (src[idx][offset]) {
                 unsigned long bits = qatomic_xchg(&src[idx][offset], 0);
                 unsigned long new_dirty;
+                if (flag & (RAMBLOCK_SYN_MODERN_ITER |
+                            RAMBLOCK_SYN_MODERN_BACKGROUND)) {
+                    /* Back-up bmap for the next iteration */
+                    iter_bmap[k] |= bits;
+                    if (flag == RAMBLOCK_SYN_MODERN_BACKGROUND) {
+                        /* Back-up bmap to detect pages has been sent */
+                        shadow_bmap[k] = dest[k];
+                    }
+                }
                 new_dirty = ~dest[k];
-                dest[k] |= bits;
+                if (flag == RAMBLOCK_SYN_LEGACY_ITER) {
+                    dest[k] |= bits;
+                }
+
                 new_dirty &= bits;
                 num_dirty += ctpopl(new_dirty);
             }
@@ -534,18 +597,50 @@ uint64_t cpu_physical_memory_sync_dirty_bitmap(RAMBlock *rb,
         ram_addr_t offset = rb->offset;
 
         for (addr = 0; addr < length; addr += TARGET_PAGE_SIZE) {
-            if (cpu_physical_memory_test_and_clear_dirty(
+            bool dirty = false;
+            long k = (start + addr) >> TARGET_PAGE_BITS;
+            if (flag == RAMBLOCK_SYN_MODERN_BACKGROUND) {
+                if (test_bit(k, dest)) {
+                    /* Back-up bmap to detect pages has been sent */
+                    set_bit(k, shadow_bmap);
+                }
+            }
+
+            dirty = cpu_physical_memory_test_and_clear_dirty(
                         start + addr + offset,
                         TARGET_PAGE_SIZE,
-                        DIRTY_MEMORY_MIGRATION)) {
-                long k = (start + addr) >> TARGET_PAGE_BITS;
-                if (!test_and_set_bit(k, dest)) {
+                        DIRTY_MEMORY_MIGRATION);
+
+            if (flag == RAMBLOCK_SYN_LEGACY_ITER) {
+                if (dirty && !test_and_set_bit(k, dest)) {
                     num_dirty++;
                 }
+            } else {
+                if (dirty) {
+                    if (!test_bit(k, dest)) {
+                        num_dirty++;
+                    }
+                    /* Back-up bmap for the next iteration */
+                    set_bit(k, iter_bmap);
+                }
             }
         }
     }
 
+    /*
+     * We have to re-fetch dirty pages from the iter_bmap one by one.
+     * It's possible that not all of the dirty pages that meant to
+     * send in the current iteration are included in the bitmap
+     * that the current sync retrieved from the KVM.
+     */
+    if (flag == RAMBLOCK_SYN_MODERN_ITER) {
+        ramblock_update_iter_bmap(rb, start, length);
+    }
+
+    if (flag == RAMBLOCK_SYN_MODERN_BACKGROUND) {
+        rb->background_sync_shown_up = true;
+    }
+
     return num_dirty;
 }
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index f29faa82d6..e205806a5f 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -112,6 +112,8 @@
 
 XBZRLECacheStats xbzrle_counters;
 
+static RAMBlockSynMode sync_mode = RAMBLOCK_SYN_LEGACY;
+
 /* used by the search for pages to send */
 struct PageSearchStatus {
     /* The migration channel used for a specific host page */
@@ -912,13 +914,42 @@ bool ramblock_page_is_discarded(RAMBlock *rb, ram_addr_t start)
     return false;
 }
 
+static void ramblock_reset_iter_stats(RAMBlock *rb)
+{
+    bitmap_clear(rb->shadow_bmap, 0, rb->used_length >> TARGET_PAGE_BITS);
+    bitmap_clear(rb->iter_bmap, 0, rb->used_length >> TARGET_PAGE_BITS);
+    rb->iter_dirty_pages = 0;
+    rb->background_sync_shown_up = false;
+}
+
 /* Called with RCU critical section */
-static void ramblock_sync_dirty_bitmap(RAMState *rs, RAMBlock *rb)
+static void ramblock_sync_dirty_bitmap(RAMState *rs,
+                                       RAMBlock *rb,
+                                       bool background)
 {
-    uint64_t new_dirty_pages =
-        cpu_physical_memory_sync_dirty_bitmap(rb, 0, rb->used_length);
+    uint64_t new_dirty_pages;
+    unsigned int flag = RAMBLOCK_SYN_LEGACY_ITER;
+
+    if (sync_mode == RAMBLOCK_SYN_MODERN) {
+        if (background) {
+            flag = RAMBLOCK_SYN_MODERN_BACKGROUND;
+        } else {
+            flag = RAMBLOCK_SYN_MODERN_ITER;
+        }
+    }
+
+    new_dirty_pages =
+        cpu_physical_memory_sync_dirty_bitmap(rb, 0, rb->used_length, flag);
+
+    if (flag & (RAMBLOCK_SYN_LEGACY_ITER | RAMBLOCK_SYN_MODERN_ITER)) {
+        if (flag == RAMBLOCK_SYN_LEGACY_ITER) {
+            rs->migration_dirty_pages += new_dirty_pages;
+        } else {
+            rs->migration_dirty_pages += rb->iter_dirty_pages;
+            ramblock_reset_iter_stats(rb);
+        }
+    }
 
-    rs->migration_dirty_pages += new_dirty_pages;
     rs->num_dirty_pages_period += new_dirty_pages;
 }
 
@@ -1041,7 +1072,9 @@ static void migration_trigger_throttle(RAMState *rs)
     }
 }
 
-static void migration_bitmap_sync(RAMState *rs, bool last_stage)
+static void migration_bitmap_sync(RAMState *rs,
+                                  bool last_stage,
+                                  bool background)
 {
     RAMBlock *block;
     int64_t end_time;
@@ -1058,7 +1091,7 @@ static void migration_bitmap_sync(RAMState *rs, bool last_stage)
     WITH_QEMU_LOCK_GUARD(&rs->bitmap_mutex) {
         WITH_RCU_READ_LOCK_GUARD() {
             RAMBLOCK_FOREACH_NOT_IGNORED(block) {
-                ramblock_sync_dirty_bitmap(rs, block);
+                ramblock_sync_dirty_bitmap(rs, block, background);
             }
             stat64_set(&mig_stats.dirty_bytes_last_sync, ram_bytes_remaining());
         }
@@ -1101,7 +1134,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs, bool last_stage)
         local_err = NULL;
     }
 
-    migration_bitmap_sync(rs, last_stage);
+    migration_bitmap_sync(rs, last_stage, false);
 
     if (precopy_notify(PRECOPY_NOTIFY_AFTER_BITMAP_SYNC, &local_err)) {
         error_report_err(local_err);
@@ -2594,7 +2627,7 @@ void ram_postcopy_send_discard_bitmap(MigrationState *ms)
     RCU_READ_LOCK_GUARD();
 
     /* This should be our last sync, the src is now paused */
-    migration_bitmap_sync(rs, false);
+    migration_bitmap_sync(rs, false, false);
 
     /* Easiest way to make sure we don't resume in the middle of a host-page */
     rs->pss[RAM_CHANNEL_PRECOPY].last_sent_block = NULL;
@@ -3581,7 +3614,7 @@ void colo_incoming_start_dirty_log(void)
     memory_global_dirty_log_sync(false);
     WITH_RCU_READ_LOCK_GUARD() {
         RAMBLOCK_FOREACH_NOT_IGNORED(block) {
-            ramblock_sync_dirty_bitmap(ram_state, block);
+            ramblock_sync_dirty_bitmap(ram_state, block, false);
             /* Discard this dirty bitmap record */
             bitmap_zero(block->bmap, block->max_length >> TARGET_PAGE_BITS);
         }
@@ -3862,7 +3895,7 @@ void colo_flush_ram_cache(void)
     qemu_mutex_lock(&ram_state->bitmap_mutex);
     WITH_RCU_READ_LOCK_GUARD() {
         RAMBLOCK_FOREACH_NOT_IGNORED(block) {
-            ramblock_sync_dirty_bitmap(ram_state, block);
+            ramblock_sync_dirty_bitmap(ram_state, block, false);
         }
     }
 
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 3/7] qapi/migration: Introduce the iteration-count
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 1/7] migration: Introduce structs for background sync Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 2/7] migration: Refine util functions to support " Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-16 20:35   ` Fabiano Rosas
  2024-09-15 16:08 ` [PATCH v1 4/7] migration: Implment background sync watcher Hyman Huang
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

The original migration information dirty-sync-count could
no longer reflect iteration count due to the introduction
of background synchronization in the next commit;
add the iteration count to compensate.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 migration/migration-stats.h  |  4 ++++
 migration/migration.c        |  1 +
 migration/ram.c              | 12 ++++++++----
 qapi/migration.json          |  6 +++++-
 tests/qtest/migration-test.c |  2 +-
 5 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/migration/migration-stats.h b/migration/migration-stats.h
index 05290ade76..43ee0f4f05 100644
--- a/migration/migration-stats.h
+++ b/migration/migration-stats.h
@@ -50,6 +50,10 @@ typedef struct {
      * Number of times we have synchronized guest bitmaps.
      */
     Stat64 dirty_sync_count;
+    /*
+     * Number of migration iteration processed.
+     */
+    Stat64 iteration_count;
     /*
      * Number of times zero copy failed to send any page using zero
      * copy.
diff --git a/migration/migration.c b/migration/migration.c
index 3dea06d577..055d527ff6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1197,6 +1197,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
     info->ram->mbps = s->mbps;
     info->ram->dirty_sync_count =
         stat64_get(&mig_stats.dirty_sync_count);
+    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
     info->ram->dirty_sync_missed_zero_copy =
         stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
     info->ram->postcopy_requests =
diff --git a/migration/ram.c b/migration/ram.c
index e205806a5f..ca5a1b5f16 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -594,7 +594,7 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
     /* We don't care if this fails to allocate a new cache page
      * as long as it updated an old one */
     cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
-                 stat64_get(&mig_stats.dirty_sync_count));
+                 stat64_get(&mig_stats.iteration_count));
 }
 
 #define ENCODING_FLAG_XBZRLE 0x1
@@ -620,7 +620,7 @@ static int save_xbzrle_page(RAMState *rs, PageSearchStatus *pss,
     int encoded_len = 0, bytes_xbzrle;
     uint8_t *prev_cached_page;
     QEMUFile *file = pss->pss_channel;
-    uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
+    uint64_t generation = stat64_get(&mig_stats.iteration_count);
 
     if (!cache_is_cached(XBZRLE.cache, current_addr, generation)) {
         xbzrle_counters.cache_miss++;
@@ -1079,6 +1079,10 @@ static void migration_bitmap_sync(RAMState *rs,
     RAMBlock *block;
     int64_t end_time;
 
+    if (!background) {
+        stat64_add(&mig_stats.iteration_count, 1);
+    }
+
     stat64_add(&mig_stats.dirty_sync_count, 1);
 
     if (!rs->time_last_bitmap_sync) {
@@ -1115,8 +1119,8 @@ static void migration_bitmap_sync(RAMState *rs,
         rs->num_dirty_pages_period = 0;
         rs->bytes_xfer_prev = migration_transferred_bytes();
     }
-    if (migrate_events()) {
-        uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
+    if (!background && migrate_events()) {
+        uint64_t generation = stat64_get(&mig_stats.iteration_count);
         qapi_event_send_migration_pass(generation);
     }
 }
diff --git a/qapi/migration.json b/qapi/migration.json
index b66cccf107..95b490706c 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -60,6 +60,9 @@
 #     between 0 and @dirty-sync-count * @multifd-channels.  (since
 #     7.1)
 #
+# @iteration-count: The number of iterations since migration started.
+#     (since 9.2)
+#
 # Since: 0.14
 ##
 { 'struct': 'MigrationStats',
@@ -72,7 +75,8 @@
            'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
            'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
            'postcopy-bytes': 'uint64',
-           'dirty-sync-missed-zero-copy': 'uint64' } }
+           'dirty-sync-missed-zero-copy': 'uint64',
+           'iteration-count' : 'int' } }
 
 ##
 # @XBZRLECacheStats:
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index d6768d5d71..b796a90cad 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -278,7 +278,7 @@ static int64_t read_migrate_property_int(QTestState *who, const char *property)
 
 static uint64_t get_migration_pass(QTestState *who)
 {
-    return read_ram_property_int(who, "dirty-sync-count");
+    return read_ram_property_int(who, "iteration-count");
 }
 
 static void read_blocktime(QTestState *who)
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 4/7] migration: Implment background sync watcher
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
                   ` (2 preceding siblings ...)
  2024-09-15 16:08 ` [PATCH v1 3/7] qapi/migration: Introduce the iteration-count Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle Hyman Huang
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

The background sync watcher is used to detect that if the
iteration lasts a long time, if so, trigger the background
sync.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 migration/ram.c        | 110 +++++++++++++++++++++++++++++++++++++++++
 migration/ram.h        |   3 ++
 migration/trace-events |   3 ++
 3 files changed, 116 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index ca5a1b5f16..799eaa0382 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -416,6 +416,11 @@ struct RAMState {
      * RAM migration.
      */
     unsigned int postcopy_bmap_sync_requested;
+
+    /* Background throttle information */
+    bool background_sync_running;
+    QemuThread background_sync_thread;
+    QemuSemaphore quit_sem;
 };
 typedef struct RAMState RAMState;
 
@@ -1125,6 +1130,111 @@ static void migration_bitmap_sync(RAMState *rs,
     }
 }
 
+/*
+ * Iteration lasting more than five seconds is undesirable;
+ * launch a background dirty bitmap sync.
+ */
+#define MIGRATION_MAX_ITERATION_DURATION  5
+
+static void *migration_background_sync_watcher(void *opaque)
+{
+    RAMState *rs = opaque;
+    uint64_t iter_cnt, prev_iter_cnt = 2;
+    bool iter_cnt_unchanged = false;
+    int max_pct = migrate_max_cpu_throttle();
+
+    trace_migration_background_sync_watcher_start();
+    rcu_register_thread();
+
+    while (qatomic_read(&rs->background_sync_running)) {
+        int cur_pct = cpu_throttle_get_percentage();
+        if ((cur_pct == max_pct) || (!migration_is_active())) {
+            break;
+        }
+
+        if (qemu_sem_timedwait(&rs->quit_sem, 1000) == 0) {
+            /* We were woken by background_sync_cleanup, quit */
+            break;
+        }
+
+        /*
+         * The first iteration copies all memory anyhow and has no
+         * effect on guest performance, therefore omit it to avoid
+         * paying extra for the sync penalty.
+         */
+        iter_cnt = stat64_get(&mig_stats.iteration_count);
+        if (iter_cnt <= 1) {
+            continue;
+        }
+
+        iter_cnt_unchanged = (iter_cnt == prev_iter_cnt);
+        prev_iter_cnt = iter_cnt;
+
+        if (iter_cnt_unchanged) {
+            int64_t curr_time, iter_duration;
+
+            curr_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+            iter_duration = curr_time - rs->time_last_bitmap_sync;
+
+            if (iter_duration >
+                    MIGRATION_MAX_ITERATION_DURATION * 1000) {
+                sync_mode = RAMBLOCK_SYN_MODERN;
+                bql_lock();
+                trace_migration_background_sync();
+                WITH_RCU_READ_LOCK_GUARD() {
+                    migration_bitmap_sync(rs, false, true);
+                }
+                bql_unlock();
+            }
+        }
+    }
+
+    qatomic_set(&rs->background_sync_running, 0);
+
+    rcu_unregister_thread();
+    trace_migration_background_sync_watcher_end();
+
+    return NULL;
+}
+
+void migration_background_sync_setup(void)
+{
+    RAMState *rs = ram_state;
+
+    if (!rs) {
+        return;
+    }
+
+    if (qatomic_read(&rs->background_sync_running)) {
+        return;
+    }
+
+    qemu_sem_init(&rs->quit_sem, 0);
+    qatomic_set(&rs->background_sync_running, 1);
+
+    qemu_thread_create(&rs->background_sync_thread,
+                       NULL, migration_background_sync_watcher,
+                       rs, QEMU_THREAD_JOINABLE);
+}
+
+void migration_background_sync_cleanup(void)
+{
+    RAMState *rs = ram_state;
+
+    if (!rs) {
+        return;
+    }
+
+    if (!qatomic_read(&rs->background_sync_running)) {
+        return;
+    }
+
+    qatomic_set(&rs->background_sync_running, 0);
+    qemu_sem_post(&rs->quit_sem);
+    qemu_thread_join(&rs->background_sync_thread);
+    qemu_sem_destroy(&rs->quit_sem);
+}
+
 static void migration_bitmap_sync_precopy(RAMState *rs, bool last_stage)
 {
     Error *local_err = NULL;
diff --git a/migration/ram.h b/migration/ram.h
index bc0318b834..0315d22a66 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -93,4 +93,7 @@ void ram_write_tracking_prepare(void);
 int ram_write_tracking_start(void);
 void ram_write_tracking_stop(void);
 
+/* Migration background sync */
+void migration_background_sync_setup(void);
+void migration_background_sync_cleanup(void);
 #endif
diff --git a/migration/trace-events b/migration/trace-events
index c65902f042..4f95f9fe14 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -92,6 +92,9 @@ qemu_file_fclose(void) ""
 # ram.c
 get_queued_page(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
 get_queued_page_not_dirty(const char *block_name, uint64_t tmp_offset, unsigned long page_abs) "%s/0x%" PRIx64 " page_abs=0x%lx"
+migration_background_sync(void) ""
+migration_background_sync_watcher_start(void) ""
+migration_background_sync_watcher_end(void) ""
 migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
                   ` (3 preceding siblings ...)
  2024-09-15 16:08 ` [PATCH v1 4/7] migration: Implment background sync watcher Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-16 20:50   ` Fabiano Rosas
  2024-09-15 16:08 ` [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter Hyman Huang
  2024-09-15 16:08 ` [PATCH v1 7/7] migration: Support responsive CPU throttle Hyman Huang
  6 siblings, 1 reply; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only called for each iteration, so it won't be invoked for a long
time if one iteration can take a long time.

The background sync and throttle aim to fix the above issue by
synchronizing the remote dirty bitmap and triggering the throttle
once detect that iteration lasts a long time.

This is a trade-off between synchronization overhead and CPU throttle
impact.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 migration/migration.c        | 12 +++++++++++
 tests/qtest/migration-test.c | 39 ++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index 055d527ff6..af8b22fa15 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1416,6 +1416,7 @@ static void migrate_fd_cleanup(MigrationState *s)
 
         trace_migrate_fd_cleanup();
         bql_unlock();
+        migration_background_sync_cleanup();
         if (s->migration_thread_running) {
             qemu_thread_join(&s->thread);
             s->migration_thread_running = false;
@@ -3263,6 +3264,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
 
     if ((!pending_size || pending_size < s->threshold_size) && can_switchover) {
         trace_migration_thread_low_pending(pending_size);
+        migration_background_sync_cleanup();
         migration_completion(s);
         return MIG_ITERATE_BREAK;
     }
@@ -3508,6 +3510,16 @@ static void *migration_thread(void *opaque)
     ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
     bql_unlock();
 
+    if (!migrate_dirty_limit()) {
+        /*
+         * Initiate the background sync watcher in order to guarantee
+         * that the CPU throttling acts appropriately. Dirty Limit
+         * doesn't use CPU throttle to make guest down, so ignore that
+         * case.
+         */
+        migration_background_sync_setup();
+    }
+
     qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
                                MIGRATION_STATUS_ACTIVE);
 
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index b796a90cad..e0e94d26be 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -281,6 +281,11 @@ static uint64_t get_migration_pass(QTestState *who)
     return read_ram_property_int(who, "iteration-count");
 }
 
+static uint64_t get_dirty_sync_count(QTestState *who)
+{
+    return read_ram_property_int(who, "dirty-sync-count");
+}
+
 static void read_blocktime(QTestState *who)
 {
     QDict *rsp_return;
@@ -468,6 +473,12 @@ static void migrate_ensure_converge(QTestState *who)
     migrate_set_parameter_int(who, "downtime-limit", 30 * 1000);
 }
 
+static void migrate_ensure_iteration_last_long(QTestState *who)
+{
+    /* Set 10Byte/s bandwidth limit to make the iteration last long enough */
+    migrate_set_parameter_int(who, "max-bandwidth", 10);
+}
+
 /*
  * Our goal is to ensure that we run a single full migration
  * iteration, and also dirty memory, ensuring that at least
@@ -2791,6 +2802,10 @@ static void test_migrate_auto_converge(void)
      * so we need to decrease a bandwidth.
      */
     const int64_t init_pct = 5, inc_pct = 25, max_pct = 95;
+    uint64_t prev_iter_cnt = 0, iter_cnt;
+    uint64_t iter_cnt_changes = 0;
+    uint64_t prev_dirty_sync_cnt = 0, dirty_sync_cnt;
+    uint64_t dirty_sync_cnt_changes = 0;
 
     if (test_migrate_start(&from, &to, uri, &args)) {
         return;
@@ -2827,6 +2842,30 @@ static void test_migrate_auto_converge(void)
     } while (true);
     /* The first percentage of throttling should be at least init_pct */
     g_assert_cmpint(percentage, >=, init_pct);
+
+    /* Make sure the iteration take a long time enough */
+    migrate_ensure_iteration_last_long(from);
+
+    /*
+     * End the loop when the dirty sync count or iteration count changes.
+     */
+    while (iter_cnt_changes < 2 && dirty_sync_cnt_changes < 2) {
+        usleep(1000 * 1000);
+        iter_cnt = get_migration_pass(from);
+        iter_cnt_changes += (iter_cnt != prev_iter_cnt);
+        prev_iter_cnt = iter_cnt;
+
+        dirty_sync_cnt = get_dirty_sync_count(from);
+        dirty_sync_cnt_changes += (dirty_sync_cnt != prev_dirty_sync_cnt);
+        prev_dirty_sync_cnt = dirty_sync_cnt;
+    }
+
+    /*
+     * The dirty sync count must have changed because we are in the same
+     * iteration.
+     */
+    g_assert_cmpint(iter_cnt_changes , < , dirty_sync_cnt_changes);
+
     /* Now, when we tested that throttling works, let it converge */
     migrate_ensure_converge(from);
 
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
                   ` (4 preceding siblings ...)
  2024-09-15 16:08 ` [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  2024-09-16 20:55   ` Fabiano Rosas
  2024-09-15 16:08 ` [PATCH v1 7/7] migration: Support responsive CPU throttle Hyman Huang
  6 siblings, 1 reply; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

To enable the responsive throttle that will be implemented
in the next commit, introduce the cpu-responsive-throttle
parameter.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 migration/migration-hmp-cmds.c |  8 ++++++++
 migration/options.c            | 20 ++++++++++++++++++++
 migration/options.h            |  1 +
 qapi/migration.json            | 16 +++++++++++++++-
 4 files changed, 44 insertions(+), 1 deletion(-)

diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 28165cfc9e..1fe6c74d66 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -264,6 +264,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %s\n",
             MigrationParameter_str(MIGRATION_PARAMETER_CPU_THROTTLE_TAILSLOW),
             params->cpu_throttle_tailslow ? "on" : "off");
+        assert(params->has_cpu_responsive_throttle);
+        monitor_printf(mon, "%s: %s\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE),
+            params->cpu_responsive_throttle ? "on" : "off");
         assert(params->has_max_cpu_throttle);
         monitor_printf(mon, "%s: %u\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
@@ -512,6 +516,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         p->has_cpu_throttle_tailslow = true;
         visit_type_bool(v, param, &p->cpu_throttle_tailslow, &err);
         break;
+    case MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE:
+        p->has_cpu_responsive_throttle = true;
+        visit_type_bool(v, param, &p->cpu_responsive_throttle, &err);
+        break;
     case MIGRATION_PARAMETER_MAX_CPU_THROTTLE:
         p->has_max_cpu_throttle = true;
         visit_type_uint8(v, param, &p->max_cpu_throttle, &err);
diff --git a/migration/options.c b/migration/options.c
index 147cd2b8fd..b4c269bf1d 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -111,6 +111,8 @@ Property migration_properties[] = {
                       DEFAULT_MIGRATE_CPU_THROTTLE_INCREMENT),
     DEFINE_PROP_BOOL("x-cpu-throttle-tailslow", MigrationState,
                       parameters.cpu_throttle_tailslow, false),
+    DEFINE_PROP_BOOL("x-cpu-responsive-throttle", MigrationState,
+                      parameters.cpu_responsive_throttle, false),
     DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
                       parameters.max_bandwidth, MAX_THROTTLE),
     DEFINE_PROP_SIZE("avail-switchover-bandwidth", MigrationState,
@@ -705,6 +707,13 @@ uint8_t migrate_cpu_throttle_initial(void)
     return s->parameters.cpu_throttle_initial;
 }
 
+bool migrate_responsive_throttle(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->parameters.cpu_responsive_throttle;
+}
+
 bool migrate_cpu_throttle_tailslow(void)
 {
     MigrationState *s = migrate_get_current();
@@ -891,6 +900,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
     params->has_cpu_throttle_tailslow = true;
     params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
+    params->has_cpu_responsive_throttle = true;
+    params->cpu_responsive_throttle = s->parameters.cpu_responsive_throttle;
     params->tls_creds = g_strdup(s->parameters.tls_creds);
     params->tls_hostname = g_strdup(s->parameters.tls_hostname);
     params->tls_authz = g_strdup(s->parameters.tls_authz ?
@@ -959,6 +970,7 @@ void migrate_params_init(MigrationParameters *params)
     params->has_cpu_throttle_initial = true;
     params->has_cpu_throttle_increment = true;
     params->has_cpu_throttle_tailslow = true;
+    params->has_cpu_responsive_throttle = true;
     params->has_max_bandwidth = true;
     params->has_downtime_limit = true;
     params->has_x_checkpoint_delay = true;
@@ -1191,6 +1203,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
         dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
     }
 
+    if (params->has_cpu_responsive_throttle) {
+        dest->cpu_responsive_throttle = params->cpu_responsive_throttle;
+    }
+
     if (params->tls_creds) {
         assert(params->tls_creds->type == QTYPE_QSTRING);
         dest->tls_creds = params->tls_creds->u.s;
@@ -1302,6 +1318,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
         s->parameters.cpu_throttle_tailslow = params->cpu_throttle_tailslow;
     }
 
+    if (params->has_cpu_responsive_throttle) {
+        s->parameters.cpu_responsive_throttle = params->cpu_responsive_throttle;
+    }
+
     if (params->tls_creds) {
         g_free(s->parameters.tls_creds);
         assert(params->tls_creds->type == QTYPE_QSTRING);
diff --git a/migration/options.h b/migration/options.h
index a0bd6edc06..80d0fcdaf9 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -68,6 +68,7 @@ bool migrate_has_block_bitmap_mapping(void);
 uint32_t migrate_checkpoint_delay(void);
 uint8_t migrate_cpu_throttle_increment(void);
 uint8_t migrate_cpu_throttle_initial(void);
+bool migrate_responsive_throttle(void);
 bool migrate_cpu_throttle_tailslow(void);
 bool migrate_direct_io(void);
 uint64_t migrate_downtime_limit(void);
diff --git a/qapi/migration.json b/qapi/migration.json
index 95b490706c..c61d3b3a6b 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -732,6 +732,10 @@
 #     be excessive at tail stage.  The default value is false.  (Since
 #     5.1)
 #
+# @cpu-responsive-throttle: Make CPU throttling more responsively by
+#                           introduce an extra detection metric of
+#                           migration convergence. (Since 9.1)
+#
 # @tls-creds: ID of the 'tls-creds' object that provides credentials
 #     for establishing a TLS connection over the migration data
 #     channel.  On the outgoing side of the migration, the credentials
@@ -857,7 +861,7 @@
            'announce-rounds', 'announce-step',
            'throttle-trigger-threshold',
            'cpu-throttle-initial', 'cpu-throttle-increment',
-           'cpu-throttle-tailslow',
+           'cpu-throttle-tailslow', 'cpu-responsive-throttle',
            'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
            'avail-switchover-bandwidth', 'downtime-limit',
            { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
@@ -913,6 +917,10 @@
 #     be excessive at tail stage.  The default value is false.  (Since
 #     5.1)
 #
+# @cpu-responsive-throttle: Make CPU throttling more responsively by
+#                           introduce an extra detection metric of
+#                           migration convergence. (Since 9.1)
+#
 # @tls-creds: ID of the 'tls-creds' object that provides credentials
 #     for establishing a TLS connection over the migration data
 #     channel.  On the outgoing side of the migration, the credentials
@@ -1045,6 +1053,7 @@
             '*cpu-throttle-initial': 'uint8',
             '*cpu-throttle-increment': 'uint8',
             '*cpu-throttle-tailslow': 'bool',
+            '*cpu-responsive-throttle': 'bool',
             '*tls-creds': 'StrOrNull',
             '*tls-hostname': 'StrOrNull',
             '*tls-authz': 'StrOrNull',
@@ -1127,6 +1136,10 @@
 #     be excessive at tail stage.  The default value is false.  (Since
 #     5.1)
 #
+# @cpu-responsive-throttle: Make CPU throttling more responsively by
+#                           introduce an extra detection metric of
+#                           migration convergence. (Since 9.1)
+#
 # @tls-creds: ID of the 'tls-creds' object that provides credentials
 #     for establishing a TLS connection over the migration data
 #     channel.  On the outgoing side of the migration, the credentials
@@ -1252,6 +1265,7 @@
             '*cpu-throttle-initial': 'uint8',
             '*cpu-throttle-increment': 'uint8',
             '*cpu-throttle-tailslow': 'bool',
+            '*cpu-responsive-throttle': 'bool',
             '*tls-creds': 'str',
             '*tls-hostname': 'str',
             '*tls-authz': 'str',
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v1 7/7] migration: Support responsive CPU throttle
  2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
                   ` (5 preceding siblings ...)
  2024-09-15 16:08 ` [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter Hyman Huang
@ 2024-09-15 16:08 ` Hyman Huang
  6 siblings, 0 replies; 28+ messages in thread
From: Hyman Huang @ 2024-09-15 16:08 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Xu, Fabiano Rosas, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini,
	yong.huang

Currently, the convergence algorithm determines that the migration
cannot converge according to the following principle:
The dirty pages generated in current iteration exceed a specific
percentage (throttle-trigger-threshold, 50 by default) of the number
of transmissions. Let's refer to this criteria as the "dirty rate".
If this criteria is met more than or equal to twice
(dirty_rate_high_cnt >= 2), the throttle percentage increased.

In most cases, above implementation is appropriate. However, for a
VM with high memory overload, each iteration is time-consuming.
The VM's computing performance may be throttled at a high percentage
and last for a long time due to the repeated confirmation behavior.
Which may be intolerable for some computationally sensitive software
in the VM.

As the comment mentioned in the migration_trigger_throttle function,
in order to avoid erroneous detection, the original algorithm confirms
the criteria repeatedly. Put differently, the criteria does not need
to be validated again once the detection is more reliable.

In the refinement, in order to make the detection more accurate, we
introduce another criteria, called the "dirty ratio" to determine
the migration convergence. The "dirty ratio" is the ratio of
bytes_xfer_period and bytes_dirty_period. When the algorithm
repeatedly detects that the "dirty ratio" of current sync is lower
than the previous, the algorithm determines that the migration cannot
converge. For the "dirty rate" and "dirty ratio", if one of the two
criteria is met, the penalty percentage would be increased. This
makes CPU throttle more responsively and therefor saves the time of
the entire iteration and therefore reduces the time of VM performance
degradation.

In conclusion, this refinement significantly reduces the processing
time required for the throttle percentage step to its maximum while
the VM is under a high memory load.

Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
 migration/ram.c              | 55 ++++++++++++++++++++++++++++++++++--
 migration/trace-events       |  1 +
 tests/qtest/migration-test.c |  1 +
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 799eaa0382..8d856a89db 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -421,6 +421,12 @@ struct RAMState {
     bool background_sync_running;
     QemuThread background_sync_thread;
     QemuSemaphore quit_sem;
+
+    /*
+     * Ratio of bytes_dirty_period and bytes_xfer_period in the
+     * previous sync.
+     */
+    uint64_t dirty_ratio_pct;
 };
 typedef struct RAMState RAMState;
 
@@ -1049,6 +1055,43 @@ static void migration_dirty_limit_guest(void)
     trace_migration_dirty_limit_guest(quota_dirtyrate);
 }
 
+static bool migration_dirty_ratio_high(RAMState *rs)
+{
+    static int dirty_ratio_high_cnt;
+    uint64_t threshold = migrate_throttle_trigger_threshold();
+    uint64_t bytes_xfer_period =
+        migration_transferred_bytes() - rs->bytes_xfer_prev;
+    uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
+    bool dirty_ratio_high = false;
+    uint64_t prev, curr;
+
+    /* Calculate the dirty ratio percentage */
+    curr = 100 * (bytes_dirty_period * 1.0 / bytes_xfer_period);
+
+    prev = rs->dirty_ratio_pct;
+    rs->dirty_ratio_pct = curr;
+
+    if (prev == 0) {
+        return false;
+    }
+
+    /*
+     * If current dirty ratio is greater than previouse, determine
+     * that the migration do not converge.
+     */
+    if (curr > threshold && curr >= prev) {
+        trace_migration_dirty_ratio_high(curr, prev);
+        dirty_ratio_high_cnt++;
+    }
+
+    if (dirty_ratio_high_cnt >= 2) {
+        dirty_ratio_high = true;
+        dirty_ratio_high_cnt = 0;
+    }
+
+    return dirty_ratio_high;
+}
+
 static void migration_trigger_throttle(RAMState *rs)
 {
     uint64_t threshold = migrate_throttle_trigger_threshold();
@@ -1056,6 +1099,11 @@ static void migration_trigger_throttle(RAMState *rs)
         migration_transferred_bytes() - rs->bytes_xfer_prev;
     uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
     uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
+    bool dirty_ratio_high = false;
+
+    if (migrate_responsive_throttle() && (bytes_xfer_period != 0)) {
+        dirty_ratio_high = migration_dirty_ratio_high(rs);
+    }
 
     /*
      * The following detection logic can be refined later. For now:
@@ -1065,8 +1113,11 @@ static void migration_trigger_throttle(RAMState *rs)
      * twice, start or increase throttling.
      */
     if ((bytes_dirty_period > bytes_dirty_threshold) &&
-        (++rs->dirty_rate_high_cnt >= 2)) {
-        rs->dirty_rate_high_cnt = 0;
+        ((++rs->dirty_rate_high_cnt >= 2) || dirty_ratio_high)) {
+
+        rs->dirty_rate_high_cnt =
+            rs->dirty_rate_high_cnt >= 2 ? 0 : rs->dirty_rate_high_cnt;
+
         if (migrate_auto_converge()) {
             trace_migration_throttle();
             mig_throttle_guest_down(bytes_dirty_period,
diff --git a/migration/trace-events b/migration/trace-events
index 4f95f9fe14..0b219516e9 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -98,6 +98,7 @@ migration_background_sync_watcher_end(void) ""
 migration_bitmap_sync_start(void) ""
 migration_bitmap_sync_end(uint64_t dirty_pages) "dirty_pages %" PRIu64
 migration_bitmap_clear_dirty(char *str, uint64_t start, uint64_t size, unsigned long page) "rb %s start 0x%"PRIx64" size 0x%"PRIx64" page 0x%lx"
+migration_dirty_ratio_high(uint64_t cur, uint64_t prev) "current ratio: %" PRIu64 " previous ratio: %" PRIu64
 migration_throttle(void) ""
 migration_dirty_limit_guest(int64_t dirtyrate) "guest dirty page rate limit %" PRIi64 " MB/s"
 ram_discard_range(const char *rbname, uint64_t start, size_t len) "%s: start: %" PRIx64 " %zx"
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index e0e94d26be..9f7c2f49a0 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2815,6 +2815,7 @@ static void test_migrate_auto_converge(void)
     migrate_set_parameter_int(from, "cpu-throttle-initial", init_pct);
     migrate_set_parameter_int(from, "cpu-throttle-increment", inc_pct);
     migrate_set_parameter_int(from, "max-cpu-throttle", max_pct);
+    migrate_set_parameter_bool(from, "cpu-responsive-throttle", true);
 
     /*
      * Set the initial parameters so that the migration could not converge
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 3/7] qapi/migration: Introduce the iteration-count
  2024-09-15 16:08 ` [PATCH v1 3/7] qapi/migration: Introduce the iteration-count Hyman Huang
@ 2024-09-16 20:35   ` Fabiano Rosas
  2024-09-17  6:52     ` Yong Huang
  2024-09-18  8:29     ` Yong Huang
  0 siblings, 2 replies; 28+ messages in thread
From: Fabiano Rosas @ 2024-09-16 20:35 UTC (permalink / raw)
  To: Hyman Huang, qemu-devel
  Cc: Peter Xu, Eric Blake, Markus Armbruster, David Hildenbrand,
	Philippe Mathieu-Daudé, Paolo Bonzini, yong.huang

Hyman Huang <yong.huang@smartx.com> writes:

> The original migration information dirty-sync-count could
> no longer reflect iteration count due to the introduction
> of background synchronization in the next commit;
> add the iteration count to compensate.

I agree with the overall idea, but I feel we're lacking some information
on what determines whether some of the lines below want to use the
iteration count vs. the dirty sync count. Since this patch increments
both variables at the same place, they can still be used interchangeably
unless we add some words to explain the distinction.

So to clarify: 

What do we call an iteration? A call to save_live_iterate(),
migration_iteration_run() or something else?

Why dirty-sync-count should ever have reflected "iteration count"? It
might have been this way by coincidence, but did we ever used it in that
sense (aside from info migrate maybe)?

With the new counter, what kind of meaning can a user extract from that
number aside from "some undescribed thing happened N times" (this might
be included in the migration.json docs)?

>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
>  migration/migration-stats.h  |  4 ++++
>  migration/migration.c        |  1 +
>  migration/ram.c              | 12 ++++++++----
>  qapi/migration.json          |  6 +++++-
>  tests/qtest/migration-test.c |  2 +-
>  5 files changed, 19 insertions(+), 6 deletions(-)
>
> diff --git a/migration/migration-stats.h b/migration/migration-stats.h
> index 05290ade76..43ee0f4f05 100644
> --- a/migration/migration-stats.h
> +++ b/migration/migration-stats.h
> @@ -50,6 +50,10 @@ typedef struct {
>       * Number of times we have synchronized guest bitmaps.
>       */
>      Stat64 dirty_sync_count;
> +    /*
> +     * Number of migration iteration processed.
> +     */
> +    Stat64 iteration_count;
>      /*
>       * Number of times zero copy failed to send any page using zero
>       * copy.
> diff --git a/migration/migration.c b/migration/migration.c
> index 3dea06d577..055d527ff6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1197,6 +1197,7 @@ static void populate_ram_info(MigrationInfo *info, MigrationState *s)
>      info->ram->mbps = s->mbps;
>      info->ram->dirty_sync_count =
>          stat64_get(&mig_stats.dirty_sync_count);
> +    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
>      info->ram->dirty_sync_missed_zero_copy =
>          stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
>      info->ram->postcopy_requests =
> diff --git a/migration/ram.c b/migration/ram.c
> index e205806a5f..ca5a1b5f16 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -594,7 +594,7 @@ static void xbzrle_cache_zero_page(ram_addr_t current_addr)
>      /* We don't care if this fails to allocate a new cache page
>       * as long as it updated an old one */
>      cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
> -                 stat64_get(&mig_stats.dirty_sync_count));
> +                 stat64_get(&mig_stats.iteration_count));
>  }
>  
>  #define ENCODING_FLAG_XBZRLE 0x1
> @@ -620,7 +620,7 @@ static int save_xbzrle_page(RAMState *rs, PageSearchStatus *pss,
>      int encoded_len = 0, bytes_xbzrle;
>      uint8_t *prev_cached_page;
>      QEMUFile *file = pss->pss_channel;
> -    uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> +    uint64_t generation = stat64_get(&mig_stats.iteration_count);
>  
>      if (!cache_is_cached(XBZRLE.cache, current_addr, generation)) {
>          xbzrle_counters.cache_miss++;
> @@ -1079,6 +1079,10 @@ static void migration_bitmap_sync(RAMState *rs,
>      RAMBlock *block;
>      int64_t end_time;
>  
> +    if (!background) {
> +        stat64_add(&mig_stats.iteration_count, 1);
> +    }
> +
>      stat64_add(&mig_stats.dirty_sync_count, 1);
>  
>      if (!rs->time_last_bitmap_sync) {
> @@ -1115,8 +1119,8 @@ static void migration_bitmap_sync(RAMState *rs,
>          rs->num_dirty_pages_period = 0;
>          rs->bytes_xfer_prev = migration_transferred_bytes();
>      }
> -    if (migrate_events()) {
> -        uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> +    if (!background && migrate_events()) {
> +        uint64_t generation = stat64_get(&mig_stats.iteration_count);
>          qapi_event_send_migration_pass(generation);
>      }
>  }
> diff --git a/qapi/migration.json b/qapi/migration.json
> index b66cccf107..95b490706c 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -60,6 +60,9 @@
>  #     between 0 and @dirty-sync-count * @multifd-channels.  (since
>  #     7.1)
>  #
> +# @iteration-count: The number of iterations since migration started.
> +#     (since 9.2)
> +#
>  # Since: 0.14
>  ##
>  { 'struct': 'MigrationStats',
> @@ -72,7 +75,8 @@
>             'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
>             'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
>             'postcopy-bytes': 'uint64',
> -           'dirty-sync-missed-zero-copy': 'uint64' } }
> +           'dirty-sync-missed-zero-copy': 'uint64',
> +           'iteration-count' : 'int' } }
>  
>  ##
>  # @XBZRLECacheStats:
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index d6768d5d71..b796a90cad 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -278,7 +278,7 @@ static int64_t read_migrate_property_int(QTestState *who, const char *property)
>  
>  static uint64_t get_migration_pass(QTestState *who)
>  {
> -    return read_ram_property_int(who, "dirty-sync-count");
> +    return read_ram_property_int(who, "iteration-count");
>  }
>  
>  static void read_blocktime(QTestState *who)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle
  2024-09-15 16:08 ` [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle Hyman Huang
@ 2024-09-16 20:50   ` Fabiano Rosas
  0 siblings, 0 replies; 28+ messages in thread
From: Fabiano Rosas @ 2024-09-16 20:50 UTC (permalink / raw)
  To: Hyman Huang, qemu-devel
  Cc: Peter Xu, Eric Blake, Markus Armbruster, David Hildenbrand,
	Philippe Mathieu-Daudé, Paolo Bonzini, yong.huang

Hyman Huang <yong.huang@smartx.com> writes:

> When VM is configured with huge memory, the current throttle logic
> doesn't look like to scale, because migration_trigger_throttle()
> is only called for each iteration, so it won't be invoked for a long
> time if one iteration can take a long time.
>
> The background sync and throttle aim to fix the above issue by
> synchronizing the remote dirty bitmap and triggering the throttle
> once detect that iteration lasts a long time.
>
> This is a trade-off between synchronization overhead and CPU throttle
> impact.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
>  migration/migration.c        | 12 +++++++++++
>  tests/qtest/migration-test.c | 39 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 51 insertions(+)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 055d527ff6..af8b22fa15 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1416,6 +1416,7 @@ static void migrate_fd_cleanup(MigrationState *s)
>  
>          trace_migrate_fd_cleanup();
>          bql_unlock();
> +        migration_background_sync_cleanup();
>          if (s->migration_thread_running) {
>              qemu_thread_join(&s->thread);
>              s->migration_thread_running = false;
> @@ -3263,6 +3264,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
>  
>      if ((!pending_size || pending_size < s->threshold_size) && can_switchover) {
>          trace_migration_thread_low_pending(pending_size);
> +        migration_background_sync_cleanup();

This one is redundant with the migrate_fd_cleanup() call at the end of
migration_iteration_finish().

>          migration_completion(s);
>          return MIG_ITERATE_BREAK;
>      }
> @@ -3508,6 +3510,16 @@ static void *migration_thread(void *opaque)
>      ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
>      bql_unlock();
>  
> +    if (!migrate_dirty_limit()) {
> +        /*
> +         * Initiate the background sync watcher in order to guarantee
> +         * that the CPU throttling acts appropriately. Dirty Limit
> +         * doesn't use CPU throttle to make guest down, so ignore that
> +         * case.
> +         */
> +        migration_background_sync_setup();
> +    }
> +
>      qemu_savevm_wait_unplug(s, MIGRATION_STATUS_SETUP,
>                                 MIGRATION_STATUS_ACTIVE);
>  
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index b796a90cad..e0e94d26be 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -281,6 +281,11 @@ static uint64_t get_migration_pass(QTestState *who)
>      return read_ram_property_int(who, "iteration-count");
>  }
>  
> +static uint64_t get_dirty_sync_count(QTestState *who)
> +{
> +    return read_ram_property_int(who, "dirty-sync-count");
> +}
> +
>  static void read_blocktime(QTestState *who)
>  {
>      QDict *rsp_return;
> @@ -468,6 +473,12 @@ static void migrate_ensure_converge(QTestState *who)
>      migrate_set_parameter_int(who, "downtime-limit", 30 * 1000);
>  }
>  
> +static void migrate_ensure_iteration_last_long(QTestState *who)
> +{
> +    /* Set 10Byte/s bandwidth limit to make the iteration last long enough */
> +    migrate_set_parameter_int(who, "max-bandwidth", 10);
> +}
> +
>  /*
>   * Our goal is to ensure that we run a single full migration
>   * iteration, and also dirty memory, ensuring that at least
> @@ -2791,6 +2802,10 @@ static void test_migrate_auto_converge(void)
>       * so we need to decrease a bandwidth.
>       */
>      const int64_t init_pct = 5, inc_pct = 25, max_pct = 95;
> +    uint64_t prev_iter_cnt = 0, iter_cnt;
> +    uint64_t iter_cnt_changes = 0;
> +    uint64_t prev_dirty_sync_cnt = 0, dirty_sync_cnt;
> +    uint64_t dirty_sync_cnt_changes = 0;
>  
>      if (test_migrate_start(&from, &to, uri, &args)) {
>          return;
> @@ -2827,6 +2842,30 @@ static void test_migrate_auto_converge(void)
>      } while (true);
>      /* The first percentage of throttling should be at least init_pct */
>      g_assert_cmpint(percentage, >=, init_pct);
> +
> +    /* Make sure the iteration take a long time enough */
> +    migrate_ensure_iteration_last_long(from);
> +
> +    /*
> +     * End the loop when the dirty sync count or iteration count changes.
> +     */
> +    while (iter_cnt_changes < 2 && dirty_sync_cnt_changes < 2) {
> +        usleep(1000 * 1000);
> +        iter_cnt = get_migration_pass(from);
> +        iter_cnt_changes += (iter_cnt != prev_iter_cnt);
> +        prev_iter_cnt = iter_cnt;
> +
> +        dirty_sync_cnt = get_dirty_sync_count(from);
> +        dirty_sync_cnt_changes += (dirty_sync_cnt != prev_dirty_sync_cnt);
> +        prev_dirty_sync_cnt = dirty_sync_cnt;
> +    }
> +
> +    /*
> +     * The dirty sync count must have changed because we are in the same
> +     * iteration.
> +     */
> +    g_assert_cmpint(iter_cnt_changes , < , dirty_sync_cnt_changes);
> +
>      /* Now, when we tested that throttling works, let it converge */
>      migrate_ensure_converge(from);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter
  2024-09-15 16:08 ` [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter Hyman Huang
@ 2024-09-16 20:55   ` Fabiano Rosas
  2024-09-17  6:54     ` Yong Huang
  0 siblings, 1 reply; 28+ messages in thread
From: Fabiano Rosas @ 2024-09-16 20:55 UTC (permalink / raw)
  To: Hyman Huang, qemu-devel
  Cc: Peter Xu, Eric Blake, Markus Armbruster, David Hildenbrand,
	Philippe Mathieu-Daudé, Paolo Bonzini, yong.huang

Hyman Huang <yong.huang@smartx.com> writes:

> To enable the responsive throttle that will be implemented
> in the next commit, introduce the cpu-responsive-throttle
> parameter.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
>  migration/migration-hmp-cmds.c |  8 ++++++++
>  migration/options.c            | 20 ++++++++++++++++++++
>  migration/options.h            |  1 +
>  qapi/migration.json            | 16 +++++++++++++++-
>  4 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 28165cfc9e..1fe6c74d66 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -264,6 +264,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>          monitor_printf(mon, "%s: %s\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_CPU_THROTTLE_TAILSLOW),
>              params->cpu_throttle_tailslow ? "on" : "off");
> +        assert(params->has_cpu_responsive_throttle);
> +        monitor_printf(mon, "%s: %s\n",
> +            MigrationParameter_str(MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE),
> +            params->cpu_responsive_throttle ? "on" : "off");
>          assert(params->has_max_cpu_throttle);
>          monitor_printf(mon, "%s: %u\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
> @@ -512,6 +516,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>          p->has_cpu_throttle_tailslow = true;
>          visit_type_bool(v, param, &p->cpu_throttle_tailslow, &err);
>          break;
> +    case MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE:
> +        p->has_cpu_responsive_throttle = true;
> +        visit_type_bool(v, param, &p->cpu_responsive_throttle, &err);
> +        break;
>      case MIGRATION_PARAMETER_MAX_CPU_THROTTLE:
>          p->has_max_cpu_throttle = true;
>          visit_type_uint8(v, param, &p->max_cpu_throttle, &err);
> diff --git a/migration/options.c b/migration/options.c
> index 147cd2b8fd..b4c269bf1d 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -111,6 +111,8 @@ Property migration_properties[] = {
>                        DEFAULT_MIGRATE_CPU_THROTTLE_INCREMENT),
>      DEFINE_PROP_BOOL("x-cpu-throttle-tailslow", MigrationState,
>                        parameters.cpu_throttle_tailslow, false),
> +    DEFINE_PROP_BOOL("x-cpu-responsive-throttle", MigrationState,
> +                      parameters.cpu_responsive_throttle, false),
>      DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
>                        parameters.max_bandwidth, MAX_THROTTLE),
>      DEFINE_PROP_SIZE("avail-switchover-bandwidth", MigrationState,
> @@ -705,6 +707,13 @@ uint8_t migrate_cpu_throttle_initial(void)
>      return s->parameters.cpu_throttle_initial;
>  }
>  
> +bool migrate_responsive_throttle(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    return s->parameters.cpu_responsive_throttle;
> +}
> +
>  bool migrate_cpu_throttle_tailslow(void)
>  {
>      MigrationState *s = migrate_get_current();
> @@ -891,6 +900,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>      params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
>      params->has_cpu_throttle_tailslow = true;
>      params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
> +    params->has_cpu_responsive_throttle = true;
> +    params->cpu_responsive_throttle = s->parameters.cpu_responsive_throttle;
>      params->tls_creds = g_strdup(s->parameters.tls_creds);
>      params->tls_hostname = g_strdup(s->parameters.tls_hostname);
>      params->tls_authz = g_strdup(s->parameters.tls_authz ?
> @@ -959,6 +970,7 @@ void migrate_params_init(MigrationParameters *params)
>      params->has_cpu_throttle_initial = true;
>      params->has_cpu_throttle_increment = true;
>      params->has_cpu_throttle_tailslow = true;
> +    params->has_cpu_responsive_throttle = true;
>      params->has_max_bandwidth = true;
>      params->has_downtime_limit = true;
>      params->has_x_checkpoint_delay = true;
> @@ -1191,6 +1203,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
>          dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
>      }
>  
> +    if (params->has_cpu_responsive_throttle) {
> +        dest->cpu_responsive_throttle = params->cpu_responsive_throttle;
> +    }
> +
>      if (params->tls_creds) {
>          assert(params->tls_creds->type == QTYPE_QSTRING);
>          dest->tls_creds = params->tls_creds->u.s;
> @@ -1302,6 +1318,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
>          s->parameters.cpu_throttle_tailslow = params->cpu_throttle_tailslow;
>      }
>  
> +    if (params->has_cpu_responsive_throttle) {
> +        s->parameters.cpu_responsive_throttle = params->cpu_responsive_throttle;
> +    }
> +
>      if (params->tls_creds) {
>          g_free(s->parameters.tls_creds);
>          assert(params->tls_creds->type == QTYPE_QSTRING);
> diff --git a/migration/options.h b/migration/options.h
> index a0bd6edc06..80d0fcdaf9 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -68,6 +68,7 @@ bool migrate_has_block_bitmap_mapping(void);
>  uint32_t migrate_checkpoint_delay(void);
>  uint8_t migrate_cpu_throttle_increment(void);
>  uint8_t migrate_cpu_throttle_initial(void);
> +bool migrate_responsive_throttle(void);
>  bool migrate_cpu_throttle_tailslow(void);
>  bool migrate_direct_io(void);
>  uint64_t migrate_downtime_limit(void);
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 95b490706c..c61d3b3a6b 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -732,6 +732,10 @@
>  #     be excessive at tail stage.  The default value is false.  (Since
>  #     5.1)
>  #
> +# @cpu-responsive-throttle: Make CPU throttling more responsively by
> +#                           introduce an extra detection metric of
> +#                           migration convergence. (Since 9.1)
> +#
>  # @tls-creds: ID of the 'tls-creds' object that provides credentials
>  #     for establishing a TLS connection over the migration data
>  #     channel.  On the outgoing side of the migration, the credentials
> @@ -857,7 +861,7 @@
>             'announce-rounds', 'announce-step',
>             'throttle-trigger-threshold',
>             'cpu-throttle-initial', 'cpu-throttle-increment',
> -           'cpu-throttle-tailslow',
> +           'cpu-throttle-tailslow', 'cpu-responsive-throttle',
>             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
>             'avail-switchover-bandwidth', 'downtime-limit',
>             { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
> @@ -913,6 +917,10 @@
>  #     be excessive at tail stage.  The default value is false.  (Since
>  #     5.1)
>  #
> +# @cpu-responsive-throttle: Make CPU throttling more responsively by
> +#                           introduce an extra detection metric of
> +#                           migration convergence. (Since 9.1)
> +#
>  # @tls-creds: ID of the 'tls-creds' object that provides credentials
>  #     for establishing a TLS connection over the migration data
>  #     channel.  On the outgoing side of the migration, the credentials
> @@ -1045,6 +1053,7 @@
>              '*cpu-throttle-initial': 'uint8',
>              '*cpu-throttle-increment': 'uint8',
>              '*cpu-throttle-tailslow': 'bool',
> +            '*cpu-responsive-throttle': 'bool',

Should this instead follow the pattern of cpu-throttle-* ?

>              '*tls-creds': 'StrOrNull',
>              '*tls-hostname': 'StrOrNull',
>              '*tls-authz': 'StrOrNull',
> @@ -1127,6 +1136,10 @@
>  #     be excessive at tail stage.  The default value is false.  (Since
>  #     5.1)
>  #
> +# @cpu-responsive-throttle: Make CPU throttling more responsively by

s/responsively/responsive/ ?

> +#                           introduce an extra detection metric of
> +#                           migration convergence. (Since 9.1)

Since 9.2. And double space after period.

> +#
>  # @tls-creds: ID of the 'tls-creds' object that provides credentials
>  #     for establishing a TLS connection over the migration data
>  #     channel.  On the outgoing side of the migration, the credentials
> @@ -1252,6 +1265,7 @@
>              '*cpu-throttle-initial': 'uint8',
>              '*cpu-throttle-increment': 'uint8',
>              '*cpu-throttle-tailslow': 'bool',
> +            '*cpu-responsive-throttle': 'bool',
>              '*tls-creds': 'str',
>              '*tls-hostname': 'str',
>              '*tls-authz': 'str',


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-15 16:08 ` [PATCH v1 1/7] migration: Introduce structs for background sync Hyman Huang
@ 2024-09-16 21:11   ` Fabiano Rosas
  2024-09-17  6:48     ` Yong Huang
  0 siblings, 1 reply; 28+ messages in thread
From: Fabiano Rosas @ 2024-09-16 21:11 UTC (permalink / raw)
  To: Hyman Huang, qemu-devel
  Cc: Peter Xu, Eric Blake, Markus Armbruster, David Hildenbrand,
	Philippe Mathieu-Daudé, Paolo Bonzini, yong.huang

Hyman Huang <yong.huang@smartx.com> writes:

> shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> to satisfy the need for background sync.
>
> Meanwhile, introduce enumeration of sync method.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
>  include/exec/ramblock.h | 45 +++++++++++++++++++++++++++++++++++++++++
>  migration/ram.c         |  6 ++++++
>  2 files changed, 51 insertions(+)
>
> diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> index 0babd105c0..0e327bc0ae 100644
> --- a/include/exec/ramblock.h
> +++ b/include/exec/ramblock.h
> @@ -24,6 +24,30 @@
>  #include "qemu/rcu.h"
>  #include "exec/ramlist.h"
>  
> +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> +
> +/*
> + * The old-fashioned sync, which is, in turn, used for CPU
> + * throttle and memory transfer.

I'm not sure I follow what "in turn" is supposed to mean in this
sentence. Could you clarify?

> + */
> +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)

So ITER is as opposed to background? I'm a bit confused with the terms.

> +
> +/*
> + * The modern sync, which is, in turn, used for CPU throttle
> + * and memory transfer.
> + */
> +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> +
> +/* The modern sync, which is used for CPU throttle only */
> +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)

What's the plan for the "legacy" part? To be removed soon? Do we want to
remove it now? Maybe better to not use the modern/legacy terms unless we
want to give the impression that the legacy one is discontinued.

> +
> +#define RAMBLOCK_SYN_MASK  (0x7)
> +
> +typedef enum RAMBlockSynMode {
> +    RAMBLOCK_SYN_LEGACY, /* Old-fashined mode */
> +    RAMBLOCK_SYN_MODERN, /* Background-sync-supported mode */
> +} RAMBlockSynMode;

I'm also wondering wheter we need this enum + the flags or one of them
would suffice. I'm looking at code like this in the following patches,
for instance:

+    if (sync_mode == RAMBLOCK_SYN_MODERN) {
+        if (background) {
+            flag = RAMBLOCK_SYN_MODERN_BACKGROUND;
+        } else {
+            flag = RAMBLOCK_SYN_MODERN_ITER;
+        }
+    }

Couldn't we use LEGACY/BG/ITER?

> +
>  struct RAMBlock {
>      struct rcu_head rcu;
>      struct MemoryRegion *mr;
> @@ -89,6 +113,27 @@ struct RAMBlock {
>       * could not have been valid on the source.
>       */
>      ram_addr_t postcopy_length;
> +
> +    /*
> +     * Used to backup the bmap during background sync to see whether any dirty
> +     * pages were sent during that time.
> +     */
> +    unsigned long *shadow_bmap;
> +
> +    /*
> +     * The bitmap "bmap," which was initially used for both sync and memory
> +     * transfer, will be replaced by two bitmaps: the previously used "bmap"
> +     * and the recently added "iter_bmap." Only the memory transfer is
> +     * conducted with the previously used "bmap"; the recently added
> +     * "iter_bmap" is utilized for dirty bitmap sync.
> +     */
> +    unsigned long *iter_bmap;
> +
> +    /* Number of new dirty pages during iteration */
> +    uint64_t iter_dirty_pages;
> +
> +    /* If background sync has shown up during iteration */
> +    bool background_sync_shown_up;
>  };
>  #endif
>  #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index 67ca3d5d51..f29faa82d6 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2362,6 +2362,10 @@ static void ram_bitmaps_destroy(void)
>          block->bmap = NULL;
>          g_free(block->file_bmap);
>          block->file_bmap = NULL;
> +        g_free(block->shadow_bmap);
> +        block->shadow_bmap = NULL;
> +        g_free(block->iter_bmap);
> +        block->iter_bmap = NULL;
>      }
>  }
>  
> @@ -2753,6 +2757,8 @@ static void ram_list_init_bitmaps(void)
>              }
>              block->clear_bmap_shift = shift;
>              block->clear_bmap = bitmap_new(clear_bmap_size(pages, shift));
> +            block->shadow_bmap = bitmap_new(pages);
> +            block->iter_bmap = bitmap_new(pages);
>          }
>      }
>  }


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-16 21:11   ` Fabiano Rosas
@ 2024-09-17  6:48     ` Yong Huang
  2024-09-19 18:45       ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: Yong Huang @ 2024-09-17  6:48 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 5511 bytes --]

On Tue, Sep 17, 2024 at 5:11 AM Fabiano Rosas <farosas@suse.de> wrote:

> Hyman Huang <yong.huang@smartx.com> writes:
>
> > shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> > to satisfy the need for background sync.
> >
> > Meanwhile, introduce enumeration of sync method.
> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> >  include/exec/ramblock.h | 45 +++++++++++++++++++++++++++++++++++++++++
> >  migration/ram.c         |  6 ++++++
> >  2 files changed, 51 insertions(+)
> >
> > diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> > index 0babd105c0..0e327bc0ae 100644
> > --- a/include/exec/ramblock.h
> > +++ b/include/exec/ramblock.h
> > @@ -24,6 +24,30 @@
> >  #include "qemu/rcu.h"
> >  #include "exec/ramlist.h"
> >
> > +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> > +
> > +/*
> > + * The old-fashioned sync, which is, in turn, used for CPU
> > + * throttle and memory transfer.
>

Using the traditional sync method, the page sending logic iterates
the "bmap" to transfer dirty pages while the CPU throttle logic
counts the amount of new dirty pages and detects convergence.
There are two uses for "bmap".

Using the modern sync method, "bmap" is used for transfer
dirty pages and "iter_bmap" is used to track new dirty pages.


> I'm not sure I follow what "in turn" is supposed to mean in this
> sentence. Could you clarify?
>

Here I want to express "in sequence".  But failed obviously. :(


>
> > + */
> > +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
>
> So ITER is as opposed to background? I'm a bit confused with the terms.
>

Yes.


>
> > +
> > +/*
> > + * The modern sync, which is, in turn, used for CPU throttle
> > + * and memory transfer.
> > + */
> > +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> > +
> > +/* The modern sync, which is used for CPU throttle only */
> > +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
>
> What's the plan for the "legacy" part? To be removed soon? Do we want to
> remove it now? Maybe better to not use the modern/legacy terms unless we
> want to give the impression that the legacy one is discontinued.
>

The bitmap they utilized to track the dirty page information was the
distinction between the "legacy iteration" and the "modern iteration."
The "iter_bmap" field is used by the "modern iteration" while the "bmap"
field is used by the "legacy iteration."

Since the refinement is now transparent and there is no API available to
change the sync method, I actually want to remove it right now in order
to simplify the logic. I'll include it in the next version.


>
> > +
> > +#define RAMBLOCK_SYN_MASK  (0x7)
> > +
> > +typedef enum RAMBlockSynMode {
> > +    RAMBLOCK_SYN_LEGACY, /* Old-fashined mode */
> > +    RAMBLOCK_SYN_MODERN, /* Background-sync-supported mode */
> > +} RAMBlockSynMode;
>
> I'm also wondering wheter we need this enum + the flags or one of them
> would suffice. I'm looking at code like this in the following patches,
> for instance:
>

If we drop the "legacy modern", we can simplify the following
logic too.


> +    if (sync_mode == RAMBLOCK_SYN_MODERN) {
> +        if (background) {
> +            flag = RAMBLOCK_SYN_MODERN_BACKGROUND;
> +        } else {
> +            flag = RAMBLOCK_SYN_MODERN_ITER;
> +        }
> +    }

Couldn't we use LEGACY/BG/ITER?


> > +
> >  struct RAMBlock {
> >      struct rcu_head rcu;
> >      struct MemoryRegion *mr;
> > @@ -89,6 +113,27 @@ struct RAMBlock {
> >       * could not have been valid on the source.
> >       */
> >      ram_addr_t postcopy_length;
> > +
> > +    /*
> > +     * Used to backup the bmap during background sync to see whether
> any dirty
> > +     * pages were sent during that time.
> > +     */
> > +    unsigned long *shadow_bmap;
> > +
> > +    /*
> > +     * The bitmap "bmap," which was initially used for both sync and
> memory
> > +     * transfer, will be replaced by two bitmaps: the previously used
> "bmap"
> > +     * and the recently added "iter_bmap." Only the memory transfer is
> > +     * conducted with the previously used "bmap"; the recently added
> > +     * "iter_bmap" is utilized for dirty bitmap sync.
> > +     */
> > +    unsigned long *iter_bmap;
> > +
> > +    /* Number of new dirty pages during iteration */
> > +    uint64_t iter_dirty_pages;
> > +
> > +    /* If background sync has shown up during iteration */
> > +    bool background_sync_shown_up;
> >  };
> >  #endif
> >  #endif
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 67ca3d5d51..f29faa82d6 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -2362,6 +2362,10 @@ static void ram_bitmaps_destroy(void)
> >          block->bmap = NULL;
> >          g_free(block->file_bmap);
> >          block->file_bmap = NULL;
> > +        g_free(block->shadow_bmap);
> > +        block->shadow_bmap = NULL;
> > +        g_free(block->iter_bmap);
> > +        block->iter_bmap = NULL;
> >      }
> >  }
> >
> > @@ -2753,6 +2757,8 @@ static void ram_list_init_bitmaps(void)
> >              }
> >              block->clear_bmap_shift = shift;
> >              block->clear_bmap = bitmap_new(clear_bmap_size(pages,
> shift));
> > +            block->shadow_bmap = bitmap_new(pages);
> > +            block->iter_bmap = bitmap_new(pages);
> >          }
> >      }
> >  }
>


-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 10780 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 3/7] qapi/migration: Introduce the iteration-count
  2024-09-16 20:35   ` Fabiano Rosas
@ 2024-09-17  6:52     ` Yong Huang
  2024-09-18  8:29     ` Yong Huang
  1 sibling, 0 replies; 28+ messages in thread
From: Yong Huang @ 2024-09-17  6:52 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 6388 bytes --]

On Tue, Sep 17, 2024 at 4:35 AM Fabiano Rosas <farosas@suse.de> wrote:

> Hyman Huang <yong.huang@smartx.com> writes:
>
> > The original migration information dirty-sync-count could
> > no longer reflect iteration count due to the introduction
> > of background synchronization in the next commit;
> > add the iteration count to compensate.
>
> I agree with the overall idea, but I feel we're lacking some information
> on what determines whether some of the lines below want to use the
> iteration count vs. the dirty sync count. Since this patch increments
> both variables at the same place, they can still be used interchangeably
> unless we add some words to explain the distinction.
>
> So to clarify:
>
> What do we call an iteration? A call to save_live_iterate(),
> migration_iteration_run() or something else?
>
> Why dirty-sync-count should ever have reflected "iteration count"? It
> might have been this way by coincidence, but did we ever used it in that
> sense (aside from info migrate maybe)?
>
> With the new counter, what kind of meaning can a user extract from that
> number aside from "some undescribed thing happened N times" (this might
> be included in the migration.json docs)?
>

Alright, I'll make some revisions to the docs in the upcoming version
and see if it clarifies the meaning of these two pieces of information.


> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> >  migration/migration-stats.h  |  4 ++++
> >  migration/migration.c        |  1 +
> >  migration/ram.c              | 12 ++++++++----
> >  qapi/migration.json          |  6 +++++-
> >  tests/qtest/migration-test.c |  2 +-
> >  5 files changed, 19 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/migration-stats.h b/migration/migration-stats.h
> > index 05290ade76..43ee0f4f05 100644
> > --- a/migration/migration-stats.h
> > +++ b/migration/migration-stats.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >       * Number of times we have synchronized guest bitmaps.
> >       */
> >      Stat64 dirty_sync_count;
> > +    /*
> > +     * Number of migration iteration processed.
> > +     */
> > +    Stat64 iteration_count;
> >      /*
> >       * Number of times zero copy failed to send any page using zero
> >       * copy.
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 3dea06d577..055d527ff6 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -1197,6 +1197,7 @@ static void populate_ram_info(MigrationInfo *info,
> MigrationState *s)
> >      info->ram->mbps = s->mbps;
> >      info->ram->dirty_sync_count =
> >          stat64_get(&mig_stats.dirty_sync_count);
> > +    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
> >      info->ram->dirty_sync_missed_zero_copy =
> >          stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
> >      info->ram->postcopy_requests =
> > diff --git a/migration/ram.c b/migration/ram.c
> > index e205806a5f..ca5a1b5f16 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -594,7 +594,7 @@ static void xbzrle_cache_zero_page(ram_addr_t
> current_addr)
> >      /* We don't care if this fails to allocate a new cache page
> >       * as long as it updated an old one */
> >      cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
> > -                 stat64_get(&mig_stats.dirty_sync_count));
> > +                 stat64_get(&mig_stats.iteration_count));
> >  }
> >
> >  #define ENCODING_FLAG_XBZRLE 0x1
> > @@ -620,7 +620,7 @@ static int save_xbzrle_page(RAMState *rs,
> PageSearchStatus *pss,
> >      int encoded_len = 0, bytes_xbzrle;
> >      uint8_t *prev_cached_page;
> >      QEMUFile *file = pss->pss_channel;
> > -    uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> > +    uint64_t generation = stat64_get(&mig_stats.iteration_count);
> >
> >      if (!cache_is_cached(XBZRLE.cache, current_addr, generation)) {
> >          xbzrle_counters.cache_miss++;
> > @@ -1079,6 +1079,10 @@ static void migration_bitmap_sync(RAMState *rs,
> >      RAMBlock *block;
> >      int64_t end_time;
> >
> > +    if (!background) {
> > +        stat64_add(&mig_stats.iteration_count, 1);
> > +    }
> > +
> >      stat64_add(&mig_stats.dirty_sync_count, 1);
> >
> >      if (!rs->time_last_bitmap_sync) {
> > @@ -1115,8 +1119,8 @@ static void migration_bitmap_sync(RAMState *rs,
> >          rs->num_dirty_pages_period = 0;
> >          rs->bytes_xfer_prev = migration_transferred_bytes();
> >      }
> > -    if (migrate_events()) {
> > -        uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> > +    if (!background && migrate_events()) {
> > +        uint64_t generation = stat64_get(&mig_stats.iteration_count);
> >          qapi_event_send_migration_pass(generation);
> >      }
> >  }
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index b66cccf107..95b490706c 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -60,6 +60,9 @@
> >  #     between 0 and @dirty-sync-count * @multifd-channels.  (since
> >  #     7.1)
> >  #
> > +# @iteration-count: The number of iterations since migration started.
> > +#     (since 9.2)
> > +#
> >  # Since: 0.14
> >  ##
> >  { 'struct': 'MigrationStats',
> > @@ -72,7 +75,8 @@
> >             'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
> >             'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
> >             'postcopy-bytes': 'uint64',
> > -           'dirty-sync-missed-zero-copy': 'uint64' } }
> > +           'dirty-sync-missed-zero-copy': 'uint64',
> > +           'iteration-count' : 'int' } }
> >
> >  ##
> >  # @XBZRLECacheStats:
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index d6768d5d71..b796a90cad 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -278,7 +278,7 @@ static int64_t read_migrate_property_int(QTestState
> *who, const char *property)
> >
> >  static uint64_t get_migration_pass(QTestState *who)
> >  {
> > -    return read_ram_property_int(who, "dirty-sync-count");
> > +    return read_ram_property_int(who, "iteration-count");
> >  }
> >
> >  static void read_blocktime(QTestState *who)
>


-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 8734 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter
  2024-09-16 20:55   ` Fabiano Rosas
@ 2024-09-17  6:54     ` Yong Huang
  0 siblings, 0 replies; 28+ messages in thread
From: Yong Huang @ 2024-09-17  6:54 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 9345 bytes --]

On Tue, Sep 17, 2024 at 4:55 AM Fabiano Rosas <farosas@suse.de> wrote:

> Hyman Huang <yong.huang@smartx.com> writes:
>
> > To enable the responsive throttle that will be implemented
> > in the next commit, introduce the cpu-responsive-throttle
> > parameter.
> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> >  migration/migration-hmp-cmds.c |  8 ++++++++
> >  migration/options.c            | 20 ++++++++++++++++++++
> >  migration/options.h            |  1 +
> >  qapi/migration.json            | 16 +++++++++++++++-
> >  4 files changed, 44 insertions(+), 1 deletion(-)
> >
> > diff --git a/migration/migration-hmp-cmds.c
> b/migration/migration-hmp-cmds.c
> > index 28165cfc9e..1fe6c74d66 100644
> > --- a/migration/migration-hmp-cmds.c
> > +++ b/migration/migration-hmp-cmds.c
> > @@ -264,6 +264,10 @@ void hmp_info_migrate_parameters(Monitor *mon,
> const QDict *qdict)
> >          monitor_printf(mon, "%s: %s\n",
> >
> MigrationParameter_str(MIGRATION_PARAMETER_CPU_THROTTLE_TAILSLOW),
> >              params->cpu_throttle_tailslow ? "on" : "off");
> > +        assert(params->has_cpu_responsive_throttle);
> > +        monitor_printf(mon, "%s: %s\n",
> > +
> MigrationParameter_str(MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE),
> > +            params->cpu_responsive_throttle ? "on" : "off");
> >          assert(params->has_max_cpu_throttle);
> >          monitor_printf(mon, "%s: %u\n",
> >
> MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
> > @@ -512,6 +516,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const
> QDict *qdict)
> >          p->has_cpu_throttle_tailslow = true;
> >          visit_type_bool(v, param, &p->cpu_throttle_tailslow, &err);
> >          break;
> > +    case MIGRATION_PARAMETER_CPU_RESPONSIVE_THROTTLE:
> > +        p->has_cpu_responsive_throttle = true;
> > +        visit_type_bool(v, param, &p->cpu_responsive_throttle, &err);
> > +        break;
> >      case MIGRATION_PARAMETER_MAX_CPU_THROTTLE:
> >          p->has_max_cpu_throttle = true;
> >          visit_type_uint8(v, param, &p->max_cpu_throttle, &err);
> > diff --git a/migration/options.c b/migration/options.c
> > index 147cd2b8fd..b4c269bf1d 100644
> > --- a/migration/options.c
> > +++ b/migration/options.c
> > @@ -111,6 +111,8 @@ Property migration_properties[] = {
> >                        DEFAULT_MIGRATE_CPU_THROTTLE_INCREMENT),
> >      DEFINE_PROP_BOOL("x-cpu-throttle-tailslow", MigrationState,
> >                        parameters.cpu_throttle_tailslow, false),
> > +    DEFINE_PROP_BOOL("x-cpu-responsive-throttle", MigrationState,
> > +                      parameters.cpu_responsive_throttle, false),
> >      DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
> >                        parameters.max_bandwidth, MAX_THROTTLE),
> >      DEFINE_PROP_SIZE("avail-switchover-bandwidth", MigrationState,
> > @@ -705,6 +707,13 @@ uint8_t migrate_cpu_throttle_initial(void)
> >      return s->parameters.cpu_throttle_initial;
> >  }
> >
> > +bool migrate_responsive_throttle(void)
> > +{
> > +    MigrationState *s = migrate_get_current();
> > +
> > +    return s->parameters.cpu_responsive_throttle;
> > +}
> > +
> >  bool migrate_cpu_throttle_tailslow(void)
> >  {
> >      MigrationState *s = migrate_get_current();
> > @@ -891,6 +900,8 @@ MigrationParameters
> *qmp_query_migrate_parameters(Error **errp)
> >      params->cpu_throttle_increment =
> s->parameters.cpu_throttle_increment;
> >      params->has_cpu_throttle_tailslow = true;
> >      params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
> > +    params->has_cpu_responsive_throttle = true;
> > +    params->cpu_responsive_throttle =
> s->parameters.cpu_responsive_throttle;
> >      params->tls_creds = g_strdup(s->parameters.tls_creds);
> >      params->tls_hostname = g_strdup(s->parameters.tls_hostname);
> >      params->tls_authz = g_strdup(s->parameters.tls_authz ?
> > @@ -959,6 +970,7 @@ void migrate_params_init(MigrationParameters *params)
> >      params->has_cpu_throttle_initial = true;
> >      params->has_cpu_throttle_increment = true;
> >      params->has_cpu_throttle_tailslow = true;
> > +    params->has_cpu_responsive_throttle = true;
> >      params->has_max_bandwidth = true;
> >      params->has_downtime_limit = true;
> >      params->has_x_checkpoint_delay = true;
> > @@ -1191,6 +1203,10 @@ static void
> migrate_params_test_apply(MigrateSetParameters *params,
> >          dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
> >      }
> >
> > +    if (params->has_cpu_responsive_throttle) {
> > +        dest->cpu_responsive_throttle = params->cpu_responsive_throttle;
> > +    }
> > +
> >      if (params->tls_creds) {
> >          assert(params->tls_creds->type == QTYPE_QSTRING);
> >          dest->tls_creds = params->tls_creds->u.s;
> > @@ -1302,6 +1318,10 @@ static void
> migrate_params_apply(MigrateSetParameters *params, Error **errp)
> >          s->parameters.cpu_throttle_tailslow =
> params->cpu_throttle_tailslow;
> >      }
> >
> > +    if (params->has_cpu_responsive_throttle) {
> > +        s->parameters.cpu_responsive_throttle =
> params->cpu_responsive_throttle;
> > +    }
> > +
> >      if (params->tls_creds) {
> >          g_free(s->parameters.tls_creds);
> >          assert(params->tls_creds->type == QTYPE_QSTRING);
> > diff --git a/migration/options.h b/migration/options.h
> > index a0bd6edc06..80d0fcdaf9 100644
> > --- a/migration/options.h
> > +++ b/migration/options.h
> > @@ -68,6 +68,7 @@ bool migrate_has_block_bitmap_mapping(void);
> >  uint32_t migrate_checkpoint_delay(void);
> >  uint8_t migrate_cpu_throttle_increment(void);
> >  uint8_t migrate_cpu_throttle_initial(void);
> > +bool migrate_responsive_throttle(void);
> >  bool migrate_cpu_throttle_tailslow(void);
> >  bool migrate_direct_io(void);
> >  uint64_t migrate_downtime_limit(void);
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 95b490706c..c61d3b3a6b 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -732,6 +732,10 @@
> >  #     be excessive at tail stage.  The default value is false.  (Since
> >  #     5.1)
> >  #
> > +# @cpu-responsive-throttle: Make CPU throttling more responsively by
> > +#                           introduce an extra detection metric of
> > +#                           migration convergence. (Since 9.1)
> > +#
> >  # @tls-creds: ID of the 'tls-creds' object that provides credentials
> >  #     for establishing a TLS connection over the migration data
> >  #     channel.  On the outgoing side of the migration, the credentials
> > @@ -857,7 +861,7 @@
> >             'announce-rounds', 'announce-step',
> >             'throttle-trigger-threshold',
> >             'cpu-throttle-initial', 'cpu-throttle-increment',
> > -           'cpu-throttle-tailslow',
> > +           'cpu-throttle-tailslow', 'cpu-responsive-throttle',
> >             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> >             'avail-switchover-bandwidth', 'downtime-limit',
> >             { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
> > @@ -913,6 +917,10 @@
> >  #     be excessive at tail stage.  The default value is false.  (Since
> >  #     5.1)
> >  #
> > +# @cpu-responsive-throttle: Make CPU throttling more responsively by
> > +#                           introduce an extra detection metric of
> > +#                           migration convergence. (Since 9.1)
> > +#
> >  # @tls-creds: ID of the 'tls-creds' object that provides credentials
> >  #     for establishing a TLS connection over the migration data
> >  #     channel.  On the outgoing side of the migration, the credentials
> > @@ -1045,6 +1053,7 @@
> >              '*cpu-throttle-initial': 'uint8',
> >              '*cpu-throttle-increment': 'uint8',
> >              '*cpu-throttle-tailslow': 'bool',
> > +            '*cpu-responsive-throttle': 'bool',
>
> Should this instead follow the pattern of cpu-throttle-* ?


Ok




> >              '*tls-creds': 'StrOrNull',
> >              '*tls-hostname': 'StrOrNull',
> >              '*tls-authz': 'StrOrNull',
> > @@ -1127,6 +1136,10 @@
> >  #     be excessive at tail stage.  The default value is false.  (Since
> >  #     5.1)
> >  #
> > +# @cpu-responsive-throttle: Make CPU throttling more responsively by
>
> s/responsively/responsive/ ?
>
> > +#                           introduce an extra detection metric of
> > +#                           migration convergence. (Since 9.1)
>
> Since 9.2. And double space after period.
>
> > +#
> >  # @tls-creds: ID of the 'tls-creds' object that provides credentials
> >  #     for establishing a TLS connection over the migration data
> >  #     channel.  On the outgoing side of the migration, the credentials
> > @@ -1252,6 +1265,7 @@
> >              '*cpu-throttle-initial': 'uint8',
> >              '*cpu-throttle-increment': 'uint8',
> >              '*cpu-throttle-tailslow': 'bool',
> > +            '*cpu-responsive-throttle': 'bool',
> >              '*tls-creds': 'str',
> >              '*tls-hostname': 'str',
> >              '*tls-authz': 'str',
>


-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 12765 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 3/7] qapi/migration: Introduce the iteration-count
  2024-09-16 20:35   ` Fabiano Rosas
  2024-09-17  6:52     ` Yong Huang
@ 2024-09-18  8:29     ` Yong Huang
  2024-09-18 14:39       ` Fabiano Rosas
  1 sibling, 1 reply; 28+ messages in thread
From: Yong Huang @ 2024-09-18  8:29 UTC (permalink / raw)
  To: Fabiano Rosas
  Cc: qemu-devel, Peter Xu, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 7441 bytes --]

On Tue, Sep 17, 2024 at 4:35 AM Fabiano Rosas <farosas@suse.de> wrote:

> Hyman Huang <yong.huang@smartx.com> writes:
>
> > The original migration information dirty-sync-count could
> > no longer reflect iteration count due to the introduction
> > of background synchronization in the next commit;
> > add the iteration count to compensate.
>
> I agree with the overall idea, but I feel we're lacking some information
> on what determines whether some of the lines below want to use the
> iteration count vs. the dirty sync count. Since this patch increments
> both variables at the same place, they can still be used interchangeably
> unless we add some words to explain the distinction.
>
> So to clarify:
>
> What do we call an iteration? A call to save_live_iterate(),
> migration_iteration_run() or something else?
>
> Why dirty-sync-count should ever have reflected "iteration count"? It
> might have been this way by coincidence, but did we ever used it in that
> sense (aside from info migrate maybe)?
>

Unfortunately, I found that Libvirt already regard the "dirty-sync-count"
as the "iteration count", so if we substitute the "dirty-sync-count"
with "iteration count" to represent its original meaning, this could
break the backward compatibility.

To avoid this side effect, we may keep the "dirty-sync-count" as its
original meaning and introduce a new field like "dirty-sync-count-internal"
to represent the *real* "dirty-sync-count"?

diff --git a/migration/migration.c b/migration/migration.c
index f97f6352d2..663315d7e6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1196,8 +1196,9 @@ static void populate_ram_info(MigrationInfo *info,
MigrationState *s)
     info->ram->normal_bytes = info->ram->normal * page_size;
     info->ram->mbps = s->mbps;
     info->ram->dirty_sync_count =
+        stat64_get(&mig_stats.iteration_count);
+    info->ram->dirty_sync_count_internal =
         stat64_get(&mig_stats.dirty_sync_count);
-    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
     info->ram->dirty_sync_missed_zero_copy =
         stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
     info->ram->postcopy_requests =


>
> With the new counter, what kind of meaning can a user extract from that
> number aside from "some undescribed thing happened N times" (this might
> be included in the migration.json docs)?
>
> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> >  migration/migration-stats.h  |  4 ++++
> >  migration/migration.c        |  1 +
> >  migration/ram.c              | 12 ++++++++----
> >  qapi/migration.json          |  6 +++++-
> >  tests/qtest/migration-test.c |  2 +-
> >  5 files changed, 19 insertions(+), 6 deletions(-)
> >
> > diff --git a/migration/migration-stats.h b/migration/migration-stats.h
> > index 05290ade76..43ee0f4f05 100644
> > --- a/migration/migration-stats.h
> > +++ b/migration/migration-stats.h
> > @@ -50,6 +50,10 @@ typedef struct {
> >       * Number of times we have synchronized guest bitmaps.
> >       */
> >      Stat64 dirty_sync_count;
> > +    /*
> > +     * Number of migration iteration processed.
> > +     */
> > +    Stat64 iteration_count;
> >      /*
> >       * Number of times zero copy failed to send any page using zero
> >       * copy.
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 3dea06d577..055d527ff6 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -1197,6 +1197,7 @@ static void populate_ram_info(MigrationInfo *info,
> MigrationState *s)
> >      info->ram->mbps = s->mbps;
> >      info->ram->dirty_sync_count =
> >          stat64_get(&mig_stats.dirty_sync_count);
> > +    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
> >      info->ram->dirty_sync_missed_zero_copy =
> >          stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
> >      info->ram->postcopy_requests =
> > diff --git a/migration/ram.c b/migration/ram.c
> > index e205806a5f..ca5a1b5f16 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -594,7 +594,7 @@ static void xbzrle_cache_zero_page(ram_addr_t
> current_addr)
> >      /* We don't care if this fails to allocate a new cache page
> >       * as long as it updated an old one */
> >      cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
> > -                 stat64_get(&mig_stats.dirty_sync_count));
> > +                 stat64_get(&mig_stats.iteration_count));
> >  }
> >
> >  #define ENCODING_FLAG_XBZRLE 0x1
> > @@ -620,7 +620,7 @@ static int save_xbzrle_page(RAMState *rs,
> PageSearchStatus *pss,
> >      int encoded_len = 0, bytes_xbzrle;
> >      uint8_t *prev_cached_page;
> >      QEMUFile *file = pss->pss_channel;
> > -    uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> > +    uint64_t generation = stat64_get(&mig_stats.iteration_count);
> >
> >      if (!cache_is_cached(XBZRLE.cache, current_addr, generation)) {
> >          xbzrle_counters.cache_miss++;
> > @@ -1079,6 +1079,10 @@ static void migration_bitmap_sync(RAMState *rs,
> >      RAMBlock *block;
> >      int64_t end_time;
> >
> > +    if (!background) {
> > +        stat64_add(&mig_stats.iteration_count, 1);
> > +    }
> > +
> >      stat64_add(&mig_stats.dirty_sync_count, 1);
> >
> >      if (!rs->time_last_bitmap_sync) {
> > @@ -1115,8 +1119,8 @@ static void migration_bitmap_sync(RAMState *rs,
> >          rs->num_dirty_pages_period = 0;
> >          rs->bytes_xfer_prev = migration_transferred_bytes();
> >      }
> > -    if (migrate_events()) {
> > -        uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
> > +    if (!background && migrate_events()) {
> > +        uint64_t generation = stat64_get(&mig_stats.iteration_count);
> >          qapi_event_send_migration_pass(generation);
> >      }
> >  }
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index b66cccf107..95b490706c 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -60,6 +60,9 @@
> >  #     between 0 and @dirty-sync-count * @multifd-channels.  (since
> >  #     7.1)
> >  #
> > +# @iteration-count: The number of iterations since migration started.
> > +#     (since 9.2)
> > +#
> >  # Since: 0.14
> >  ##
> >  { 'struct': 'MigrationStats',
> > @@ -72,7 +75,8 @@
> >             'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
> >             'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
> >             'postcopy-bytes': 'uint64',
> > -           'dirty-sync-missed-zero-copy': 'uint64' } }
> > +           'dirty-sync-missed-zero-copy': 'uint64',
> > +           'iteration-count' : 'int' } }
> >
> >  ##
> >  # @XBZRLECacheStats:
> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> > index d6768d5d71..b796a90cad 100644
> > --- a/tests/qtest/migration-test.c
> > +++ b/tests/qtest/migration-test.c
> > @@ -278,7 +278,7 @@ static int64_t read_migrate_property_int(QTestState
> *who, const char *property)
> >
> >  static uint64_t get_migration_pass(QTestState *who)
> >  {
> > -    return read_ram_property_int(who, "dirty-sync-count");
> > +    return read_ram_property_int(who, "iteration-count");
> >  }
> >
> >  static void read_blocktime(QTestState *who)
>


-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 10615 bytes --]

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 3/7] qapi/migration: Introduce the iteration-count
  2024-09-18  8:29     ` Yong Huang
@ 2024-09-18 14:39       ` Fabiano Rosas
  0 siblings, 0 replies; 28+ messages in thread
From: Fabiano Rosas @ 2024-09-18 14:39 UTC (permalink / raw)
  To: Yong Huang
  Cc: qemu-devel, Peter Xu, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

Yong Huang <yong.huang@smartx.com> writes:

> On Tue, Sep 17, 2024 at 4:35 AM Fabiano Rosas <farosas@suse.de> wrote:
>
>> Hyman Huang <yong.huang@smartx.com> writes:
>>
>> > The original migration information dirty-sync-count could
>> > no longer reflect iteration count due to the introduction
>> > of background synchronization in the next commit;
>> > add the iteration count to compensate.
>>
>> I agree with the overall idea, but I feel we're lacking some information
>> on what determines whether some of the lines below want to use the
>> iteration count vs. the dirty sync count. Since this patch increments
>> both variables at the same place, they can still be used interchangeably
>> unless we add some words to explain the distinction.
>>
>> So to clarify:
>>
>> What do we call an iteration? A call to save_live_iterate(),
>> migration_iteration_run() or something else?
>>
>> Why dirty-sync-count should ever have reflected "iteration count"? It
>> might have been this way by coincidence, but did we ever used it in that
>> sense (aside from info migrate maybe)?
>>
>
> Unfortunately, I found that Libvirt already regard the "dirty-sync-count"
> as the "iteration count", so if we substitute the "dirty-sync-count"
> with "iteration count" to represent its original meaning, this could
> break the backward compatibility.
>
> To avoid this side effect, we may keep the "dirty-sync-count" as its
> original meaning and introduce a new field like "dirty-sync-count-internal"
> to represent the *real* "dirty-sync-count"?
>
> diff --git a/migration/migration.c b/migration/migration.c
> index f97f6352d2..663315d7e6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1196,8 +1196,9 @@ static void populate_ram_info(MigrationInfo *info,
> MigrationState *s)
>      info->ram->normal_bytes = info->ram->normal * page_size;
>      info->ram->mbps = s->mbps;
>      info->ram->dirty_sync_count =
> +        stat64_get(&mig_stats.iteration_count);

ok

> +    info->ram->dirty_sync_count_internal =
>          stat64_get(&mig_stats.dirty_sync_count);

Does this need to be exposed at all? If it does then it'll need a name
that doesn't have "internal" in it.

> -    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
>      info->ram->dirty_sync_missed_zero_copy =
>          stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
>      info->ram->postcopy_requests =
>
>
>>
>> With the new counter, what kind of meaning can a user extract from that
>> number aside from "some undescribed thing happened N times" (this might
>> be included in the migration.json docs)?
>>
>> >
>> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
>> > ---
>> >  migration/migration-stats.h  |  4 ++++
>> >  migration/migration.c        |  1 +
>> >  migration/ram.c              | 12 ++++++++----
>> >  qapi/migration.json          |  6 +++++-
>> >  tests/qtest/migration-test.c |  2 +-
>> >  5 files changed, 19 insertions(+), 6 deletions(-)
>> >
>> > diff --git a/migration/migration-stats.h b/migration/migration-stats.h
>> > index 05290ade76..43ee0f4f05 100644
>> > --- a/migration/migration-stats.h
>> > +++ b/migration/migration-stats.h
>> > @@ -50,6 +50,10 @@ typedef struct {
>> >       * Number of times we have synchronized guest bitmaps.
>> >       */
>> >      Stat64 dirty_sync_count;
>> > +    /*
>> > +     * Number of migration iteration processed.
>> > +     */
>> > +    Stat64 iteration_count;
>> >      /*
>> >       * Number of times zero copy failed to send any page using zero
>> >       * copy.
>> > diff --git a/migration/migration.c b/migration/migration.c
>> > index 3dea06d577..055d527ff6 100644
>> > --- a/migration/migration.c
>> > +++ b/migration/migration.c
>> > @@ -1197,6 +1197,7 @@ static void populate_ram_info(MigrationInfo *info,
>> MigrationState *s)
>> >      info->ram->mbps = s->mbps;
>> >      info->ram->dirty_sync_count =
>> >          stat64_get(&mig_stats.dirty_sync_count);
>> > +    info->ram->iteration_count = stat64_get(&mig_stats.iteration_count);
>> >      info->ram->dirty_sync_missed_zero_copy =
>> >          stat64_get(&mig_stats.dirty_sync_missed_zero_copy);
>> >      info->ram->postcopy_requests =
>> > diff --git a/migration/ram.c b/migration/ram.c
>> > index e205806a5f..ca5a1b5f16 100644
>> > --- a/migration/ram.c
>> > +++ b/migration/ram.c
>> > @@ -594,7 +594,7 @@ static void xbzrle_cache_zero_page(ram_addr_t
>> current_addr)
>> >      /* We don't care if this fails to allocate a new cache page
>> >       * as long as it updated an old one */
>> >      cache_insert(XBZRLE.cache, current_addr, XBZRLE.zero_target_page,
>> > -                 stat64_get(&mig_stats.dirty_sync_count));
>> > +                 stat64_get(&mig_stats.iteration_count));
>> >  }
>> >
>> >  #define ENCODING_FLAG_XBZRLE 0x1
>> > @@ -620,7 +620,7 @@ static int save_xbzrle_page(RAMState *rs,
>> PageSearchStatus *pss,
>> >      int encoded_len = 0, bytes_xbzrle;
>> >      uint8_t *prev_cached_page;
>> >      QEMUFile *file = pss->pss_channel;
>> > -    uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
>> > +    uint64_t generation = stat64_get(&mig_stats.iteration_count);
>> >
>> >      if (!cache_is_cached(XBZRLE.cache, current_addr, generation)) {
>> >          xbzrle_counters.cache_miss++;
>> > @@ -1079,6 +1079,10 @@ static void migration_bitmap_sync(RAMState *rs,
>> >      RAMBlock *block;
>> >      int64_t end_time;
>> >
>> > +    if (!background) {
>> > +        stat64_add(&mig_stats.iteration_count, 1);
>> > +    }
>> > +
>> >      stat64_add(&mig_stats.dirty_sync_count, 1);
>> >
>> >      if (!rs->time_last_bitmap_sync) {
>> > @@ -1115,8 +1119,8 @@ static void migration_bitmap_sync(RAMState *rs,
>> >          rs->num_dirty_pages_period = 0;
>> >          rs->bytes_xfer_prev = migration_transferred_bytes();
>> >      }
>> > -    if (migrate_events()) {
>> > -        uint64_t generation = stat64_get(&mig_stats.dirty_sync_count);
>> > +    if (!background && migrate_events()) {
>> > +        uint64_t generation = stat64_get(&mig_stats.iteration_count);
>> >          qapi_event_send_migration_pass(generation);
>> >      }
>> >  }
>> > diff --git a/qapi/migration.json b/qapi/migration.json
>> > index b66cccf107..95b490706c 100644
>> > --- a/qapi/migration.json
>> > +++ b/qapi/migration.json
>> > @@ -60,6 +60,9 @@
>> >  #     between 0 and @dirty-sync-count * @multifd-channels.  (since
>> >  #     7.1)
>> >  #
>> > +# @iteration-count: The number of iterations since migration started.
>> > +#     (since 9.2)
>> > +#
>> >  # Since: 0.14
>> >  ##
>> >  { 'struct': 'MigrationStats',
>> > @@ -72,7 +75,8 @@
>> >             'multifd-bytes': 'uint64', 'pages-per-second': 'uint64',
>> >             'precopy-bytes': 'uint64', 'downtime-bytes': 'uint64',
>> >             'postcopy-bytes': 'uint64',
>> > -           'dirty-sync-missed-zero-copy': 'uint64' } }
>> > +           'dirty-sync-missed-zero-copy': 'uint64',
>> > +           'iteration-count' : 'int' } }
>> >
>> >  ##
>> >  # @XBZRLECacheStats:
>> > diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> > index d6768d5d71..b796a90cad 100644
>> > --- a/tests/qtest/migration-test.c
>> > +++ b/tests/qtest/migration-test.c
>> > @@ -278,7 +278,7 @@ static int64_t read_migrate_property_int(QTestState
>> *who, const char *property)
>> >
>> >  static uint64_t get_migration_pass(QTestState *who)
>> >  {
>> > -    return read_ram_property_int(who, "dirty-sync-count");
>> > +    return read_ram_property_int(who, "iteration-count");
>> >  }
>> >
>> >  static void read_blocktime(QTestState *who)
>>


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-17  6:48     ` Yong Huang
@ 2024-09-19 18:45       ` Peter Xu
  2024-09-20  2:43         ` Yong Huang
                           ` (2 more replies)
  0 siblings, 3 replies; 28+ messages in thread
From: Peter Xu @ 2024-09-19 18:45 UTC (permalink / raw)
  To: Yong Huang
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

On Tue, Sep 17, 2024 at 02:48:03PM +0800, Yong Huang wrote:
> On Tue, Sep 17, 2024 at 5:11 AM Fabiano Rosas <farosas@suse.de> wrote:
> 
> > Hyman Huang <yong.huang@smartx.com> writes:
> >
> > > shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> > > to satisfy the need for background sync.
> > >
> > > Meanwhile, introduce enumeration of sync method.
> > >
> > > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > > ---
> > >  include/exec/ramblock.h | 45 +++++++++++++++++++++++++++++++++++++++++
> > >  migration/ram.c         |  6 ++++++
> > >  2 files changed, 51 insertions(+)
> > >
> > > diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> > > index 0babd105c0..0e327bc0ae 100644
> > > --- a/include/exec/ramblock.h
> > > +++ b/include/exec/ramblock.h
> > > @@ -24,6 +24,30 @@
> > >  #include "qemu/rcu.h"
> > >  #include "exec/ramlist.h"
> > >
> > > +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> > > +
> > > +/*
> > > + * The old-fashioned sync, which is, in turn, used for CPU
> > > + * throttle and memory transfer.
> >
> 
> Using the traditional sync method, the page sending logic iterates
> the "bmap" to transfer dirty pages while the CPU throttle logic
> counts the amount of new dirty pages and detects convergence.
> There are two uses for "bmap".
> 
> Using the modern sync method, "bmap" is used for transfer
> dirty pages and "iter_bmap" is used to track new dirty pages.
> 
> 
> > I'm not sure I follow what "in turn" is supposed to mean in this
> > sentence. Could you clarify?
> >
> 
> Here I want to express "in sequence".  But failed obviously. :(
> 
> 
> >
> > > + */
> > > +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
> >
> > So ITER is as opposed to background? I'm a bit confused with the terms.
> >
> 
> Yes.
> 
> 
> >
> > > +
> > > +/*
> > > + * The modern sync, which is, in turn, used for CPU throttle
> > > + * and memory transfer.
> > > + */
> > > +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> > > +
> > > +/* The modern sync, which is used for CPU throttle only */
> > > +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
> >
> > What's the plan for the "legacy" part? To be removed soon? Do we want to
> > remove it now? Maybe better to not use the modern/legacy terms unless we
> > want to give the impression that the legacy one is discontinued.
> >
> 
> The bitmap they utilized to track the dirty page information was the
> distinction between the "legacy iteration" and the "modern iteration."
> The "iter_bmap" field is used by the "modern iteration" while the "bmap"
> field is used by the "legacy iteration."
> 
> Since the refinement is now transparent and there is no API available to
> change the sync method, I actually want to remove it right now in order
> to simplify the logic. I'll include it in the next version.

How confident do we think the new way is better than the old?

If it'll be 100% / always better, I agree we can consider removing the old.
But is it always better?  At least it consumes much more resources..

Otherwise, we can still leave that logic as-is but use a migration property
to turn it on only on new machines I think.

Besides, could you explain why the solution needs to be this complex?  My
previous question was that we sync dirty too less, while auto converge
relies on dirty information, so that means auto converge can be adjusted
too unfrequently.

However I wonder whether that can be achieved in a simpler manner by
e.g. invoke migration_bitmap_sync_precopy() more frequently during
migration, for example, in ram_save_iterate() - not every time but the
iterate() is invoked much more frequent, and maybe we can do sync from time
to time.

I also don't see why we need a separate thread, plus two new bitmaps, to
achieve this..  I didn't read in-depth yet, but I thought dirty sync
requires bql anyway, then I don't yet understand why the two bitmaps are
required.  If the bitmaps are introduced in the 1st patch, IMO it'll be
great to explain clearly on why they're needed here.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-19 18:45       ` Peter Xu
@ 2024-09-20  2:43         ` Yong Huang
  2024-09-25 19:17           ` Peter Xu
  2024-09-20  3:02         ` Yong Huang
  2024-09-20  3:13         ` Yong Huang
  2 siblings, 1 reply; 28+ messages in thread
From: Yong Huang @ 2024-09-20  2:43 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 5974 bytes --]

On Fri, Sep 20, 2024 at 2:45 AM Peter Xu <peterx@redhat.com> wrote:

> On Tue, Sep 17, 2024 at 02:48:03PM +0800, Yong Huang wrote:
> > On Tue, Sep 17, 2024 at 5:11 AM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > > Hyman Huang <yong.huang@smartx.com> writes:
> > >
> > > > shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> > > > to satisfy the need for background sync.
> > > >
> > > > Meanwhile, introduce enumeration of sync method.
> > > >
> > > > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > > > ---
> > > >  include/exec/ramblock.h | 45
> +++++++++++++++++++++++++++++++++++++++++
> > > >  migration/ram.c         |  6 ++++++
> > > >  2 files changed, 51 insertions(+)
> > > >
> > > > diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> > > > index 0babd105c0..0e327bc0ae 100644
> > > > --- a/include/exec/ramblock.h
> > > > +++ b/include/exec/ramblock.h
> > > > @@ -24,6 +24,30 @@
> > > >  #include "qemu/rcu.h"
> > > >  #include "exec/ramlist.h"
> > > >
> > > > +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> > > > +
> > > > +/*
> > > > + * The old-fashioned sync, which is, in turn, used for CPU
> > > > + * throttle and memory transfer.
> > >
> >
> > Using the traditional sync method, the page sending logic iterates
> > the "bmap" to transfer dirty pages while the CPU throttle logic
> > counts the amount of new dirty pages and detects convergence.
> > There are two uses for "bmap".
> >
> > Using the modern sync method, "bmap" is used for transfer
> > dirty pages and "iter_bmap" is used to track new dirty pages.
> >
> >
> > > I'm not sure I follow what "in turn" is supposed to mean in this
> > > sentence. Could you clarify?
> > >
> >
> > Here I want to express "in sequence".  But failed obviously. :(
> >
> >
> > >
> > > > + */
> > > > +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
> > >
> > > So ITER is as opposed to background? I'm a bit confused with the terms.
> > >
> >
> > Yes.
> >
> >
> > >
> > > > +
> > > > +/*
> > > > + * The modern sync, which is, in turn, used for CPU throttle
> > > > + * and memory transfer.
> > > > + */
> > > > +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> > > > +
> > > > +/* The modern sync, which is used for CPU throttle only */
> > > > +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
> > >
> > > What's the plan for the "legacy" part? To be removed soon? Do we want
> to
> > > remove it now? Maybe better to not use the modern/legacy terms unless
> we
> > > want to give the impression that the legacy one is discontinued.
> > >
> >
> > The bitmap they utilized to track the dirty page information was the
> > distinction between the "legacy iteration" and the "modern iteration."
> > The "iter_bmap" field is used by the "modern iteration" while the "bmap"
> > field is used by the "legacy iteration."
> >
> > Since the refinement is now transparent and there is no API available to
> > change the sync method, I actually want to remove it right now in order
> > to simplify the logic. I'll include it in the next version.
>
> How confident do we think the new way is better than the old?
>
> If it'll be 100% / always better, I agree we can consider removing the old.
> But is it always better?  At least it consumes much more resources..
>
> Otherwise, we can still leave that logic as-is but use a migration property
> to turn it on only on new machines I think.
>
> Besides, could you explain why the solution needs to be this complex?  My
> previous question was that we sync dirty too less, while auto converge
> relies on dirty information, so that means auto converge can be adjusted
> too unfrequently.
>

The original logic will update the bmap for each sync, which was used to
conduct the dirty page sending. In the background sync logic, we do not
want to update bmap to interfere with the behavior of page sending for
each background sync, since the bitmap we are syncing is only used to
detect the convergence and do the CPU throttle.

The iteration sync wants to 1: sync dirty bitmap, 2:detect convergence,
3: do the CPU throttle and 4: use the bmap fetched to conduct the page
sending, while the background sync only does the 1,2,3. They have different
purposes. These logic need at least two bitmap, one is used to page sending
and another is used for CPU throttling, to achieve this, we introduced the
iter_bmap as the temporary bitmap to store the dirty page information
during background sync and copy it to the bmap in the iteration sync logic.
However, the dirty page information in iter_bmap may be repetitive since
the dirty pages it records could be sent after background syncing, we
introduced the shadow_bmap to help calculate the dirty pages having
been sent during two background syncs.


> However I wonder whether that can be achieved in a simpler manner by
>

I have tried my best to make the solution simpler but failed. :(


> e.g. invoke migration_bitmap_sync_precopy() more frequently during
>

Yes, invoke migration_bitmap_sync_precopy more frequently is also my
first idea but it involves bitmap updating and interfere with the behavior
of page sending, it also affects the migration information stats and
interfere
other migration logic such as migration_update_rates().


> migration, for example, in ram_save_iterate() - not every time but the
> iterate() is invoked much more frequent, and maybe we can do sync from time
> to time.


> I also don't see why we need a separate thread, plus two new bitmaps, to
> achieve this..  I didn't read in-depth yet, but I thought dirty sync
> requires bql anyway, then I don't yet understand why the two bitmaps are
> required.  If the bitmaps are introduced in the 1st patch, IMO it'll be
> great to explain clearly on why they're needed here.
>
> Thanks,
>
> --
> Peter Xu
>
>
Thanks,

Yong

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 11075 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-19 18:45       ` Peter Xu
  2024-09-20  2:43         ` Yong Huang
@ 2024-09-20  3:02         ` Yong Huang
  2024-09-20  3:13         ` Yong Huang
  2 siblings, 0 replies; 28+ messages in thread
From: Yong Huang @ 2024-09-20  3:02 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 4572 bytes --]

On Fri, Sep 20, 2024 at 2:45 AM Peter Xu <peterx@redhat.com> wrote:

> On Tue, Sep 17, 2024 at 02:48:03PM +0800, Yong Huang wrote:
> > On Tue, Sep 17, 2024 at 5:11 AM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > > Hyman Huang <yong.huang@smartx.com> writes:
> > >
> > > > shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> > > > to satisfy the need for background sync.
> > > >
> > > > Meanwhile, introduce enumeration of sync method.
> > > >
> > > > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > > > ---
> > > >  include/exec/ramblock.h | 45
> +++++++++++++++++++++++++++++++++++++++++
> > > >  migration/ram.c         |  6 ++++++
> > > >  2 files changed, 51 insertions(+)
> > > >
> > > > diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> > > > index 0babd105c0..0e327bc0ae 100644
> > > > --- a/include/exec/ramblock.h
> > > > +++ b/include/exec/ramblock.h
> > > > @@ -24,6 +24,30 @@
> > > >  #include "qemu/rcu.h"
> > > >  #include "exec/ramlist.h"
> > > >
> > > > +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> > > > +
> > > > +/*
> > > > + * The old-fashioned sync, which is, in turn, used for CPU
> > > > + * throttle and memory transfer.
> > >
> >
> > Using the traditional sync method, the page sending logic iterates
> > the "bmap" to transfer dirty pages while the CPU throttle logic
> > counts the amount of new dirty pages and detects convergence.
> > There are two uses for "bmap".
> >
> > Using the modern sync method, "bmap" is used for transfer
> > dirty pages and "iter_bmap" is used to track new dirty pages.
> >
> >
> > > I'm not sure I follow what "in turn" is supposed to mean in this
> > > sentence. Could you clarify?
> > >
> >
> > Here I want to express "in sequence".  But failed obviously. :(
> >
> >
> > >
> > > > + */
> > > > +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
> > >
> > > So ITER is as opposed to background? I'm a bit confused with the terms.
> > >
> >
> > Yes.
> >
> >
> > >
> > > > +
> > > > +/*
> > > > + * The modern sync, which is, in turn, used for CPU throttle
> > > > + * and memory transfer.
> > > > + */
> > > > +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> > > > +
> > > > +/* The modern sync, which is used for CPU throttle only */
> > > > +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
> > >
> > > What's the plan for the "legacy" part? To be removed soon? Do we want
> to
> > > remove it now? Maybe better to not use the modern/legacy terms unless
> we
> > > want to give the impression that the legacy one is discontinued.
> > >
> >
> > The bitmap they utilized to track the dirty page information was the
> > distinction between the "legacy iteration" and the "modern iteration."
> > The "iter_bmap" field is used by the "modern iteration" while the "bmap"
> > field is used by the "legacy iteration."
> >
> > Since the refinement is now transparent and there is no API available to
> > change the sync method, I actually want to remove it right now in order
> > to simplify the logic. I'll include it in the next version.
>
> How confident do we think the new way is better than the old?
>
> If it'll be 100% / always better, I agree we can consider removing the old.
> But is it always better?  At least it consumes much more resources..
>

Yes, it introduces an extra bitmap with respect to the old sync logic.


>
> Otherwise, we can still leave that logic as-is but use a migration property
> to turn it on only on new machines I think.
>

OK, that's fine.


>
> Besides, could you explain why the solution needs to be this complex?  My
> previous question was that we sync dirty too less, while auto converge
> relies on dirty information, so that means auto converge can be adjusted
> too unfrequently.
>
> However I wonder whether that can be achieved in a simpler manner by
> e.g. invoke migration_bitmap_sync_precopy() more frequently during
> migration, for example, in ram_save_iterate() - not every time but the
> iterate() is invoked much more frequent, and maybe we can do sync from time
> to time.
>
> I also don't see why we need a separate thread, plus two new bitmaps, to
> achieve this..  I didn't read in-depth yet, but I thought dirty sync
> requires bql anyway, then I don't yet understand why the two bitmaps are
> required.  If the bitmaps are introduced in the 1st patch, IMO it'll be
> great to explain clearly on why they're needed here.
>
> Thanks,
>
> --
> Peter Xu
>
>

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 6935 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-19 18:45       ` Peter Xu
  2024-09-20  2:43         ` Yong Huang
  2024-09-20  3:02         ` Yong Huang
@ 2024-09-20  3:13         ` Yong Huang
  2 siblings, 0 replies; 28+ messages in thread
From: Yong Huang @ 2024-09-20  3:13 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 4584 bytes --]

On Fri, Sep 20, 2024 at 2:45 AM Peter Xu <peterx@redhat.com> wrote:

> On Tue, Sep 17, 2024 at 02:48:03PM +0800, Yong Huang wrote:
> > On Tue, Sep 17, 2024 at 5:11 AM Fabiano Rosas <farosas@suse.de> wrote:
> >
> > > Hyman Huang <yong.huang@smartx.com> writes:
> > >
> > > > shadow_bmap, iter_bmap and iter_dirty_pages are introduced
> > > > to satisfy the need for background sync.
> > > >
> > > > Meanwhile, introduce enumeration of sync method.
> > > >
> > > > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > > > ---
> > > >  include/exec/ramblock.h | 45
> +++++++++++++++++++++++++++++++++++++++++
> > > >  migration/ram.c         |  6 ++++++
> > > >  2 files changed, 51 insertions(+)
> > > >
> > > > diff --git a/include/exec/ramblock.h b/include/exec/ramblock.h
> > > > index 0babd105c0..0e327bc0ae 100644
> > > > --- a/include/exec/ramblock.h
> > > > +++ b/include/exec/ramblock.h
> > > > @@ -24,6 +24,30 @@
> > > >  #include "qemu/rcu.h"
> > > >  #include "exec/ramlist.h"
> > > >
> > > > +/* Possible bits for cpu_physical_memory_sync_dirty_bitmap */
> > > > +
> > > > +/*
> > > > + * The old-fashioned sync, which is, in turn, used for CPU
> > > > + * throttle and memory transfer.
> > >
> >
> > Using the traditional sync method, the page sending logic iterates
> > the "bmap" to transfer dirty pages while the CPU throttle logic
> > counts the amount of new dirty pages and detects convergence.
> > There are two uses for "bmap".
> >
> > Using the modern sync method, "bmap" is used for transfer
> > dirty pages and "iter_bmap" is used to track new dirty pages.
> >
> >
> > > I'm not sure I follow what "in turn" is supposed to mean in this
> > > sentence. Could you clarify?
> > >
> >
> > Here I want to express "in sequence".  But failed obviously. :(
> >
> >
> > >
> > > > + */
> > > > +#define RAMBLOCK_SYN_LEGACY_ITER   (1U << 0)
> > >
> > > So ITER is as opposed to background? I'm a bit confused with the terms.
> > >
> >
> > Yes.
> >
> >
> > >
> > > > +
> > > > +/*
> > > > + * The modern sync, which is, in turn, used for CPU throttle
> > > > + * and memory transfer.
> > > > + */
> > > > +#define RAMBLOCK_SYN_MODERN_ITER   (1U << 1)
> > > > +
> > > > +/* The modern sync, which is used for CPU throttle only */
> > > > +#define RAMBLOCK_SYN_MODERN_BACKGROUND    (1U << 2)
> > >
> > > What's the plan for the "legacy" part? To be removed soon? Do we want
> to
> > > remove it now? Maybe better to not use the modern/legacy terms unless
> we
> > > want to give the impression that the legacy one is discontinued.
> > >
> >
> > The bitmap they utilized to track the dirty page information was the
> > distinction between the "legacy iteration" and the "modern iteration."
> > The "iter_bmap" field is used by the "modern iteration" while the "bmap"
> > field is used by the "legacy iteration."
> >
> > Since the refinement is now transparent and there is no API available to
> > change the sync method, I actually want to remove it right now in order
> > to simplify the logic. I'll include it in the next version.
>
> How confident do we think the new way is better than the old?
>
> If it'll be 100% / always better, I agree we can consider removing the old.
> But is it always better?  At least it consumes much more resources..
>
> Otherwise, we can still leave that logic as-is but use a migration property
> to turn it on only on new machines I think.
>
> Besides, could you explain why the solution needs to be this complex?  My
> previous question was that we sync dirty too less, while auto converge
> relies on dirty information, so that means auto converge can be adjusted
> too unfrequently.
>
> However I wonder whether that can be achieved in a simpler manner by
> e.g. invoke migration_bitmap_sync_precopy() more frequently during
> migration, for example, in ram_save_iterate() - not every time but the
> iterate() is invoked much more frequent, and maybe we can do sync from time
> to time.
>
> I also don't see why we need a separate thread, plus two new bitmaps, to
>

You mean we could do the background sync in the migration thread or
in the main thread( eg, using a timer) ?


> achieve this..  I didn't read in-depth yet, but I thought dirty sync
> requires bql anyway, then I don't yet understand why the two bitmaps are
> required.  If the bitmaps are introduced in the 1st patch, IMO it'll be
> great to explain clearly on why they're needed here.
>
> Thanks,
>
> --
> Peter Xu
>
>

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 6739 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-20  2:43         ` Yong Huang
@ 2024-09-25 19:17           ` Peter Xu
  2024-09-26 18:13             ` Yong Huang
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2024-09-25 19:17 UTC (permalink / raw)
  To: Yong Huang
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> Yes, invoke migration_bitmap_sync_precopy more frequently is also my
> first idea but it involves bitmap updating and interfere with the behavior
> of page sending, it also affects the migration information stats and
> interfere other migration logic such as migration_update_rates().

Could you elaborate?

For example, what happens if we start to sync in ram_save_iterate() for
some time intervals (e.g. 5 seconds)?

Btw, we shouldn't have this extra sync exist if auto converge is disabled
no matter which way we use, because it's pure overhead when auto converge
is not in use.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-25 19:17           ` Peter Xu
@ 2024-09-26 18:13             ` Yong Huang
  2024-09-26 19:55               ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: Yong Huang @ 2024-09-26 18:13 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 1369 bytes --]

On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:

> On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> > Yes, invoke migration_bitmap_sync_precopy more frequently is also my
> > first idea but it involves bitmap updating and interfere with the
> behavior
> > of page sending, it also affects the migration information stats and
> > interfere other migration logic such as migration_update_rates().
>
> Could you elaborate?
>
> For example, what happens if we start to sync in ram_save_iterate() for
> some time intervals (e.g. 5 seconds)?
>

I didn't try to sync in ram_save_iterate but in the
migration_bitmap_sync_precopy.

If we use the migration_bitmap_sync_precopy in the ram_save_iterate
function,
This approach seems to be correct. However, the bitmap will be updated as
the
migration thread iterates through each dirty page in the RAMBlock list.
Compared
to the existing implementation, this is different but still straightforward;
I'll give it a shot soon to see if it works.


> Btw, we shouldn't have this extra sync exist if auto converge is disabled
> no matter which way we use, because it's pure overhead when auto converge
> is not in use.
>

Ok, I'll add the check in the next versioni.


>
> Thanks,
>
> --
> Peter Xu
>
>
Thanks for the comment.

Yong

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 3581 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-26 18:13             ` Yong Huang
@ 2024-09-26 19:55               ` Peter Xu
  2024-09-27  2:50                 ` Yong Huang
  0 siblings, 1 reply; 28+ messages in thread
From: Peter Xu @ 2024-09-26 19:55 UTC (permalink / raw)
  To: Yong Huang
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

On Fri, Sep 27, 2024 at 02:13:47AM +0800, Yong Huang wrote:
> On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:
> 
> > On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> > > Yes, invoke migration_bitmap_sync_precopy more frequently is also my
> > > first idea but it involves bitmap updating and interfere with the
> > behavior
> > > of page sending, it also affects the migration information stats and
> > > interfere other migration logic such as migration_update_rates().
> >
> > Could you elaborate?
> >
> > For example, what happens if we start to sync in ram_save_iterate() for
> > some time intervals (e.g. 5 seconds)?
> >
> 
> I didn't try to sync in ram_save_iterate but in the
> migration_bitmap_sync_precopy.
> 
> If we use the migration_bitmap_sync_precopy in the ram_save_iterate
> function,
> This approach seems to be correct. However, the bitmap will be updated as
> the
> migration thread iterates through each dirty page in the RAMBlock list.
> Compared
> to the existing implementation, this is different but still straightforward;
> I'll give it a shot soon to see if it works.

It's still serialized in the migration thread, so I'd expect it is similar
to e.g. ->state_pending_exact() calls when QEMU flushed most dirty pages in
the current bitmap.

> 
> 
> > Btw, we shouldn't have this extra sync exist if auto converge is disabled
> > no matter which way we use, because it's pure overhead when auto converge
> > is not in use.
> >
> 
> Ok, I'll add the check in the next versioni.

Let's start with simple, and if there's anything unsure we can discuss
upfront, just to avoid coding something and change direction later.  Again,
personally I think we shouldn't add too much new code to auto converge
(unless very well justfied, but I think it's just hard.. fundamentally with
any pure throttling solutions), hopefully something small can make it start
to work for huge VMs.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-26 19:55               ` Peter Xu
@ 2024-09-27  2:50                 ` Yong Huang
  2024-09-27 15:35                   ` Peter Xu
  0 siblings, 1 reply; 28+ messages in thread
From: Yong Huang @ 2024-09-27  2:50 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 2646 bytes --]

On Fri, Sep 27, 2024 at 3:55 AM Peter Xu <peterx@redhat.com> wrote:

> On Fri, Sep 27, 2024 at 02:13:47AM +0800, Yong Huang wrote:
> > On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > > On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> > > > Yes, invoke migration_bitmap_sync_precopy more frequently is also my
> > > > first idea but it involves bitmap updating and interfere with the
> > > behavior
> > > > of page sending, it also affects the migration information stats and
> > > > interfere other migration logic such as migration_update_rates().
> > >
> > > Could you elaborate?
> > >
> > > For example, what happens if we start to sync in ram_save_iterate() for
> > > some time intervals (e.g. 5 seconds)?
> > >
> >
> > I didn't try to sync in ram_save_iterate but in the
> > migration_bitmap_sync_precopy.
> >
> > If we use the migration_bitmap_sync_precopy in the ram_save_iterate
> > function,
> > This approach seems to be correct. However, the bitmap will be updated as
> > the
> > migration thread iterates through each dirty page in the RAMBlock list.
> > Compared
> > to the existing implementation, this is different but still
> straightforward;
> > I'll give it a shot soon to see if it works.
>
> It's still serialized in the migration thread, so I'd expect it is similar
>

What does "serialized" mean?

How about we:
1. invoke the migration_bitmap_sync_precopy in a timer(bg_sync_timer) hook,
   every 5 seconds.
2. register the bg_sync_timer in the main loop when the machine starts like
    throttle_timer
3. activate the timer when ram_save_iterate gets called and deactivate it in
    the ram_save_cleanup gracefully during migration.

I think it is simple enough and also isn't "serialized"?

to e.g. ->state_pending_exact() calls when QEMU flushed most dirty pages in
> the current bitmap.
>
> >
> >
> > > Btw, we shouldn't have this extra sync exist if auto converge is
> disabled
> > > no matter which way we use, because it's pure overhead when auto
> converge
> > > is not in use.
> > >
> >
> > Ok, I'll add the check in the next versioni.
>
> Let's start with simple, and if there's anything unsure we can discuss
> upfront, just to avoid coding something and change direction later.  Again,
> personally I think we shouldn't add too much new code to auto converge
> (unless very well justfied, but I think it's just hard.. fundamentally with
> any pure throttling solutions), hopefully something small can make it start
> to work for huge VMs.
>
> Thanks,
>
> --
> Peter Xu
>
>

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 4865 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-27  2:50                 ` Yong Huang
@ 2024-09-27 15:35                   ` Peter Xu
  2024-09-27 16:44                     ` Hyman Huang
  2024-09-28  5:07                     ` Yong Huang
  0 siblings, 2 replies; 28+ messages in thread
From: Peter Xu @ 2024-09-27 15:35 UTC (permalink / raw)
  To: Yong Huang
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

On Fri, Sep 27, 2024 at 10:50:01AM +0800, Yong Huang wrote:
> On Fri, Sep 27, 2024 at 3:55 AM Peter Xu <peterx@redhat.com> wrote:
> 
> > On Fri, Sep 27, 2024 at 02:13:47AM +0800, Yong Huang wrote:
> > > On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > > On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> > > > > Yes, invoke migration_bitmap_sync_precopy more frequently is also my
> > > > > first idea but it involves bitmap updating and interfere with the
> > > > behavior
> > > > > of page sending, it also affects the migration information stats and
> > > > > interfere other migration logic such as migration_update_rates().
> > > >
> > > > Could you elaborate?
> > > >
> > > > For example, what happens if we start to sync in ram_save_iterate() for
> > > > some time intervals (e.g. 5 seconds)?
> > > >
> > >
> > > I didn't try to sync in ram_save_iterate but in the
> > > migration_bitmap_sync_precopy.
> > >
> > > If we use the migration_bitmap_sync_precopy in the ram_save_iterate
> > > function,
> > > This approach seems to be correct. However, the bitmap will be updated as
> > > the
> > > migration thread iterates through each dirty page in the RAMBlock list.
> > > Compared
> > > to the existing implementation, this is different but still
> > straightforward;
> > > I'll give it a shot soon to see if it works.
> >
> > It's still serialized in the migration thread, so I'd expect it is similar
> >
> 
> What does "serialized" mean?

I meant sync() never happens before concurrently with RAM pages being
iterated, simply because sync() previously only happens in the migration
thread, which is still the same thread that initiate the movement of pages.

> 
> How about we:
> 1. invoke the migration_bitmap_sync_precopy in a timer(bg_sync_timer) hook,
>    every 5 seconds.
> 2. register the bg_sync_timer in the main loop when the machine starts like
>     throttle_timer
> 3. activate the timer when ram_save_iterate gets called and deactivate it in
>     the ram_save_cleanup gracefully during migration.
> 
> I think it is simple enough and also isn't "serialized"?

If you want to do that with timer that's ok, but then IIUC it doesn't need
to involve ram.c code at all.

You can rely on cpu_throttle_get_percentage() too just like the throttle
timer, and it'll work naturally with migration because outside migration
the throttle will be cleared (cpu_throttle_stop() at finish/fail/cancel..).

Then it also gracefully align the async thread sync() that it only happens
with auto-converge is enabled.  Yeh that may look better.. and stick the
code together with cpu-throttle.c seems nice.

Side note: one thing regarind to sync() is ram_init_bitmaps() sync once,
while I don't see why it's necessary.  I remember I tried to remove it but
maybe I hit some issues and I didn't dig further.  If you're working on
sync() anyway not sure whether you'd like to have a look.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-27 15:35                   ` Peter Xu
@ 2024-09-27 16:44                     ` Hyman Huang
  2024-09-28  5:07                     ` Yong Huang
  1 sibling, 0 replies; 28+ messages in thread
From: Hyman Huang @ 2024-09-27 16:44 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini


在 2024/9/27 23:35, Peter Xu 写道:
> On Fri, Sep 27, 2024 at 10:50:01AM +0800, Yong Huang wrote:
>> On Fri, Sep 27, 2024 at 3:55 AM Peter Xu <peterx@redhat.com> wrote:
>>
>>> On Fri, Sep 27, 2024 at 02:13:47AM +0800, Yong Huang wrote:
>>>> On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:
>>>>
>>>>> On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
>>>>>> Yes, invoke migration_bitmap_sync_precopy more frequently is also my
>>>>>> first idea but it involves bitmap updating and interfere with the
>>>>> behavior
>>>>>> of page sending, it also affects the migration information stats and
>>>>>> interfere other migration logic such as migration_update_rates().
>>>>> Could you elaborate?
>>>>>
>>>>> For example, what happens if we start to sync in ram_save_iterate() for
>>>>> some time intervals (e.g. 5 seconds)?
>>>>>
>>>> I didn't try to sync in ram_save_iterate but in the
>>>> migration_bitmap_sync_precopy.
>>>>
>>>> If we use the migration_bitmap_sync_precopy in the ram_save_iterate
>>>> function,
>>>> This approach seems to be correct. However, the bitmap will be updated as
>>>> the
>>>> migration thread iterates through each dirty page in the RAMBlock list.
>>>> Compared
>>>> to the existing implementation, this is different but still
>>> straightforward;
>>>> I'll give it a shot soon to see if it works.
>>> It's still serialized in the migration thread, so I'd expect it is similar
>>>
>> What does "serialized" mean?
> I meant sync() never happens before concurrently with RAM pages being
> iterated, simply because sync() previously only happens in the migration
> thread, which is still the same thread that initiate the movement of pages.
>
>> How about we:
>> 1. invoke the migration_bitmap_sync_precopy in a timer(bg_sync_timer) hook,
>>     every 5 seconds.
>> 2. register the bg_sync_timer in the main loop when the machine starts like
>>      throttle_timer
>> 3. activate the timer when ram_save_iterate gets called and deactivate it in
>>      the ram_save_cleanup gracefully during migration.
>>
>> I think it is simple enough and also isn't "serialized"?
> If you want to do that with timer that's ok, but then IIUC it doesn't need
> to involve ram.c code at all.
>
> You can rely on cpu_throttle_get_percentage() too just like the throttle
> timer, and it'll work naturally with migration because outside migration
> the throttle will be cleared (cpu_throttle_stop() at finish/fail/cancel..).
>
> Then it also gracefully align the async thread sync() that it only happens
> with auto-converge is enabled.  Yeh that may look better.. and stick the
> code together with cpu-throttle.c seems nice.

Ok, Thanks for the advices, i'll check it and see how it goes.

>
> Side note: one thing regarind to sync() is ram_init_bitmaps() sync once,
> while I don't see why it's necessary.  I remember I tried to remove it but
> maybe I hit some issues and I didn't dig further.  If you're working on
> sync() anyway not sure whether you'd like to have a look.

Agree, I'll try it after working out current series.


Yong



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v1 1/7] migration: Introduce structs for background sync
  2024-09-27 15:35                   ` Peter Xu
  2024-09-27 16:44                     ` Hyman Huang
@ 2024-09-28  5:07                     ` Yong Huang
  1 sibling, 0 replies; 28+ messages in thread
From: Yong Huang @ 2024-09-28  5:07 UTC (permalink / raw)
  To: Peter Xu
  Cc: Fabiano Rosas, qemu-devel, Eric Blake, Markus Armbruster,
	David Hildenbrand, Philippe Mathieu-Daudé, Paolo Bonzini

[-- Attachment #1: Type: text/plain, Size: 3694 bytes --]

On Fri, Sep 27, 2024 at 11:35 PM Peter Xu <peterx@redhat.com> wrote:

> On Fri, Sep 27, 2024 at 10:50:01AM +0800, Yong Huang wrote:
> > On Fri, Sep 27, 2024 at 3:55 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > > On Fri, Sep 27, 2024 at 02:13:47AM +0800, Yong Huang wrote:
> > > > On Thu, Sep 26, 2024 at 3:17 AM Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > > On Fri, Sep 20, 2024 at 10:43:31AM +0800, Yong Huang wrote:
> > > > > > Yes, invoke migration_bitmap_sync_precopy more frequently is
> also my
> > > > > > first idea but it involves bitmap updating and interfere with the
> > > > > behavior
> > > > > > of page sending, it also affects the migration information stats
> and
> > > > > > interfere other migration logic such as migration_update_rates().
> > > > >
> > > > > Could you elaborate?
> > > > >
> > > > > For example, what happens if we start to sync in
> ram_save_iterate() for
> > > > > some time intervals (e.g. 5 seconds)?
> > > > >
> > > >
> > > > I didn't try to sync in ram_save_iterate but in the
> > > > migration_bitmap_sync_precopy.
> > > >
> > > > If we use the migration_bitmap_sync_precopy in the ram_save_iterate
> > > > function,
> > > > This approach seems to be correct. However, the bitmap will be
> updated as
> > > > the
> > > > migration thread iterates through each dirty page in the RAMBlock
> list.
> > > > Compared
> > > > to the existing implementation, this is different but still
> > > straightforward;
> > > > I'll give it a shot soon to see if it works.
> > >
> > > It's still serialized in the migration thread, so I'd expect it is
> similar
> > >
> >
> > What does "serialized" mean?
>
> I meant sync() never happens before concurrently with RAM pages being
> iterated, simply because sync() previously only happens in the migration
> thread, which is still the same thread that initiate the movement of pages.
>
> >
> > How about we:
> > 1. invoke the migration_bitmap_sync_precopy in a timer(bg_sync_timer)
> hook,
> >    every 5 seconds.
> > 2. register the bg_sync_timer in the main loop when the machine starts
> like
> >     throttle_timer
> > 3. activate the timer when ram_save_iterate gets called and deactivate
> it in
> >     the ram_save_cleanup gracefully during migration.
> >
> > I think it is simple enough and also isn't "serialized"?
>
> If you want to do that with timer that's ok, but then IIUC it doesn't need
> to involve ram.c code at all.
>

The timer hook will call the migration_bitmap_sync_precopy()
which is implemented in ram.c, maybe we can define the hook
function in ram.c and expose it in ram.h?


>
> You can rely on cpu_throttle_get_percentage() too just like the throttle
> timer, and it'll work naturally with migration because outside migration
> the throttle will be cleared (cpu_throttle_stop() at finish/fail/cancel..).
>

Relying on cpu_throttle_get_percentage() may miss the sync time window
during the second iteration when it last a long time while the throtlle
hasn't  started yet. I'll think through your idea and apply it as possible.


>
> Then it also gracefully align the async thread sync() that it only happens
> with auto-converge is enabled.  Yeh that may look better.. and stick the
> code together with cpu-throttle.c seems nice.
>
> Side note: one thing regarind to sync() is ram_init_bitmaps() sync once,
> while I don't see why it's necessary.  I remember I tried to remove it but
> maybe I hit some issues and I didn't dig further.  If you're working on
> sync() anyway not sure whether you'd like to have a look.
>
> --
> Peter Xu
>
>
Thanks,
Yong

-- 
Best regards

[-- Attachment #2: Type: text/html, Size: 6094 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2024-09-28  5:08 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-15 16:08 [PATCH v1 0/7] migration: auto-converge refinements for huge VM Hyman Huang
2024-09-15 16:08 ` [PATCH v1 1/7] migration: Introduce structs for background sync Hyman Huang
2024-09-16 21:11   ` Fabiano Rosas
2024-09-17  6:48     ` Yong Huang
2024-09-19 18:45       ` Peter Xu
2024-09-20  2:43         ` Yong Huang
2024-09-25 19:17           ` Peter Xu
2024-09-26 18:13             ` Yong Huang
2024-09-26 19:55               ` Peter Xu
2024-09-27  2:50                 ` Yong Huang
2024-09-27 15:35                   ` Peter Xu
2024-09-27 16:44                     ` Hyman Huang
2024-09-28  5:07                     ` Yong Huang
2024-09-20  3:02         ` Yong Huang
2024-09-20  3:13         ` Yong Huang
2024-09-15 16:08 ` [PATCH v1 2/7] migration: Refine util functions to support " Hyman Huang
2024-09-15 16:08 ` [PATCH v1 3/7] qapi/migration: Introduce the iteration-count Hyman Huang
2024-09-16 20:35   ` Fabiano Rosas
2024-09-17  6:52     ` Yong Huang
2024-09-18  8:29     ` Yong Huang
2024-09-18 14:39       ` Fabiano Rosas
2024-09-15 16:08 ` [PATCH v1 4/7] migration: Implment background sync watcher Hyman Huang
2024-09-15 16:08 ` [PATCH v1 5/7] migration: Support background dirty bitmap sync and throttle Hyman Huang
2024-09-16 20:50   ` Fabiano Rosas
2024-09-15 16:08 ` [PATCH v1 6/7] qapi/migration: Introduce cpu-responsive-throttle parameter Hyman Huang
2024-09-16 20:55   ` Fabiano Rosas
2024-09-17  6:54     ` Yong Huang
2024-09-15 16:08 ` [PATCH v1 7/7] migration: Support responsive CPU throttle Hyman Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).