* [PATCH v3 0/4] migration: auto-converge refinements for huge VM
@ 2024-10-16 7:56 yong.huang
2024-10-16 7:56 ` [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration yong.huang
` (3 more replies)
0 siblings, 4 replies; 11+ messages in thread
From: yong.huang @ 2024-10-16 7:56 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Richard Henderson, Paolo Bonzini,
yong.huang
From: Hyman Huang <yong.huang@smartx.com>
v3:
1. drop the responsive throttle patchset
2. rename background sync to periodic ramblock dirty sync
3. move the cpu-throttle.* from system to migration
4. remove "rs" parameter in migration_bitmap_sync_precopy
5. implement periodic ramblock dirty sync in cpu-throttle.c
6. move the test change into a separate patch
To simplify the cover letter, i have dropped the test data,
please refer to
https://lore.kernel.org/qemu-devel/cover.1727630000.git.yong.huang@smartx.com/
for more test details.
Thanks Peter and Fabiano for the suggestions and comments.
Please review.
Yong
Hyman Huang (4):
migration: Move cpu-throttole.c from system to migration
migration: Remove "rs" parameter in migration_bitmap_sync_precopy
migration: Support periodic ramblock dirty sync
tests/migration: Add case for periodic ramblock dirty sync
accel/tcg/icount-common.c | 1 -
{system => migration}/cpu-throttle.c | 72 +++++++++++++++++++-
{include/sysemu => migration}/cpu-throttle.h | 14 ++++
migration/meson.build | 1 +
migration/migration.c | 11 ++-
migration/migration.h | 1 +
migration/ram.c | 20 ++++--
migration/trace-events | 4 ++
system/cpu-timers.c | 3 -
system/meson.build | 1 -
system/trace-events | 3 -
tests/qtest/migration-test.c | 32 +++++++++
12 files changed, 144 insertions(+), 19 deletions(-)
rename {system => migration}/cpu-throttle.c (65%)
rename {include/sysemu => migration}/cpu-throttle.h (87%)
--
2.27.0
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration
2024-10-16 7:56 [PATCH v3 0/4] migration: auto-converge refinements for huge VM yong.huang
@ 2024-10-16 7:56 ` yong.huang
2024-10-16 15:50 ` Peter Xu
2024-10-16 7:56 ` [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy yong.huang
` (2 subsequent siblings)
3 siblings, 1 reply; 11+ messages in thread
From: yong.huang @ 2024-10-16 7:56 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Richard Henderson, Paolo Bonzini,
yong.huang
From: Hyman Huang <yong.huang@smartx.com>
Move cpu-throttle.c from system to migration since it's
only used for migration; this makes us avoid exporting the
util functions and variables in misc.h but export them in
migration.h when implementing the background ramblock dirty
sync feature in the upcoming commits.
Additionally, make the two modifications below:
1. Delay the timer registering of CPU throttle until
migration starts since it is only used in migration.
2. Stop CPU throttle if auto converge capability is
enabled since it only happens with auto converge.
3. Remove the unused header file reference in
accel/tcg/icount-common.c.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
accel/tcg/icount-common.c | 1 -
{system => migration}/cpu-throttle.c | 2 +-
{include/sysemu => migration}/cpu-throttle.h | 0
migration/meson.build | 1 +
migration/migration.c | 11 +++++++++--
migration/ram.c | 2 +-
migration/trace-events | 3 +++
system/cpu-timers.c | 3 ---
system/meson.build | 1 -
system/trace-events | 3 ---
10 files changed, 15 insertions(+), 12 deletions(-)
rename {system => migration}/cpu-throttle.c (99%)
rename {include/sysemu => migration}/cpu-throttle.h (100%)
diff --git a/accel/tcg/icount-common.c b/accel/tcg/icount-common.c
index 8d3d3a7e9d..30bf8500dc 100644
--- a/accel/tcg/icount-common.c
+++ b/accel/tcg/icount-common.c
@@ -36,7 +36,6 @@
#include "sysemu/runstate.h"
#include "hw/core/cpu.h"
#include "sysemu/cpu-timers.h"
-#include "sysemu/cpu-throttle.h"
#include "sysemu/cpu-timers-internal.h"
/*
diff --git a/system/cpu-throttle.c b/migration/cpu-throttle.c
similarity index 99%
rename from system/cpu-throttle.c
rename to migration/cpu-throttle.c
index 7632dc6143..fa47ee2e21 100644
--- a/system/cpu-throttle.c
+++ b/migration/cpu-throttle.c
@@ -27,7 +27,7 @@
#include "hw/core/cpu.h"
#include "qemu/main-loop.h"
#include "sysemu/cpus.h"
-#include "sysemu/cpu-throttle.h"
+#include "cpu-throttle.h"
#include "trace.h"
/* vcpu throttling controls */
diff --git a/include/sysemu/cpu-throttle.h b/migration/cpu-throttle.h
similarity index 100%
rename from include/sysemu/cpu-throttle.h
rename to migration/cpu-throttle.h
diff --git a/migration/meson.build b/migration/meson.build
index 66d3de86f0..d53cf3417a 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -13,6 +13,7 @@ system_ss.add(files(
'block-dirty-bitmap.c',
'channel.c',
'channel-block.c',
+ 'cpu-throttle.c',
'dirtyrate.c',
'exec.c',
'fd.c',
diff --git a/migration/migration.c b/migration/migration.c
index 021faee2f3..7e71184257 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -24,7 +24,7 @@
#include "socket.h"
#include "sysemu/runstate.h"
#include "sysemu/sysemu.h"
-#include "sysemu/cpu-throttle.h"
+#include "cpu-throttle.h"
#include "rdma.h"
#include "ram.h"
#include "migration/global_state.h"
@@ -3289,7 +3289,9 @@ static MigIterateState migration_iteration_run(MigrationState *s)
static void migration_iteration_finish(MigrationState *s)
{
/* If we enabled cpu throttling for auto-converge, turn it off. */
- cpu_throttle_stop();
+ if (migrate_auto_converge()) {
+ cpu_throttle_stop();
+ }
bql_lock();
switch (s->state) {
@@ -3508,6 +3510,11 @@ static void *migration_thread(void *opaque)
qemu_savevm_send_colo_enable(s->to_dst_file);
}
+ if (migrate_auto_converge()) {
+ /* Start cpu throttle timers */
+ cpu_throttle_init();
+ }
+
bql_lock();
ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
bql_unlock();
diff --git a/migration/ram.c b/migration/ram.c
index 326ce7eb79..54d352b152 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -52,7 +52,7 @@
#include "exec/target_page.h"
#include "qemu/rcu_queue.h"
#include "migration/colo.h"
-#include "sysemu/cpu-throttle.h"
+#include "cpu-throttle.h"
#include "savevm.h"
#include "qemu/iov.h"
#include "multifd.h"
diff --git a/migration/trace-events b/migration/trace-events
index c65902f042..9a19599804 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -378,3 +378,6 @@ migration_block_progression(unsigned percent) "Completed %u%%"
# page_cache.c
migration_pagecache_init(int64_t max_num_items) "Setting cache buckets to %" PRId64
migration_pagecache_insert(void) "Error allocating page"
+
+# cpu-throttle.c
+cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
diff --git a/system/cpu-timers.c b/system/cpu-timers.c
index 0b31c9a1b6..856e502e34 100644
--- a/system/cpu-timers.c
+++ b/system/cpu-timers.c
@@ -35,7 +35,6 @@
#include "sysemu/runstate.h"
#include "hw/core/cpu.h"
#include "sysemu/cpu-timers.h"
-#include "sysemu/cpu-throttle.h"
#include "sysemu/cpu-timers-internal.h"
/* clock and ticks */
@@ -272,6 +271,4 @@ void cpu_timers_init(void)
seqlock_init(&timers_state.vm_clock_seqlock);
qemu_spin_init(&timers_state.vm_clock_lock);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
-
- cpu_throttle_init();
}
diff --git a/system/meson.build b/system/meson.build
index a296270cb0..4952f4b2c7 100644
--- a/system/meson.build
+++ b/system/meson.build
@@ -10,7 +10,6 @@ system_ss.add(files(
'balloon.c',
'bootdevice.c',
'cpus.c',
- 'cpu-throttle.c',
'cpu-timers.c',
'datadir.c',
'dirtylimit.c',
diff --git a/system/trace-events b/system/trace-events
index 074d001e90..2ed1d59b1f 100644
--- a/system/trace-events
+++ b/system/trace-events
@@ -44,6 +44,3 @@ dirtylimit_state_finalize(void)
dirtylimit_throttle_pct(int cpu_index, uint64_t pct, int64_t time_us) "CPU[%d] throttle percent: %" PRIu64 ", throttle adjust time %"PRIi64 " us"
dirtylimit_set_vcpu(int cpu_index, uint64_t quota) "CPU[%d] set dirty page rate limit %"PRIu64
dirtylimit_vcpu_execute(int cpu_index, int64_t sleep_time_us) "CPU[%d] sleep %"PRIi64 " us"
-
-# cpu-throttle.c
-cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy
2024-10-16 7:56 [PATCH v3 0/4] migration: auto-converge refinements for huge VM yong.huang
2024-10-16 7:56 ` [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration yong.huang
@ 2024-10-16 7:56 ` yong.huang
2024-10-16 15:51 ` Peter Xu
2024-10-16 7:56 ` [PATCH v3 3/4] migration: Support periodic ramblock dirty sync yong.huang
2024-10-16 7:56 ` [PATCH v3 4/4] tests/migration: Add case for " yong.huang
3 siblings, 1 reply; 11+ messages in thread
From: yong.huang @ 2024-10-16 7:56 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Richard Henderson, Paolo Bonzini,
yong.huang
From: Hyman Huang <yong.huang@smartx.com>
The global static variable ram_state in fact is referred to by the
"rs" parameter in migration_bitmap_sync_precopy. For ease of calling
by the callees, use the global variable directly in
migration_bitmap_sync_precopy and remove "rs" parameter.
The migration_bitmap_sync_precopy will be exported in the next commit.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
migration/ram.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/migration/ram.c b/migration/ram.c
index 54d352b152..9b5b350405 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1088,9 +1088,10 @@ static void migration_bitmap_sync(RAMState *rs, bool last_stage)
}
}
-static void migration_bitmap_sync_precopy(RAMState *rs, bool last_stage)
+static void migration_bitmap_sync_precopy(bool last_stage)
{
Error *local_err = NULL;
+ assert(ram_state);
/*
* The current notifier usage is just an optimization to migration, so we
@@ -1101,7 +1102,7 @@ static void migration_bitmap_sync_precopy(RAMState *rs, bool last_stage)
local_err = NULL;
}
- migration_bitmap_sync(rs, last_stage);
+ migration_bitmap_sync(ram_state, last_stage);
if (precopy_notify(PRECOPY_NOTIFY_AFTER_BITMAP_SYNC, &local_err)) {
error_report_err(local_err);
@@ -2782,7 +2783,7 @@ static bool ram_init_bitmaps(RAMState *rs, Error **errp)
if (!ret) {
goto out_unlock;
}
- migration_bitmap_sync_precopy(rs, false);
+ migration_bitmap_sync_precopy(false);
}
}
out_unlock:
@@ -3248,7 +3249,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
WITH_RCU_READ_LOCK_GUARD() {
if (!migration_in_postcopy()) {
- migration_bitmap_sync_precopy(rs, true);
+ migration_bitmap_sync_precopy(true);
}
ret = rdma_registration_start(f, RAM_CONTROL_FINISH);
@@ -3330,7 +3331,7 @@ static void ram_state_pending_exact(void *opaque, uint64_t *must_precopy,
if (!migration_in_postcopy()) {
bql_lock();
WITH_RCU_READ_LOCK_GUARD() {
- migration_bitmap_sync_precopy(rs, false);
+ migration_bitmap_sync_precopy(false);
}
bql_unlock();
}
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 3/4] migration: Support periodic ramblock dirty sync
2024-10-16 7:56 [PATCH v3 0/4] migration: auto-converge refinements for huge VM yong.huang
2024-10-16 7:56 ` [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration yong.huang
2024-10-16 7:56 ` [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy yong.huang
@ 2024-10-16 7:56 ` yong.huang
2024-10-16 18:49 ` Peter Xu
2024-10-16 7:56 ` [PATCH v3 4/4] tests/migration: Add case for " yong.huang
3 siblings, 1 reply; 11+ messages in thread
From: yong.huang @ 2024-10-16 7:56 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Richard Henderson, Paolo Bonzini,
yong.huang
From: Hyman Huang <yong.huang@smartx.com>
When VM is configured with huge memory, the current throttle logic
doesn't look like to scale, because migration_trigger_throttle()
is only called for each iteration, so it won't be invoked for a long
time if one iteration can take a long time.
The periodic dirty sync aims to fix the above issue by synchronizing
the ramblock from remote dirty bitmap and, when necessary, triggering
the CPU throttle multiple times during a long iteration.
This is a trade-off between synchronization overhead and CPU throttle
impact.
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
migration/cpu-throttle.c | 70 +++++++++++++++++++++++++++++++++++++++-
migration/cpu-throttle.h | 14 ++++++++
migration/migration.h | 1 +
migration/ram.c | 9 ++++--
migration/trace-events | 1 +
5 files changed, 92 insertions(+), 3 deletions(-)
diff --git a/migration/cpu-throttle.c b/migration/cpu-throttle.c
index fa47ee2e21..784b51ab35 100644
--- a/migration/cpu-throttle.c
+++ b/migration/cpu-throttle.c
@@ -28,16 +28,23 @@
#include "qemu/main-loop.h"
#include "sysemu/cpus.h"
#include "cpu-throttle.h"
+#include "migration.h"
+#include "migration-stats.h"
+#include "options.h"
#include "trace.h"
/* vcpu throttling controls */
-static QEMUTimer *throttle_timer;
+static QEMUTimer *throttle_timer, *throttle_dirty_sync_timer;
static unsigned int throttle_percentage;
+static bool throttle_dirty_sync_timer_active;
#define CPU_THROTTLE_PCT_MIN 1
#define CPU_THROTTLE_PCT_MAX 99
#define CPU_THROTTLE_TIMESLICE_NS 10000000
+/* RAMBlock dirty sync trigger every five seconds */
+#define CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS 5000
+
static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
{
double pct;
@@ -112,6 +119,7 @@ void cpu_throttle_set(int new_throttle_pct)
void cpu_throttle_stop(void)
{
qatomic_set(&throttle_percentage, 0);
+ cpu_throttle_dirty_sync_timer(false);
}
bool cpu_throttle_active(void)
@@ -124,8 +132,68 @@ int cpu_throttle_get_percentage(void)
return qatomic_read(&throttle_percentage);
}
+void cpu_throttle_dirty_sync_timer_tick(void *opaque)
+{
+ static uint64_t prev_sync_cnt = 2;
+ uint64_t sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
+
+ if (!migrate_auto_converge()) {
+ /* Stop the timer when auto converge is disabled */
+ return;
+ }
+
+ /*
+ * The first iteration copies all memory anyhow and has no
+ * effect on guest performance, therefore omit it to avoid
+ * paying extra for the sync penalty.
+ */
+ if (sync_cnt <= 1) {
+ goto end;
+ }
+
+ if (sync_cnt == prev_sync_cnt) {
+ trace_cpu_throttle_dirty_sync();
+ WITH_RCU_READ_LOCK_GUARD() {
+ migration_bitmap_sync_precopy(false);
+ }
+ }
+
+end:
+ prev_sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
+
+ timer_mod(throttle_dirty_sync_timer,
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+ CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
+}
+
+static bool cpu_throttle_dirty_sync_active(void)
+{
+ return qatomic_read(&throttle_dirty_sync_timer_active);
+}
+
+void cpu_throttle_dirty_sync_timer(bool enable)
+{
+ if (enable) {
+ assert(throttle_dirty_sync_timer);
+ if (!cpu_throttle_dirty_sync_active()) {
+ timer_mod(throttle_dirty_sync_timer,
+ qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+ CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
+ qatomic_set(&throttle_dirty_sync_timer_active, 1);
+ }
+ } else {
+ if (throttle_dirty_sync_timer != NULL) {
+ timer_del(throttle_dirty_sync_timer);
+ qatomic_set(&throttle_dirty_sync_timer_active, 0);
+ }
+ }
+}
+
void cpu_throttle_init(void)
{
throttle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
cpu_throttle_timer_tick, NULL);
+ throttle_dirty_sync_timer =
+ timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
+ cpu_throttle_dirty_sync_timer_tick, NULL);
}
diff --git a/migration/cpu-throttle.h b/migration/cpu-throttle.h
index d65bdef6d0..420702b8d3 100644
--- a/migration/cpu-throttle.h
+++ b/migration/cpu-throttle.h
@@ -65,4 +65,18 @@ bool cpu_throttle_active(void);
*/
int cpu_throttle_get_percentage(void);
+/**
+ * cpu_throttle_dirty_sync_timer_tick:
+ *
+ * Dirty sync timer hook.
+ */
+void cpu_throttle_dirty_sync_timer_tick(void *opaque);
+
+/**
+ * cpu_throttle_dirty_sync_timer:
+ *
+ * Start or stop the dirty sync timer.
+ */
+void cpu_throttle_dirty_sync_timer(bool enable);
+
#endif /* SYSEMU_CPU_THROTTLE_H */
diff --git a/migration/migration.h b/migration/migration.h
index 38aa1402d5..fbd0d19092 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -537,4 +537,5 @@ int migration_rp_wait(MigrationState *s);
*/
void migration_rp_kick(MigrationState *s);
+void migration_bitmap_sync_precopy(bool last_stage);
#endif
diff --git a/migration/ram.c b/migration/ram.c
index 9b5b350405..ac34e731e2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1020,6 +1020,11 @@ static void migration_trigger_throttle(RAMState *rs)
migration_transferred_bytes() - rs->bytes_xfer_prev;
uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
+ bool auto_converge = migrate_auto_converge();
+
+ if (auto_converge) {
+ cpu_throttle_dirty_sync_timer(true);
+ }
/*
* The following detection logic can be refined later. For now:
@@ -1031,7 +1036,7 @@ static void migration_trigger_throttle(RAMState *rs)
if ((bytes_dirty_period > bytes_dirty_threshold) &&
(++rs->dirty_rate_high_cnt >= 2)) {
rs->dirty_rate_high_cnt = 0;
- if (migrate_auto_converge()) {
+ if (auto_converge) {
trace_migration_throttle();
mig_throttle_guest_down(bytes_dirty_period,
bytes_dirty_threshold);
@@ -1088,7 +1093,7 @@ static void migration_bitmap_sync(RAMState *rs, bool last_stage)
}
}
-static void migration_bitmap_sync_precopy(bool last_stage)
+void migration_bitmap_sync_precopy(bool last_stage)
{
Error *local_err = NULL;
assert(ram_state);
diff --git a/migration/trace-events b/migration/trace-events
index 9a19599804..0638183056 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -381,3 +381,4 @@ migration_pagecache_insert(void) "Error allocating page"
# cpu-throttle.c
cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
+cpu_throttle_dirty_sync(void) ""
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [PATCH v3 4/4] tests/migration: Add case for periodic ramblock dirty sync
2024-10-16 7:56 [PATCH v3 0/4] migration: auto-converge refinements for huge VM yong.huang
` (2 preceding siblings ...)
2024-10-16 7:56 ` [PATCH v3 3/4] migration: Support periodic ramblock dirty sync yong.huang
@ 2024-10-16 7:56 ` yong.huang
2024-10-16 18:50 ` Peter Xu
3 siblings, 1 reply; 11+ messages in thread
From: yong.huang @ 2024-10-16 7:56 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Xu, Fabiano Rosas, Richard Henderson, Paolo Bonzini,
yong.huang
From: Hyman Huang <yong.huang@smartx.com>
Signed-off-by: Hyman Huang <yong.huang@smartx.com>
---
tests/qtest/migration-test.c | 32 ++++++++++++++++++++++++++++++++
1 file changed, 32 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 95e45b5029..e6a2803e71 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -2791,6 +2791,8 @@ static void test_migrate_auto_converge(void)
* so we need to decrease a bandwidth.
*/
const int64_t init_pct = 5, inc_pct = 25, max_pct = 95;
+ uint64_t prev_dirty_sync_cnt, dirty_sync_cnt;
+ int max_try_count, hit = 0;
if (test_migrate_start(&from, &to, uri, &args)) {
return;
@@ -2827,6 +2829,36 @@ static void test_migrate_auto_converge(void)
} while (true);
/* The first percentage of throttling should be at least init_pct */
g_assert_cmpint(percentage, >=, init_pct);
+
+ /*
+ * End the loop when the dirty sync count greater than 1.
+ */
+ while ((dirty_sync_cnt = get_migration_pass(from)) < 2) {
+ usleep(1000 * 1000);
+ }
+
+ prev_dirty_sync_cnt = dirty_sync_cnt;
+
+ /*
+ * The RAMBlock dirty sync count must changes in 5 seconds, here we set
+ * the timeout to 10 seconds to ensure it changes.
+ *
+ * Note that migrate_ensure_non_converge set the max-bandwidth to 3MB/s,
+ * while the qtest mem is >= 100MB, one iteration takes at least 33s (100/3)
+ * to complete; this ensures that the RAMBlock dirty sync occurs.
+ */
+ max_try_count = 10;
+ while (--max_try_count) {
+ dirty_sync_cnt = get_migration_pass(from);
+ if (dirty_sync_cnt != prev_dirty_sync_cnt) {
+ hit = 1;
+ break;
+ }
+ prev_dirty_sync_cnt = dirty_sync_cnt;
+ sleep(1);
+ }
+ g_assert_cmpint(hit, ==, 1);
+
/* Now, when we tested that throttling works, let it converge */
migrate_ensure_converge(from);
--
2.27.0
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration
2024-10-16 7:56 ` [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration yong.huang
@ 2024-10-16 15:50 ` Peter Xu
2024-10-17 3:52 ` Yong Huang
0 siblings, 1 reply; 11+ messages in thread
From: Peter Xu @ 2024-10-16 15:50 UTC (permalink / raw)
To: yong.huang; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
On Wed, Oct 16, 2024 at 03:56:42PM +0800, yong.huang@smartx.com wrote:
> From: Hyman Huang <yong.huang@smartx.com>
>
> Move cpu-throttle.c from system to migration since it's
> only used for migration; this makes us avoid exporting the
> util functions and variables in misc.h but export them in
> migration.h when implementing the background ramblock dirty
> sync feature in the upcoming commits.
>
> Additionally, make the two modifications below:
>
> 1. Delay the timer registering of CPU throttle until
> migration starts since it is only used in migration.
>
> 2. Stop CPU throttle if auto converge capability is
> enabled since it only happens with auto converge.
>
> 3. Remove the unused header file reference in
> accel/tcg/icount-common.c.
Please consider split the things into smaller patches, especially when it
involves file movements.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
> accel/tcg/icount-common.c | 1 -
> {system => migration}/cpu-throttle.c | 2 +-
> {include/sysemu => migration}/cpu-throttle.h | 0
> migration/meson.build | 1 +
> migration/migration.c | 11 +++++++++--
> migration/ram.c | 2 +-
> migration/trace-events | 3 +++
> system/cpu-timers.c | 3 ---
> system/meson.build | 1 -
> system/trace-events | 3 ---
> 10 files changed, 15 insertions(+), 12 deletions(-)
> rename {system => migration}/cpu-throttle.c (99%)
> rename {include/sysemu => migration}/cpu-throttle.h (100%)
>
> diff --git a/accel/tcg/icount-common.c b/accel/tcg/icount-common.c
> index 8d3d3a7e9d..30bf8500dc 100644
> --- a/accel/tcg/icount-common.c
> +++ b/accel/tcg/icount-common.c
> @@ -36,7 +36,6 @@
> #include "sysemu/runstate.h"
> #include "hw/core/cpu.h"
> #include "sysemu/cpu-timers.h"
> -#include "sysemu/cpu-throttle.h"
> #include "sysemu/cpu-timers-internal.h"
>
> /*
> diff --git a/system/cpu-throttle.c b/migration/cpu-throttle.c
> similarity index 99%
> rename from system/cpu-throttle.c
> rename to migration/cpu-throttle.c
> index 7632dc6143..fa47ee2e21 100644
> --- a/system/cpu-throttle.c
> +++ b/migration/cpu-throttle.c
> @@ -27,7 +27,7 @@
> #include "hw/core/cpu.h"
> #include "qemu/main-loop.h"
> #include "sysemu/cpus.h"
> -#include "sysemu/cpu-throttle.h"
> +#include "cpu-throttle.h"
> #include "trace.h"
>
> /* vcpu throttling controls */
> diff --git a/include/sysemu/cpu-throttle.h b/migration/cpu-throttle.h
> similarity index 100%
> rename from include/sysemu/cpu-throttle.h
> rename to migration/cpu-throttle.h
> diff --git a/migration/meson.build b/migration/meson.build
> index 66d3de86f0..d53cf3417a 100644
> --- a/migration/meson.build
> +++ b/migration/meson.build
> @@ -13,6 +13,7 @@ system_ss.add(files(
> 'block-dirty-bitmap.c',
> 'channel.c',
> 'channel-block.c',
> + 'cpu-throttle.c',
> 'dirtyrate.c',
> 'exec.c',
> 'fd.c',
> diff --git a/migration/migration.c b/migration/migration.c
> index 021faee2f3..7e71184257 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -24,7 +24,7 @@
> #include "socket.h"
> #include "sysemu/runstate.h"
> #include "sysemu/sysemu.h"
> -#include "sysemu/cpu-throttle.h"
> +#include "cpu-throttle.h"
> #include "rdma.h"
> #include "ram.h"
> #include "migration/global_state.h"
> @@ -3289,7 +3289,9 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> static void migration_iteration_finish(MigrationState *s)
> {
> /* If we enabled cpu throttling for auto-converge, turn it off. */
> - cpu_throttle_stop();
> + if (migrate_auto_converge()) {
> + cpu_throttle_stop();
> + }
>
> bql_lock();
> switch (s->state) {
> @@ -3508,6 +3510,11 @@ static void *migration_thread(void *opaque)
> qemu_savevm_send_colo_enable(s->to_dst_file);
> }
>
> + if (migrate_auto_converge()) {
> + /* Start cpu throttle timers */
> + cpu_throttle_init();
> + }
Might this leak the timer object?
I think it perhaps needs to be moved to migration_object_init().
> +
> bql_lock();
> ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
> bql_unlock();
> diff --git a/migration/ram.c b/migration/ram.c
> index 326ce7eb79..54d352b152 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -52,7 +52,7 @@
> #include "exec/target_page.h"
> #include "qemu/rcu_queue.h"
> #include "migration/colo.h"
> -#include "sysemu/cpu-throttle.h"
> +#include "cpu-throttle.h"
> #include "savevm.h"
> #include "qemu/iov.h"
> #include "multifd.h"
> diff --git a/migration/trace-events b/migration/trace-events
> index c65902f042..9a19599804 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -378,3 +378,6 @@ migration_block_progression(unsigned percent) "Completed %u%%"
> # page_cache.c
> migration_pagecache_init(int64_t max_num_items) "Setting cache buckets to %" PRId64
> migration_pagecache_insert(void) "Error allocating page"
> +
> +# cpu-throttle.c
> +cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
> diff --git a/system/cpu-timers.c b/system/cpu-timers.c
> index 0b31c9a1b6..856e502e34 100644
> --- a/system/cpu-timers.c
> +++ b/system/cpu-timers.c
> @@ -35,7 +35,6 @@
> #include "sysemu/runstate.h"
> #include "hw/core/cpu.h"
> #include "sysemu/cpu-timers.h"
> -#include "sysemu/cpu-throttle.h"
> #include "sysemu/cpu-timers-internal.h"
>
> /* clock and ticks */
> @@ -272,6 +271,4 @@ void cpu_timers_init(void)
> seqlock_init(&timers_state.vm_clock_seqlock);
> qemu_spin_init(&timers_state.vm_clock_lock);
> vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
> -
> - cpu_throttle_init();
> }
> diff --git a/system/meson.build b/system/meson.build
> index a296270cb0..4952f4b2c7 100644
> --- a/system/meson.build
> +++ b/system/meson.build
> @@ -10,7 +10,6 @@ system_ss.add(files(
> 'balloon.c',
> 'bootdevice.c',
> 'cpus.c',
> - 'cpu-throttle.c',
> 'cpu-timers.c',
> 'datadir.c',
> 'dirtylimit.c',
> diff --git a/system/trace-events b/system/trace-events
> index 074d001e90..2ed1d59b1f 100644
> --- a/system/trace-events
> +++ b/system/trace-events
> @@ -44,6 +44,3 @@ dirtylimit_state_finalize(void)
> dirtylimit_throttle_pct(int cpu_index, uint64_t pct, int64_t time_us) "CPU[%d] throttle percent: %" PRIu64 ", throttle adjust time %"PRIi64 " us"
> dirtylimit_set_vcpu(int cpu_index, uint64_t quota) "CPU[%d] set dirty page rate limit %"PRIu64
> dirtylimit_vcpu_execute(int cpu_index, int64_t sleep_time_us) "CPU[%d] sleep %"PRIi64 " us"
> -
> -# cpu-throttle.c
> -cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
> --
> 2.27.0
>
--
Peter Xu
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy
2024-10-16 7:56 ` [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy yong.huang
@ 2024-10-16 15:51 ` Peter Xu
0 siblings, 0 replies; 11+ messages in thread
From: Peter Xu @ 2024-10-16 15:51 UTC (permalink / raw)
To: yong.huang; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
On Wed, Oct 16, 2024 at 03:56:43PM +0800, yong.huang@smartx.com wrote:
> From: Hyman Huang <yong.huang@smartx.com>
>
> The global static variable ram_state in fact is referred to by the
> "rs" parameter in migration_bitmap_sync_precopy. For ease of calling
> by the callees, use the global variable directly in
> migration_bitmap_sync_precopy and remove "rs" parameter.
>
> The migration_bitmap_sync_precopy will be exported in the next commit.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 3/4] migration: Support periodic ramblock dirty sync
2024-10-16 7:56 ` [PATCH v3 3/4] migration: Support periodic ramblock dirty sync yong.huang
@ 2024-10-16 18:49 ` Peter Xu
2024-10-17 3:58 ` Yong Huang
0 siblings, 1 reply; 11+ messages in thread
From: Peter Xu @ 2024-10-16 18:49 UTC (permalink / raw)
To: yong.huang; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
On Wed, Oct 16, 2024 at 03:56:44PM +0800, yong.huang@smartx.com wrote:
> From: Hyman Huang <yong.huang@smartx.com>
>
> When VM is configured with huge memory, the current throttle logic
> doesn't look like to scale, because migration_trigger_throttle()
> is only called for each iteration, so it won't be invoked for a long
> time if one iteration can take a long time.
>
> The periodic dirty sync aims to fix the above issue by synchronizing
> the ramblock from remote dirty bitmap and, when necessary, triggering
> the CPU throttle multiple times during a long iteration.
>
> This is a trade-off between synchronization overhead and CPU throttle
> impact.
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> ---
> migration/cpu-throttle.c | 70 +++++++++++++++++++++++++++++++++++++++-
> migration/cpu-throttle.h | 14 ++++++++
> migration/migration.h | 1 +
> migration/ram.c | 9 ++++--
> migration/trace-events | 1 +
> 5 files changed, 92 insertions(+), 3 deletions(-)
>
> diff --git a/migration/cpu-throttle.c b/migration/cpu-throttle.c
> index fa47ee2e21..784b51ab35 100644
> --- a/migration/cpu-throttle.c
> +++ b/migration/cpu-throttle.c
> @@ -28,16 +28,23 @@
> #include "qemu/main-loop.h"
> #include "sysemu/cpus.h"
> #include "cpu-throttle.h"
> +#include "migration.h"
> +#include "migration-stats.h"
> +#include "options.h"
> #include "trace.h"
>
> /* vcpu throttling controls */
> -static QEMUTimer *throttle_timer;
> +static QEMUTimer *throttle_timer, *throttle_dirty_sync_timer;
> static unsigned int throttle_percentage;
> +static bool throttle_dirty_sync_timer_active;
>
> #define CPU_THROTTLE_PCT_MIN 1
> #define CPU_THROTTLE_PCT_MAX 99
> #define CPU_THROTTLE_TIMESLICE_NS 10000000
>
> +/* RAMBlock dirty sync trigger every five seconds */
Maybe enrich it to say "making sure it is synchronized every five seconds"?
Because it can synchronize faster if each iteration runs faster than 5sec,
so just to emphasize it's a fallback sync, and only with auto converge.
> +#define CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS 5000
> +
> static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
> {
> double pct;
> @@ -112,6 +119,7 @@ void cpu_throttle_set(int new_throttle_pct)
> void cpu_throttle_stop(void)
> {
> qatomic_set(&throttle_percentage, 0);
> + cpu_throttle_dirty_sync_timer(false);
> }
>
> bool cpu_throttle_active(void)
> @@ -124,8 +132,68 @@ int cpu_throttle_get_percentage(void)
> return qatomic_read(&throttle_percentage);
> }
>
> +void cpu_throttle_dirty_sync_timer_tick(void *opaque)
> +{
> + static uint64_t prev_sync_cnt = 2;
IIUC the hard coded "2" isn't needed, as long as it's guaranteed to be
updated for each timer call, and you special cased "1" anyway below.
> + uint64_t sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
> +
> + if (!migrate_auto_converge()) {
> + /* Stop the timer when auto converge is disabled */
> + return;
I think we can try to make sure this never starts if !auto-converge, so
assuming this path will never trigger in real life.
> + }
> +
> + /*
> + * The first iteration copies all memory anyhow and has no
> + * effect on guest performance, therefore omit it to avoid
> + * paying extra for the sync penalty.
> + */
> + if (sync_cnt <= 1) {
> + goto end;
> + }
> +
> + if (sync_cnt == prev_sync_cnt) {
> + trace_cpu_throttle_dirty_sync();
> + WITH_RCU_READ_LOCK_GUARD() {
> + migration_bitmap_sync_precopy(false);
> + }
> + }
> +
> +end:
> + prev_sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
> +
> + timer_mod(throttle_dirty_sync_timer,
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> + CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
> +}
> +
> +static bool cpu_throttle_dirty_sync_active(void)
> +{
> + return qatomic_read(&throttle_dirty_sync_timer_active);
> +}
> +
> +void cpu_throttle_dirty_sync_timer(bool enable)
> +{
> + if (enable) {
> + assert(throttle_dirty_sync_timer);
> + if (!cpu_throttle_dirty_sync_active()) {
I suppose this can be logically racy? As I think after this patch this path
can be invoked both in main thread and migration thread.
The simplest way to do is to move cpu_throttle_stop() call to be under
bql_lock(), so that this will be serialized by BQL. Then we can add an
assertion at the entry of the function for bql_locked().
> + timer_mod(throttle_dirty_sync_timer,
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> + CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
> + qatomic_set(&throttle_dirty_sync_timer_active, 1);
> + }
> + } else {
> + if (throttle_dirty_sync_timer != NULL) {
IIUC throttle_dirty_sync_timer is never destroyed, aka, timer_del() only
disables it. So you should probably use throttle_dirty_sync_timer_active?
> + timer_del(throttle_dirty_sync_timer);
> + qatomic_set(&throttle_dirty_sync_timer_active, 0);
> + }
> + }
> +}
> +
> void cpu_throttle_init(void)
> {
> throttle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
> cpu_throttle_timer_tick, NULL);
> + throttle_dirty_sync_timer =
> + timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
> + cpu_throttle_dirty_sync_timer_tick, NULL);
> }
> diff --git a/migration/cpu-throttle.h b/migration/cpu-throttle.h
> index d65bdef6d0..420702b8d3 100644
> --- a/migration/cpu-throttle.h
> +++ b/migration/cpu-throttle.h
> @@ -65,4 +65,18 @@ bool cpu_throttle_active(void);
> */
> int cpu_throttle_get_percentage(void);
>
> +/**
> + * cpu_throttle_dirty_sync_timer_tick:
> + *
> + * Dirty sync timer hook.
> + */
> +void cpu_throttle_dirty_sync_timer_tick(void *opaque);
> +
> +/**
> + * cpu_throttle_dirty_sync_timer:
> + *
> + * Start or stop the dirty sync timer.
> + */
> +void cpu_throttle_dirty_sync_timer(bool enable);
> +
> #endif /* SYSEMU_CPU_THROTTLE_H */
> diff --git a/migration/migration.h b/migration/migration.h
> index 38aa1402d5..fbd0d19092 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -537,4 +537,5 @@ int migration_rp_wait(MigrationState *s);
> */
> void migration_rp_kick(MigrationState *s);
>
> +void migration_bitmap_sync_precopy(bool last_stage);
> #endif
> diff --git a/migration/ram.c b/migration/ram.c
> index 9b5b350405..ac34e731e2 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1020,6 +1020,11 @@ static void migration_trigger_throttle(RAMState *rs)
> migration_transferred_bytes() - rs->bytes_xfer_prev;
> uint64_t bytes_dirty_period = rs->num_dirty_pages_period * TARGET_PAGE_SIZE;
> uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold / 100;
> + bool auto_converge = migrate_auto_converge();
> +
> + if (auto_converge) {
> + cpu_throttle_dirty_sync_timer(true);
> + }
If you have the guard to skip the 1st sync in the timer fn(), IIUC you can
move this earlier to e.g. migration_thread() before iteration starts.
Otherwise it'll be not as clear on when this timer will start if it hides
in the sync path itself.
>
> /*
> * The following detection logic can be refined later. For now:
> @@ -1031,7 +1036,7 @@ static void migration_trigger_throttle(RAMState *rs)
> if ((bytes_dirty_period > bytes_dirty_threshold) &&
> (++rs->dirty_rate_high_cnt >= 2)) {
> rs->dirty_rate_high_cnt = 0;
> - if (migrate_auto_converge()) {
> + if (auto_converge) {
> trace_migration_throttle();
> mig_throttle_guest_down(bytes_dirty_period,
> bytes_dirty_threshold);
> @@ -1088,7 +1093,7 @@ static void migration_bitmap_sync(RAMState *rs, bool last_stage)
> }
> }
>
> -static void migration_bitmap_sync_precopy(bool last_stage)
> +void migration_bitmap_sync_precopy(bool last_stage)
> {
> Error *local_err = NULL;
> assert(ram_state);
> diff --git a/migration/trace-events b/migration/trace-events
> index 9a19599804..0638183056 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -381,3 +381,4 @@ migration_pagecache_insert(void) "Error allocating page"
>
> # cpu-throttle.c
> cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by %d%%"
> +cpu_throttle_dirty_sync(void) ""
> --
> 2.27.0
>
--
Peter Xu
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 4/4] tests/migration: Add case for periodic ramblock dirty sync
2024-10-16 7:56 ` [PATCH v3 4/4] tests/migration: Add case for " yong.huang
@ 2024-10-16 18:50 ` Peter Xu
0 siblings, 0 replies; 11+ messages in thread
From: Peter Xu @ 2024-10-16 18:50 UTC (permalink / raw)
To: yong.huang; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
On Wed, Oct 16, 2024 at 03:56:45PM +0800, yong.huang@smartx.com wrote:
> From: Hyman Huang <yong.huang@smartx.com>
>
> Signed-off-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration
2024-10-16 15:50 ` Peter Xu
@ 2024-10-17 3:52 ` Yong Huang
0 siblings, 0 replies; 11+ messages in thread
From: Yong Huang @ 2024-10-17 3:52 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
[-- Attachment #1: Type: text/plain, Size: 7607 bytes --]
On Wed, Oct 16, 2024 at 11:50 PM Peter Xu <peterx@redhat.com> wrote:
> On Wed, Oct 16, 2024 at 03:56:42PM +0800, yong.huang@smartx.com wrote:
> > From: Hyman Huang <yong.huang@smartx.com>
> >
> > Move cpu-throttle.c from system to migration since it's
> > only used for migration; this makes us avoid exporting the
> > util functions and variables in misc.h but export them in
> > migration.h when implementing the background ramblock dirty
> > sync feature in the upcoming commits.
> >
> > Additionally, make the two modifications below:
> >
> > 1. Delay the timer registering of CPU throttle until
> > migration starts since it is only used in migration.
> >
> > 2. Stop CPU throttle if auto converge capability is
> > enabled since it only happens with auto converge.
> >
> > 3. Remove the unused header file reference in
> > accel/tcg/icount-common.c.
>
> Please consider split the things into smaller patches, especially when it
> involves file movements.
>
OK.
> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> > accel/tcg/icount-common.c | 1 -
> > {system => migration}/cpu-throttle.c | 2 +-
> > {include/sysemu => migration}/cpu-throttle.h | 0
> > migration/meson.build | 1 +
> > migration/migration.c | 11 +++++++++--
> > migration/ram.c | 2 +-
> > migration/trace-events | 3 +++
> > system/cpu-timers.c | 3 ---
> > system/meson.build | 1 -
> > system/trace-events | 3 ---
> > 10 files changed, 15 insertions(+), 12 deletions(-)
> > rename {system => migration}/cpu-throttle.c (99%)
> > rename {include/sysemu => migration}/cpu-throttle.h (100%)
> >
> > diff --git a/accel/tcg/icount-common.c b/accel/tcg/icount-common.c
> > index 8d3d3a7e9d..30bf8500dc 100644
> > --- a/accel/tcg/icount-common.c
> > +++ b/accel/tcg/icount-common.c
> > @@ -36,7 +36,6 @@
> > #include "sysemu/runstate.h"
> > #include "hw/core/cpu.h"
> > #include "sysemu/cpu-timers.h"
> > -#include "sysemu/cpu-throttle.h"
> > #include "sysemu/cpu-timers-internal.h"
> >
> > /*
> > diff --git a/system/cpu-throttle.c b/migration/cpu-throttle.c
> > similarity index 99%
> > rename from system/cpu-throttle.c
> > rename to migration/cpu-throttle.c
> > index 7632dc6143..fa47ee2e21 100644
> > --- a/system/cpu-throttle.c
> > +++ b/migration/cpu-throttle.c
> > @@ -27,7 +27,7 @@
> > #include "hw/core/cpu.h"
> > #include "qemu/main-loop.h"
> > #include "sysemu/cpus.h"
> > -#include "sysemu/cpu-throttle.h"
> > +#include "cpu-throttle.h"
> > #include "trace.h"
> >
> > /* vcpu throttling controls */
> > diff --git a/include/sysemu/cpu-throttle.h b/migration/cpu-throttle.h
> > similarity index 100%
> > rename from include/sysemu/cpu-throttle.h
> > rename to migration/cpu-throttle.h
> > diff --git a/migration/meson.build b/migration/meson.build
> > index 66d3de86f0..d53cf3417a 100644
> > --- a/migration/meson.build
> > +++ b/migration/meson.build
> > @@ -13,6 +13,7 @@ system_ss.add(files(
> > 'block-dirty-bitmap.c',
> > 'channel.c',
> > 'channel-block.c',
> > + 'cpu-throttle.c',
> > 'dirtyrate.c',
> > 'exec.c',
> > 'fd.c',
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 021faee2f3..7e71184257 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -24,7 +24,7 @@
> > #include "socket.h"
> > #include "sysemu/runstate.h"
> > #include "sysemu/sysemu.h"
> > -#include "sysemu/cpu-throttle.h"
> > +#include "cpu-throttle.h"
> > #include "rdma.h"
> > #include "ram.h"
> > #include "migration/global_state.h"
> > @@ -3289,7 +3289,9 @@ static MigIterateState
> migration_iteration_run(MigrationState *s)
> > static void migration_iteration_finish(MigrationState *s)
> > {
> > /* If we enabled cpu throttling for auto-converge, turn it off. */
> > - cpu_throttle_stop();
> > + if (migrate_auto_converge()) {
> > + cpu_throttle_stop();
> > + }
> >
> > bql_lock();
> > switch (s->state) {
> > @@ -3508,6 +3510,11 @@ static void *migration_thread(void *opaque)
> > qemu_savevm_send_colo_enable(s->to_dst_file);
> > }
> >
> > + if (migrate_auto_converge()) {
> > + /* Start cpu throttle timers */
> > + cpu_throttle_init();
> > + }
>
> Might this leak the timer object?
> I think it perhaps needs to be moved to migration_object_init().
>
Ok, I'll fix it in the next version.
>
> > +
> > bql_lock();
> > ret = qemu_savevm_state_setup(s->to_dst_file, &local_err);
> > bql_unlock();
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 326ce7eb79..54d352b152 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -52,7 +52,7 @@
> > #include "exec/target_page.h"
> > #include "qemu/rcu_queue.h"
> > #include "migration/colo.h"
> > -#include "sysemu/cpu-throttle.h"
> > +#include "cpu-throttle.h"
> > #include "savevm.h"
> > #include "qemu/iov.h"
> > #include "multifd.h"
> > diff --git a/migration/trace-events b/migration/trace-events
> > index c65902f042..9a19599804 100644
> > --- a/migration/trace-events
> > +++ b/migration/trace-events
> > @@ -378,3 +378,6 @@ migration_block_progression(unsigned percent)
> "Completed %u%%"
> > # page_cache.c
> > migration_pagecache_init(int64_t max_num_items) "Setting cache buckets
> to %" PRId64
> > migration_pagecache_insert(void) "Error allocating page"
> > +
> > +# cpu-throttle.c
> > +cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by
> %d%%"
> > diff --git a/system/cpu-timers.c b/system/cpu-timers.c
> > index 0b31c9a1b6..856e502e34 100644
> > --- a/system/cpu-timers.c
> > +++ b/system/cpu-timers.c
> > @@ -35,7 +35,6 @@
> > #include "sysemu/runstate.h"
> > #include "hw/core/cpu.h"
> > #include "sysemu/cpu-timers.h"
> > -#include "sysemu/cpu-throttle.h"
> > #include "sysemu/cpu-timers-internal.h"
> >
> > /* clock and ticks */
> > @@ -272,6 +271,4 @@ void cpu_timers_init(void)
> > seqlock_init(&timers_state.vm_clock_seqlock);
> > qemu_spin_init(&timers_state.vm_clock_lock);
> > vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
> > -
> > - cpu_throttle_init();
> > }
> > diff --git a/system/meson.build b/system/meson.build
> > index a296270cb0..4952f4b2c7 100644
> > --- a/system/meson.build
> > +++ b/system/meson.build
> > @@ -10,7 +10,6 @@ system_ss.add(files(
> > 'balloon.c',
> > 'bootdevice.c',
> > 'cpus.c',
> > - 'cpu-throttle.c',
> > 'cpu-timers.c',
> > 'datadir.c',
> > 'dirtylimit.c',
> > diff --git a/system/trace-events b/system/trace-events
> > index 074d001e90..2ed1d59b1f 100644
> > --- a/system/trace-events
> > +++ b/system/trace-events
> > @@ -44,6 +44,3 @@ dirtylimit_state_finalize(void)
> > dirtylimit_throttle_pct(int cpu_index, uint64_t pct, int64_t time_us)
> "CPU[%d] throttle percent: %" PRIu64 ", throttle adjust time %"PRIi64 " us"
> > dirtylimit_set_vcpu(int cpu_index, uint64_t quota) "CPU[%d] set dirty
> page rate limit %"PRIu64
> > dirtylimit_vcpu_execute(int cpu_index, int64_t sleep_time_us) "CPU[%d]
> sleep %"PRIi64 " us"
> > -
> > -# cpu-throttle.c
> > -cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by
> %d%%"
> > --
> > 2.27.0
> >
>
> --
> Peter Xu
>
>
--
Best regards
[-- Attachment #2: Type: text/html, Size: 11127 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH v3 3/4] migration: Support periodic ramblock dirty sync
2024-10-16 18:49 ` Peter Xu
@ 2024-10-17 3:58 ` Yong Huang
0 siblings, 0 replies; 11+ messages in thread
From: Yong Huang @ 2024-10-17 3:58 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, Fabiano Rosas, Richard Henderson, Paolo Bonzini
[-- Attachment #1: Type: text/plain, Size: 9511 bytes --]
On Thu, Oct 17, 2024 at 2:49 AM Peter Xu <peterx@redhat.com> wrote:
> On Wed, Oct 16, 2024 at 03:56:44PM +0800, yong.huang@smartx.com wrote:
> > From: Hyman Huang <yong.huang@smartx.com>
> >
> > When VM is configured with huge memory, the current throttle logic
> > doesn't look like to scale, because migration_trigger_throttle()
> > is only called for each iteration, so it won't be invoked for a long
> > time if one iteration can take a long time.
> >
> > The periodic dirty sync aims to fix the above issue by synchronizing
> > the ramblock from remote dirty bitmap and, when necessary, triggering
> > the CPU throttle multiple times during a long iteration.
> >
> > This is a trade-off between synchronization overhead and CPU throttle
> > impact.
> >
> > Signed-off-by: Hyman Huang <yong.huang@smartx.com>
> > ---
> > migration/cpu-throttle.c | 70 +++++++++++++++++++++++++++++++++++++++-
> > migration/cpu-throttle.h | 14 ++++++++
> > migration/migration.h | 1 +
> > migration/ram.c | 9 ++++--
> > migration/trace-events | 1 +
> > 5 files changed, 92 insertions(+), 3 deletions(-)
> >
> > diff --git a/migration/cpu-throttle.c b/migration/cpu-throttle.c
> > index fa47ee2e21..784b51ab35 100644
> > --- a/migration/cpu-throttle.c
> > +++ b/migration/cpu-throttle.c
> > @@ -28,16 +28,23 @@
> > #include "qemu/main-loop.h"
> > #include "sysemu/cpus.h"
> > #include "cpu-throttle.h"
> > +#include "migration.h"
> > +#include "migration-stats.h"
> > +#include "options.h"
> > #include "trace.h"
> >
> > /* vcpu throttling controls */
> > -static QEMUTimer *throttle_timer;
> > +static QEMUTimer *throttle_timer, *throttle_dirty_sync_timer;
> > static unsigned int throttle_percentage;
> > +static bool throttle_dirty_sync_timer_active;
> >
> > #define CPU_THROTTLE_PCT_MIN 1
> > #define CPU_THROTTLE_PCT_MAX 99
> > #define CPU_THROTTLE_TIMESLICE_NS 10000000
> >
> > +/* RAMBlock dirty sync trigger every five seconds */
>
> Maybe enrich it to say "making sure it is synchronized every five seconds"?
> Because it can synchronize faster if each iteration runs faster than 5sec,
> so just to emphasize it's a fallback sync, and only with auto converge.
>
Agree, I'll refine the comment in the next version.
>
> > +#define CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS 5000
> > +
> > static void cpu_throttle_thread(CPUState *cpu, run_on_cpu_data opaque)
> > {
> > double pct;
> > @@ -112,6 +119,7 @@ void cpu_throttle_set(int new_throttle_pct)
> > void cpu_throttle_stop(void)
> > {
> > qatomic_set(&throttle_percentage, 0);
> > + cpu_throttle_dirty_sync_timer(false);
> > }
> >
> > bool cpu_throttle_active(void)
> > @@ -124,8 +132,68 @@ int cpu_throttle_get_percentage(void)
> > return qatomic_read(&throttle_percentage);
> > }
> >
> > +void cpu_throttle_dirty_sync_timer_tick(void *opaque)
> > +{
> > + static uint64_t prev_sync_cnt = 2;
>
> IIUC the hard coded "2" isn't needed, as long as it's guaranteed to be
> updated for each timer call, and you special cased "1" anyway below.
>
Ok.
>
> > + uint64_t sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
> > +
> > + if (!migrate_auto_converge()) {
> > + /* Stop the timer when auto converge is disabled */
> > + return;
>
> I think we can try to make sure this never starts if !auto-converge, so
> assuming this path will never trigger in real life.
>
Indeed, this makes the code cleaner.
> > + }
> > +
> > + /*
> > + * The first iteration copies all memory anyhow and has no
> > + * effect on guest performance, therefore omit it to avoid
> > + * paying extra for the sync penalty.
> > + */
> > + if (sync_cnt <= 1) {
> > + goto end;
> > + }
> > +
> > + if (sync_cnt == prev_sync_cnt) {
> > + trace_cpu_throttle_dirty_sync();
> > + WITH_RCU_READ_LOCK_GUARD() {
> > + migration_bitmap_sync_precopy(false);
> > + }
> > + }
> > +
> > +end:
> > + prev_sync_cnt = stat64_get(&mig_stats.dirty_sync_count);
> > +
> > + timer_mod(throttle_dirty_sync_timer,
> > + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> > + CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
> > +}
> > +
> > +static bool cpu_throttle_dirty_sync_active(void)
> > +{
> > + return qatomic_read(&throttle_dirty_sync_timer_active);
> > +}
> > +
> > +void cpu_throttle_dirty_sync_timer(bool enable)
> > +{
> > + if (enable) {
> > + assert(throttle_dirty_sync_timer);
> > + if (!cpu_throttle_dirty_sync_active()) {
>
> I suppose this can be logically racy? As I think after this patch this path
> can be invoked both in main thread and migration thread.
>
Indeed, thanks for pointing this out.
>
> The simplest way to do is to move cpu_throttle_stop() call to be under
> bql_lock(), so that this will be serialized by BQL. Then we can add an
> assertion at the entry of the function for bql_locked().
> > + timer_mod(throttle_dirty_sync_timer,
> > + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> > + CPU_THROTTLE_DIRTY_SYNC_TIMESLICE_MS);
> > + qatomic_set(&throttle_dirty_sync_timer_active, 1);
> > + }
> > + } else {
> > + if (throttle_dirty_sync_timer != NULL) {
>
> IIUC throttle_dirty_sync_timer is never destroyed, aka, timer_del() only
> disables it. So you should probably use throttle_dirty_sync_timer_active?
>
> > + timer_del(throttle_dirty_sync_timer);
> > + qatomic_set(&throttle_dirty_sync_timer_active, 0);
> > + }
> > + }
> > +}
> > +
> > void cpu_throttle_init(void)
> > {
> > throttle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
> > cpu_throttle_timer_tick, NULL);
> > + throttle_dirty_sync_timer =
> > + timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
> > + cpu_throttle_dirty_sync_timer_tick, NULL);
> > }
> > diff --git a/migration/cpu-throttle.h b/migration/cpu-throttle.h
> > index d65bdef6d0..420702b8d3 100644
> > --- a/migration/cpu-throttle.h
> > +++ b/migration/cpu-throttle.h
> > @@ -65,4 +65,18 @@ bool cpu_throttle_active(void);
> > */
> > int cpu_throttle_get_percentage(void);
> >
> > +/**
> > + * cpu_throttle_dirty_sync_timer_tick:
> > + *
> > + * Dirty sync timer hook.
> > + */
> > +void cpu_throttle_dirty_sync_timer_tick(void *opaque);
> > +
> > +/**
> > + * cpu_throttle_dirty_sync_timer:
> > + *
> > + * Start or stop the dirty sync timer.
> > + */
> > +void cpu_throttle_dirty_sync_timer(bool enable);
> > +
> > #endif /* SYSEMU_CPU_THROTTLE_H */
> > diff --git a/migration/migration.h b/migration/migration.h
> > index 38aa1402d5..fbd0d19092 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -537,4 +537,5 @@ int migration_rp_wait(MigrationState *s);
> > */
> > void migration_rp_kick(MigrationState *s);
> >
> > +void migration_bitmap_sync_precopy(bool last_stage);
> > #endif
> > diff --git a/migration/ram.c b/migration/ram.c
> > index 9b5b350405..ac34e731e2 100644
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1020,6 +1020,11 @@ static void migration_trigger_throttle(RAMState
> *rs)
> > migration_transferred_bytes() - rs->bytes_xfer_prev;
> > uint64_t bytes_dirty_period = rs->num_dirty_pages_period *
> TARGET_PAGE_SIZE;
> > uint64_t bytes_dirty_threshold = bytes_xfer_period * threshold /
> 100;
> > + bool auto_converge = migrate_auto_converge();
> > +
> > + if (auto_converge) {
> > + cpu_throttle_dirty_sync_timer(true);
> > + }
>
> If you have the guard to skip the 1st sync in the timer fn(), IIUC you can
> move this earlier to e.g. migration_thread() before iteration starts.
> Otherwise it'll be not as clear on when this timer will start if it hides
> in the sync path itself.
>
> >
> > /*
> > * The following detection logic can be refined later. For now:
> > @@ -1031,7 +1036,7 @@ static void migration_trigger_throttle(RAMState
> *rs)
> > if ((bytes_dirty_period > bytes_dirty_threshold) &&
> > (++rs->dirty_rate_high_cnt >= 2)) {
> > rs->dirty_rate_high_cnt = 0;
> > - if (migrate_auto_converge()) {
> > + if (auto_converge) {
> > trace_migration_throttle();
> > mig_throttle_guest_down(bytes_dirty_period,
> > bytes_dirty_threshold);
> > @@ -1088,7 +1093,7 @@ static void migration_bitmap_sync(RAMState *rs,
> bool last_stage)
> > }
> > }
> >
> > -static void migration_bitmap_sync_precopy(bool last_stage)
> > +void migration_bitmap_sync_precopy(bool last_stage)
> > {
> > Error *local_err = NULL;
> > assert(ram_state);
> > diff --git a/migration/trace-events b/migration/trace-events
> > index 9a19599804..0638183056 100644
> > --- a/migration/trace-events
> > +++ b/migration/trace-events
> > @@ -381,3 +381,4 @@ migration_pagecache_insert(void) "Error allocating
> page"
> >
> > # cpu-throttle.c
> > cpu_throttle_set(int new_throttle_pct) "set guest CPU throttled by
> %d%%"
> > +cpu_throttle_dirty_sync(void) ""
> > --
> > 2.27.0
> >
>
> --
> Peter Xu
>
>
All the comments above are advisable, thanks Peter, I'll try it in the next
version.
Yong
--
Best regards
[-- Attachment #2: Type: text/html, Size: 13557 bytes --]
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2024-10-17 3:59 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-16 7:56 [PATCH v3 0/4] migration: auto-converge refinements for huge VM yong.huang
2024-10-16 7:56 ` [PATCH v3 1/4] migration: Move cpu-throttole.c from system to migration yong.huang
2024-10-16 15:50 ` Peter Xu
2024-10-17 3:52 ` Yong Huang
2024-10-16 7:56 ` [PATCH v3 2/4] migration: Remove "rs" parameter in migration_bitmap_sync_precopy yong.huang
2024-10-16 15:51 ` Peter Xu
2024-10-16 7:56 ` [PATCH v3 3/4] migration: Support periodic ramblock dirty sync yong.huang
2024-10-16 18:49 ` Peter Xu
2024-10-17 3:58 ` Yong Huang
2024-10-16 7:56 ` [PATCH v3 4/4] tests/migration: Add case for " yong.huang
2024-10-16 18:50 ` Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).