* [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset Wei Wang
` (7 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
BITMAP_LAST_WORD_MASK(nbits) returns 0xffffffff when "nbits=0", which
makes bitmap_count_one fail to handle the "nbits=0" case. It appears to be
preferred to remain BITMAP_LAST_WORD_MASK identical to the kernel
implementation that it is ported from.
So this patch fixes bitmap_count_one to handle the nbits=0 case.
Inital Discussion Link:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg554316.html
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
include/qemu/bitmap.h | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 509eedd..679f1bd 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -221,6 +221,10 @@ static inline int bitmap_intersects(const unsigned long *src1,
static inline long bitmap_count_one(const unsigned long *bitmap, long nbits)
{
+ if (unlikely(!nbits)) {
+ return 0;
+ }
+
if (small_nbits(nbits)) {
return ctpopl(*bitmap & BITMAP_LAST_WORD_MASK(nbits));
} else {
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang
` (6 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
Count the number of 1s in a bitmap starting from an offset.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
include/qemu/bitmap.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 679f1bd..5c31334 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -232,6 +232,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits)
}
}
+static inline long bitmap_count_one_with_offset(const unsigned long *bitmap,
+ long offset, long nbits)
+{
+ long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG);
+ long redundant_bits = offset - aligned_offset;
+ long bits_to_count = nbits + redundant_bits;
+ const unsigned long *bitmap_start = bitmap +
+ aligned_offset / BITS_PER_LONG;
+
+ return bitmap_count_one(bitmap_start, bits_to_count) -
+ bitmap_count_one(bitmap_start, redundant_bits);
+}
+
void bitmap_set(unsigned long *map, long i, long len);
void bitmap_set_atomic(unsigned long *map, long i, long len);
void bitmap_clear(unsigned long *map, long start, long nr);
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
[not found] ` <20181127054056.GA3205@xz-x1>
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang
` (5 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
The bitmap mutex is used to synchronize threads to update the dirty
bitmap and the migration_dirty_pages counter. For example, the free
page optimization clears bits of free pages from the bitmap in an
iothread context. This patch makes migration_bitmap_clear_dirty update
the bitmap and counter under the mutex.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
migration/ram.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec..ef69dbe 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1556,11 +1556,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
{
bool ret;
+ qemu_mutex_lock(&rs->bitmap_mutex);
ret = test_and_clear_bit(page, rb->bmap);
if (ret) {
rs->migration_dirty_pages--;
}
+ qemu_mutex_unlock(&rs->bitmap_mutex);
+
return ret;
}
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (2 preceding siblings ...)
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
[not found] ` <20181127060638.GB3205@xz-x1>
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy Wei Wang
` (4 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
This patch adds an API to clear bits corresponding to guest free pages
from the dirty bitmap. Spilt the free page block if it crosses the QEMU
RAMBlock boundary.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 2 ++
migration/ram.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 4ebf24c..113320e 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -14,11 +14,13 @@
#ifndef MIGRATION_MISC_H
#define MIGRATION_MISC_H
+#include "exec/cpu-common.h"
#include "qemu/notify.h"
/* migration/ram.c */
void ram_mig_init(void);
+void qemu_guest_free_page_hint(void *addr, size_t len);
/* migration/block.c */
diff --git a/migration/ram.c b/migration/ram.c
index ef69dbe..229b791 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3131,6 +3131,54 @@ static void ram_state_resume_prepare(RAMState *rs, QEMUFile *out)
}
/*
+ * This function clears bits of the free pages reported by the caller from the
+ * migration dirty bitmap. @addr is the host address corresponding to the
+ * start of the continuous guest free pages, and @len is the total bytes of
+ * those pages.
+ */
+void qemu_guest_free_page_hint(void *addr, size_t len)
+{
+ RAMBlock *block;
+ ram_addr_t offset;
+ size_t used_len, start, npages;
+ MigrationState *s = migrate_get_current();
+
+ /* This function is currently expected to be used during live migration */
+ if (!migration_is_setup_or_active(s->state)) {
+ return;
+ }
+
+ for (; len > 0; len -= used_len) {
+ block = qemu_ram_block_from_host(addr, false, &offset);
+ assert(block);
+
+ /*
+ * This handles the case that the RAMBlock is resized after the free
+ * page hint is reported.
+ */
+ if (unlikely(offset > block->used_length)) {
+ return;
+ }
+
+ if (len <= block->used_length - offset) {
+ used_len = len;
+ } else {
+ used_len = block->used_length - offset;
+ addr += used_len;
+ }
+
+ start = offset >> TARGET_PAGE_BITS;
+ npages = used_len >> TARGET_PAGE_BITS;
+
+ qemu_mutex_lock(&ram_state->bitmap_mutex);
+ ram_state->migration_dirty_pages -=
+ bitmap_count_one_with_offset(block->bmap, start, npages);
+ bitmap_clear(block->bmap, start, npages);
+ qemu_mutex_unlock(&ram_state->bitmap_mutex);
+ }
+}
+
+/*
* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
* long-running RCU critical section. When rcu-reclaims in the code
* start to become numerous it will be necessary to reduce the
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (3 preceding siblings ...)
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
[not found] ` <20181127073802.GC3205@xz-x1>
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage Wei Wang
` (3 subsequent siblings)
8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
This patch adds a notifier chain for the memory precopy. This enables various
precopy optimizations to be invoked at specific places.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 12 ++++++++++++
migration/ram.c | 31 +++++++++++++++++++++++++++++++
vl.c | 1 +
3 files changed, 44 insertions(+)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 113320e..0bac623 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -19,6 +19,18 @@
/* migration/ram.c */
+typedef enum PrecopyNotifyReason {
+ PRECOPY_NOTIFY_ERR = 0,
+ PRECOPY_NOTIFY_START_ITERATION = 1,
+ PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP = 2,
+ PRECOPY_NOTIFY_AFTER_SYNC_BITMAP = 3,
+ PRECOPY_NOTIFY_MAX = 4,
+} PrecopyNotifyReason;
+
+void precopy_infrastructure_init(void);
+void precopy_add_notifier(Notifier *n);
+void precopy_remove_notifier(Notifier *n);
+
void ram_mig_init(void);
void qemu_guest_free_page_hint(void *addr, size_t len);
diff --git a/migration/ram.c b/migration/ram.c
index 229b791..65b1223 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -292,6 +292,8 @@ struct RAMState {
bool ram_bulk_stage;
/* How many times we have dirty too many pages */
int dirty_rate_high_cnt;
+ /* ram save states used for notifiers */
+ int ram_save_state;
/* these variables are used for bitmap sync */
/* last time we did a full bitmap_sync */
int64_t time_last_bitmap_sync;
@@ -328,6 +330,28 @@ typedef struct RAMState RAMState;
static RAMState *ram_state;
+static NotifierList precopy_notifier_list;
+
+void precopy_infrastructure_init(void)
+{
+ notifier_list_init(&precopy_notifier_list);
+}
+
+void precopy_add_notifier(Notifier *n)
+{
+ notifier_list_add(&precopy_notifier_list, n);
+}
+
+void precopy_remove_notifier(Notifier *n)
+{
+ notifier_remove(n);
+}
+
+static void precopy_notify(PrecopyNotifyReason reason)
+{
+ notifier_list_notify(&precopy_notifier_list, &reason);
+}
+
uint64_t ram_bytes_remaining(void)
{
return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) :
@@ -1642,6 +1666,8 @@ static void migration_bitmap_sync(RAMState *rs)
int64_t end_time;
uint64_t bytes_xfer_now;
+ precopy_notify(PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP);
+
ram_counters.dirty_sync_count++;
if (!rs->time_last_bitmap_sync) {
@@ -1699,6 +1725,8 @@ static void migration_bitmap_sync(RAMState *rs)
if (migrate_use_events()) {
qapi_event_send_migration_pass(ram_counters.dirty_sync_count);
}
+
+ precopy_notify(PRECOPY_NOTIFY_AFTER_SYNC_BITMAP);
}
/**
@@ -2555,6 +2583,8 @@ static void ram_state_reset(RAMState *rs)
rs->last_page = 0;
rs->last_version = ram_list.version;
rs->ram_bulk_stage = true;
+
+ precopy_notify(PRECOPY_NOTIFY_START_ITERATION);
}
#define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -3324,6 +3354,7 @@ out:
ret = qemu_file_get_error(f);
if (ret < 0) {
+ precopy_notify(PRECOPY_NOTIFY_ERR);
return ret;
}
diff --git a/vl.c b/vl.c
index fa25d1a..48ae0e8 100644
--- a/vl.c
+++ b/vl.c
@@ -3057,6 +3057,7 @@ int main(int argc, char **argv, char **envp)
module_call_init(MODULE_INIT_OPTS);
runstate_init();
+ precopy_infrastructure_init();
postcopy_infrastructure_init();
monitor_init_globals();
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (4 preceding siblings ...)
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h Wei Wang
` (2 subsequent siblings)
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
This patch adds a function to enable a precopy notifier callback outside
the migration subsystem to disable the bulk stage flag. This is needed by
the free page optimization offered by virtio-balloon.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
include/migration/misc.h | 1 +
migration/ram.c | 9 +++++++++
2 files changed, 10 insertions(+)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 0bac623..67cb275 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -30,6 +30,7 @@ typedef enum PrecopyNotifyReason {
void precopy_infrastructure_init(void);
void precopy_add_notifier(Notifier *n);
void precopy_remove_notifier(Notifier *n);
+void precopy_disable_bulk_stage(void);
void ram_mig_init(void);
void qemu_guest_free_page_hint(void *addr, size_t len);
diff --git a/migration/ram.c b/migration/ram.c
index 65b1223..8745ca3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -352,6 +352,15 @@ static void precopy_notify(PrecopyNotifyReason reason)
notifier_list_notify(&precopy_notifier_list, &reason);
}
+void precopy_disable_bulk_stage(void)
+{
+ if (!ram_state) {
+ return;
+ }
+
+ ram_state->ram_bulk_stage = false;
+}
+
uint64_t ram_bytes_remaining(void)
{
return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) :
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (5 preceding siblings ...)
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2018-11-27 3:11 ` [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
The ram save state notifier callback, for example the free page
optimization offerred by virtio-balloon, may need to check if
postcopy is in use, so move migrate_postcopy() to the outside header.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
include/migration/misc.h | 1 +
migration/migration.h | 2 --
2 files changed, 1 insertion(+), 2 deletions(-)
diff --git a/include/migration/misc.h b/include/migration/misc.h
index 67cb275..06b816f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -70,6 +70,7 @@ bool migration_has_failed(MigrationState *);
/* ...and after the device transmission */
bool migration_in_postcopy_after_devices(MigrationState *);
void migration_global_dump(Monitor *mon);
+bool migrate_postcopy(void);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d..e1e92ff 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -249,8 +249,6 @@ bool migration_is_blocked(Error **errp);
bool migration_in_postcopy(void);
MigrationState *migrate_get_current(void);
-bool migrate_postcopy(void);
-
bool migrate_release_ram(void);
bool migrate_postcopy_ram(void);
bool migrate_zero_blocks(void);
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (6 preceding siblings ...)
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
2018-11-27 3:11 ` [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang
The new feature enables the virtio-balloon device to receive hints of
guest free pages from the free page vq.
A notifier is registered to the migration precopy notifier chain. The
notifier calls free_page_start after the migration thread syncs the dirty
bitmap, so that the free page optimization starts to clear bits of free
pages from the bitmap. It calls the free_page_stop before the migration
thread syncs the bitmap, which is the end of the current round of ram
save. The free_page_stop is also called to stop the optimization in the
case when there is an error occurred in the process of ram saving.
Note: balloon will report pages which were free at the time of this call.
As the reporting happens asynchronously, dirty bit logging must be
enabled before this free_page_start call is made. Guest reporting must be
disabled before the migration dirty bitmap is synchronized.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
hw/virtio/virtio-balloon.c | 255 ++++++++++++++++++++++++
include/hw/virtio/virtio-balloon.h | 28 ++-
include/standard-headers/linux/virtio_balloon.h | 5 +
3 files changed, 287 insertions(+), 1 deletion(-)
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 1728e4f..4a3eb30 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -27,6 +27,7 @@
#include "qapi/visitor.h"
#include "trace.h"
#include "qemu/error-report.h"
+#include "migration/misc.h"
#include "hw/virtio/virtio-bus.h"
#include "hw/virtio/virtio-access.h"
@@ -308,6 +309,176 @@ out:
}
}
+static void virtio_balloon_handle_free_page_vq(VirtIODevice *vdev,
+ VirtQueue *vq)
+{
+ VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+ qemu_bh_schedule(s->free_page_bh);
+}
+
+static bool get_free_page_hints(VirtIOBalloon *dev)
+{
+ VirtQueueElement *elem;
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VirtQueue *vq = dev->free_page_vq;
+
+ while (dev->block_iothread) {
+ qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock);
+ }
+
+ elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+ if (!elem) {
+ return false;
+ }
+
+ if (elem->out_num) {
+ uint32_t id;
+ size_t size = iov_to_buf(elem->out_sg, elem->out_num, 0,
+ &id, sizeof(id));
+ virtqueue_push(vq, elem, size);
+ g_free(elem);
+
+ virtio_tswap32s(vdev, &id);
+ if (unlikely(size != sizeof(id))) {
+ virtio_error(vdev, "received an incorrect cmd id");
+ return false;
+ }
+ if (id == dev->free_page_report_cmd_id) {
+ dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+ } else {
+ /*
+ * Stop the optimization only when it has started. This
+ * avoids a stale stop sign for the previous command.
+ */
+ if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+ dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+ }
+ }
+ }
+
+ if (elem->in_num) {
+ if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+ qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
+ elem->in_sg[0].iov_len);
+ }
+ virtqueue_push(vq, elem, 1);
+ g_free(elem);
+ }
+
+ return true;
+}
+
+static void virtio_ballloon_get_free_page_hints(void *opaque)
+{
+ VirtIOBalloon *dev = opaque;
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ VirtQueue *vq = dev->free_page_vq;
+ bool continue_to_get_hints;
+
+ do {
+ qemu_mutex_lock(&dev->free_page_lock);
+ virtio_queue_set_notification(vq, 0);
+ continue_to_get_hints = get_free_page_hints(dev);
+ qemu_mutex_unlock(&dev->free_page_lock);
+ virtio_notify(vdev, vq);
+ /*
+ * Start to poll the vq once the reporting started. Otherwise, continue
+ * only when there are entries on the vq, which need to be given back.
+ */
+ } while (continue_to_get_hints ||
+ dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
+ virtio_queue_set_notification(vq, 1);
+}
+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+ VirtIOBalloon *s = opaque;
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+ return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+}
+
+static void virtio_balloon_free_page_start(VirtIOBalloon *s)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+ /* For the stop and copy phase, we don't need to start the optimization */
+ if (!vdev->vm_running) {
+ return;
+ }
+
+ if (s->free_page_report_cmd_id == UINT_MAX) {
+ s->free_page_report_cmd_id =
+ VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+ } else {
+ s->free_page_report_cmd_id++;
+ }
+
+ s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+ virtio_notify_config(vdev);
+}
+
+static void virtio_balloon_free_page_stop(VirtIOBalloon *s)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+ if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+ /*
+ * The lock also guarantees us that the
+ * virtio_ballloon_get_free_page_hints exits after the
+ * free_page_report_status is set to S_STOP.
+ */
+ qemu_mutex_lock(&s->free_page_lock);
+ /*
+ * The guest hasn't done the reporting, so host sends a notification
+ * to the guest to actively stop the reporting.
+ */
+ s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+ qemu_mutex_unlock(&s->free_page_lock);
+ virtio_notify_config(vdev);
+ }
+}
+
+static void virtio_balloon_free_page_done(VirtIOBalloon *s)
+{
+ VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+ s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+ virtio_notify_config(vdev);
+}
+
+static void virtio_balloon_free_page_report_notify(Notifier *notifier,
+ void *data)
+{
+ VirtIOBalloon *dev = container_of(notifier, VirtIOBalloon,
+ free_page_report_notify);
+ VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+ PrecopyNotifyReason reason = *(PrecopyNotifyReason *)data;
+
+ if (!virtio_balloon_free_page_support(dev) || migrate_postcopy()) {
+ return;
+ }
+
+ switch (reason) {
+ case PRECOPY_NOTIFY_START_ITERATION:
+ precopy_disable_bulk_stage();
+ break;
+ case PRECOPY_NOTIFY_ERR:
+ case PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP:
+ virtio_balloon_free_page_stop(dev);
+ break;
+ case PRECOPY_NOTIFY_AFTER_SYNC_BITMAP:
+ if (vdev->vm_running) {
+ virtio_balloon_free_page_start(dev);
+ } else {
+ virtio_balloon_free_page_done(dev);
+ }
+ break;
+ default:
+ virtio_error(vdev, "%s: %d reason unknown", __func__, reason);
+ }
+}
+
static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
{
VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -316,6 +487,17 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
config.num_pages = cpu_to_le32(dev->num_pages);
config.actual = cpu_to_le32(dev->actual);
+ if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
+ config.free_page_report_cmd_id =
+ cpu_to_le32(dev->free_page_report_cmd_id);
+ } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
+ config.free_page_report_cmd_id =
+ cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID);
+ } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) {
+ config.free_page_report_cmd_id =
+ cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID);
+ }
+
trace_virtio_balloon_get_config(config.num_pages, config.actual);
memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
}
@@ -376,6 +558,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
f |= dev->host_features;
virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+
return f;
}
@@ -412,6 +595,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
return 0;
}
+static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+ .name = "virtio-balloon-device/free-page-report",
+ .version_id = 1,
+ .minimum_version_id = 1,
+ .needed = virtio_balloon_free_page_support,
+ .fields = (VMStateField[]) {
+ VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
+ VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
+ VMSTATE_END_OF_LIST()
+ }
+};
+
static const VMStateDescription vmstate_virtio_balloon_device = {
.name = "virtio-balloon-device",
.version_id = 1,
@@ -422,6 +617,10 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
VMSTATE_UINT32(actual, VirtIOBalloon),
VMSTATE_END_OF_LIST()
},
+ .subsections = (const VMStateDescription * []) {
+ &vmstate_virtio_balloon_free_page_report,
+ NULL
+ }
};
static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
@@ -446,6 +645,29 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
+ if (virtio_has_feature(s->host_features,
+ VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+ s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
+ virtio_balloon_handle_free_page_vq);
+ s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+ s->free_page_report_cmd_id =
+ VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+ s->free_page_report_notify.notify =
+ virtio_balloon_free_page_report_notify;
+ precopy_add_notifier(&s->free_page_report_notify);
+ if (s->iothread) {
+ object_ref(OBJECT(s->iothread));
+ s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
+ virtio_ballloon_get_free_page_hints, s);
+ qemu_mutex_init(&s->free_page_lock);
+ qemu_cond_init(&s->free_page_cond);
+ s->block_iothread = false;
+ } else {
+ /* Simply disable this feature if the iothread wasn't created. */
+ s->host_features &= ~(1 << VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+ virtio_error(vdev, "iothread is missing");
+ }
+ }
reset_stats(s);
}
@@ -454,6 +676,11 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp)
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
VirtIOBalloon *s = VIRTIO_BALLOON(dev);
+ if (virtio_balloon_free_page_support(s)) {
+ qemu_bh_delete(s->free_page_bh);
+ virtio_balloon_free_page_stop(s);
+ precopy_remove_notifier(&s->free_page_report_notify);
+ }
balloon_stats_destroy_timer(s);
qemu_remove_balloon_handler(s);
virtio_cleanup(vdev);
@@ -463,6 +690,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
{
VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+ if (virtio_balloon_free_page_support(s)) {
+ virtio_balloon_free_page_stop(s);
+ }
+
if (s->stats_vq_elem != NULL) {
virtqueue_unpop(s->svq, s->stats_vq_elem, 0);
g_free(s->stats_vq_elem);
@@ -480,6 +711,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
* was stopped */
virtio_balloon_receive_stats(vdev, s->svq);
}
+
+ if (virtio_balloon_free_page_support(s)) {
+ /*
+ * The VM is woken up and the iothread was blocked, so signal it to
+ * continue.
+ */
+ if (vdev->vm_running && s->block_iothread) {
+ qemu_mutex_lock(&s->free_page_lock);
+ s->block_iothread = false;
+ qemu_cond_signal(&s->free_page_cond);
+ qemu_mutex_unlock(&s->free_page_lock);
+ }
+
+ /* The VM is stopped, block the iothread. */
+ if (!vdev->vm_running) {
+ qemu_mutex_lock(&s->free_page_lock);
+ s->block_iothread = true;
+ qemu_mutex_unlock(&s->free_page_lock);
+ }
+ }
}
static void virtio_balloon_instance_init(Object *obj)
@@ -508,6 +759,10 @@ static const VMStateDescription vmstate_virtio_balloon = {
static Property virtio_balloon_properties[] = {
DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
+ DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
+ VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
+ DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD,
+ IOThread *),
DEFINE_PROP_END_OF_LIST(),
};
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index e0df352..df1d632 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -17,11 +17,14 @@
#include "standard-headers/linux/virtio_balloon.h"
#include "hw/virtio/virtio.h"
+#include "sysemu/iothread.h"
#define TYPE_VIRTIO_BALLOON "virtio-balloon-device"
#define VIRTIO_BALLOON(obj) \
OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000
+
typedef struct virtio_balloon_stat VirtIOBalloonStat;
typedef struct virtio_balloon_stat_modern {
@@ -30,15 +33,38 @@ typedef struct virtio_balloon_stat_modern {
uint64_t val;
} VirtIOBalloonStatModern;
+enum virtio_balloon_free_page_report_status {
+ FREE_PAGE_REPORT_S_STOP = 0,
+ FREE_PAGE_REPORT_S_REQUESTED = 1,
+ FREE_PAGE_REPORT_S_START = 2,
+ FREE_PAGE_REPORT_S_DONE = 3,
+};
+
typedef struct VirtIOBalloon {
VirtIODevice parent_obj;
- VirtQueue *ivq, *dvq, *svq;
+ VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+ uint32_t free_page_report_status;
uint32_t num_pages;
uint32_t actual;
+ uint32_t free_page_report_cmd_id;
uint64_t stats[VIRTIO_BALLOON_S_NR];
VirtQueueElement *stats_vq_elem;
size_t stats_vq_offset;
QEMUTimer *stats_timer;
+ IOThread *iothread;
+ QEMUBH *free_page_bh;
+ /*
+ * Lock to synchronize threads to access the free page reporting related
+ * fields (e.g. free_page_report_status).
+ */
+ QemuMutex free_page_lock;
+ QemuCond free_page_cond;
+ /*
+ * Set to block iothread to continue reading free page hints as the VM is
+ * stopped.
+ */
+ bool block_iothread;
+ Notifier free_page_report_notify;
int64_t stats_last_update;
int64_t stats_poll_interval;
uint32_t host_features;
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 4dbb7dc..9eee1c6 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,15 +34,20 @@
#define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */
#define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */
#define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
/* Size of a PFN in the balloon interface. */
#define VIRTIO_BALLOON_PFN_SHIFT 12
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID 1
struct virtio_balloon_config {
/* Number of pages host wants Guest to give up. */
uint32_t num_pages;
/* Number of pages we've actually got in balloon. */
uint32_t actual;
+ /* Free page report command id, readonly by guest */
+ uint32_t free_page_report_cmd_id;
};
#define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */
--
1.8.3.1
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply related [flat|nested] 20+ messages in thread* [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
` (7 preceding siblings ...)
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
@ 2018-11-27 3:11 ` Wei Wang
8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-27 3:11 UTC (permalink / raw)
To: qemu-devel, virtio-dev, mst, quintela, dgilbert
Cc: peterx, pbonzini, liliang.opensource, nilal, riel
On 11/15/2018 06:07 PM, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not sent by the migration thread to the destination.
>
> *Tests
> 1 Test Environment
> Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
> Migration setup: migrate_set_speed 100G, migrate_set_downtime 400ms
>
> 2 Test Results (results are averaged over several repeated runs)
> 2.1 Guest setup: 8G RAM, 4 vCPU
> 2.1.1 Idle guest live migration time
> Optimization v.s. Legacy = 620ms vs 2970ms
> --> ~79% reduction
> 2.1.2 Guest live migration with Linux compilation workload
> (i.e. make bzImage -j4) running
> 1) Live Migration Time:
> Optimization v.s. Legacy = 2273ms v.s. 4502ms
> --> ~50% reduction
> 2) Linux Compilation Time:
> Optimization v.s. Legacy = 8min42s v.s. 8min43s
> --> no obvious difference
>
> 2.2 Guest setup: 128G RAM, 4 vCPU
> 2.2.1 Idle guest live migration time
> Optimization v.s. Legacy = 5294ms vs 41651ms
> --> ~87% reduction
> 2.2.2 Guest live migration with Linux compilation workload
> 1) Live Migration Time:
> Optimization v.s. Legacy = 8816ms v.s. 54201ms
> --> 84% reduction
> 2) Linux Compilation Time:
> Optimization v.s. Legacy = 8min30s v.s. 8min36s
> --> no obvious difference
>
> ChangeLog:
> v8->v9:
> bitmap:
> - fix bitmap_count_one to handle the nbits=0 case
> migration:
> - replace the ram save notifier chain with a more general precopy
> notifier chain, which is similar to the postcopy notifier chain.
> - Avoid exposing the RAMState struct, and add a function,
> precopy_disable_bulk_stage, to let the virtio-balloon notifier
> callback to disable the bulk stage flag.
Hi Dave and Peter,
Could you continue to review the patches? Thanks!
Best,
Wei
---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
^ permalink raw reply [flat|nested] 20+ messages in thread