Discussion of the implementations of VIRTIO specification
 help / color / mirror / Atom feed
* [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support
@ 2018-11-15 10:07 Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
                   ` (8 more replies)
  0 siblings, 9 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

This is the deivce part implementation to add a new feature,
VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
receives the guest free page hints from the driver and clears the
corresponding bits in the dirty bitmap, so that those free pages are
not sent by the migration thread to the destination.

*Tests
1 Test Environment
    Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
    Migration setup: migrate_set_speed 100G, migrate_set_downtime 400ms

2 Test Results (results are averaged over several repeated runs)
    2.1 Guest setup: 8G RAM, 4 vCPU
        2.1.1 Idle guest live migration time
            Optimization v.s. Legacy = 620ms vs 2970ms
            --> ~79% reduction
        2.1.2 Guest live migration with Linux compilation workload
          (i.e. make bzImage -j4) running
          1) Live Migration Time:
             Optimization v.s. Legacy = 2273ms v.s. 4502ms
             --> ~50% reduction
          2) Linux Compilation Time:
             Optimization v.s. Legacy = 8min42s v.s. 8min43s
             --> no obvious difference

    2.2 Guest setup: 128G RAM, 4 vCPU
        2.2.1 Idle guest live migration time
            Optimization v.s. Legacy = 5294ms vs 41651ms
            --> ~87% reduction
        2.2.2 Guest live migration with Linux compilation workload
          1) Live Migration Time:
            Optimization v.s. Legacy = 8816ms v.s. 54201ms
            --> 84% reduction
          2) Linux Compilation Time:
             Optimization v.s. Legacy = 8min30s v.s. 8min36s
             --> no obvious difference

ChangeLog:
v8->v9:
bitmap:
    - fix bitmap_count_one to handle the nbits=0 case
migration:
    - replace the ram save notifier chain with a more general precopy
      notifier chain, which is similar to the postcopy notifier chain.
    - Avoid exposing the RAMState struct, and add a function,
      precopy_disable_bulk_stage, to let the virtio-balloon notifier
      callback to disable the bulk stage flag.
virtio-balloon:
    - Remove VIRTIO_BALLOON_F_PAGE_POISON from this series as it is not
      a limit to the free page optimization now. Plan to add the device
      support for VIRTIO_BALLOON_F_PAGE_POISON in another patch later.

Previous changelog:
https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg01908.html

Wei Wang (8):
  bitmap: fix bitmap_count_one
  bitmap: bitmap_count_one_with_offset
  migration: use bitmap_mutex in migration_bitmap_clear_dirty
  migration: API to clear bits of guest free pages from the dirty bitmap
  migration/ram.c: add a notifier chain for precopy
  migration/ram.c: add a function to disable the bulk stage
  migration: move migrate_postcopy() to include/migration/misc.h
  virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

 hw/virtio/virtio-balloon.c                      | 255 ++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h              |  28 ++-
 include/migration/misc.h                        |  16 ++
 include/qemu/bitmap.h                           |  17 ++
 include/standard-headers/linux/virtio_balloon.h |   5 +
 migration/migration.h                           |   2 -
 migration/ram.c                                 |  91 +++++++++
 vl.c                                            |   1 +
 8 files changed, 412 insertions(+), 3 deletions(-)

-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset Wei Wang
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

BITMAP_LAST_WORD_MASK(nbits) returns 0xffffffff when "nbits=0", which
makes bitmap_count_one fail to handle the "nbits=0" case. It appears to be
preferred to remain BITMAP_LAST_WORD_MASK identical to the kernel
implementation that it is ported from.

So this patch fixes bitmap_count_one to handle the nbits=0 case.

Inital Discussion Link:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg554316.html
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 include/qemu/bitmap.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 509eedd..679f1bd 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -221,6 +221,10 @@ static inline int bitmap_intersects(const unsigned long *src1,
 
 static inline long bitmap_count_one(const unsigned long *bitmap, long nbits)
 {
+    if (unlikely(!nbits)) {
+        return 0;
+    }
+
     if (small_nbits(nbits)) {
         return ctpopl(*bitmap & BITMAP_LAST_WORD_MASK(nbits));
     } else {
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

Count the number of 1s in a bitmap starting from an offset.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/qemu/bitmap.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
index 679f1bd..5c31334 100644
--- a/include/qemu/bitmap.h
+++ b/include/qemu/bitmap.h
@@ -232,6 +232,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits)
     }
 }
 
+static inline long bitmap_count_one_with_offset(const unsigned long *bitmap,
+                                                long offset, long nbits)
+{
+    long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG);
+    long redundant_bits = offset - aligned_offset;
+    long bits_to_count = nbits + redundant_bits;
+    const unsigned long *bitmap_start = bitmap +
+                                        aligned_offset / BITS_PER_LONG;
+
+    return bitmap_count_one(bitmap_start, bits_to_count) -
+           bitmap_count_one(bitmap_start, redundant_bits);
+}
+
 void bitmap_set(unsigned long *map, long i, long len);
 void bitmap_set_atomic(unsigned long *map, long i, long len);
 void bitmap_clear(unsigned long *map, long start, long nr);
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset Wei Wang
@ 2018-11-15 10:07 ` Wei Wang
       [not found]   ` <20181127054056.GA3205@xz-x1>
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:07 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

The bitmap mutex is used to synchronize threads to update the dirty
bitmap and the migration_dirty_pages counter. For example, the free
page optimization clears bits of free pages from the bitmap in an
iothread context. This patch makes migration_bitmap_clear_dirty update
the bitmap and counter under the mutex.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 migration/ram.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 7e7deec..ef69dbe 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1556,11 +1556,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
 {
     bool ret;
 
+    qemu_mutex_lock(&rs->bitmap_mutex);
     ret = test_and_clear_bit(page, rb->bmap);
 
     if (ret) {
         rs->migration_dirty_pages--;
     }
+    qemu_mutex_unlock(&rs->bitmap_mutex);
+
     return ret;
 }
 
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (2 preceding siblings ...)
  2018-11-15 10:07 ` [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
       [not found]   ` <20181127060638.GB3205@xz-x1>
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy Wei Wang
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

This patch adds an API to clear bits corresponding to guest free pages
from the dirty bitmap. Spilt the free page block if it crosses the QEMU
RAMBlock boundary.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 include/migration/misc.h |  2 ++
 migration/ram.c          | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 50 insertions(+)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 4ebf24c..113320e 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -14,11 +14,13 @@
 #ifndef MIGRATION_MISC_H
 #define MIGRATION_MISC_H
 
+#include "exec/cpu-common.h"
 #include "qemu/notify.h"
 
 /* migration/ram.c */
 
 void ram_mig_init(void);
+void qemu_guest_free_page_hint(void *addr, size_t len);
 
 /* migration/block.c */
 
diff --git a/migration/ram.c b/migration/ram.c
index ef69dbe..229b791 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3131,6 +3131,54 @@ static void ram_state_resume_prepare(RAMState *rs, QEMUFile *out)
 }
 
 /*
+ * This function clears bits of the free pages reported by the caller from the
+ * migration dirty bitmap. @addr is the host address corresponding to the
+ * start of the continuous guest free pages, and @len is the total bytes of
+ * those pages.
+ */
+void qemu_guest_free_page_hint(void *addr, size_t len)
+{
+    RAMBlock *block;
+    ram_addr_t offset;
+    size_t used_len, start, npages;
+    MigrationState *s = migrate_get_current();
+
+    /* This function is currently expected to be used during live migration */
+    if (!migration_is_setup_or_active(s->state)) {
+        return;
+    }
+
+    for (; len > 0; len -= used_len) {
+        block = qemu_ram_block_from_host(addr, false, &offset);
+        assert(block);
+
+        /*
+         * This handles the case that the RAMBlock is resized after the free
+         * page hint is reported.
+         */
+        if (unlikely(offset > block->used_length)) {
+            return;
+        }
+
+        if (len <= block->used_length - offset) {
+            used_len = len;
+        } else {
+            used_len = block->used_length - offset;
+            addr += used_len;
+        }
+
+        start = offset >> TARGET_PAGE_BITS;
+        npages = used_len >> TARGET_PAGE_BITS;
+
+        qemu_mutex_lock(&ram_state->bitmap_mutex);
+        ram_state->migration_dirty_pages -=
+                      bitmap_count_one_with_offset(block->bmap, start, npages);
+        bitmap_clear(block->bmap, start, npages);
+        qemu_mutex_unlock(&ram_state->bitmap_mutex);
+    }
+}
+
+/*
  * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
  * long-running RCU critical section.  When rcu-reclaims in the code
  * start to become numerous it will be necessary to reduce the
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (3 preceding siblings ...)
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
       [not found]   ` <20181127073802.GC3205@xz-x1>
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage Wei Wang
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

This patch adds a notifier chain for the memory precopy. This enables various
precopy optimizations to be invoked at specific places.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 include/migration/misc.h | 12 ++++++++++++
 migration/ram.c          | 31 +++++++++++++++++++++++++++++++
 vl.c                     |  1 +
 3 files changed, 44 insertions(+)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 113320e..0bac623 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -19,6 +19,18 @@
 
 /* migration/ram.c */
 
+typedef enum PrecopyNotifyReason {
+    PRECOPY_NOTIFY_ERR = 0,
+    PRECOPY_NOTIFY_START_ITERATION = 1,
+    PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP = 2,
+    PRECOPY_NOTIFY_AFTER_SYNC_BITMAP = 3,
+    PRECOPY_NOTIFY_MAX = 4,
+} PrecopyNotifyReason;
+
+void precopy_infrastructure_init(void);
+void precopy_add_notifier(Notifier *n);
+void precopy_remove_notifier(Notifier *n);
+
 void ram_mig_init(void);
 void qemu_guest_free_page_hint(void *addr, size_t len);
 
diff --git a/migration/ram.c b/migration/ram.c
index 229b791..65b1223 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -292,6 +292,8 @@ struct RAMState {
     bool ram_bulk_stage;
     /* How many times we have dirty too many pages */
     int dirty_rate_high_cnt;
+    /* ram save states used for notifiers */
+    int ram_save_state;
     /* these variables are used for bitmap sync */
     /* last time we did a full bitmap_sync */
     int64_t time_last_bitmap_sync;
@@ -328,6 +330,28 @@ typedef struct RAMState RAMState;
 
 static RAMState *ram_state;
 
+static NotifierList precopy_notifier_list;
+
+void precopy_infrastructure_init(void)
+{
+    notifier_list_init(&precopy_notifier_list);
+}
+
+void precopy_add_notifier(Notifier *n)
+{
+    notifier_list_add(&precopy_notifier_list, n);
+}
+
+void precopy_remove_notifier(Notifier *n)
+{
+    notifier_remove(n);
+}
+
+static void precopy_notify(PrecopyNotifyReason reason)
+{
+    notifier_list_notify(&precopy_notifier_list, &reason);
+}
+
 uint64_t ram_bytes_remaining(void)
 {
     return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) :
@@ -1642,6 +1666,8 @@ static void migration_bitmap_sync(RAMState *rs)
     int64_t end_time;
     uint64_t bytes_xfer_now;
 
+    precopy_notify(PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP);
+
     ram_counters.dirty_sync_count++;
 
     if (!rs->time_last_bitmap_sync) {
@@ -1699,6 +1725,8 @@ static void migration_bitmap_sync(RAMState *rs)
     if (migrate_use_events()) {
         qapi_event_send_migration_pass(ram_counters.dirty_sync_count);
     }
+
+    precopy_notify(PRECOPY_NOTIFY_AFTER_SYNC_BITMAP);
 }
 
 /**
@@ -2555,6 +2583,8 @@ static void ram_state_reset(RAMState *rs)
     rs->last_page = 0;
     rs->last_version = ram_list.version;
     rs->ram_bulk_stage = true;
+
+    precopy_notify(PRECOPY_NOTIFY_START_ITERATION);
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -3324,6 +3354,7 @@ out:
 
     ret = qemu_file_get_error(f);
     if (ret < 0) {
+        precopy_notify(PRECOPY_NOTIFY_ERR);
         return ret;
     }
 
diff --git a/vl.c b/vl.c
index fa25d1a..48ae0e8 100644
--- a/vl.c
+++ b/vl.c
@@ -3057,6 +3057,7 @@ int main(int argc, char **argv, char **envp)
     module_call_init(MODULE_INIT_OPTS);
 
     runstate_init();
+    precopy_infrastructure_init();
     postcopy_infrastructure_init();
     monitor_init_globals();
 
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (4 preceding siblings ...)
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h Wei Wang
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

This patch adds a function to enable a precopy notifier callback outside
the migration subsystem to disable the bulk stage flag. This is needed by
the free page optimization offered by virtio-balloon.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 include/migration/misc.h | 1 +
 migration/ram.c          | 9 +++++++++
 2 files changed, 10 insertions(+)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 0bac623..67cb275 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -30,6 +30,7 @@ typedef enum PrecopyNotifyReason {
 void precopy_infrastructure_init(void);
 void precopy_add_notifier(Notifier *n);
 void precopy_remove_notifier(Notifier *n);
+void precopy_disable_bulk_stage(void);
 
 void ram_mig_init(void);
 void qemu_guest_free_page_hint(void *addr, size_t len);
diff --git a/migration/ram.c b/migration/ram.c
index 65b1223..8745ca3 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -352,6 +352,15 @@ static void precopy_notify(PrecopyNotifyReason reason)
     notifier_list_notify(&precopy_notifier_list, &reason);
 }
 
+void precopy_disable_bulk_stage(void)
+{
+    if (!ram_state) {
+        return;
+    }
+
+    ram_state->ram_bulk_stage = false;
+}
+
 uint64_t ram_bytes_remaining(void)
 {
     return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) :
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (5 preceding siblings ...)
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
  2018-11-27  3:11 ` [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

The ram save state notifier callback, for example the free page
optimization offerred by virtio-balloon, may need to check if
postcopy is in use, so move migrate_postcopy() to the outside header.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/misc.h | 1 +
 migration/migration.h    | 2 --
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/include/migration/misc.h b/include/migration/misc.h
index 67cb275..06b816f 100644
--- a/include/migration/misc.h
+++ b/include/migration/misc.h
@@ -70,6 +70,7 @@ bool migration_has_failed(MigrationState *);
 /* ...and after the device transmission */
 bool migration_in_postcopy_after_devices(MigrationState *);
 void migration_global_dump(Monitor *mon);
+bool migrate_postcopy(void);
 
 /* migration/block-dirty-bitmap.c */
 void dirty_bitmap_mig_init(void);
diff --git a/migration/migration.h b/migration/migration.h
index e413d4d..e1e92ff 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -249,8 +249,6 @@ bool migration_is_blocked(Error **errp);
 bool migration_in_postcopy(void);
 MigrationState *migrate_get_current(void);
 
-bool migrate_postcopy(void);
-
 bool migrate_release_ram(void);
 bool migrate_postcopy_ram(void);
 bool migrate_zero_blocks(void);
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (6 preceding siblings ...)
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h Wei Wang
@ 2018-11-15 10:08 ` Wei Wang
  2018-11-27  3:11 ` [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-15 10:08 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel, wei.w.wang

The new feature enables the virtio-balloon device to receive hints of
guest free pages from the free page vq.

A notifier is registered to the migration precopy notifier chain. The
notifier calls free_page_start after the migration thread syncs the dirty
bitmap, so that the free page optimization starts to clear bits of free
pages from the bitmap. It calls the free_page_stop before the migration
thread syncs the bitmap, which is the end of the current round of ram
save. The free_page_stop is also called to stop the optimization in the
case when there is an error occurred in the process of ram saving.

Note: balloon will report pages which were free at the time of this call.
As the reporting happens asynchronously, dirty bit logging must be
enabled before this free_page_start call is made. Guest reporting must be
disabled before the migration dirty bitmap is synchronized.

Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
---
 hw/virtio/virtio-balloon.c                      | 255 ++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h              |  28 ++-
 include/standard-headers/linux/virtio_balloon.h |   5 +
 3 files changed, 287 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 1728e4f..4a3eb30 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -27,6 +27,7 @@
 #include "qapi/visitor.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "migration/misc.h"
 
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
@@ -308,6 +309,176 @@ out:
     }
 }
 
+static void virtio_balloon_handle_free_page_vq(VirtIODevice *vdev,
+                                               VirtQueue *vq)
+{
+    VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    qemu_bh_schedule(s->free_page_bh);
+}
+
+static bool get_free_page_hints(VirtIOBalloon *dev)
+{
+    VirtQueueElement *elem;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtQueue *vq = dev->free_page_vq;
+
+    while (dev->block_iothread) {
+        qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock);
+    }
+
+    elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
+    if (!elem) {
+        return false;
+    }
+
+    if (elem->out_num) {
+        uint32_t id;
+        size_t size = iov_to_buf(elem->out_sg, elem->out_num, 0,
+                                 &id, sizeof(id));
+        virtqueue_push(vq, elem, size);
+        g_free(elem);
+
+        virtio_tswap32s(vdev, &id);
+        if (unlikely(size != sizeof(id))) {
+            virtio_error(vdev, "received an incorrect cmd id");
+            return false;
+        }
+        if (id == dev->free_page_report_cmd_id) {
+            dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+        } else {
+            /*
+             * Stop the optimization only when it has started. This
+             * avoids a stale stop sign for the previous command.
+             */
+            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+            }
+        }
+    }
+
+    if (elem->in_num) {
+        if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+            qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
+                                      elem->in_sg[0].iov_len);
+        }
+        virtqueue_push(vq, elem, 1);
+        g_free(elem);
+    }
+
+    return true;
+}
+
+static void virtio_ballloon_get_free_page_hints(void *opaque)
+{
+    VirtIOBalloon *dev = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    VirtQueue *vq = dev->free_page_vq;
+    bool continue_to_get_hints;
+
+    do {
+        qemu_mutex_lock(&dev->free_page_lock);
+        virtio_queue_set_notification(vq, 0);
+        continue_to_get_hints = get_free_page_hints(dev);
+        qemu_mutex_unlock(&dev->free_page_lock);
+        virtio_notify(vdev, vq);
+      /*
+       * Start to poll the vq once the reporting started. Otherwise, continue
+       * only when there are entries on the vq, which need to be given back.
+       */
+    } while (continue_to_get_hints ||
+             dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
+    virtio_queue_set_notification(vq, 1);
+}
+
+static bool virtio_balloon_free_page_support(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+}
+
+static void virtio_balloon_free_page_start(VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    /* For the stop and copy phase, we don't need to start the optimization */
+    if (!vdev->vm_running) {
+        return;
+    }
+
+    if (s->free_page_report_cmd_id == UINT_MAX) {
+        s->free_page_report_cmd_id =
+                       VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+    } else {
+        s->free_page_report_cmd_id++;
+    }
+
+    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+    virtio_notify_config(vdev);
+}
+
+static void virtio_balloon_free_page_stop(VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+        /*
+         * The lock also guarantees us that the
+         * virtio_ballloon_get_free_page_hints exits after the
+         * free_page_report_status is set to S_STOP.
+         */
+        qemu_mutex_lock(&s->free_page_lock);
+        /*
+         * The guest hasn't done the reporting, so host sends a notification
+         * to the guest to actively stop the reporting.
+         */
+        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        qemu_mutex_unlock(&s->free_page_lock);
+        virtio_notify_config(vdev);
+    }
+}
+
+static void virtio_balloon_free_page_done(VirtIOBalloon *s)
+{
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+    virtio_notify_config(vdev);
+}
+
+static void virtio_balloon_free_page_report_notify(Notifier *notifier,
+                                                   void *data)
+{
+    VirtIOBalloon *dev = container_of(notifier, VirtIOBalloon,
+                                      free_page_report_notify);
+    VirtIODevice *vdev = VIRTIO_DEVICE(dev);
+    PrecopyNotifyReason reason = *(PrecopyNotifyReason *)data;
+
+    if (!virtio_balloon_free_page_support(dev) || migrate_postcopy()) {
+        return;
+    }
+
+    switch (reason) {
+    case PRECOPY_NOTIFY_START_ITERATION:
+        precopy_disable_bulk_stage();
+        break;
+    case PRECOPY_NOTIFY_ERR:
+    case PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP:
+        virtio_balloon_free_page_stop(dev);
+        break;
+    case PRECOPY_NOTIFY_AFTER_SYNC_BITMAP:
+        if (vdev->vm_running) {
+            virtio_balloon_free_page_start(dev);
+        } else {
+            virtio_balloon_free_page_done(dev);
+        }
+        break;
+    default:
+        virtio_error(vdev, "%s: %d reason unknown", __func__, reason);
+    }
+}
+
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 {
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
@@ -316,6 +487,17 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
 
+    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
+        config.free_page_report_cmd_id =
+                       cpu_to_le32(dev->free_page_report_cmd_id);
+    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
+        config.free_page_report_cmd_id =
+                       cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID);
+    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) {
+        config.free_page_report_cmd_id =
+                       cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID);
+    }
+
     trace_virtio_balloon_get_config(config.num_pages, config.actual);
     memcpy(config_data, &config, sizeof(struct virtio_balloon_config));
 }
@@ -376,6 +558,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f,
     VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
     f |= dev->host_features;
     virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ);
+
     return f;
 }
 
@@ -412,6 +595,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
     return 0;
 }
 
+static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+    .name = "virtio-balloon-device/free-page-report",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = virtio_balloon_free_page_support,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
+        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_virtio_balloon_device = {
     .name = "virtio-balloon-device",
     .version_id = 1,
@@ -422,6 +617,10 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
         VMSTATE_UINT32(actual, VirtIOBalloon),
         VMSTATE_END_OF_LIST()
     },
+    .subsections = (const VMStateDescription * []) {
+        &vmstate_virtio_balloon_free_page_report,
+        NULL
+    }
 };
 
 static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
@@ -446,6 +645,29 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
     s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output);
     s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats);
 
+    if (virtio_has_feature(s->host_features,
+                           VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
+        s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
+                                           virtio_balloon_handle_free_page_vq);
+        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        s->free_page_report_cmd_id =
+                           VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+        s->free_page_report_notify.notify =
+                                       virtio_balloon_free_page_report_notify;
+        precopy_add_notifier(&s->free_page_report_notify);
+        if (s->iothread) {
+            object_ref(OBJECT(s->iothread));
+            s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
+                                       virtio_ballloon_get_free_page_hints, s);
+            qemu_mutex_init(&s->free_page_lock);
+            qemu_cond_init(&s->free_page_cond);
+            s->block_iothread = false;
+        } else {
+            /* Simply disable this feature if the iothread wasn't created. */
+            s->host_features &= ~(1 << VIRTIO_BALLOON_F_FREE_PAGE_HINT);
+            virtio_error(vdev, "iothread is missing");
+        }
+    }
     reset_stats(s);
 }
 
@@ -454,6 +676,11 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp)
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     VirtIOBalloon *s = VIRTIO_BALLOON(dev);
 
+    if (virtio_balloon_free_page_support(s)) {
+        qemu_bh_delete(s->free_page_bh);
+        virtio_balloon_free_page_stop(s);
+        precopy_remove_notifier(&s->free_page_report_notify);
+    }
     balloon_stats_destroy_timer(s);
     qemu_remove_balloon_handler(s);
     virtio_cleanup(vdev);
@@ -463,6 +690,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
 
+    if (virtio_balloon_free_page_support(s)) {
+        virtio_balloon_free_page_stop(s);
+    }
+
     if (s->stats_vq_elem != NULL) {
         virtqueue_unpop(s->svq, s->stats_vq_elem, 0);
         g_free(s->stats_vq_elem);
@@ -480,6 +711,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
          * was stopped */
         virtio_balloon_receive_stats(vdev, s->svq);
     }
+
+    if (virtio_balloon_free_page_support(s)) {
+        /*
+         * The VM is woken up and the iothread was blocked, so signal it to
+         * continue.
+         */
+        if (vdev->vm_running && s->block_iothread) {
+            qemu_mutex_lock(&s->free_page_lock);
+            s->block_iothread = false;
+            qemu_cond_signal(&s->free_page_cond);
+            qemu_mutex_unlock(&s->free_page_lock);
+        }
+
+        /* The VM is stopped, block the iothread. */
+        if (!vdev->vm_running) {
+            qemu_mutex_lock(&s->free_page_lock);
+            s->block_iothread = true;
+            qemu_mutex_unlock(&s->free_page_lock);
+        }
+    }
 }
 
 static void virtio_balloon_instance_init(Object *obj)
@@ -508,6 +759,10 @@ static const VMStateDescription vmstate_virtio_balloon = {
 static Property virtio_balloon_properties[] = {
     DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
+    DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
+    DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD,
+                     IOThread *),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index e0df352..df1d632 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -17,11 +17,14 @@
 
 #include "standard-headers/linux/virtio_balloon.h"
 #include "hw/virtio/virtio.h"
+#include "sysemu/iothread.h"
 
 #define TYPE_VIRTIO_BALLOON "virtio-balloon-device"
 #define VIRTIO_BALLOON(obj) \
         OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
 
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000
+
 typedef struct virtio_balloon_stat VirtIOBalloonStat;
 
 typedef struct virtio_balloon_stat_modern {
@@ -30,15 +33,38 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
+enum virtio_balloon_free_page_report_status {
+    FREE_PAGE_REPORT_S_STOP = 0,
+    FREE_PAGE_REPORT_S_REQUESTED = 1,
+    FREE_PAGE_REPORT_S_START = 2,
+    FREE_PAGE_REPORT_S_DONE = 3,
+};
+
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq;
+    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;
+    uint32_t free_page_report_cmd_id;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
     QEMUTimer *stats_timer;
+    IOThread *iothread;
+    QEMUBH *free_page_bh;
+    /*
+     * Lock to synchronize threads to access the free page reporting related
+     * fields (e.g. free_page_report_status).
+     */
+    QemuMutex free_page_lock;
+    QemuCond  free_page_cond;
+    /*
+     * Set to block iothread to continue reading free page hints as the VM is
+     * stopped.
+     */
+    bool block_iothread;
+    Notifier free_page_report_notify;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h
index 4dbb7dc..9eee1c6 100644
--- a/include/standard-headers/linux/virtio_balloon.h
+++ b/include/standard-headers/linux/virtio_balloon.h
@@ -34,15 +34,20 @@
 #define VIRTIO_BALLOON_F_MUST_TELL_HOST	0 /* Tell before reclaiming pages */
 #define VIRTIO_BALLOON_F_STATS_VQ	1 /* Memory Stats virtqueue */
 #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM	2 /* Deflate balloon on OOM */
+#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */
 
 /* Size of a PFN in the balloon interface. */
 #define VIRTIO_BALLOON_PFN_SHIFT 12
 
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0
+#define VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID 1
 struct virtio_balloon_config {
 	/* Number of pages host wants Guest to give up. */
 	uint32_t num_pages;
 	/* Number of pages we've actually got in balloon. */
 	uint32_t actual;
+	/* Free page report command id, readonly by guest */
+	uint32_t free_page_report_cmd_id;
 };
 
 #define VIRTIO_BALLOON_S_SWAP_IN  0   /* Amount of memory swapped in */
-- 
1.8.3.1


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support
  2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
                   ` (7 preceding siblings ...)
  2018-11-15 10:08 ` [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
@ 2018-11-27  3:11 ` Wei Wang
  8 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-27  3:11 UTC (permalink / raw)
  To: qemu-devel, virtio-dev, mst, quintela, dgilbert
  Cc: peterx, pbonzini, liliang.opensource, nilal, riel

On 11/15/2018 06:07 PM, Wei Wang wrote:
> This is the deivce part implementation to add a new feature,
> VIRTIO_BALLOON_F_FREE_PAGE_HINT to the virtio-balloon device. The device
> receives the guest free page hints from the driver and clears the
> corresponding bits in the dirty bitmap, so that those free pages are
> not sent by the migration thread to the destination.
>
> *Tests
> 1 Test Environment
>      Host: Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
>      Migration setup: migrate_set_speed 100G, migrate_set_downtime 400ms
>
> 2 Test Results (results are averaged over several repeated runs)
>      2.1 Guest setup: 8G RAM, 4 vCPU
>          2.1.1 Idle guest live migration time
>              Optimization v.s. Legacy = 620ms vs 2970ms
>              --> ~79% reduction
>          2.1.2 Guest live migration with Linux compilation workload
>            (i.e. make bzImage -j4) running
>            1) Live Migration Time:
>               Optimization v.s. Legacy = 2273ms v.s. 4502ms
>               --> ~50% reduction
>            2) Linux Compilation Time:
>               Optimization v.s. Legacy = 8min42s v.s. 8min43s
>               --> no obvious difference
>
>      2.2 Guest setup: 128G RAM, 4 vCPU
>          2.2.1 Idle guest live migration time
>              Optimization v.s. Legacy = 5294ms vs 41651ms
>              --> ~87% reduction
>          2.2.2 Guest live migration with Linux compilation workload
>            1) Live Migration Time:
>              Optimization v.s. Legacy = 8816ms v.s. 54201ms
>              --> 84% reduction
>            2) Linux Compilation Time:
>               Optimization v.s. Legacy = 8min30s v.s. 8min36s
>               --> no obvious difference
>
> ChangeLog:
> v8->v9:
> bitmap:
>      - fix bitmap_count_one to handle the nbits=0 case
> migration:
>      - replace the ram save notifier chain with a more general precopy
>        notifier chain, which is similar to the postcopy notifier chain.
>      - Avoid exposing the RAMState struct, and add a function,
>        precopy_disable_bulk_stage, to let the virtio-balloon notifier
>        callback to disable the bulk stage flag.

Hi Dave and Peter,

Could you continue to review the patches? Thanks!

Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty
       [not found]   ` <20181127054056.GA3205@xz-x1>
@ 2018-11-27  6:02     ` Wei Wang
  2018-11-27  6:12       ` Wei Wang
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-27  6:02 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/27/2018 01:40 PM, Peter Xu wrote:
> On Thu, Nov 15, 2018 at 06:07:59PM +0800, Wei Wang wrote:
>> The bitmap mutex is used to synchronize threads to update the dirty
>> bitmap and the migration_dirty_pages counter. For example, the free
>> page optimization clears bits of free pages from the bitmap in an
>> iothread context. This patch makes migration_bitmap_clear_dirty update
>> the bitmap and counter under the mutex.
>>
>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Michael S. Tsirkin <mst@redhat.com>
>> CC: Peter Xu <peterx@redhat.com>
>> ---
>>   migration/ram.c | 3 +++
>>   1 file changed, 3 insertions(+)
>>
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 7e7deec..ef69dbe 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -1556,11 +1556,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
>>   {
>>       bool ret;
>>   
>> +    qemu_mutex_lock(&rs->bitmap_mutex);
>>       ret = test_and_clear_bit(page, rb->bmap);
>>   
>>       if (ret) {
>>           rs->migration_dirty_pages--;
>>       }
>> +    qemu_mutex_unlock(&rs->bitmap_mutex);
>> +
>>       return ret;
>>   }
> It seems fine to me, but have you thought about
> test_and_clear_bit_atomic()?  Note that we just had
> test_and_set_bit_atomic() a few months ago.

Thanks for sharing. I think we might also need to
mutex migration_dirty_pages.

>
> And not related to this patch: I'm unclear on why we have had
> bitmap_mutex before, since it seems unnecessary.

OK. This is because with the optimization we have a thread
which clears bits (of free pages) from the bitmap and updates
migration_dirty_pages. So we need to synchronization between
the migration thread and the optimization thread.

Best,
Wei



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [virtio-dev] Re: [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty
  2018-11-27  6:02     ` [virtio-dev] " Wei Wang
@ 2018-11-27  6:12       ` Wei Wang
       [not found]         ` <20181127074127.GD3205@xz-x1>
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-27  6:12 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/27/2018 02:02 PM, Wei Wang wrote:
> On 11/27/2018 01:40 PM, Peter Xu wrote:
>> On Thu, Nov 15, 2018 at 06:07:59PM +0800, Wei Wang wrote:
>>> The bitmap mutex is used to synchronize threads to update the dirty
>>> bitmap and the migration_dirty_pages counter. For example, the free
>>> page optimization clears bits of free pages from the bitmap in an
>>> iothread context. This patch makes migration_bitmap_clear_dirty update
>>> the bitmap and counter under the mutex.
>>>
>>> Signed-off-by: Wei Wang <wei.w.wang@intel.com>
>>> CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> CC: Juan Quintela <quintela@redhat.com>
>>> CC: Michael S. Tsirkin <mst@redhat.com>
>>> CC: Peter Xu <peterx@redhat.com>
>>> ---
>>>   migration/ram.c | 3 +++
>>>   1 file changed, 3 insertions(+)
>>>
>>> diff --git a/migration/ram.c b/migration/ram.c
>>> index 7e7deec..ef69dbe 100644
>>> --- a/migration/ram.c
>>> +++ b/migration/ram.c
>>> @@ -1556,11 +1556,14 @@ static inline bool 
>>> migration_bitmap_clear_dirty(RAMState *rs,
>>>   {
>>>       bool ret;
>>>   +    qemu_mutex_lock(&rs->bitmap_mutex);
>>>       ret = test_and_clear_bit(page, rb->bmap);
>>>         if (ret) {
>>>           rs->migration_dirty_pages--;
>>>       }
>>> +    qemu_mutex_unlock(&rs->bitmap_mutex);
>>> +
>>>       return ret;
>>>   }
>> It seems fine to me, but have you thought about
>> test_and_clear_bit_atomic()?  Note that we just had
>> test_and_set_bit_atomic() a few months ago.
>
> Thanks for sharing. I think we might also need to
> mutex migration_dirty_pages.
>
>>
>> And not related to this patch: I'm unclear on why we have had
>> bitmap_mutex before, since it seems unnecessary.
>
> OK. This is because with the optimization we have a thread
> which clears bits (of free pages) from the bitmap and updates
> migration_dirty_pages. So we need to synchronization between
> the migration thread and the optimization thread.
>

And before this feature, I think yes, that bitmap_mutex is not needed.
It was left there due to some historical reasons.
I remember Dave previous said he was about to remove it. But the new
feature will need it again.

Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap
       [not found]   ` <20181127060638.GB3205@xz-x1>
@ 2018-11-27  6:52     ` Wei Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-27  6:52 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/27/2018 02:06 PM, Peter Xu wrote:
> On Thu, Nov 15, 2018 at 06:08:00PM +0800, Wei Wang wrote:
> Again, is it possible to resize during migration?
>
> So I think the check is fine, but uncertain about the comment.

Yes, resize would not happen with the current implementation.
But heard it could just be a temporal implementation. Probably
we could improve the comment like this:

"
Though the implementation might not support ram resize currently,
this could happen in theory with future updates. So the check here
handles the case that RAMBLOCK is resized after the free page hint is
reported.
"

>
> And shall we print something if that happened?  We can use
> error_report_once(), and squashing the above assert:
>
>    if (!block || offset > block->used_length) {
>      /* should never happen, but if it happens we ignore the hints and warn */
>      error_report_once("...");
>      return;
>    }
>
> What do you think?

Sounds good.

>
>> +
>> +        if (len <= block->used_length - offset) {
>> +            used_len = len;
>> +        } else {
>> +            used_len = block->used_length - offset;
>> +            addr += used_len;
> Maybe moving this line into the for() could be a bit better?
>
>    for (; len > 0; len -= used_len, addr += used_len) {
>

Yes, I think it looks better, thanks.

Best,
Wei


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [virtio-dev] Re: [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty
       [not found]         ` <20181127074127.GD3205@xz-x1>
@ 2018-11-27 10:17           ` Wei Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-27 10:17 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/27/2018 03:41 PM, Peter Xu wrote:
>
> Ok then I'm fine with it.  Though you could update the comments too if
> you like:
>
>      /* protects modification of the bitmap and migration_dirty_pages */
>      QemuMutex bitmap_mutex;
>
> And it's tricky that sometimes we don't take the lock when reading
> this variable "migration_dirty_pages".  I don't see obvious issue so
> far, hope it's true (at least I skipped the colo ones...).

The caller reads the value just to estimate the remaining_size, and
it seems fine without a lock, because whether it reads a
value before the update or after the update seem not causing
an issue.

Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]   ` <20181127073802.GC3205@xz-x1>
@ 2018-11-27 10:25     ` Wei Wang
       [not found]       ` <20181128052655.GC12839@xz-x1>
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-27 10:25 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/27/2018 03:38 PM, Peter Xu wrote:
> On Thu, Nov 15, 2018 at 06:08:01PM +0800, Wei Wang wrote:
>>   
>> +typedef enum PrecopyNotifyReason {
>> +    PRECOPY_NOTIFY_ERR = 0,
>> +    PRECOPY_NOTIFY_START_ITERATION = 1,
>> +    PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP = 2,
>> +    PRECOPY_NOTIFY_AFTER_SYNC_BITMAP = 3,
>> +    PRECOPY_NOTIFY_MAX = 4,
> It would be nice to add some comments for each of the notify reason.
> E.g., from the name PRECOPY_NOTIFY_START_ITERATION seems more like a
> hook at the start of each iteration but according to [1] it should be
> at the start of migration rather than each iteration (or when
> migration restarts, though I'm not sure whether we really have this
> yet).

OK. I think It would be better if the name itself could be straightforward.
Probably we could change PRECOPY_NOTIFY_START_ITERATION to
PRECOPY_NOTIFY_START_MIGRATION.


>> +} PrecopyNotifyReason;
>> +
>> +void precopy_infrastructure_init(void);
>> +void precopy_add_notifier(Notifier *n);
>> +void precopy_remove_notifier(Notifier *n);
>> +
>>   void ram_mig_init(void);
>>   void qemu_guest_free_page_hint(void *addr, size_t len);
>>   
>> diff --git a/migration/ram.c b/migration/ram.c
>> index 229b791..65b1223 100644
>> --- a/migration/ram.c
>> +++ b/migration/ram.c
>> @@ -292,6 +292,8 @@ struct RAMState {
>>       bool ram_bulk_stage;
>>       /* How many times we have dirty too many pages */
>>       int dirty_rate_high_cnt;
>> +    /* ram save states used for notifiers */
>> +    int ram_save_state;
> This can be removed?

Yes, thanks.

>
>>       /* these variables are used for bitmap sync */
>>       /* last time we did a full bitmap_sync */
>>       int64_t time_last_bitmap_sync;
>> @@ -328,6 +330,28 @@ typedef struct RAMState RAMState;
>>   
>>   static RAMState *ram_state;
>>   
>> +static NotifierList precopy_notifier_list;
>> +
>> +void precopy_infrastructure_init(void)
>> +{
>> +    notifier_list_init(&precopy_notifier_list);
>> +}
>> +
>> +void precopy_add_notifier(Notifier *n)
>> +{
>> +    notifier_list_add(&precopy_notifier_list, n);
>> +}
>> +
>> +void precopy_remove_notifier(Notifier *n)
>> +{
>> +    notifier_remove(n);
>> +}
>> +
>> +static void precopy_notify(PrecopyNotifyReason reason)
>> +{
>> +    notifier_list_notify(&precopy_notifier_list, &reason);
>> +}
>> +
>>   uint64_t ram_bytes_remaining(void)
>>   {
>>       return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) :
>> @@ -1642,6 +1666,8 @@ static void migration_bitmap_sync(RAMState *rs)
>>       int64_t end_time;
>>       uint64_t bytes_xfer_now;
>>   
>> +    precopy_notify(PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP);
>> +
>>       ram_counters.dirty_sync_count++;
>>   
>>       if (!rs->time_last_bitmap_sync) {
>> @@ -1699,6 +1725,8 @@ static void migration_bitmap_sync(RAMState *rs)
>>       if (migrate_use_events()) {
>>           qapi_event_send_migration_pass(ram_counters.dirty_sync_count);
>>       }
>> +
>> +    precopy_notify(PRECOPY_NOTIFY_AFTER_SYNC_BITMAP);
>>   }
>>   
>>   /**
>> @@ -2555,6 +2583,8 @@ static void ram_state_reset(RAMState *rs)
>>       rs->last_page = 0;
>>       rs->last_version = ram_list.version;
>>       rs->ram_bulk_stage = true;
>> +
>> +    precopy_notify(PRECOPY_NOTIFY_START_ITERATION);
> [1]
>
>>   }
>>   
>>   #define MAX_WAIT 50 /* ms, half buffered_file limit */
>> @@ -3324,6 +3354,7 @@ out:
>>   
>>       ret = qemu_file_get_error(f);
>>       if (ret < 0) {
>> +        precopy_notify(PRECOPY_NOTIFY_ERR);
> Could you show me which function is this line in?
>
> Line 3324 here is ram_save_complete(), but I cannot find this exact
> place.

Sure, it's in ram_save_iterate():
...
out:
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
     ram_counters.transferred += 8;

     ret = qemu_file_get_error(f);
     if (ret < 0) {
+        precopy_notify(PRECOPY_NOTIFY_ERR);
         return ret;
     }

     return done;
}


>
> Another thing to mention about the "reasons" (though I see it more
> like "events"): have you thought about adding a PRECOPY_NOTIFY_END?
> It might help in some cases:
>
>    - then you don't need to trickily export the migrate_postcopy()
>      since you'll notify that before postcopy starts

I'm thinking probably we don't need to export migrate_postcopy even now.
It's more like a sanity check, and not needed because now we have the
notifier registered to the precopy specific callchain, which has ensured 
that
it is invoked via precopy.

>    - you'll have a solid point that you'll 100% guarantee that we'll
>      stop the free page hinting and don't need to worry about whether
>      there is chance the hinting will be running without an end [2].

Thanks, I think it makes sense. Plan to add PRECOPY_NOTIFY_END in 
ram_save_complete.


>
> Regarding [2] above: now the series only stops the hinting when
> PRECOPY_NOTIFY_BEFORE_SYNC_BITMAP is notified.  Could there be a case
> that it's missing?  E.g., what if we cancel/fail a migration during
> precopy?  Have you tried it?
>

I think it has been handled by the above PRECOPY_NOTIFY_ERR

Best,
Wei


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]       ` <20181128052655.GC12839@xz-x1>
@ 2018-11-28  9:01         ` Wei Wang
       [not found]           ` <20181128093220.GF12839@xz-x1>
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-28  9:01 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/28/2018 01:26 PM, Peter Xu wrote:
>
> Ok thanks.  Please just make sure you will capture all the error
> cases, e.g., I also see path like this (a few lines below):
>
>          if (pages < 0) {
>              qemu_file_set_error(f, pages);
>              break;
>          }
>
> It seems that you missed that one.

I think that one should be fine. This notification is actually put at the
bottom of ram_save_iterate. All the above error will bail out to the "out:"
path and then go to call precopy_notify(PRECOPY_NOTIFY_ERR).

>
> I would even suggest that you capture the error with higher level.
> E.g., in migration_iteration_run() after qemu_savevm_state_iterate().
> Or we can just check the return value of qemu_savevm_state_iterate(),
> which we have had ignored so far.

Not very sure about the higher level, because other SaveStateEntry may cause
errors that this feature don't need to care, I think we may only need it 
in ram_save.


> [1]
>
>>
>>> Another thing to mention about the "reasons" (though I see it more
>>> like "events"): have you thought about adding a PRECOPY_NOTIFY_END?
>>> It might help in some cases:
>>>
>>>     - then you don't need to trickily export the migrate_postcopy()
>>>       since you'll notify that before postcopy starts
>> I'm thinking probably we don't need to export migrate_postcopy even now.
>> It's more like a sanity check, and not needed because now we have the
>> notifier registered to the precopy specific callchain, which has ensured
>> that
>> it is invoked via precopy.
> But postcopy will always start with precopy, no?

Yes, but I think we could add the check in precopy_notify()


>
>>>     - you'll have a solid point that you'll 100% guarantee that we'll
>>>       stop the free page hinting and don't need to worry about whether
>>>       there is chance the hinting will be running without an end [2].
>> Thanks, I think it makes sense. Plan to add PRECOPY_NOTIFY_END in
>> ram_save_complete.
> Yeah you can.
>
> Btw, if you're mostly adding the notifies only in RAM-only codes, then
> you can consider add the "RAM" into the names of events too to be
> clear.

Sounds good, will try and see if the name would become too long :)

>
> My suggestion at [1] is precopy general, but you can still capture it
> at the end of ram_save_iterate, then they are RAM-only again.  Please
> feel free to choose what fits more...

OK, thanks.

Best,
Wei

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]           ` <20181128093220.GF12839@xz-x1>
@ 2018-11-29  3:40             ` Wei Wang
       [not found]               ` <20181129051014.GC29246@xz-x1>
  0 siblings, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-29  3:40 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/28/2018 05:32 PM, Peter Xu wrote:
>
> So what I am worrying here are corner cases where we might forget to
> stop the hinting.  I'm fabricating one example sequence of events:
>
>    (start migration)
>    START_MIGRATION
>    BEFORE_SYNC
>    AFTER_SYNC
>    ...
>    BEFORE_SYNC
>    AFTER_SYNC
>    (some SaveStateEntry failed rather than RAM, then
>     migration_detect_error returned MIG_THR_ERR_FATAL so we need to
>     fail the migration, however when running the previous
>     ram_save_iterate for RAM's specific SaveStateEntry we didn't see
>     any error so no ERROR event detected)
>
> Then it seems the hinting will last forever.  Considering that now I'm
> not sure whether this can be done ram-only, since even if you capture
> ram_save_complete() and at the same time you introduce PRECOPY_END you
> may still miss the PRECOPY_END event since AFAIU ram_save_complete()
> won't be called at all in this case.
>
> Could this happen?

Thanks, indeed this case could happen if we add PRECOPY_END in
ram_save_complete.

How about putting PRECOPY_END in ram_save_cleanup?
I think it would be called in any case.

I'm also thinking probably we don't need PRECOPY_ERR when we have 
PRECOPY_END,
and what do you think of the notifier names below:

+typedef enum PrecopyNotifyReason {
+    PRECOPY_NOTIFY_RAM_SAVE_END = 0,
+    PRECOPY_NOTIFY_RAM_SAVE_START = 1,
+    PRECOPY_NOTIFY_RAM_SAVE_BEFORE_SYNC_BITMAP = 2,
+    PRECOPY_NOTIFY_RAM_SAVE_AFTER_SYNC_BITMAP = 3,
+    PRECOPY_NOTIFY_RAM_SAVE_MAX = 4,
+} PrecopyNotifyReason;


>
>>
>>> [1]
>>>
>>>>> Another thing to mention about the "reasons" (though I see it more
>>>>> like "events"): have you thought about adding a PRECOPY_NOTIFY_END?
>>>>> It might help in some cases:
>>>>>
>>>>>      - then you don't need to trickily export the migrate_postcopy()
>>>>>        since you'll notify that before postcopy starts
>>>> I'm thinking probably we don't need to export migrate_postcopy even now.
>>>> It's more like a sanity check, and not needed because now we have the
>>>> notifier registered to the precopy specific callchain, which has ensured
>>>> that
>>>> it is invoked via precopy.
>>> But postcopy will always start with precopy, no?
>> Yes, but I think we could add the check in precopy_notify()
> I'm not sure that's good.  If the notifier could potentially have
> other user, they might still work with postcopy, and they might expect
> e.g. BEFORE_SYNC to be called for every sync, even if it's at the
> precopy stage of a postcopy.

I think this precopy notifier callchain is expected to be used only for
the precopy mode. Postcopy has its dedicated notifier callchain that
users could use.

How about changing the migrate_postcopy() check to "ms->start_postcopy":

bool migration_postcopy_start(void)
{
     MigrationState *s;

     s = migrate_get_current();

     return atomic_read(&s->start_postcopy);
}


static void precopy_notify(PrecopyNotifyReason reason)
{
     if (migration_postcopy_start())
         return;

     notifier_list_notify(&precopy_notifier_list, &reason);
}

If postcopy started with precopy, the precopy optimization feature
could still be used until it switches to the postcopy mode.



> In that sense I still feel the
> PRECOPY_END is better (so contantly call it at the end of precopy, no
> matter whether there's another postcopy afterwards).  It sounds like a
> cleaner interface.

Probably I still haven't got the point how PRECOPY_END could help above yet.

Best,
Wei


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]               ` <20181129051014.GC29246@xz-x1>
@ 2018-11-29  6:30                 ` Wei Wang
  2018-11-30  5:05                 ` Wei Wang
  1 sibling, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-29  6:30 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/29/2018 01:10 PM, Peter Xu wrote:
> On Thu, Nov 29, 2018 at 11:40:57AM +0800, Wei Wang wrote:
>> On 11/28/2018 05:32 PM, Peter Xu wrote:
> I'm not sure we can use start_postcopy.  It's a variable being set in
> the QMP handler but it does not mean postcopy has started.  I'm afraid
> there can be race where it's still precopy but the variable is set so
> event could be missed...
>
> IMHO the problem is not that complicated.  How about this proposal:
>
> [1]
>
>    typedef enum PrecopyNotifyReason {
>      PRECOPY_NOTIFY_RAM_START,
>      PRECOPY_NOTIFY_RAM_BEFORE_SYNC,
>      PRECOPY_NOTIFY_RAM_AFTER_SYNC,
>      PRECOPY_NOTIFY_COMPLETE,
>      PRECOPY_NOTIFY_RAM_CLEANUP,
>    };
>
> The first three keep the same as your old ones.  Notify RAM_CLEANUP in
> ram_save_cleanup() to make sure it'll always be cleaned up (the same
> as PRECOPY_END, just another name).  Notify COMPLETE in
> qemu_savevm_state_complete_precopy() to show that precopy is
> completed.  Meanwhile on balloon side you should stop the hinting for
> either RAM_CLEANUP or COMPLETE event.  Then either:
>
>    - precopy is switching to postcopy, or
>    - precopy completed, or
>    - precopy failed/cancelled
>
> You should always get at least a notification to stop the balloon.
> Though you could also get one RAM_CLEANUP after one COMPLETE, but
> the balloon should easily handle it (stop the hinting twice).

Sounds better, thanks.


>
> Here maybe you can even remove the "RAM_" in both RAM_START and
> RAM_CLEANUP if we're going to have COMPLETE since after all it'll be
> not only limited to RAM.
>
> Another suggestion is that you can add an Error into the notify hooks,
> please refer to the postcopy one:
>
>    int postcopy_notify(enum PostcopyNotifyReason reason, Error **errp);
>
> So the hook functions have a way to even stop the migration (though
> for balloon hinting it'll be always optional so no error should be
> reported...), then the two interfaces are matched.

Yes, I removed that "errp" as it's not needed by the balloon usage.
But no problem, I can add it back if no objections from others.

Best,
Wei



---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]               ` <20181129051014.GC29246@xz-x1>
  2018-11-29  6:30                 ` Wei Wang
@ 2018-11-30  5:05                 ` Wei Wang
       [not found]                   ` <20181130055736.GA15865@xz-x1>
  1 sibling, 1 reply; 20+ messages in thread
From: Wei Wang @ 2018-11-30  5:05 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/29/2018 01:10 PM, Peter Xu wrote:
> On Thu, Nov 29, 2018 at 11:40:57AM +0800, Wei Wang wrote:
> I think this precopy notifier callchain is expected to be used only for
> the precopy mode. Postcopy has its dedicated notifier callchain that
> users could use.
>
> How about changing the migrate_postcopy() check to "ms->start_postcopy":
>
> bool migration_postcopy_start(void)
> {
>      MigrationState *s;
>
>      s = migrate_get_current();
>
>      return atomic_read(&s->start_postcopy);
> }
>
>
> static void precopy_notify(PrecopyNotifyReason reason)
> {
>      if (migration_postcopy_start())
>          return;
>
>      notifier_list_notify(&precopy_notifier_list, &reason);
> }
>
> If postcopy started with precopy, the precopy optimization feature
> could still be used until it switches to the postcopy mode.
> I'm not sure we can use start_postcopy.  It's a variable being set in
> the QMP handler but it does not mean postcopy has started.  I'm afraid
> there can be race where it's still precopy but the variable is set so
> event could be missed...

Peter,
I just found that migration_bitmap_sync is also called by 
ram_postcopy_send_discard_bitmap().
But we don't expect notifier_list_notify to be called in the postcopy mode.

So, probably we still need the start_postcopy check in 
notifier_list_notify though we have
the COMPLETE notifier.

Best,
Wei


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [virtio-dev] Re: [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy
       [not found]                   ` <20181130055736.GA15865@xz-x1>
@ 2018-11-30  7:09                     ` Wei Wang
  0 siblings, 0 replies; 20+ messages in thread
From: Wei Wang @ 2018-11-30  7:09 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, virtio-dev, mst, quintela, dgilbert, pbonzini,
	liliang.opensource, nilal, riel

On 11/30/2018 01:57 PM, Peter Xu wrote:
>>
>>
>> Or you can introduce migration_bitmap_sync_precopy() and let precopy
>> code call that one instead.

Agree, thanks.

> PS. I'm a bit unsure on why we need to sync bitmap in ram_init_bitmaps.
> I feel like it can be removed...

Seems it was added in early days to fix some bug:
https://git.virtualopensystems.com/dev/qemu/commit/79536f4f16934d6759a1d67f0342b4e7ceb66671?w=1

Hopefully, Juan could share some details about that.

Best,
Wei





---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2018-11-30  7:04 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-11-15 10:07 [virtio-dev] [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 1/8] bitmap: fix bitmap_count_one Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 2/8] bitmap: bitmap_count_one_with_offset Wei Wang
2018-11-15 10:07 ` [virtio-dev] [PATCH v9 3/8] migration: use bitmap_mutex in migration_bitmap_clear_dirty Wei Wang
     [not found]   ` <20181127054056.GA3205@xz-x1>
2018-11-27  6:02     ` [virtio-dev] " Wei Wang
2018-11-27  6:12       ` Wei Wang
     [not found]         ` <20181127074127.GD3205@xz-x1>
2018-11-27 10:17           ` Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 4/8] migration: API to clear bits of guest free pages from the dirty bitmap Wei Wang
     [not found]   ` <20181127060638.GB3205@xz-x1>
2018-11-27  6:52     ` [virtio-dev] " Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 5/8] migration/ram.c: add a notifier chain for precopy Wei Wang
     [not found]   ` <20181127073802.GC3205@xz-x1>
2018-11-27 10:25     ` [virtio-dev] " Wei Wang
     [not found]       ` <20181128052655.GC12839@xz-x1>
2018-11-28  9:01         ` Wei Wang
     [not found]           ` <20181128093220.GF12839@xz-x1>
2018-11-29  3:40             ` Wei Wang
     [not found]               ` <20181129051014.GC29246@xz-x1>
2018-11-29  6:30                 ` Wei Wang
2018-11-30  5:05                 ` Wei Wang
     [not found]                   ` <20181130055736.GA15865@xz-x1>
2018-11-30  7:09                     ` Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 6/8] migration/ram.c: add a function to disable the bulk stage Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 7/8] migration: move migrate_postcopy() to include/migration/misc.h Wei Wang
2018-11-15 10:08 ` [virtio-dev] [PATCH v9 8/8] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2018-11-27  3:11 ` [virtio-dev] Re: [PATCH v9 0/8] virtio-balloon: free page hint support Wei Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox