[PATCH v6 0/3] Eliminate multifd flush

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v6 0/3] Eliminate multifd flush
@ 2023-02-15 18:02 Juan Quintela
  2023-02-15 18:02 ` [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section Juan Quintela
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Juan Quintela @ 2023-02-15 18:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Dr. David Alan Gilbert, Yanan Wang, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Eduardo Habkost, Markus Armbruster,
	Juan Quintela

Hi

In this v6:
- Rename multifd-sync-after-each-section to
         multifd-flush-after-each-section
- Redo comments (thanks Markus)
- Redo how to comment capabilities that are enabled/disabled during
  development. (thanks Markus)

Please, review.

In this v5:
- Remove RAM Flags documentation (already on PULL request)
- rebase on top of PULL request.

Please review.

Based-on: <20230213025150.71537-1-quintela@redhat.com>
          Migration 20230213 patches

In this v4:
- Rebased on top of migration-20230209 PULL request
- Integrate two patches in that pull request
- Rebase
- Address Eric reviews.

Please review.

In this v3:
- update to latest upstream.
- fix checkpatch errors.

Please, review.

In this v2:
- update to latest upstream
- change 0, 1, 2 values to defines
- Add documentation for SAVE_VM_FLAGS
- Add missing qemu_fflush(), it made random hangs for migration test
  (only for tls, no clue why).

Please, review.

[v1]
Upstream multifd code synchronize all threads after each RAM section.  This is suboptimal.
Change it to only flush after we go trough all ram.

Preserve all semantics for old machine types.

Juan Quintela (3):
  multifd: Create property multifd-flush-after-each-section
  multifd: Protect multifd_send_sync_main() calls
  multifd: Only flush once each full round of memory

 qapi/migration.json   | 18 +++++++++++++++++-
 migration/migration.h |  1 +
 hw/core/machine.c     |  1 +
 migration/migration.c | 13 +++++++++++--
 migration/ram.c       | 44 +++++++++++++++++++++++++++++++++++++------
 5 files changed, 68 insertions(+), 9 deletions(-)

-- 
2.39.1



^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-15 18:02 [PATCH v6 0/3] Eliminate multifd flush Juan Quintela
@ 2023-02-15 18:02 ` Juan Quintela
  2023-02-15 19:59   ` Peter Xu
  2023-02-15 18:02 ` [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls Juan Quintela
  2023-02-15 18:02 ` [PATCH v6 3/3] multifd: Only flush once each full round of memory Juan Quintela
  2 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-15 18:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Dr. David Alan Gilbert, Yanan Wang, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Eduardo Habkost, Markus Armbruster,
	Juan Quintela

We used to flush all channels at the end of each RAM section
sent.  That is not needed, so preparing to only flush after a full
iteration through all the RAM.

Default value of the property is false.  But we return "true" in
migrate_multifd_flush_after_each_section() until we implement the code
in following patches.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Rename each-iteration to after-each-section
Rename multifd-sync-after-each-section to
       multifd-flush-after-each-section
---
 qapi/migration.json   | 21 ++++++++++++++++++++-
 migration/migration.h |  1 +
 hw/core/machine.c     |  1 +
 migration/migration.c | 17 +++++++++++++++--
 4 files changed, 37 insertions(+), 3 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index c84fa10e86..3afd81174d 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -478,6 +478,24 @@
 #                    should not affect the correctness of postcopy migration.
 #                    (since 7.1)
 #
+# @multifd-flush-after-each-section: flush every channel after each
+#                                    section sent.  This assures that
+#                                    we can't mix pages from one
+#                                    iteration through ram pages with
+#                                    pages for the following
+#                                    iteration.  We really only need
+#                                    to do this flush after we have go
+#                                    through all the dirty pages.
+#                                    For historical reasons, we do
+#                                    that after each section.  This is
+#                                    suboptimal (we flush too many
+#                                    times).
+#                                    Default value is false.
+#                                    Setting this capability has no
+#                                    effect until the patch that
+#                                    removes this comment.
+#                                    (since 8.0)
+#
 # Features:
 # @unstable: Members @x-colo and @x-ignore-shared are experimental.
 #
@@ -492,7 +510,8 @@
            'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
            { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
            'validate-uuid', 'background-snapshot',
-           'zero-copy-send', 'postcopy-preempt'] }
+           'zero-copy-send', 'postcopy-preempt',
+           'multifd-flush-after-each-section'] }
 
 ##
 # @MigrationCapabilityStatus:
diff --git a/migration/migration.h b/migration/migration.h
index 2da2f8a164..7f0f4260ba 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -424,6 +424,7 @@ int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
 int migrate_multifd_zlib_level(void);
 int migrate_multifd_zstd_level(void);
+bool migrate_multifd_flush_after_each_section(void);
 
 #ifdef CONFIG_LINUX
 bool migrate_use_zero_copy_send(void);
diff --git a/hw/core/machine.c b/hw/core/machine.c
index f73fc4c45c..602e775f34 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -54,6 +54,7 @@ const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1);
 GlobalProperty hw_compat_7_0[] = {
     { "arm-gicv3-common", "force-8-bit-prio", "on" },
     { "nvme-ns", "eui64-default", "on"},
+    { "migration", "multifd-flush-after-each-section", "on"},
 };
 const size_t hw_compat_7_0_len = G_N_ELEMENTS(hw_compat_7_0);
 
diff --git a/migration/migration.c b/migration/migration.c
index 90fca70cb7..cfba0da005 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -167,7 +167,8 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
     MIGRATION_CAPABILITY_XBZRLE,
     MIGRATION_CAPABILITY_X_COLO,
     MIGRATION_CAPABILITY_VALIDATE_UUID,
-    MIGRATION_CAPABILITY_ZERO_COPY_SEND);
+    MIGRATION_CAPABILITY_ZERO_COPY_SEND,
+    MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION);
 
 /* When we add fault tolerance, we could have several
    migrations at once.  For now we don't need to add
@@ -2701,6 +2702,17 @@ bool migrate_use_multifd(void)
     return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD];
 }
 
+bool migrate_multifd_flush_after_each_section(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    /*
+     * Until the patch that remove this comment, we always return that
+     * the capability is enabled.
+     */
+    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
+}
+
 bool migrate_pause_before_switchover(void)
 {
     MigrationState *s;
@@ -4535,7 +4547,8 @@ static Property migration_properties[] = {
     DEFINE_PROP_MIG_CAP("x-zero-copy-send",
             MIGRATION_CAPABILITY_ZERO_COPY_SEND),
 #endif
-
+    DEFINE_PROP_MIG_CAP("multifd-flush-after-each-section",
+                        MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION),
     DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls
  2023-02-15 18:02 [PATCH v6 0/3] Eliminate multifd flush Juan Quintela
  2023-02-15 18:02 ` [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section Juan Quintela
@ 2023-02-15 18:02 ` Juan Quintela
  2023-02-15 20:06   ` Peter Xu
  2023-02-15 18:02 ` [PATCH v6 3/3] multifd: Only flush once each full round of memory Juan Quintela
  2 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-15 18:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Dr. David Alan Gilbert, Yanan Wang, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Eduardo Habkost, Markus Armbruster,
	Juan Quintela

We only need to do that on the ram_save_iterate() call on sending and
on destination when we get a RAM_SAVE_FLAG_EOS.

In setup() and complete() we need to synch in both new and old cases,
so don't add a check there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Remove the wrappers that we take out on patch 5.
---
 migration/ram.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 521912385d..6191dac9af 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -3408,9 +3408,11 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 out:
     if (ret >= 0
         && migration_is_setup_or_active(migrate_get_current()->state)) {
-        ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
-        if (ret < 0) {
-            return ret;
+        if (migrate_multifd_flush_after_each_section()) {
+            ret = multifd_send_sync_main(rs->pss[RAM_CHANNEL_PRECOPY].pss_channel);
+            if (ret < 0) {
+                return ret;
+            }
         }
 
         qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
@@ -4170,7 +4172,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
 
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            multifd_recv_sync_main();
+            if (migrate_multifd_flush_after_each_section()) {
+                multifd_recv_sync_main();
+            }
             break;
         default:
             error_report("Unknown combination of migration flags: 0x%x"
@@ -4441,7 +4445,9 @@ static int ram_load_precopy(QEMUFile *f)
             break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
-            multifd_recv_sync_main();
+            if (migrate_multifd_flush_after_each_section()) {
+                multifd_recv_sync_main();
+            }
             break;
         default:
             if (flags & RAM_SAVE_FLAG_HOOK) {
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v6 3/3] multifd: Only flush once each full round of memory
  2023-02-15 18:02 [PATCH v6 0/3] Eliminate multifd flush Juan Quintela
  2023-02-15 18:02 ` [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section Juan Quintela
  2023-02-15 18:02 ` [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls Juan Quintela
@ 2023-02-15 18:02 ` Juan Quintela
  2023-02-15 20:05   ` Peter Xu
  2 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-15 18:02 UTC (permalink / raw)
  To: qemu-devel
  Cc: Eric Blake, Dr. David Alan Gilbert, Yanan Wang, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Eduardo Habkost, Markus Armbruster,
	Juan Quintela

We need to add a new flag to mean to flush at that point.
Notice that we still flush at the end of setup and at the end of
complete stages.

Signed-off-by: Juan Quintela <quintela@redhat.com>

---

Add missing qemu_fflush(), now it passes all tests always.
---
 qapi/migration.json   |  3 ---
 migration/migration.c |  6 +-----
 migration/ram.c       | 28 +++++++++++++++++++++++++++-
 3 files changed, 28 insertions(+), 9 deletions(-)

diff --git a/qapi/migration.json b/qapi/migration.json
index 3afd81174d..34e1657c4e 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -491,9 +491,6 @@
 #                                    suboptimal (we flush too many
 #                                    times).
 #                                    Default value is false.
-#                                    Setting this capability has no
-#                                    effect until the patch that
-#                                    removes this comment.
 #                                    (since 8.0)
 #
 # Features:
diff --git a/migration/migration.c b/migration/migration.c
index cfba0da005..74bcc16848 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2706,11 +2706,7 @@ bool migrate_multifd_flush_after_each_section(void)
 {
     MigrationState *s = migrate_get_current();
 
-    /*
-     * Until the patch that remove this comment, we always return that
-     * the capability is enabled.
-     */
-    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
 }
 
 bool migrate_pause_before_switchover(void)
diff --git a/migration/ram.c b/migration/ram.c
index 6191dac9af..bc5eb1640b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -85,6 +85,7 @@
 #define RAM_SAVE_FLAG_XBZRLE   0x40
 /* 0x80 is reserved in qemu-file.h for RAM_SAVE_FLAG_HOOK */
 #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
+#define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
 /* We can't use any flag that is bigger than 0x200 */
 
 int (*xbzrle_encode_buffer_func)(uint8_t *, uint8_t *, int,
@@ -1595,6 +1596,7 @@ retry:
  * associated with the search process.
  *
  * Returns:
+ *         <0: An error happened
  *         PAGE_ALL_CLEAN: no dirty page found, give up
  *         PAGE_TRY_AGAIN: no dirty page found, retry for next block
  *         PAGE_DIRTY_FOUND: dirty page found
@@ -1622,6 +1624,15 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
+            if (!migrate_multifd_flush_after_each_section()) {
+                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
+                int ret = multifd_send_sync_main(f);
+                if (ret < 0) {
+                    return ret;
+                }
+                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+                qemu_fflush(f);
+            }
             /*
              * If memory migration starts over, we will meet a dirtied page
              * which may still exists in compression threads's ring, so we
@@ -2614,6 +2625,9 @@ static int ram_find_and_save_block(RAMState *rs)
                     break;
                 } else if (res == PAGE_TRY_AGAIN) {
                     continue;
+                } else if (res < 0) {
+                    pages = res;
+                    break;
                 }
             }
         }
@@ -3300,6 +3314,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
         return ret;
     }
 
+    if (!migrate_multifd_flush_after_each_section()) {
+        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+    }
+
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
 
@@ -3485,6 +3503,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
         return ret;
     }
 
+    if (!migrate_multifd_flush_after_each_section()) {
+        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
+    }
     qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
     qemu_fflush(f);
 
@@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
             }
             decompress_data_with_multi_threads(f, page_buffer, len);
             break;
-
+        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
+            multifd_recv_sync_main();
+            break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
             if (migrate_multifd_flush_after_each_section()) {
@@ -4443,6 +4466,9 @@ static int ram_load_precopy(QEMUFile *f)
                 break;
             }
             break;
+        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
+            multifd_recv_sync_main();
+            break;
         case RAM_SAVE_FLAG_EOS:
             /* normal exit */
             if (migrate_multifd_flush_after_each_section()) {
-- 
2.39.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-15 18:02 ` [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section Juan Quintela
@ 2023-02-15 19:59   ` Peter Xu
  2023-02-15 20:13     ` Juan Quintela
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2023-02-15 19:59 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

On Wed, Feb 15, 2023 at 07:02:29PM +0100, Juan Quintela wrote:
> We used to flush all channels at the end of each RAM section
> sent.  That is not needed, so preparing to only flush after a full
> iteration through all the RAM.
> 
> Default value of the property is false.  But we return "true" in
> migrate_multifd_flush_after_each_section() until we implement the code
> in following patches.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

This line can be dropped, after (I assume) git commit helped to add the
other one below. :)

Normally that's also why I put R-bs before my SoB because I should have two
SoB but then I merge them into the last; git is happy with that too.

> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

But some nitpicks below (I'll leave those to you to decide whether to
rework or keep them as is..).

>
> ---
> 
> Rename each-iteration to after-each-section
> Rename multifd-sync-after-each-section to
>        multifd-flush-after-each-section
> ---
>  qapi/migration.json   | 21 ++++++++++++++++++++-
>  migration/migration.h |  1 +
>  hw/core/machine.c     |  1 +
>  migration/migration.c | 17 +++++++++++++++--
>  4 files changed, 37 insertions(+), 3 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index c84fa10e86..3afd81174d 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -478,6 +478,24 @@
>  #                    should not affect the correctness of postcopy migration.
>  #                    (since 7.1)
>  #
> +# @multifd-flush-after-each-section: flush every channel after each
> +#                                    section sent.  This assures that
> +#                                    we can't mix pages from one
> +#                                    iteration through ram pages with
> +#                                    pages for the following
> +#                                    iteration.  We really only need
> +#                                    to do this flush after we have go
> +#                                    through all the dirty pages.
> +#                                    For historical reasons, we do
> +#                                    that after each section.  This is
> +#                                    suboptimal (we flush too many
> +#                                    times).
> +#                                    Default value is false.
> +#                                    Setting this capability has no
> +#                                    effect until the patch that
> +#                                    removes this comment.
> +#                                    (since 8.0)

IMHO the core of this new "cap" is the new RAM_SAVE_FLAG_MULTIFD_FLUSH bit
in the stream protocol, but it's not referenced here.  I would suggest
simplify the content but highlight the core change:

 @multifd-lazy-flush:  When enabled, multifd will only do sync flush after
                       each whole round of bitmap scan.  Otherwise it'll be
                       done per RAM save iteration (which happens with a much
                       higher frequency).

                       Please consider enable this as long as possible, and
                       keep this off only if any of the src/dst QEMU binary
                       doesn't support it.

                       This capability is bound to the new RAM save flag
                       RAM_SAVE_FLAG_MULTIFD_FLUSH, the new flag will only
                       be used and recognized when this feature bit set.

I know you dislike multifd-lazy-flush, but that's still the best I can come
up with when writting this (yeah I still like it :-p), please bare with me
and take whatever you think the best.

> +#
>  # Features:
>  # @unstable: Members @x-colo and @x-ignore-shared are experimental.
>  #
> @@ -492,7 +510,8 @@
>             'dirty-bitmaps', 'postcopy-blocktime', 'late-block-activate',
>             { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] },
>             'validate-uuid', 'background-snapshot',
> -           'zero-copy-send', 'postcopy-preempt'] }
> +           'zero-copy-send', 'postcopy-preempt',
> +           'multifd-flush-after-each-section'] }
>  
>  ##
>  # @MigrationCapabilityStatus:
> diff --git a/migration/migration.h b/migration/migration.h
> index 2da2f8a164..7f0f4260ba 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -424,6 +424,7 @@ int migrate_multifd_channels(void);
>  MultiFDCompression migrate_multifd_compression(void);
>  int migrate_multifd_zlib_level(void);
>  int migrate_multifd_zstd_level(void);
> +bool migrate_multifd_flush_after_each_section(void);
>  
>  #ifdef CONFIG_LINUX
>  bool migrate_use_zero_copy_send(void);
> diff --git a/hw/core/machine.c b/hw/core/machine.c
> index f73fc4c45c..602e775f34 100644
> --- a/hw/core/machine.c
> +++ b/hw/core/machine.c
> @@ -54,6 +54,7 @@ const size_t hw_compat_7_1_len = G_N_ELEMENTS(hw_compat_7_1);
>  GlobalProperty hw_compat_7_0[] = {
>      { "arm-gicv3-common", "force-8-bit-prio", "on" },
>      { "nvme-ns", "eui64-default", "on"},
> +    { "migration", "multifd-flush-after-each-section", "on"},

[same note to IRC: need to revert if rename, but otherwise don't bother]

Thanks,

>  };
>  const size_t hw_compat_7_0_len = G_N_ELEMENTS(hw_compat_7_0);
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index 90fca70cb7..cfba0da005 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -167,7 +167,8 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot,
>      MIGRATION_CAPABILITY_XBZRLE,
>      MIGRATION_CAPABILITY_X_COLO,
>      MIGRATION_CAPABILITY_VALIDATE_UUID,
> -    MIGRATION_CAPABILITY_ZERO_COPY_SEND);
> +    MIGRATION_CAPABILITY_ZERO_COPY_SEND,
> +    MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION);
>  
>  /* When we add fault tolerance, we could have several
>     migrations at once.  For now we don't need to add
> @@ -2701,6 +2702,17 @@ bool migrate_use_multifd(void)
>      return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD];
>  }
>  
> +bool migrate_multifd_flush_after_each_section(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    /*
> +     * Until the patch that remove this comment, we always return that
> +     * the capability is enabled.
> +     */
> +    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];

(I'd rather not care about what happens if someone applies only this patch
 not the latter two by dropping "true ||" directly here, but again, no a
 huge deal)

> +}
> +
>  bool migrate_pause_before_switchover(void)
>  {
>      MigrationState *s;
> @@ -4535,7 +4547,8 @@ static Property migration_properties[] = {
>      DEFINE_PROP_MIG_CAP("x-zero-copy-send",
>              MIGRATION_CAPABILITY_ZERO_COPY_SEND),
>  #endif
> -
> +    DEFINE_PROP_MIG_CAP("multifd-flush-after-each-section",
> +                        MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION),
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> -- 
> 2.39.1
> 
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory
  2023-02-15 18:02 ` [PATCH v6 3/3] multifd: Only flush once each full round of memory Juan Quintela
@ 2023-02-15 20:05   ` Peter Xu
  2023-02-16 11:00     ` Juan Quintela
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2023-02-15 20:05 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

On Wed, Feb 15, 2023 at 07:02:31PM +0100, Juan Quintela wrote:
> We need to add a new flag to mean to flush at that point.
> Notice that we still flush at the end of setup and at the end of
> complete stages.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>

Acked-by: Peter Xu <peterx@redhat.com>

One nitpick below.

> 
> ---
> 
> Add missing qemu_fflush(), now it passes all tests always.
> ---
>  qapi/migration.json   |  3 ---
>  migration/migration.c |  6 +-----
>  migration/ram.c       | 28 +++++++++++++++++++++++++++-
>  3 files changed, 28 insertions(+), 9 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 3afd81174d..34e1657c4e 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -491,9 +491,6 @@
>  #                                    suboptimal (we flush too many
>  #                                    times).
>  #                                    Default value is false.
> -#                                    Setting this capability has no
> -#                                    effect until the patch that
> -#                                    removes this comment.
>  #                                    (since 8.0)
>  #
>  # Features:
> diff --git a/migration/migration.c b/migration/migration.c
> index cfba0da005..74bcc16848 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2706,11 +2706,7 @@ bool migrate_multifd_flush_after_each_section(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    /*
> -     * Until the patch that remove this comment, we always return that
> -     * the capability is enabled.
> -     */
> -    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
> +    return s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
>  }
>  
>  bool migrate_pause_before_switchover(void)
> diff --git a/migration/ram.c b/migration/ram.c
> index 6191dac9af..bc5eb1640b 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -85,6 +85,7 @@
>  #define RAM_SAVE_FLAG_XBZRLE   0x40
>  /* 0x80 is reserved in qemu-file.h for RAM_SAVE_FLAG_HOOK */
>  #define RAM_SAVE_FLAG_COMPRESS_PAGE    0x100
> +#define RAM_SAVE_FLAG_MULTIFD_FLUSH    0x200
>  /* We can't use any flag that is bigger than 0x200 */
>  
>  int (*xbzrle_encode_buffer_func)(uint8_t *, uint8_t *, int,
> @@ -1595,6 +1596,7 @@ retry:
>   * associated with the search process.
>   *
>   * Returns:
> + *         <0: An error happened
>   *         PAGE_ALL_CLEAN: no dirty page found, give up
>   *         PAGE_TRY_AGAIN: no dirty page found, retry for next block
>   *         PAGE_DIRTY_FOUND: dirty page found
> @@ -1622,6 +1624,15 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
>          pss->page = 0;
>          pss->block = QLIST_NEXT_RCU(pss->block, next);
>          if (!pss->block) {
> +            if (!migrate_multifd_flush_after_each_section()) {
> +                QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
> +                int ret = multifd_send_sync_main(f);
> +                if (ret < 0) {
> +                    return ret;
> +                }
> +                qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +                qemu_fflush(f);
> +            }
>              /*
>               * If memory migration starts over, we will meet a dirtied page
>               * which may still exists in compression threads's ring, so we
> @@ -2614,6 +2625,9 @@ static int ram_find_and_save_block(RAMState *rs)
>                      break;
>                  } else if (res == PAGE_TRY_AGAIN) {
>                      continue;
> +                } else if (res < 0) {
> +                    pages = res;
> +                    break;
>                  }
>              }
>          }
> @@ -3300,6 +3314,10 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>          return ret;
>      }
>  
> +    if (!migrate_multifd_flush_after_each_section()) {
> +        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +    }
> +
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      qemu_fflush(f);
>  
> @@ -3485,6 +3503,9 @@ static int ram_save_complete(QEMUFile *f, void *opaque)
>          return ret;
>      }
>  
> +    if (!migrate_multifd_flush_after_each_section()) {
> +        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_FLUSH);
> +    }
>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>      qemu_fflush(f);
>  
> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
>              }
>              decompress_data_with_multi_threads(f, page_buffer, len);
>              break;
> -
> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> +            multifd_recv_sync_main();
> +            break;
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
>              if (migrate_multifd_flush_after_each_section()) {

We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
now until we support postcopy+multifd.

Here it's not only about enabling them together, but it's about running
them in parallel, which I doubt whether it can really be implemented at all
due to the fundamentally concepts difference between multifd & postcopy.. :(

> @@ -4443,6 +4466,9 @@ static int ram_load_precopy(QEMUFile *f)
>                  break;
>              }
>              break;
> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> +            multifd_recv_sync_main();
> +            break;
>          case RAM_SAVE_FLAG_EOS:
>              /* normal exit */
>              if (migrate_multifd_flush_after_each_section()) {
> -- 
> 2.39.1
> 
> 

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls
  2023-02-15 18:02 ` [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls Juan Quintela
@ 2023-02-15 20:06   ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2023-02-15 20:06 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

On Wed, Feb 15, 2023 at 07:02:30PM +0100, Juan Quintela wrote:
> We only need to do that on the ram_save_iterate() call on sending and
> on destination when we get a RAM_SAVE_FLAG_EOS.
> 
> In setup() and complete() we need to synch in both new and old cases,
> so don't add a check there.
> 
> Signed-off-by: Juan Quintela <quintela@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: Juan Quintela <quintela@redhat.com>

[same issue on SoB]

Acked-by: Peter Xu <peterx@redhat.com>

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-15 19:59   ` Peter Xu
@ 2023-02-15 20:13     ` Juan Quintela
  2023-02-16 15:15       ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-15 20:13 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Feb 15, 2023 at 07:02:29PM +0100, Juan Quintela wrote:
>> We used to flush all channels at the end of each RAM section
>> sent.  That is not needed, so preparing to only flush after a full
>> iteration through all the RAM.
>> 
>> Default value of the property is false.  But we return "true" in
>> migrate_multifd_flush_after_each_section() until we implement the code
>> in following patches.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> This line can be dropped, after (I assume) git commit helped to add the
> other one below. :)

Gree, git and trailers is always so much fun.  Will try to fix them (again)

>
> Normally that's also why I put R-bs before my SoB because I should have two
> SoB but then I merge them into the last; git is happy with that too.
>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>

Thanks.

> But some nitpicks below (I'll leave those to you to decide whether to
> rework or keep them as is..).
>
>>
>> ---
>> 
>> Rename each-iteration to after-each-section
>> Rename multifd-sync-after-each-section to
>>        multifd-flush-after-each-section
>> ---
>>  qapi/migration.json   | 21 ++++++++++++++++++++-
>>  migration/migration.h |  1 +
>>  hw/core/machine.c     |  1 +
>>  migration/migration.c | 17 +++++++++++++++--
>>  4 files changed, 37 insertions(+), 3 deletions(-)
>> 
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index c84fa10e86..3afd81174d 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -478,6 +478,24 @@
>>  #                    should not affect the correctness of postcopy migration.
>>  #                    (since 7.1)
>>  #
>> +# @multifd-flush-after-each-section: flush every channel after each
>> +#                                    section sent.  This assures that
>> +#                                    we can't mix pages from one
>> +#                                    iteration through ram pages with
>> +#                                    pages for the following
>> +#                                    iteration.  We really only need
>> +#                                    to do this flush after we have go
>> +#                                    through all the dirty pages.
>> +#                                    For historical reasons, we do
>> +#                                    that after each section.  This is
>> +#                                    suboptimal (we flush too many
>> +#                                    times).
>> +#                                    Default value is false.
>> +#                                    Setting this capability has no
>> +#                                    effect until the patch that
>> +#                                    removes this comment.
>> +#                                    (since 8.0)
>
> IMHO the core of this new "cap" is the new RAM_SAVE_FLAG_MULTIFD_FLUSH bit
> in the stream protocol, but it's not referenced here.  I would suggest
> simplify the content but highlight the core change:

Actually it is the other way around.  What this capability will do is
_NOT_ use RAM_SAVE_FLAG_MULTIFD_FLUSH protocol.

>  @multifd-lazy-flush:  When enabled, multifd will only do sync flush after
>                        each whole round of bitmap scan.  Otherwise it'll be
>                        done per RAM save iteration (which happens with a much
>                        higher frequency).
>
>                        Please consider enable this as long as possible, and
>                        keep this off only if any of the src/dst QEMU binary
>                        doesn't support it.
>
>                        This capability is bound to the new RAM save flag
>                        RAM_SAVE_FLAG_MULTIFD_FLUSH, the new flag will only
>                        be used and recognized when this feature bit set.

Name is wrong.  It would be multifd-non-lazy-flush.  And I don't like
negatives.  Real name is:

multifd-I-messed-and-flush-too-many-times.


> I know you dislike multifd-lazy-flush, but that's still the best I can come
> up with when writting this (yeah I still like it :-p), please bare with me
> and take whatever you think the best.

Libvirt assumes that all capabilities are false except if enabled.
We want RAM_SAVE_FLAG_MULTFD_FLUSH by default (in new machine types).

So, if we can do

capability_use_new_way = true

We change that to

capability_use_old_way = true

And then by default with false value is what we want.

>> +bool migrate_multifd_flush_after_each_section(void)
>> +{
>> +    MigrationState *s = migrate_get_current();
>> +
>> +    /*
>> +     * Until the patch that remove this comment, we always return that
>> +     * the capability is enabled.
>> +     */
>> +    return true || s->enabled_capabilities[MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION];
>
> (I'd rather not care about what happens if someone applies only this patch
>  not the latter two by dropping "true ||" directly here, but again, no a
>  huge deal)
>
>> +}

It don't matter at all.

Patch1: Introduces the code for a capability, but nothing else.
        It makes the capability be true always (remember old code)
Patch2: Makes old code conditional on this capability (see that we have
        force it to be true).
Patch3: Introduce code for the new capability, and "protect" it for the
        capability being false.  We can remove the "trick" that we just
        had.

See discussion on v5 with Markus for more details.

Later, Juan.

>> +
>>  bool migrate_pause_before_switchover(void)
>>  {
>>      MigrationState *s;
>> @@ -4535,7 +4547,8 @@ static Property migration_properties[] = {
>>      DEFINE_PROP_MIG_CAP("x-zero-copy-send",
>>              MIGRATION_CAPABILITY_ZERO_COPY_SEND),
>>  #endif
>> -
>> +    DEFINE_PROP_MIG_CAP("multifd-flush-after-each-section",
>> +                        MIGRATION_CAPABILITY_MULTIFD_FLUSH_AFTER_EACH_SECTION),
>>      DEFINE_PROP_END_OF_LIST(),
>>  };
>>  
>> -- 
>> 2.39.1
>> 
>> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory
  2023-02-15 20:05   ` Peter Xu
@ 2023-02-16 11:00     ` Juan Quintela
  2023-02-16 16:44       ` Peter Xu
  0 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-16 11:00 UTC (permalink / raw)
  To: Peter Xu
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

Peter Xu <peterx@redhat.com> wrote:
> On Wed, Feb 15, 2023 at 07:02:31PM +0100, Juan Quintela wrote:
>> We need to add a new flag to mean to flush at that point.
>> Notice that we still flush at the end of setup and at the end of
>> complete stages.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>
> Acked-by: Peter Xu <peterx@redhat.com>
>
> One nitpick below.

Thanks.
>> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
>>              }
>>              decompress_data_with_multi_threads(f, page_buffer, len);
>>              break;
>> -
>> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
>> +            multifd_recv_sync_main();
>> +            break;
>>          case RAM_SAVE_FLAG_EOS:
>>              /* normal exit */
>>              if (migrate_multifd_flush_after_each_section()) {
>
> We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
> now until we support postcopy+multifd.

I don't think so.

We have this curse of biblic proportions called Backwards compatibility.

We need to mark the beggining and end of sections.  That is independent
of multifd.
And for multifd we have to flush all channels at the end of each
iteration through RAM.  We could do that without involving the main
thread, but I don't see the point of doing that.

> Here it's not only about enabling them together, but it's about running
> them in parallel, which I doubt whether it can really be implemented at all
> due to the fundamentally concepts difference between multifd & postcopy.. :(

Later, Juan.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-15 20:13     ` Juan Quintela
@ 2023-02-16 15:15       ` Markus Armbruster
  2023-02-16 17:13         ` Juan Quintela
  0 siblings, 1 reply; 13+ messages in thread
From: Markus Armbruster @ 2023-02-16 15:15 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Peter Xu, qemu-devel, Eric Blake, Dr. David Alan Gilbert,
	Yanan Wang, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Eduardo Habkost, Markus Armbruster

Juan Quintela <quintela@redhat.com> writes:

> Peter Xu <peterx@redhat.com> wrote:
>> On Wed, Feb 15, 2023 at 07:02:29PM +0100, Juan Quintela wrote:
>>> We used to flush all channels at the end of each RAM section
>>> sent.  That is not needed, so preparing to only flush after a full
>>> iteration through all the RAM.
>>> 
>>> Default value of the property is false.  But we return "true" in
>>> migrate_multifd_flush_after_each_section() until we implement the code
>>> in following patches.
>>> 
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>
>> This line can be dropped, after (I assume) git commit helped to add the
>> other one below. :)
>
> Gree, git and trailers is always so much fun.  Will try to fix them (again)
>
>>
>> Normally that's also why I put R-bs before my SoB because I should have two
>> SoB but then I merge them into the last; git is happy with that too.
>>
>>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>>
>> Acked-by: Peter Xu <peterx@redhat.com>
>
> Thanks.
>
>> But some nitpicks below (I'll leave those to you to decide whether to
>> rework or keep them as is..).
>>
>>>
>>> ---
>>> 
>>> Rename each-iteration to after-each-section
>>> Rename multifd-sync-after-each-section to
>>>        multifd-flush-after-each-section
>>> ---
>>>  qapi/migration.json   | 21 ++++++++++++++++++++-
>>>  migration/migration.h |  1 +
>>>  hw/core/machine.c     |  1 +
>>>  migration/migration.c | 17 +++++++++++++++--
>>>  4 files changed, 37 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index c84fa10e86..3afd81174d 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -478,6 +478,24 @@
>>>  #                    should not affect the correctness of postcopy migration.
>>>  #                    (since 7.1)
>>>  #
>>> +# @multifd-flush-after-each-section: flush every channel after each
>>> +#                                    section sent.  This assures that
>>> +#                                    we can't mix pages from one
>>> +#                                    iteration through ram pages with

RAM

>>> +#                                    pages for the following
>>> +#                                    iteration.  We really only need
>>> +#                                    to do this flush after we have go

to flush after we have gone

>>> +#                                    through all the dirty pages.
>>> +#                                    For historical reasons, we do
>>> +#                                    that after each section.  This is

we flush after each section

>>> +#                                    suboptimal (we flush too many
>>> +#                                    times).

inefficient: we flush too often.

>>> +#                                    Default value is false.
>>> +#                                    Setting this capability has no
>>> +#                                    effect until the patch that
>>> +#                                    removes this comment.
>>> +#                                    (since 8.0)
>>
>> IMHO the core of this new "cap" is the new RAM_SAVE_FLAG_MULTIFD_FLUSH bit
>> in the stream protocol, but it's not referenced here.  I would suggest
>> simplify the content but highlight the core change:
>
> Actually it is the other way around.  What this capability will do is
> _NOT_ use RAM_SAVE_FLAG_MULTIFD_FLUSH protocol.
>
>>  @multifd-lazy-flush:  When enabled, multifd will only do sync flush after

Spell out "synchronrous".

>>                        each whole round of bitmap scan.  Otherwise it'll be

Suggest to scratch "whole".

>>                        done per RAM save iteration (which happens with a much
>>                        higher frequency).

Less detail than Juan's version.  I'm not sure how much detail is
appropriate for QMP reference documentation.

>>                        Please consider enable this as long as possible, and
>>                        keep this off only if any of the src/dst QEMU binary
>>                        doesn't support it.

Clear guidance on how to use it, good!

Perhaps state it more forcefully: "Enable this when both source and
destination support it."

>>
>>                        This capability is bound to the new RAM save flag
>>                        RAM_SAVE_FLAG_MULTIFD_FLUSH, the new flag will only
>>                        be used and recognized when this feature bit set.

Is RAM_SAVE_FLAG_MULTIFD_FLUSH visible in the QMP interface?  Or in the
migration stream?

I'm asking because doc comments are QMP reference documentation, but
when writing them, it's easy to mistake them for internal documentation,
because, well, they're comments.

> Name is wrong.  It would be multifd-non-lazy-flush.  And I don't like
> negatives.  Real name is:
>
> multifd-I-messed-and-flush-too-many-times.

If you don't like "non-lazy", say "eager".

>> I know you dislike multifd-lazy-flush, but that's still the best I can come
>> up with when writting this (yeah I still like it :-p), please bare with me
>> and take whatever you think the best.
>
> Libvirt assumes that all capabilities are false except if enabled.
> We want RAM_SAVE_FLAG_MULTFD_FLUSH by default (in new machine types).
>
> So, if we can do
>
> capability_use_new_way = true
>
> We change that to
>
> capability_use_old_way = true
>
> And then by default with false value is what we want.

Eventually, all supported migration peers will support lazy flush.  What
then?  Will we flip the default?  Or will we ignore the capability and
always flush lazily?

[...]



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 3/3] multifd: Only flush once each full round of memory
  2023-02-16 11:00     ` Juan Quintela
@ 2023-02-16 16:44       ` Peter Xu
  0 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2023-02-16 16:44 UTC (permalink / raw)
  To: Juan Quintela
  Cc: qemu-devel, Eric Blake, Dr. David Alan Gilbert, Yanan Wang,
	Marcel Apfelbaum, Philippe Mathieu-Daudé, Eduardo Habkost,
	Markus Armbruster

On Thu, Feb 16, 2023 at 12:00:55PM +0100, Juan Quintela wrote:
> >> @@ -4169,7 +4190,9 @@ int ram_load_postcopy(QEMUFile *f, int channel)
> >>              }
> >>              decompress_data_with_multi_threads(f, page_buffer, len);
> >>              break;
> >> -
> >> +        case RAM_SAVE_FLAG_MULTIFD_FLUSH:
> >> +            multifd_recv_sync_main();
> >> +            break;
> >>          case RAM_SAVE_FLAG_EOS:
> >>              /* normal exit */
> >>              if (migrate_multifd_flush_after_each_section()) {
> >
> > We could have dropped RAM_SAVE_FLAG_MULTIFD_FLUSH and RAM_SAVE_FLAG_EOS for
> > now until we support postcopy+multifd.
> 
> I don't think so.
> 
> We have this curse of biblic proportions called Backwards compatibility.
> 
> We need to mark the beggining and end of sections.  That is independent
> of multifd.
> And for multifd we have to flush all channels at the end of each
> iteration through RAM.  We could do that without involving the main
> thread, but I don't see the point of doing that.

Oops, sorry I didn't mean to drop the flags RAM_SAVE_FLAG_EOS itself, but
the calls to multifd_recv_sync_main().

RAM_SAVE_FLAG_MULTIFD_FLUSH as a whole can be dropped.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-16 15:15       ` Markus Armbruster
@ 2023-02-16 17:13         ` Juan Quintela
  2023-02-17  5:53           ` Markus Armbruster
  0 siblings, 1 reply; 13+ messages in thread
From: Juan Quintela @ 2023-02-16 17:13 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Peter Xu, qemu-devel, Eric Blake, Dr. David Alan Gilbert,
	Yanan Wang, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Eduardo Habkost

Markus Armbruster <armbru@redhat.com> wrote:
> Juan Quintela <quintela@redhat.com> writes:
>
>>>> @@ -478,6 +478,24 @@
>>>>  #                    should not affect the correctness of postcopy migration.
>>>>  #                    (since 7.1)
>>>>  #
>>>> +# @multifd-flush-after-each-section: flush every channel after each
>>>> +#                                    section sent.  This assures that
>>>> +#                                    we can't mix pages from one
>>>> +#                                    iteration through ram pages with
>
> RAM

OK.

>>>> +#                                    pages for the following
>>>> +#                                    iteration.  We really only need
>>>> +#                                    to do this flush after we have go
>
> to flush after we have gone

OK

>>>> +#                                    through all the dirty pages.
>>>> +#                                    For historical reasons, we do
>>>> +#                                    that after each section.  This is

> we flush after each section

OK

>>>> +#                                    suboptimal (we flush too many
>>>> +#                                    times).

> inefficient: we flush too often.

OK

>>>> +#                                    Default value is false.
>>>> +#                                    Setting this capability has no
>>>> +#                                    effect until the patch that
>>>> +#                                    removes this comment.
>>>> +#                                    (since 8.0)
>>>
>>> IMHO the core of this new "cap" is the new RAM_SAVE_FLAG_MULTIFD_FLUSH bit
>>> in the stream protocol, but it's not referenced here.  I would suggest
>>> simplify the content but highlight the core change:
>>
>> Actually it is the other way around.  What this capability will do is
>> _NOT_ use RAM_SAVE_FLAG_MULTIFD_FLUSH protocol.
>>
>>>  @multifd-lazy-flush:  When enabled, multifd will only do sync flush after
>
> Spell out "synchronrous".
ok.

>>>                        each whole round of bitmap scan.  Otherwise it'll be
>
> Suggest to scratch "whole".

ok.

>>>                        done per RAM save iteration (which happens with a much
>>>                        higher frequency).
>
> Less detail than Juan's version.  I'm not sure how much detail is
> appropriate for QMP reference documentation.
>
>>>                        Please consider enable this as long as possible, and
>>>                        keep this off only if any of the src/dst QEMU binary
>>>                        doesn't support it.
>
> Clear guidance on how to use it, good!
>
> Perhaps state it more forcefully: "Enable this when both source and
> destination support it."
>
>>>
>>>                        This capability is bound to the new RAM save flag
>>>                        RAM_SAVE_FLAG_MULTIFD_FLUSH, the new flag will only
>>>                        be used and recognized when this feature bit set.
>
> Is RAM_SAVE_FLAG_MULTIFD_FLUSH visible in the QMP interface?  Or in the
> migration stream?

No.  Only migration stream.

> I'm asking because doc comments are QMP reference documentation, but
> when writing them, it's easy to mistake them for internal documentation,
> because, well, they're comments.

>> Name is wrong.  It would be multifd-non-lazy-flush.  And I don't like
>> negatives.  Real name is:
>>
>> multifd-I-messed-and-flush-too-many-times.
>
> If you don't like "non-lazy", say "eager".

more than eager it is unnecesary.

>>> I know you dislike multifd-lazy-flush, but that's still the best I can come
>>> up with when writting this (yeah I still like it :-p), please bare with me
>>> and take whatever you think the best.
>>
>> Libvirt assumes that all capabilities are false except if enabled.
>> We want RAM_SAVE_FLAG_MULTFD_FLUSH by default (in new machine types).
>>
>> So, if we can do
>>
>> capability_use_new_way = true
>>
>> We change that to
>>
>> capability_use_old_way = true
>>
>> And then by default with false value is what we want.
>
> Eventually, all supported migration peers will support lazy flush.  What
> then?  Will we flip the default?  Or will we ignore the capability and
> always flush lazily?

I have to take a step back.  Cope with me.

How we fix problems in migration that make the stream incompatible.
We create a property.

static Property migration_properties[] = {
    ...
    DEFINE_PROP_BOOL("decompress-error-check", MigrationState,
                      decompress_error_check, true),
    ....
}

In this case it is true by default.

GlobalProperty hw_compat_2_12[] = {
    { "migration", "decompress-error-check", "off" },
    ...
};

We introduced it on whatever machine that is newer than 2_12.
Then we make it "off" for older machine types, that way we make sure
that migration from old qemu to new qemu works.

And we can even left libvirt, if they know that both qemus are new, they
can setup the property to true even for old machine types.

So, what we have:

Machine 2_13 and newer use the new code.
Machine 2_12 and older use the old code (by default).
We _can_ migrate machine 2_12 with new code, but we need to setup it
correctly on both sides.
We can run the old code with machine type 2_13.  But I admit than that
is only useful for testing, debugging, meassuring performance, etc.

So, the idea here is that we flush a lot of times for old machine types,
and we only flush when needed for new machine types.  Libvirt (or
whoever) can use the new method if it sees fit just using the
capability.

Now that I am telling this, I can switch back to a property instead of a
capability:
- I can have the any default value that I want
- So I can name it multifd_lazy_flush or whatever.

Later, Juan.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section
  2023-02-16 17:13         ` Juan Quintela
@ 2023-02-17  5:53           ` Markus Armbruster
  0 siblings, 0 replies; 13+ messages in thread
From: Markus Armbruster @ 2023-02-17  5:53 UTC (permalink / raw)
  To: Juan Quintela
  Cc: Peter Xu, qemu-devel, Eric Blake, Dr. David Alan Gilbert,
	Yanan Wang, Marcel Apfelbaum, Philippe Mathieu-Daudé,
	Eduardo Habkost

Juan Quintela <quintela@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> wrote:
>> Juan Quintela <quintela@redhat.com> writes:
>>
>>>>> @@ -478,6 +478,24 @@
>>>>>  #                    should not affect the correctness of postcopy migration.
>>>>>  #                    (since 7.1)
>>>>>  #
>>>>> +# @multifd-flush-after-each-section: flush every channel after each
>>>>> +#                                    section sent.  This assures that
>>>>> +#                                    we can't mix pages from one
>>>>> +#                                    iteration through ram pages with
>>
>> RAM
>
> OK.
>
>>>>> +#                                    pages for the following
>>>>> +#                                    iteration.  We really only need
>>>>> +#                                    to do this flush after we have go
>>
>> to flush after we have gone
>
> OK
>
>>>>> +#                                    through all the dirty pages.
>>>>> +#                                    For historical reasons, we do
>>>>> +#                                    that after each section.  This is
>
>> we flush after each section
>
> OK
>
>>>>> +#                                    suboptimal (we flush too many
>>>>> +#                                    times).
>
>> inefficient: we flush too often.
>
> OK
>
>>>>> +#                                    Default value is false.
>>>>> +#                                    Setting this capability has no
>>>>> +#                                    effect until the patch that
>>>>> +#                                    removes this comment.
>>>>> +#                                    (since 8.0)
>>>>
>>>> IMHO the core of this new "cap" is the new RAM_SAVE_FLAG_MULTIFD_FLUSH bit
>>>> in the stream protocol, but it's not referenced here.  I would suggest
>>>> simplify the content but highlight the core change:
>>>
>>> Actually it is the other way around.  What this capability will do is
>>> _NOT_ use RAM_SAVE_FLAG_MULTIFD_FLUSH protocol.
>>>
>>>>  @multifd-lazy-flush:  When enabled, multifd will only do sync flush after
>>
>> Spell out "synchronrous".
>
> ok.
>
>>>>                        each whole round of bitmap scan.  Otherwise it'll be
>>
>> Suggest to scratch "whole".
>
> ok.
>
>>>>                        done per RAM save iteration (which happens with a much
>>>>                        higher frequency).
>>
>> Less detail than Juan's version.  I'm not sure how much detail is
>> appropriate for QMP reference documentation.
>>
>>>>                        Please consider enable this as long as possible, and
>>>>                        keep this off only if any of the src/dst QEMU binary
>>>>                        doesn't support it.
>>
>> Clear guidance on how to use it, good!
>>
>> Perhaps state it more forcefully: "Enable this when both source and
>> destination support it."
>>
>>>>
>>>>                        This capability is bound to the new RAM save flag
>>>>                        RAM_SAVE_FLAG_MULTIFD_FLUSH, the new flag will only
>>>>                        be used and recognized when this feature bit set.
>>
>> Is RAM_SAVE_FLAG_MULTIFD_FLUSH visible in the QMP interface?  Or in the
>> migration stream?
>
> No.  Only migration stream.

Doc comments should be written for readers of the QEMU QMP Reference
Manual.  Is RAM_SAVE_FLAG_MULTIFD_FLUSH relevant for them?

Perhaps the relevant part is "the peer needs to enable this capability
too, or else", for a value of "or else".

What happens when the source enables, and the destination doesn't?

What happens when the destination enables, and the source doesn't?

Any particular reason for having the destination recognize the flag only
when the capability is enabled?

>> I'm asking because doc comments are QMP reference documentation, but
>> when writing them, it's easy to mistake them for internal documentation,
>> because, well, they're comments.
>
>>> Name is wrong.  It would be multifd-non-lazy-flush.  And I don't like
>>> negatives.  Real name is:
>>>
>>> multifd-I-messed-and-flush-too-many-times.
>>
>> If you don't like "non-lazy", say "eager".
>
> more than eager it is unnecesary.

"overeager"?

>>>> I know you dislike multifd-lazy-flush, but that's still the best I can come
>>>> up with when writting this (yeah I still like it :-p), please bare with me
>>>> and take whatever you think the best.
>>>
>>> Libvirt assumes that all capabilities are false except if enabled.
>>> We want RAM_SAVE_FLAG_MULTFD_FLUSH by default (in new machine types).
>>>
>>> So, if we can do
>>>
>>> capability_use_new_way = true
>>>
>>> We change that to
>>>
>>> capability_use_old_way = true
>>>
>>> And then by default with false value is what we want.
>>
>> Eventually, all supported migration peers will support lazy flush.  What
>> then?  Will we flip the default?  Or will we ignore the capability and
>> always flush lazily?
>
> I have to take a step back.  Cope with me.
>
> How we fix problems in migration that make the stream incompatible.
> We create a property.
>
> static Property migration_properties[] = {
>     ...
>     DEFINE_PROP_BOOL("decompress-error-check", MigrationState,
>                       decompress_error_check, true),
>     ....
> }
>
> In this case it is true by default.
>
> GlobalProperty hw_compat_2_12[] = {
>     { "migration", "decompress-error-check", "off" },
>     ...
> };
>
> We introduced it on whatever machine that is newer than 2_12.
> Then we make it "off" for older machine types, that way we make sure
> that migration from old qemu to new qemu works.
>
> And we can even left libvirt, if they know that both qemus are new, they
> can setup the property to true even for old machine types.
>
> So, what we have:
>
> Machine 2_13 and newer use the new code.
> Machine 2_12 and older use the old code (by default).
> We _can_ migrate machine 2_12 with new code, but we need to setup it
> correctly on both sides.
> We can run the old code with machine type 2_13.  But I admit than that
> is only useful for testing, debugging, meassuring performance, etc.
>
> So, the idea here is that we flush a lot of times for old machine types,
> and we only flush when needed for new machine types.  Libvirt (or
> whoever) can use the new method if it sees fit just using the
> capability.

Got it.

What should be done, if anything, when all machines defaulting to the
old code are gone?  Getting rid of the old code along with the
capability is desirable, isn't it?  But if the capability is a stable
interface, the deprecation process applies.  Should it be stable?  Or
should it be just something we use to do the right thing depending on
the machine types (primary purpose), and also enable experimentation
(secondary purpose)?

> Now that I am telling this, I can switch back to a property instead of a
> capability:
> - I can have the any default value that I want
> - So I can name it multifd_lazy_flush or whatever.
>
> Later, Juan.



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-02-17  5:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-02-15 18:02 [PATCH v6 0/3] Eliminate multifd flush Juan Quintela
2023-02-15 18:02 ` [PATCH v6 1/3] multifd: Create property multifd-flush-after-each-section Juan Quintela
2023-02-15 19:59   ` Peter Xu
2023-02-15 20:13     ` Juan Quintela
2023-02-16 15:15       ` Markus Armbruster
2023-02-16 17:13         ` Juan Quintela
2023-02-17  5:53           ` Markus Armbruster
2023-02-15 18:02 ` [PATCH v6 2/3] multifd: Protect multifd_send_sync_main() calls Juan Quintela
2023-02-15 20:06   ` Peter Xu
2023-02-15 18:02 ` [PATCH v6 3/3] multifd: Only flush once each full round of memory Juan Quintela
2023-02-15 20:05   ` Peter Xu
2023-02-16 11:00     ` Juan Quintela
2023-02-16 16:44       ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).