qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PULL 1/7] multifd: Implement yank for multifd send side
  2021-09-09 10:33 Juan Quintela
@ 2021-09-09 10:33 ` Juan Quintela
  0 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-09-09 10:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Lukas Straub, Dr. David Alan Gilbert, Leonardo Bras,
	Juan Quintela

From: Lukas Straub <lukasstraub2@web.de>

To: qemu-devel <qemu-devel@nongnu.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela
 <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares
 Passos <lsoaresp@redhat.com>
Date: Wed, 1 Sep 2021 17:58:57 +0200 (1 week, 15 hours, 17 minutes ago)

[[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-09-01T17:58:57+0200 using RSA]]
When introducing yank functionality in the migration code I forgot
to cover the multifd send side.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h | 2 ++
 migration/multifd.c | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 8d6751f5ed..16c4d112d1 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -85,6 +85,8 @@ typedef struct {
     bool running;
     /* should this thread finish */
     bool quit;
+    /* is the yank function registered */
+    bool registered_yank;
     /* thread has work to do */
     int pending_job;
     /* array of pages to sent */
diff --git a/migration/multifd.c b/migration/multifd.c
index 377da78f5b..5a4f158f3c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -546,6 +546,9 @@ void multifd_save_cleanup(void)
         MultiFDSendParams *p = &multifd_send_state->params[i];
         Error *local_err = NULL;
 
+        if (p->registered_yank) {
+            migration_ioc_unregister_yank(p->c);
+        }
         socket_send_channel_destroy(p->c);
         p->c = NULL;
         qemu_mutex_destroy(&p->mutex);
@@ -813,7 +816,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
                 return false;
             }
         } else {
-            /* update for tls qio channel */
+            migration_ioc_register_yank(ioc);
+            p->registered_yank = true;
             p->c = ioc;
             qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
                                    QEMU_THREAD_JOINABLE);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 0/7] Migration.next patches
@ 2021-10-19  9:29 Juan Quintela
  2021-10-19  9:29 ` [PULL 1/7] multifd: Implement yank for multifd send side Juan Quintela
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Juan Quintela

The following changes since commit 362534a643b4a34bcb223996538ce9de5cdab946:

  Merge remote-tracking branch 'remotes/bsdimp/tags/pull-bsd-user-20211018-pull-request' into staging (2021-10-18 12:17:24 -0700)

are available in the Git repository at:

  https://github.com/juanquintela/qemu.git tags/migration.next-pull-request

for you to fetch changes up to 911965ace9386e35ca022a65bb45a32fd421af3e:

  migration/rdma: advise prefetch write for ODP region (2021-10-19 08:39:04 +0200)

----------------------------------------------------------------
Migration Pull request (3rd try)

Hi

This should fix all the freebsd problems.

Please apply,

----------------------------------------------------------------

David Hildenbrand (1):
  migration/ram: Don't passs RAMState to
    migration_clear_memory_region_dirty_bitmap_*()

Li Zhijian (4):
  migration: allow multifd for socket protocol only
  migration: allow enabling mutilfd for specific protocol only
  migration/rdma: Try to register On-Demand Paging memory region
  migration/rdma: advise prefetch write for ODP region

Lukas Straub (2):
  multifd: Implement yank for multifd send side
  multifd: Unconditionally unregister yank function

 meson.build            |   6 +++
 migration/multifd.h    |   4 ++
 migration/migration.c  |  12 +++++
 migration/multifd.c    |  35 ++++++++++---
 migration/ram.c        |  13 ++---
 migration/rdma.c       | 113 ++++++++++++++++++++++++++++++++++-------
 migration/trace-events |   2 +
 7 files changed, 151 insertions(+), 34 deletions(-)

-- 
2.31.1




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PULL 1/7] multifd: Implement yank for multifd send side
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 2/7] multifd: Unconditionally unregister yank function Juan Quintela
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: Lukas Straub, Dr. David Alan Gilbert, Leonardo Bras,
	Juan Quintela

From: Lukas Straub <lukasstraub2@web.de>

To: qemu-devel <qemu-devel@nongnu.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela
 <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares
 Passos <lsoaresp@redhat.com>
Date: Wed, 1 Sep 2021 17:58:57 +0200 (1 week, 15 hours, 17 minutes ago)

[[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-09-01T17:58:57+0200 using RSA]]
When introducing yank functionality in the migration code I forgot
to cover the multifd send side.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Tested-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Leonardo Bras <leobras@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h | 2 ++
 migration/multifd.c | 6 +++++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 8d6751f5ed..16c4d112d1 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -85,6 +85,8 @@ typedef struct {
     bool running;
     /* should this thread finish */
     bool quit;
+    /* is the yank function registered */
+    bool registered_yank;
     /* thread has work to do */
     int pending_job;
     /* array of pages to sent */
diff --git a/migration/multifd.c b/migration/multifd.c
index 377da78f5b..5a4f158f3c 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -546,6 +546,9 @@ void multifd_save_cleanup(void)
         MultiFDSendParams *p = &multifd_send_state->params[i];
         Error *local_err = NULL;
 
+        if (p->registered_yank) {
+            migration_ioc_unregister_yank(p->c);
+        }
         socket_send_channel_destroy(p->c);
         p->c = NULL;
         qemu_mutex_destroy(&p->mutex);
@@ -813,7 +816,8 @@ static bool multifd_channel_connect(MultiFDSendParams *p,
                 return false;
             }
         } else {
-            /* update for tls qio channel */
+            migration_ioc_register_yank(ioc);
+            p->registered_yank = true;
             p->c = ioc;
             qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
                                    QEMU_THREAD_JOINABLE);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 2/7] multifd: Unconditionally unregister yank function
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
  2021-10-19  9:29 ` [PULL 1/7] multifd: Implement yank for multifd send side Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 3/7] migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*() Juan Quintela
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Lukas Straub, Dr. David Alan Gilbert, Juan Quintela

From: Lukas Straub <lukasstraub2@web.de>

To: qemu-devel <qemu-devel@nongnu.org>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, Juan Quintela
 <quintela@redhat.com>, Peter Xu <peterx@redhat.com>, Leonardo Bras Soares
 Passos <lsoaresp@redhat.com>
Date: Wed, 4 Aug 2021 21:26:32 +0200 (5 weeks, 11 hours, 52 minutes ago)

[[PGP Signed Part:No public key for 35AB0B289C5DB258 created at 2021-08-04T21:26:32+0200 using RSA]]
Unconditionally unregister yank function in multifd_load_cleanup().
If it is not unregistered here, it will leak and cause a crash
in yank_unregister_instance(). Now if the ioc is still in use
afterwards, it will only lead to qemu not being able to recover
from a hang related to that ioc.

After checking the code, i am pretty sure that ref is always 1
when arriving here. So all this currently does is remove the
unneeded check.

Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/migration/multifd.c b/migration/multifd.c
index 5a4f158f3c..efd424bc97 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -991,10 +991,7 @@ int multifd_load_cleanup(Error **errp)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDRecvParams *p = &multifd_recv_state->params[i];
 
-        if (OBJECT(p->c)->ref == 1) {
-            migration_ioc_unregister_yank(p->c);
-        }
-
+        migration_ioc_unregister_yank(p->c);
         object_unref(OBJECT(p->c));
         p->c = NULL;
         qemu_mutex_destroy(&p->mutex);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 3/7] migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*()
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
  2021-10-19  9:29 ` [PULL 1/7] multifd: Implement yank for multifd send side Juan Quintela
  2021-10-19  9:29 ` [PULL 2/7] multifd: Unconditionally unregister yank function Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 4/7] migration: allow multifd for socket protocol only Juan Quintela
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel
  Cc: David Hildenbrand, Philippe Mathieu-Daudé,
	Dr. David Alan Gilbert, Peter Xu, Juan Quintela

From: David Hildenbrand <david@redhat.com>

The parameter is unused, let's drop it.

Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/ram.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/migration/ram.c b/migration/ram.c
index 7a43bfd7af..bb908822d5 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -789,8 +789,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
     return find_next_bit(bitmap, size, start);
 }
 
-static void migration_clear_memory_region_dirty_bitmap(RAMState *rs,
-                                                       RAMBlock *rb,
+static void migration_clear_memory_region_dirty_bitmap(RAMBlock *rb,
                                                        unsigned long page)
 {
     uint8_t shift;
@@ -818,8 +817,7 @@ static void migration_clear_memory_region_dirty_bitmap(RAMState *rs,
 }
 
 static void
-migration_clear_memory_region_dirty_bitmap_range(RAMState *rs,
-                                                 RAMBlock *rb,
+migration_clear_memory_region_dirty_bitmap_range(RAMBlock *rb,
                                                  unsigned long start,
                                                  unsigned long npages)
 {
@@ -832,7 +830,7 @@ migration_clear_memory_region_dirty_bitmap_range(RAMState *rs,
      * exclusive.
      */
     for (i = chunk_start; i < chunk_end; i += chunk_pages) {
-        migration_clear_memory_region_dirty_bitmap(rs, rb, i);
+        migration_clear_memory_region_dirty_bitmap(rb, i);
     }
 }
 
@@ -850,7 +848,7 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs,
      * the page in the chunk we clear the remote dirty bitmap for all.
      * Clearing it earlier won't be a problem, but too late will.
      */
-    migration_clear_memory_region_dirty_bitmap(rs, rb, page);
+    migration_clear_memory_region_dirty_bitmap(rb, page);
 
     ret = test_and_clear_bit(page, rb->bmap);
     if (ret) {
@@ -2777,8 +2775,7 @@ void qemu_guest_free_page_hint(void *addr, size_t len)
          * are initially set. Otherwise those skipped pages will be sent in
          * the next round after syncing from the memory region bitmap.
          */
-        migration_clear_memory_region_dirty_bitmap_range(ram_state, block,
-                                                         start, npages);
+        migration_clear_memory_region_dirty_bitmap_range(block, start, npages);
         ram_state->migration_dirty_pages -=
                       bitmap_count_one_with_offset(block->bmap, start, npages);
         bitmap_clear(block->bmap, start, npages);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 4/7] migration: allow multifd for socket protocol only
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
                   ` (2 preceding siblings ...)
  2021-10-19  9:29 ` [PULL 3/7] migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*() Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 5/7] migration: allow enabling mutilfd for specific " Juan Quintela
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Li Zhijian, Juan Quintela

From: Li Zhijian <lizhijian@cn.fujitsu.com>

To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org>
CC: Li Zhijian <lizhijian@cn.fujitsu.com>
Date: Sat, 31 Jul 2021 22:05:51 +0800 (5 weeks, 4 days, 17 hours ago)

multifd with unsupported protocol will cause a segment fault.
(gdb) bt
 #0  0x0000563b4a93faf8 in socket_connect (addr=0x0, errp=0x7f7f02675410) at ../util/qemu-sockets.c:1190
 #1 0x0000563b4a797a03 in qio_channel_socket_connect_sync
(ioc=0x563b4d16e8c0, addr=0x0, errp=0x7f7f02675410) at
../io/channel-socket.c:145
 #2  0x0000563b4a797abf in qio_channel_socket_connect_worker (task=0x563b4cd86c30, opaque=0x0) at ../io/channel-socket.c:168
 #3  0x0000563b4a792631 in qio_task_thread_worker (opaque=0x563b4cd86c30) at ../io/task.c:124
 #4  0x0000563b4a91da69 in qemu_thread_start (args=0x563b4c44bb80) at ../util/qemu-thread-posix.c:541
 #5  0x00007f7fe9b5b3f9 in ?? ()
 #6  0x0000000000000000 in ?? ()

It's enough to check migrate_multifd_is_allowed() in multifd cleanup() and
multifd setup() though there are so many other places using migrate_use_multifd().

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/multifd.h   |  2 ++
 migration/migration.c |  4 ++++
 migration/multifd.c   | 24 ++++++++++++++++++++++--
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/migration/multifd.h b/migration/multifd.h
index 16c4d112d1..15c50ca0b2 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -13,6 +13,8 @@
 #ifndef QEMU_MIGRATION_MULTIFD_H
 #define QEMU_MIGRATION_MULTIFD_H
 
+bool migrate_multifd_is_allowed(void);
+void migrate_protocol_allow_multifd(bool allow);
 int multifd_save_setup(Error **errp);
 void multifd_save_cleanup(void);
 int multifd_load_setup(Error **errp);
diff --git a/migration/migration.c b/migration/migration.c
index 6ac807ef3d..f13b07c638 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -453,10 +453,12 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p = NULL;
 
+    migrate_protocol_allow_multifd(false); /* reset it anyway */
     qapi_event_send_migration(MIGRATION_STATUS_SETUP);
     if (strstart(uri, "tcp:", &p) ||
         strstart(uri, "unix:", NULL) ||
         strstart(uri, "vsock:", NULL)) {
+        migrate_protocol_allow_multifd(true);
         socket_start_incoming_migration(p ? p : uri, errp);
 #ifdef CONFIG_RDMA
     } else if (strstart(uri, "rdma:", &p)) {
@@ -2280,9 +2282,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
         }
     }
 
+    migrate_protocol_allow_multifd(false);
     if (strstart(uri, "tcp:", &p) ||
         strstart(uri, "unix:", NULL) ||
         strstart(uri, "vsock:", NULL)) {
+        migrate_protocol_allow_multifd(true);
         socket_start_outgoing_migration(s, p ? p : uri, &local_err);
 #ifdef CONFIG_RDMA
     } else if (strstart(uri, "rdma:", &p)) {
diff --git a/migration/multifd.c b/migration/multifd.c
index efd424bc97..283f672bf0 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -531,7 +531,7 @@ void multifd_save_cleanup(void)
 {
     int i;
 
-    if (!migrate_use_multifd()) {
+    if (!migrate_use_multifd() || !migrate_multifd_is_allowed()) {
         return;
     }
     multifd_send_terminate_threads(NULL);
@@ -868,6 +868,17 @@ cleanup:
     multifd_new_send_channel_cleanup(p, sioc, local_err);
 }
 
+static bool migrate_allow_multifd;
+void migrate_protocol_allow_multifd(bool allow)
+{
+    migrate_allow_multifd = allow;
+}
+
+bool migrate_multifd_is_allowed(void)
+{
+    return migrate_allow_multifd;
+}
+
 int multifd_save_setup(Error **errp)
 {
     int thread_count;
@@ -878,6 +889,11 @@ int multifd_save_setup(Error **errp)
     if (!migrate_use_multifd()) {
         return 0;
     }
+    if (!migrate_multifd_is_allowed()) {
+        error_setg(errp, "multifd is not supported by current protocol");
+        return -1;
+    }
+
     s = migrate_get_current();
     thread_count = migrate_multifd_channels();
     multifd_send_state = g_malloc0(sizeof(*multifd_send_state));
@@ -971,7 +987,7 @@ int multifd_load_cleanup(Error **errp)
 {
     int i;
 
-    if (!migrate_use_multifd()) {
+    if (!migrate_use_multifd() || !migrate_multifd_is_allowed()) {
         return 0;
     }
     multifd_recv_terminate_threads(NULL);
@@ -1120,6 +1136,10 @@ int multifd_load_setup(Error **errp)
     if (!migrate_use_multifd()) {
         return 0;
     }
+    if (!migrate_multifd_is_allowed()) {
+        error_setg(errp, "multifd is not supported by current protocol");
+        return -1;
+    }
     thread_count = migrate_multifd_channels();
     multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state));
     multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count);
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 5/7] migration: allow enabling mutilfd for specific protocol only
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
                   ` (3 preceding siblings ...)
  2021-10-19  9:29 ` [PULL 4/7] migration: allow multifd for socket protocol only Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 6/7] migration/rdma: Try to register On-Demand Paging memory region Juan Quintela
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Li Zhijian, Juan Quintela

From: Li Zhijian <lizhijian@cn.fujitsu.com>

To: <quintela@redhat.com>, <dgilbert@redhat.com>, <qemu-devel@nongnu.org>
CC: Li Zhijian <lizhijian@cn.fujitsu.com>
Date: Sat, 31 Jul 2021 22:05:52 +0800 (5 weeks, 4 days, 17 hours ago)

And change the default to true so that in '-incoming defer' case, user is able
to change multifd capability.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/migration.c | 8 ++++++++
 migration/multifd.c   | 2 +-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index f13b07c638..9172686b89 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1237,6 +1237,14 @@ static bool migrate_caps_check(bool *cap_list,
         }
     }
 
+    /* incoming side only */
+    if (runstate_check(RUN_STATE_INMIGRATE) &&
+        !migrate_multifd_is_allowed() &&
+        cap_list[MIGRATION_CAPABILITY_MULTIFD]) {
+        error_setg(errp, "multifd is not supported by current protocol");
+        return false;
+    }
+
     return true;
 }
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 283f672bf0..7c9deb1921 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -868,7 +868,7 @@ cleanup:
     multifd_new_send_channel_cleanup(p, sioc, local_err);
 }
 
-static bool migrate_allow_multifd;
+static bool migrate_allow_multifd = true;
 void migrate_protocol_allow_multifd(bool allow)
 {
     migrate_allow_multifd = allow;
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 6/7] migration/rdma: Try to register On-Demand Paging memory region
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
                   ` (4 preceding siblings ...)
  2021-10-19  9:29 ` [PULL 5/7] migration: allow enabling mutilfd for specific " Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19  9:29 ` [PULL 7/7] migration/rdma: advise prefetch write for ODP region Juan Quintela
  2021-10-19 16:55 ` [PULL 0/7] Migration.next patches Richard Henderson
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Li Zhijian, Juan Quintela

From: Li Zhijian <lizhijian@cn.fujitsu.com>

Previously, for the fsdax mem-backend-file, it will register failed with
Operation not supported. In this case, we can try to register it with
On-Demand Paging[1] like what rpma_mr_reg() does on rpma[2].

[1]: https://community.mellanox.com/s/article/understanding-on-demand-paging--odp-x
[2]: http://pmem.io/rpma/manpages/v0.9.0/rpma_mr_reg.3

CC: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 migration/rdma.c       | 73 ++++++++++++++++++++++++++++++------------
 migration/trace-events |  1 +
 2 files changed, 54 insertions(+), 20 deletions(-)

diff --git a/migration/rdma.c b/migration/rdma.c
index 5c2d113aa9..eb80431aae 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1117,19 +1117,47 @@ static int qemu_rdma_alloc_qp(RDMAContext *rdma)
     return 0;
 }
 
+/* Check whether On-Demand Paging is supported by RDAM device */
+static bool rdma_support_odp(struct ibv_context *dev)
+{
+    struct ibv_device_attr_ex attr = {0};
+    int ret = ibv_query_device_ex(dev, NULL, &attr);
+    if (ret) {
+        return false;
+    }
+
+    if (attr.odp_caps.general_caps & IBV_ODP_SUPPORT) {
+        return true;
+    }
+
+    return false;
+}
+
 static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
 {
     int i;
     RDMALocalBlocks *local = &rdma->local_ram_blocks;
 
     for (i = 0; i < local->nb_blocks; i++) {
+        int access = IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE;
+
         local->block[i].mr =
             ibv_reg_mr(rdma->pd,
                     local->block[i].local_host_addr,
-                    local->block[i].length,
-                    IBV_ACCESS_LOCAL_WRITE |
-                    IBV_ACCESS_REMOTE_WRITE
+                    local->block[i].length, access
                     );
+
+        if (!local->block[i].mr &&
+            errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
+                access |= IBV_ACCESS_ON_DEMAND;
+                /* register ODP mr */
+                local->block[i].mr =
+                    ibv_reg_mr(rdma->pd,
+                               local->block[i].local_host_addr,
+                               local->block[i].length, access);
+                trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
+        }
+
         if (!local->block[i].mr) {
             perror("Failed to register local dest ram block!");
             break;
@@ -1215,28 +1243,33 @@ static int qemu_rdma_register_and_get_keys(RDMAContext *rdma,
      */
     if (!block->pmr[chunk]) {
         uint64_t len = chunk_end - chunk_start;
+        int access = rkey ? IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE :
+                     0;
 
         trace_qemu_rdma_register_and_get_keys(len, chunk_start);
 
-        block->pmr[chunk] = ibv_reg_mr(rdma->pd,
-                chunk_start, len,
-                (rkey ? (IBV_ACCESS_LOCAL_WRITE |
-                        IBV_ACCESS_REMOTE_WRITE) : 0));
-
-        if (!block->pmr[chunk]) {
-            perror("Failed to register chunk!");
-            fprintf(stderr, "Chunk details: block: %d chunk index %d"
-                            " start %" PRIuPTR " end %" PRIuPTR
-                            " host %" PRIuPTR
-                            " local %" PRIuPTR " registrations: %d\n",
-                            block->index, chunk, (uintptr_t)chunk_start,
-                            (uintptr_t)chunk_end, host_addr,
-                            (uintptr_t)block->local_host_addr,
-                            rdma->total_registrations);
-            return -1;
+        block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
+        if (!block->pmr[chunk] &&
+            errno == ENOTSUP && rdma_support_odp(rdma->verbs)) {
+            access |= IBV_ACCESS_ON_DEMAND;
+            /* register ODP mr */
+            block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
+            trace_qemu_rdma_register_odp_mr(block->block_name);
         }
-        rdma->total_registrations++;
     }
+    if (!block->pmr[chunk]) {
+        perror("Failed to register chunk!");
+        fprintf(stderr, "Chunk details: block: %d chunk index %d"
+                        " start %" PRIuPTR " end %" PRIuPTR
+                        " host %" PRIuPTR
+                        " local %" PRIuPTR " registrations: %d\n",
+                        block->index, chunk, (uintptr_t)chunk_start,
+                        (uintptr_t)chunk_end, host_addr,
+                        (uintptr_t)block->local_host_addr,
+                        rdma->total_registrations);
+        return -1;
+    }
+    rdma->total_registrations++;
 
     if (lkey) {
         *lkey = block->pmr[chunk]->lkey;
diff --git a/migration/trace-events b/migration/trace-events
index a1c0f034ab..5f6aa580de 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -212,6 +212,7 @@ qemu_rdma_poll_write(const char *compstr, int64_t comp, int left, uint64_t block
 qemu_rdma_poll_other(const char *compstr, int64_t comp, int left) "other completion %s (%" PRId64 ") received left %d"
 qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
 qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" PRIu64 " bytes @ %p"
+qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand Paging memory region: %s"
 qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
 qemu_rdma_registration_handle_finished(void) ""
 qemu_rdma_registration_handle_ram_blocks(void) ""
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PULL 7/7] migration/rdma: advise prefetch write for ODP region
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
                   ` (5 preceding siblings ...)
  2021-10-19  9:29 ` [PULL 6/7] migration/rdma: Try to register On-Demand Paging memory region Juan Quintela
@ 2021-10-19  9:29 ` Juan Quintela
  2021-10-19 16:55 ` [PULL 0/7] Migration.next patches Richard Henderson
  7 siblings, 0 replies; 10+ messages in thread
From: Juan Quintela @ 2021-10-19  9:29 UTC (permalink / raw)
  To: qemu-devel; +Cc: Dr. David Alan Gilbert, Li Zhijian, Juan Quintela

From: Li Zhijian <lizhijian@cn.fujitsu.com>

The responder mr registering with ODP will sent RNR NAK back to
the requester in the face of the page fault.
---------
ibv_poll_cq wc.status=13 RNR retry counter exceeded!
ibv_poll_cq wrid=WRITE RDMA!
---------
ibv_advise_mr(3) helps to make pages present before the actual IO is
conducted so that the responder does page fault as little as possible.

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
---
 meson.build            |  6 ++++++
 migration/rdma.c       | 42 ++++++++++++++++++++++++++++++++++++++++++
 migration/trace-events |  1 +
 3 files changed, 49 insertions(+)

diff --git a/meson.build b/meson.build
index 5e7946776d..9ed9a993e2 100644
--- a/meson.build
+++ b/meson.build
@@ -1530,6 +1530,12 @@ config_host_data.set('HAVE_COPY_FILE_RANGE', cc.has_function('copy_file_range'))
 config_host_data.set('HAVE_OPENPTY', cc.has_function('openpty', dependencies: util))
 config_host_data.set('HAVE_STRCHRNUL', cc.has_function('strchrnul'))
 config_host_data.set('HAVE_SYSTEM_FUNCTION', cc.has_function('system', prefix: '#include <stdlib.h>'))
+if rdma.found()
+  config_host_data.set('HAVE_IBV_ADVISE_MR',
+                       cc.has_function('ibv_advise_mr',
+                                       args: config_host['RDMA_LIBS'].split(),
+                                       prefix: '#include <infiniband/verbs.h>'))
+endif
 
 # has_header_symbol
 config_host_data.set('CONFIG_BYTESWAP_H',
diff --git a/migration/rdma.c b/migration/rdma.c
index eb80431aae..2a3c7889b9 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1133,6 +1133,32 @@ static bool rdma_support_odp(struct ibv_context *dev)
     return false;
 }
 
+/*
+ * ibv_advise_mr to avoid RNR NAK error as far as possible.
+ * The responder mr registering with ODP will sent RNR NAK back to
+ * the requester in the face of the page fault.
+ */
+static void qemu_rdma_advise_prefetch_mr(struct ibv_pd *pd, uint64_t addr,
+                                         uint32_t len,  uint32_t lkey,
+                                         const char *name, bool wr)
+{
+#ifdef HAVE_IBV_ADVISE_MR
+    int ret;
+    int advice = wr ? IBV_ADVISE_MR_ADVICE_PREFETCH_WRITE :
+                 IBV_ADVISE_MR_ADVICE_PREFETCH;
+    struct ibv_sge sg_list = {.lkey = lkey, .addr = addr, .length = len};
+
+    ret = ibv_advise_mr(pd, advice,
+                        IBV_ADVISE_MR_FLAG_FLUSH, &sg_list, 1);
+    /* ignore the error */
+    if (ret) {
+        trace_qemu_rdma_advise_mr(name, len, addr, strerror(errno));
+    } else {
+        trace_qemu_rdma_advise_mr(name, len, addr, "successed");
+    }
+#endif
+}
+
 static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
 {
     int i;
@@ -1156,6 +1182,15 @@ static int qemu_rdma_reg_whole_ram_blocks(RDMAContext *rdma)
                                local->block[i].local_host_addr,
                                local->block[i].length, access);
                 trace_qemu_rdma_register_odp_mr(local->block[i].block_name);
+
+                if (local->block[i].mr) {
+                    qemu_rdma_advise_prefetch_mr(rdma->pd,
+                                    (uintptr_t)local->block[i].local_host_addr,
+                                    local->block[i].length,
+                                    local->block[i].mr->lkey,
+                                    local->block[i].block_name,
+                                    true);
+                }
         }
 
         if (!local->block[i].mr) {
@@ -1255,6 +1290,13 @@ static int qemu_rdma_register_and_get_keys(RDMAContext *rdma,
             /* register ODP mr */
             block->pmr[chunk] = ibv_reg_mr(rdma->pd, chunk_start, len, access);
             trace_qemu_rdma_register_odp_mr(block->block_name);
+
+            if (block->pmr[chunk]) {
+                qemu_rdma_advise_prefetch_mr(rdma->pd, (uintptr_t)chunk_start,
+                                            len, block->pmr[chunk]->lkey,
+                                            block->block_name, rkey);
+
+            }
         }
     }
     if (!block->pmr[chunk]) {
diff --git a/migration/trace-events b/migration/trace-events
index 5f6aa580de..a8ae163707 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -213,6 +213,7 @@ qemu_rdma_poll_other(const char *compstr, int64_t comp, int left) "other complet
 qemu_rdma_post_send_control(const char *desc) "CONTROL: sending %s.."
 qemu_rdma_register_and_get_keys(uint64_t len, void *start) "Registering %" PRIu64 " bytes @ %p"
 qemu_rdma_register_odp_mr(const char *name) "Try to register On-Demand Paging memory region: %s"
+qemu_rdma_advise_mr(const char *name, uint32_t len, uint64_t addr, const char *res) "Try to advise block %s prefetch at %" PRIu32 "@0x%" PRIx64 ": %s"
 qemu_rdma_registration_handle_compress(int64_t length, int index, int64_t offset) "Zapping zero chunk: %" PRId64 " bytes, index %d, offset %" PRId64
 qemu_rdma_registration_handle_finished(void) ""
 qemu_rdma_registration_handle_ram_blocks(void) ""
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PULL 0/7] Migration.next patches
  2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
                   ` (6 preceding siblings ...)
  2021-10-19  9:29 ` [PULL 7/7] migration/rdma: advise prefetch write for ODP region Juan Quintela
@ 2021-10-19 16:55 ` Richard Henderson
  7 siblings, 0 replies; 10+ messages in thread
From: Richard Henderson @ 2021-10-19 16:55 UTC (permalink / raw)
  To: Juan Quintela, qemu-devel; +Cc: Dr. David Alan Gilbert

On 10/19/21 2:29 AM, Juan Quintela wrote:
> The following changes since commit 362534a643b4a34bcb223996538ce9de5cdab946:
> 
>    Merge remote-tracking branch 'remotes/bsdimp/tags/pull-bsd-user-20211018-pull-request' into staging (2021-10-18 12:17:24 -0700)
> 
> are available in the Git repository at:
> 
>    https://github.com/juanquintela/qemu.git tags/migration.next-pull-request
> 
> for you to fetch changes up to 911965ace9386e35ca022a65bb45a32fd421af3e:
> 
>    migration/rdma: advise prefetch write for ODP region (2021-10-19 08:39:04 +0200)
> 
> ----------------------------------------------------------------
> Migration Pull request (3rd try)
> 
> Hi
> 
> This should fix all the freebsd problems.
> 
> Please apply,
> 
> ----------------------------------------------------------------
> 
> David Hildenbrand (1):
>    migration/ram: Don't passs RAMState to
>      migration_clear_memory_region_dirty_bitmap_*()
> 
> Li Zhijian (4):
>    migration: allow multifd for socket protocol only
>    migration: allow enabling mutilfd for specific protocol only
>    migration/rdma: Try to register On-Demand Paging memory region
>    migration/rdma: advise prefetch write for ODP region
> 
> Lukas Straub (2):
>    multifd: Implement yank for multifd send side
>    multifd: Unconditionally unregister yank function
> 
>   meson.build            |   6 +++
>   migration/multifd.h    |   4 ++
>   migration/migration.c  |  12 +++++
>   migration/multifd.c    |  35 ++++++++++---
>   migration/ram.c        |  13 ++---
>   migration/rdma.c       | 113 ++++++++++++++++++++++++++++++++++-------
>   migration/trace-events |   2 +
>   7 files changed, 151 insertions(+), 34 deletions(-)

Applied, thanks.

r~



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-10-19 17:26 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-10-19  9:29 [PULL 0/7] Migration.next patches Juan Quintela
2021-10-19  9:29 ` [PULL 1/7] multifd: Implement yank for multifd send side Juan Quintela
2021-10-19  9:29 ` [PULL 2/7] multifd: Unconditionally unregister yank function Juan Quintela
2021-10-19  9:29 ` [PULL 3/7] migration/ram: Don't passs RAMState to migration_clear_memory_region_dirty_bitmap_*() Juan Quintela
2021-10-19  9:29 ` [PULL 4/7] migration: allow multifd for socket protocol only Juan Quintela
2021-10-19  9:29 ` [PULL 5/7] migration: allow enabling mutilfd for specific " Juan Quintela
2021-10-19  9:29 ` [PULL 6/7] migration/rdma: Try to register On-Demand Paging memory region Juan Quintela
2021-10-19  9:29 ` [PULL 7/7] migration/rdma: advise prefetch write for ODP region Juan Quintela
2021-10-19 16:55 ` [PULL 0/7] Migration.next patches Richard Henderson
  -- strict thread matches above, loose matches on Subject: below --
2021-09-09 10:33 Juan Quintela
2021-09-09 10:33 ` [PULL 1/7] multifd: Implement yank for multifd send side Juan Quintela

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).