* [PATCH] migration: RDMA registrations interval optimization
@ 2021-08-20 15:57 Zhiwei Jiang
2021-11-02 16:47 ` Juan Quintela
2021-11-03 12:11 ` Dr. David Alan Gilbert
0 siblings, 2 replies; 3+ messages in thread
From: Zhiwei Jiang @ 2021-08-20 15:57 UTC (permalink / raw)
To: quintela, dgilbert; +Cc: qemu-devel
RDMA migration very hard to complete when VM run mysql
benchmark on 1G host hugepage.I think the time between
ram_control_before_iterate(f, RAM_CONTROL_ROUND) and
after_iterate is too large when 1G host pagesize,so 1M
buffer size match with mlx driver that will be good.
after this patch,it will work as normal on my situation.
Signed-off-by: Zhiwei Jiang <elish.jiang@ucloud.cn>
---
migration/migration.c | 13 +++++++++++++
migration/migration.h | 6 ++++++
migration/ram.c | 6 +++++-
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/migration/migration.c b/migration/migration.c
index 041b8451a6..934916b161 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -457,6 +457,8 @@ void migrate_add_address(SocketAddress *address)
static void qemu_start_incoming_migration(const char *uri, Error **errp)
{
const char *p = NULL;
+ MigrationState *s = migrate_get_current();
+ s->enabled_rdma_migration = false;
qapi_event_send_migration(MIGRATION_STATUS_SETUP);
if (strstart(uri, "tcp:", &p) ||
@@ -465,6 +467,7 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp)
socket_start_incoming_migration(p ? p : uri, errp);
#ifdef CONFIG_RDMA
} else if (strstart(uri, "rdma:", &p)) {
+ s->enabled_rdma_migration = true;
rdma_start_incoming_migration(p, errp);
#endif
} else if (strstart(uri, "exec:", &p)) {
@@ -2040,6 +2043,7 @@ void migrate_init(MigrationState *s)
s->start_postcopy = false;
s->postcopy_after_devices = false;
s->migration_thread_running = false;
+ s->enabled_rdma_migration = false;
error_free(s->error);
s->error = NULL;
s->hostname = NULL;
@@ -2300,6 +2304,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
socket_start_outgoing_migration(s, p ? p : uri, &local_err);
#ifdef CONFIG_RDMA
} else if (strstart(uri, "rdma:", &p)) {
+ s->enabled_rdma_migration = true;
rdma_start_outgoing_migration(s, p, &local_err);
#endif
} else if (strstart(uri, "exec:", &p)) {
@@ -2475,6 +2480,14 @@ bool migrate_use_events(void)
return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
}
+bool migrate_use_rdma(void)
+{
+ MigrationState *s;
+ s = migrate_get_current();
+
+ return s->enabled_rdma_migration;
+}
+
bool migrate_use_multifd(void)
{
MigrationState *s;
diff --git a/migration/migration.h b/migration/migration.h
index 7a5aa8c2fd..860dc93df1 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -296,6 +296,11 @@ struct MigrationState {
* This save hostname when out-going migration starts
*/
char *hostname;
+
+ /*
+ * Enable RDMA migration
+ */
+ bool enabled_rdma_migration;
};
void migrate_set_state(int *state, int old_state, int new_state);
@@ -332,6 +337,7 @@ bool migrate_ignore_shared(void);
bool migrate_validate_uuid(void);
bool migrate_auto_converge(void);
+bool migrate_use_rdma(void);
bool migrate_use_multifd(void);
bool migrate_pause_before_switchover(void);
int migrate_multifd_channels(void);
diff --git a/migration/ram.c b/migration/ram.c
index 7a43bfd7af..dc0c0e2565 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -2043,7 +2043,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
unsigned long hostpage_boundary =
QEMU_ALIGN_UP(pss->page + 1, pagesize_bits);
+ /* Set RDMA boundary default 256*4K=1M that driver delivery more effective*/
+ unsigned long rdma_boundary =
+ QEMU_ALIGN_UP(pss->page + 1, 256);
unsigned long start_page = pss->page;
+ bool use_rdma = migrate_use_rdma();
int res;
if (ramblock_is_ignored(pss->block)) {
@@ -2069,7 +2073,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
}
}
pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
- } while ((pss->page < hostpage_boundary) &&
+ } while ((pss->page < (use_rdma ? rdma_boundary : hostpage_boundary)) &&
offset_in_ramblock(pss->block,
((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
/* The offset we leave with is the min boundary of host page and block */
--
2.25.1
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] migration: RDMA registrations interval optimization
2021-08-20 15:57 [PATCH] migration: RDMA registrations interval optimization Zhiwei Jiang
@ 2021-11-02 16:47 ` Juan Quintela
2021-11-03 12:11 ` Dr. David Alan Gilbert
1 sibling, 0 replies; 3+ messages in thread
From: Juan Quintela @ 2021-11-02 16:47 UTC (permalink / raw)
To: Zhiwei Jiang; +Cc: dgilbert, qemu-devel
Zhiwei Jiang <elish.jiang@ucloud.cn> wrote:
> RDMA migration very hard to complete when VM run mysql
> benchmark on 1G host hugepage.I think the time between
> ram_control_before_iterate(f, RAM_CONTROL_ROUND) and
> after_iterate is too large when 1G host pagesize,so 1M
> buffer size match with mlx driver that will be good.
> after this patch,it will work as normal on my situation.
>
> Signed-off-by: Zhiwei Jiang <elish.jiang@ucloud.cn>
Hi
Nice catch.
Could you split the migrate_use_rdma() part (which is obvious).
I can't believe that wedon't have that function. And the rest.
See my comments there.
> diff --git a/migration/migration.c b/migration/migration.c
> index 041b8451a6..934916b161 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -457,6 +457,8 @@ void migrate_add_address(SocketAddress *address)
> static void qemu_start_incoming_migration(const char *uri, Error **errp)
> {
> const char *p = NULL;
> + MigrationState *s = migrate_get_current();
> + s->enabled_rdma_migration = false;
This shouldn't be needed (famous last words). If it is needed, we are
masking other problem that we need to fix, like not initialzing the
migration state properly.
> + /*
> + * Enable RDMA migration
> + */
> + bool enabled_rdma_migration;
I think that ""rdma_enabled" is enough, what do you think?
I think that the comment is not needed.
> diff --git a/migration/ram.c b/migration/ram.c
> index 7a43bfd7af..dc0c0e2565 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2043,7 +2043,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
> unsigned long hostpage_boundary =
> QEMU_ALIGN_UP(pss->page + 1, pagesize_bits);
> + /* Set RDMA boundary default 256*4K=1M that driver delivery more effective*/
> + unsigned long rdma_boundary =
> + QEMU_ALIGN_UP(pss->page + 1, 256);
> unsigned long start_page = pss->page;
> + bool use_rdma = migrate_use_rdma();
> int res;
To not have to do the comparison everytime, what about:
unsigned long boundary = migrate_use_rdma() ? rdma_boundary : hostpage_boundary;
> if (ramblock_is_ignored(pss->block)) {
> @@ -2069,7 +2073,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> }
> }
> pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
> - } while ((pss->page < hostpage_boundary) &&
> + } while ((pss->page < (use_rdma ? rdma_boundary : hostpage_boundary)) &&
> offset_in_ramblock(pss->block,
> ((ram_addr_t)pss->page) <<
> TARGET_PAGE_BITS));
} while ((pss->page < boundary) &&
offset_in_ramblock(pss->block,
((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
We remove one comparison in every round.
What do you think?
Later, Juan.
PD. Nice acchievement being able to migrate with a relative high load
using 1GB pages. Could you share your setup and how fast it goes.
Thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] migration: RDMA registrations interval optimization
2021-08-20 15:57 [PATCH] migration: RDMA registrations interval optimization Zhiwei Jiang
2021-11-02 16:47 ` Juan Quintela
@ 2021-11-03 12:11 ` Dr. David Alan Gilbert
1 sibling, 0 replies; 3+ messages in thread
From: Dr. David Alan Gilbert @ 2021-11-03 12:11 UTC (permalink / raw)
To: Zhiwei Jiang; +Cc: qemu-devel, quintela
* Zhiwei Jiang (elish.jiang@ucloud.cn) wrote:
> RDMA migration very hard to complete when VM run mysql
> benchmark on 1G host hugepage.I think the time between
> ram_control_before_iterate(f, RAM_CONTROL_ROUND) and
> after_iterate is too large when 1G host pagesize,so 1M
> buffer size match with mlx driver that will be good.
> after this patch,it will work as normal on my situation.
Hi,
Apologies for the delay in responding.
There are a few things I don't understand about this patch:
a) Splitting the ram_save-host_page is probably wrong, in other
places we rely on this always saving the whole page.
b) I don't understand why splitting ram_save_host_page to a smaller
size, causes ram_control..(RAM_CONTROL_ROUND) to be called more often
- it might be if you have the bandwidth limits set?
c) I thought the RDMA cards preferred large transfers? Can you provide
some pointer to the information on this 1M buffer size and why it slows
it down?
Dave
> Signed-off-by: Zhiwei Jiang <elish.jiang@ucloud.cn>
> ---
> migration/migration.c | 13 +++++++++++++
> migration/migration.h | 6 ++++++
> migration/ram.c | 6 +++++-
> 3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index 041b8451a6..934916b161 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -457,6 +457,8 @@ void migrate_add_address(SocketAddress *address)
> static void qemu_start_incoming_migration(const char *uri, Error **errp)
> {
> const char *p = NULL;
> + MigrationState *s = migrate_get_current();
> + s->enabled_rdma_migration = false;
>
> qapi_event_send_migration(MIGRATION_STATUS_SETUP);
> if (strstart(uri, "tcp:", &p) ||
> @@ -465,6 +467,7 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp)
> socket_start_incoming_migration(p ? p : uri, errp);
> #ifdef CONFIG_RDMA
> } else if (strstart(uri, "rdma:", &p)) {
> + s->enabled_rdma_migration = true;
> rdma_start_incoming_migration(p, errp);
> #endif
> } else if (strstart(uri, "exec:", &p)) {
> @@ -2040,6 +2043,7 @@ void migrate_init(MigrationState *s)
> s->start_postcopy = false;
> s->postcopy_after_devices = false;
> s->migration_thread_running = false;
> + s->enabled_rdma_migration = false;
> error_free(s->error);
> s->error = NULL;
> s->hostname = NULL;
> @@ -2300,6 +2304,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
> socket_start_outgoing_migration(s, p ? p : uri, &local_err);
> #ifdef CONFIG_RDMA
> } else if (strstart(uri, "rdma:", &p)) {
> + s->enabled_rdma_migration = true;
> rdma_start_outgoing_migration(s, p, &local_err);
> #endif
> } else if (strstart(uri, "exec:", &p)) {
> @@ -2475,6 +2480,14 @@ bool migrate_use_events(void)
> return s->enabled_capabilities[MIGRATION_CAPABILITY_EVENTS];
> }
>
> +bool migrate_use_rdma(void)
> +{
> + MigrationState *s;
> + s = migrate_get_current();
> +
> + return s->enabled_rdma_migration;
> +}
> +
> bool migrate_use_multifd(void)
> {
> MigrationState *s;
> diff --git a/migration/migration.h b/migration/migration.h
> index 7a5aa8c2fd..860dc93df1 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -296,6 +296,11 @@ struct MigrationState {
> * This save hostname when out-going migration starts
> */
> char *hostname;
> +
> + /*
> + * Enable RDMA migration
> + */
> + bool enabled_rdma_migration;
> };
>
> void migrate_set_state(int *state, int old_state, int new_state);
> @@ -332,6 +337,7 @@ bool migrate_ignore_shared(void);
> bool migrate_validate_uuid(void);
>
> bool migrate_auto_converge(void);
> +bool migrate_use_rdma(void);
> bool migrate_use_multifd(void);
> bool migrate_pause_before_switchover(void);
> int migrate_multifd_channels(void);
> diff --git a/migration/ram.c b/migration/ram.c
> index 7a43bfd7af..dc0c0e2565 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2043,7 +2043,11 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> qemu_ram_pagesize(pss->block) >> TARGET_PAGE_BITS;
> unsigned long hostpage_boundary =
> QEMU_ALIGN_UP(pss->page + 1, pagesize_bits);
> + /* Set RDMA boundary default 256*4K=1M that driver delivery more effective*/
> + unsigned long rdma_boundary =
> + QEMU_ALIGN_UP(pss->page + 1, 256);
> unsigned long start_page = pss->page;
> + bool use_rdma = migrate_use_rdma();
> int res;
>
> if (ramblock_is_ignored(pss->block)) {
> @@ -2069,7 +2073,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss,
> }
> }
> pss->page = migration_bitmap_find_dirty(rs, pss->block, pss->page);
> - } while ((pss->page < hostpage_boundary) &&
> + } while ((pss->page < (use_rdma ? rdma_boundary : hostpage_boundary)) &&
> offset_in_ramblock(pss->block,
> ((ram_addr_t)pss->page) << TARGET_PAGE_BITS));
> /* The offset we leave with is the min boundary of host page and block */
> --
> 2.25.1
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2021-11-03 12:12 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-08-20 15:57 [PATCH] migration: RDMA registrations interval optimization Zhiwei Jiang
2021-11-02 16:47 ` Juan Quintela
2021-11-03 12:11 ` Dr. David Alan Gilbert
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).