From: Juan Quintela <quintela@redhat.com>
To: Leonardo Bras <leobras@redhat.com>
Cc: "Elena Ufimtseva" <elena.ufimtseva@oracle.com>,
"John G Johnson" <john.g.johnson@oracle.com>,
"Jagannathan Raman" <jag.raman@oracle.com>,
qemu-block@nongnu.org, qemu-devel@nongnu.org,
"Markus Armbruster" <armbru@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>,
"Peter Xu" <peterx@redhat.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Marc-André Lureau" <marcandre.lureau@redhat.com>,
"Fam Zheng" <fam@euphon.net>, "Eric Blake" <eblake@redhat.com>
Subject: Re: [PATCH v8 5/5] multifd: Implement zero copy write in multifd migration (multifd-zero-copy)
Date: Fri, 18 Feb 2022 17:57:13 +0100 [thread overview]
Message-ID: <87fsogcdue.fsf@secure.mitica> (raw)
In-Reply-To: <20220201062901.428838-6-leobras@redhat.com> (Leonardo Bras's message of "Tue, 1 Feb 2022 03:29:03 -0300")
Leonardo Bras <leobras@redhat.com> wrote:
> Implement zero copy send on nocomp_send_write(), by making use of QIOChannel
> writev + flags & flush interface.
>
> Change multifd_send_sync_main() so flush_zero_copy() can be called
> after each iteration in order to make sure all dirty pages are sent before
> a new iteration is started. It will also flush at the beginning and at the
> end of migration.
>
> Also make it return -1 if flush_zero_copy() fails, in order to cancel
> the migration process, and avoid resuming the guest in the target host
> without receiving all current RAM.
>
> This will work fine on RAM migration because the RAM pages are not usually freed,
> and there is no problem on changing the pages content between writev_zero_copy() and
> the actual sending of the buffer, because this change will dirty the page and
> cause it to be re-sent on a next iteration anyway.
>
> A lot of locked memory may be needed in order to use multid migration
^^^^^^
multifd.
I can fix it on the commit.
> @@ -1479,7 +1479,16 @@ static bool migrate_params_check(MigrationParameters *params, Error **errp)
> error_prepend(errp, "Invalid mapping given for block-bitmap-mapping: ");
> return false;
> }
> -
> +#ifdef CONFIG_LINUX
> + if (params->zero_copy_send &&
> + (!migrate_use_multifd() ||
> + params->multifd_compression != MULTIFD_COMPRESSION_NONE ||
> + (params->tls_creds && *params->tls_creds))) {
> + error_setg(errp,
> + "Zero copy only available for non-compressed non-TLS multifd migration");
> + return false;
> + }
> +#endif
> return true;
> }
Test is long, but it is exactly what we need. Good.
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 43998ad117..2d68b9cf4f 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -568,19 +568,28 @@ void multifd_save_cleanup(void)
> multifd_send_state = NULL;
> }
>
> -void multifd_send_sync_main(QEMUFile *f)
> +int multifd_send_sync_main(QEMUFile *f)
> {
> int i;
> + bool flush_zero_copy;
>
> if (!migrate_use_multifd()) {
> - return;
> + return 0;
> }
> if (multifd_send_state->pages->num) {
> if (multifd_send_pages(f) < 0) {
> error_report("%s: multifd_send_pages fail", __func__);
> - return;
> + return 0;
> }
> }
> +
> + /*
> + * When using zero-copy, it's necessary to flush after each iteration to
> + * make sure pages from earlier iterations don't end up replacing newer
> + * pages.
> + */
> + flush_zero_copy = migrate_use_zero_copy_send();
> +
> for (i = 0; i < migrate_multifd_channels(); i++) {
> MultiFDSendParams *p = &multifd_send_state->params[i];
>
> @@ -591,7 +600,7 @@ void multifd_send_sync_main(QEMUFile *f)
> if (p->quit) {
> error_report("%s: channel %d has already quit", __func__, i);
> qemu_mutex_unlock(&p->mutex);
> - return;
> + return 0;
> }
>
> p->packet_num = multifd_send_state->packet_num++;
> @@ -602,6 +611,17 @@ void multifd_send_sync_main(QEMUFile *f)
> ram_counters.transferred += p->packet_len;
> qemu_mutex_unlock(&p->mutex);
> qemu_sem_post(&p->sem);
> +
> + if (flush_zero_copy) {
> + int ret;
> + Error *err = NULL;
> +
> + ret = qio_channel_flush(p->c, &err);
> + if (ret < 0) {
> + error_report_err(err);
> + return -1;
> + }
> + }
> }
> for (i = 0; i < migrate_multifd_channels(); i++) {
> MultiFDSendParams *p = &multifd_send_state->params[i];
> @@ -610,6 +630,8 @@ void multifd_send_sync_main(QEMUFile *f)
> qemu_sem_wait(&p->sem_sync);
> }
> trace_multifd_send_sync_main(multifd_send_state->packet_num);
> +
> + return 0;
> }
We are leaving pages is flight for potentially a lot of time. I *think*
that we can sync shorter than that.
> static void *multifd_send_thread(void *opaque)
> @@ -668,8 +690,8 @@ static void *multifd_send_thread(void *opaque)
> p->iov[0].iov_len = p->packet_len;
> p->iov[0].iov_base = p->packet;
>
> - ret = qio_channel_writev_all(p->c, p->iov, p->iovs_num,
> - &local_err);
> + ret = qio_channel_writev_full_all(p->c, p->iov, p->iovs_num, NULL,
> + 0, p->write_flags, &local_err);
> if (ret != 0) {
> break;
> }
I still think that it would be better to have a sync here in each
thread. I don't know the size, but once every couple megabytes of RAM
or so.
I did a change on:
commit d48c3a044537689866fe44e65d24c7d39a68868a
Author: Juan Quintela <quintela@redhat.com>
Date: Fri Nov 19 15:35:58 2021 +0100
multifd: Use a single writev on the send side
Until now, we wrote the packet header with write(), and the rest of the
pages with writev(). Just increase the size of the iovec and do a
single writev().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
And now we need to "perserve" this header until we do the sync,
otherwise we are overwritting it with other things.
What testing have you done after this commit?
Notice that it is not _complicated_ to fix it, I will try to come with
some idea on monday, but basically is having an array of buffers for
each thread, and store them until we call a sync().
Later, Juan.
next prev parent reply other threads:[~2022-02-18 17:07 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-01 6:28 [PATCH v8 0/5] MSG_ZEROCOPY + multifd Leonardo Bras
2022-02-01 6:28 ` [PATCH v8 1/5] QIOChannel: Add flags on io_writev and introduce io_flush callback Leonardo Bras
2022-02-01 9:35 ` Daniel P. Berrangé
2022-02-01 17:25 ` Leonardo Bras Soares Passos
2022-02-07 12:49 ` Peter Xu
2022-02-07 20:50 ` Leonardo Bras Soares Passos
2022-02-18 16:36 ` Juan Quintela
2022-02-21 16:41 ` Leonardo Bras Soares Passos
2022-02-01 6:29 ` [PATCH v8 2/5] QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX Leonardo Bras
2022-02-18 16:38 ` Juan Quintela
2022-02-01 6:29 ` [PATCH v8 3/5] migration: Add zero-copy-send parameter for QMP/HMP for Linux Leonardo Bras
2022-02-18 16:39 ` Juan Quintela
2022-02-01 6:29 ` [PATCH v8 4/5] migration: Add migrate_use_tls() helper Leonardo Bras
2022-02-01 6:29 ` [PATCH v8 5/5] multifd: Implement zero copy write in multifd migration (multifd-zero-copy) Leonardo Bras
2022-02-08 2:22 ` Peter Xu
2022-02-08 2:49 ` Leonardo Bras Soares Passos
2022-02-08 3:05 ` Peter Xu
2022-02-18 17:36 ` Juan Quintela
2022-02-21 19:47 ` Leonardo Bras Soares Passos
2022-02-18 16:57 ` Juan Quintela [this message]
2022-02-21 19:41 ` Leonardo Bras Soares Passos
2022-02-22 4:09 ` Leonardo Bras Soares Passos
2022-03-01 3:57 ` Peter Xu
2022-03-07 14:20 ` Leonardo Bras Soares Passos
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87fsogcdue.fsf@secure.mitica \
--to=quintela@redhat.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=eblake@redhat.com \
--cc=elena.ufimtseva@oracle.com \
--cc=fam@euphon.net \
--cc=jag.raman@oracle.com \
--cc=john.g.johnson@oracle.com \
--cc=leobras@redhat.com \
--cc=marcandre.lureau@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).