From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Leonardo Bras Soares Passos <lsoaresp@redhat.com>,
qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>
Subject: Re: [PATCH 17/20] migration: Postcopy preemption preparation on channel creation
Date: Tue, 22 Feb 2022 10:19:37 +0000 [thread overview]
Message-ID: <YhS4uaMTHaFjM168@work-vm> (raw)
In-Reply-To: <YhSf/a9Fjc5rhKZ5@xz-m1.local>
* Peter Xu (peterx@redhat.com) wrote:
> On Mon, Feb 21, 2022 at 06:39:36PM +0000, Dr. David Alan Gilbert wrote:
> > * Peter Xu (peterx@redhat.com) wrote:
> > > Create a new socket for postcopy to be prepared to send postcopy requested
> > > pages via this specific channel, so as to not get blocked by precopy pages.
> > >
> > > A new thread is also created on dest qemu to receive data from this new channel
> > > based on the ram_load_postcopy() routine.
> > >
> > > The ram_load_postcopy(POSTCOPY) branch and the thread has not started to
> > > function, and that'll be done in follow up patches.
> > >
> > > Cleanup the new sockets on both src/dst QEMUs, meanwhile look after the new
> > > thread too to make sure it'll be recycled properly.
> > >
> > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > ---
> > > migration/migration.c | 62 ++++++++++++++++++++++++----
> > > migration/migration.h | 8 ++++
> > > migration/postcopy-ram.c | 88 ++++++++++++++++++++++++++++++++++++++--
> > > migration/postcopy-ram.h | 10 +++++
> > > migration/ram.c | 25 ++++++++----
> > > migration/ram.h | 4 +-
> > > migration/socket.c | 22 +++++++++-
> > > migration/socket.h | 1 +
> > > migration/trace-events | 3 ++
> > > 9 files changed, 203 insertions(+), 20 deletions(-)
> > >
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 4c22bad304..3d7f897b72 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -321,6 +321,12 @@ void migration_incoming_state_destroy(void)
> > > mis->page_requested = NULL;
> > > }
> > >
> > > + if (mis->postcopy_qemufile_dst) {
> > > + migration_ioc_unregister_yank_from_file(mis->postcopy_qemufile_dst);
> > > + qemu_fclose(mis->postcopy_qemufile_dst);
> > > + mis->postcopy_qemufile_dst = NULL;
> > > + }
> > > +
> > > yank_unregister_instance(MIGRATION_YANK_INSTANCE);
> > > }
> > >
> > > @@ -714,15 +720,21 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp)
> > > migration_incoming_process();
> > > }
> > >
> > > +static bool migration_needs_multiple_sockets(void)
> > > +{
> > > + return migrate_use_multifd() || migrate_postcopy_preempt();
> > > +}
> > > +
> > > void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> > > {
> > > MigrationIncomingState *mis = migration_incoming_get_current();
> > > Error *local_err = NULL;
> > > bool start_migration;
> > > + QEMUFile *f;
> > >
> > > if (!mis->from_src_file) {
> > > /* The first connection (multifd may have multiple) */
> > > - QEMUFile *f = qemu_fopen_channel_input(ioc);
> > > + f = qemu_fopen_channel_input(ioc);
> > >
> > > if (!migration_incoming_setup(f, errp)) {
> > > return;
> > > @@ -730,13 +742,18 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> > >
> > > /*
> > > * Common migration only needs one channel, so we can start
> > > - * right now. Multifd needs more than one channel, we wait.
> > > + * right now. Some features need more than one channel, we wait.
> > > */
> > > - start_migration = !migrate_use_multifd();
> > > + start_migration = !migration_needs_multiple_sockets();
> > > } else {
> > > /* Multiple connections */
> > > - assert(migrate_use_multifd());
> > > - start_migration = multifd_recv_new_channel(ioc, &local_err);
> > > + assert(migration_needs_multiple_sockets());
> > > + if (migrate_use_multifd()) {
> > > + start_migration = multifd_recv_new_channel(ioc, &local_err);
> > > + } else if (migrate_postcopy_preempt()) {
> > > + f = qemu_fopen_channel_input(ioc);
> > > + start_migration = postcopy_preempt_new_channel(mis, f);
> > > + }
> > > if (local_err) {
> > > error_propagate(errp, local_err);
> > > return;
> > > @@ -761,11 +778,20 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
> > > bool migration_has_all_channels(void)
> > > {
> > > MigrationIncomingState *mis = migration_incoming_get_current();
> > > - bool all_channels;
> > >
> > > - all_channels = multifd_recv_all_channels_created();
> > > + if (!mis->from_src_file) {
> > > + return false;
> > > + }
> > > +
> > > + if (migrate_use_multifd()) {
> > > + return multifd_recv_all_channels_created();
> > > + }
> > > +
> > > + if (migrate_postcopy_preempt()) {
> > > + return mis->postcopy_qemufile_dst != NULL;
> > > + }
> > >
> > > - return all_channels && mis->from_src_file != NULL;
> > > + return true;
> > > }
> > >
> > > /*
> > > @@ -1858,6 +1884,12 @@ static void migrate_fd_cleanup(MigrationState *s)
> > > qemu_fclose(tmp);
> > > }
> > >
> > > + if (s->postcopy_qemufile_src) {
> > > + migration_ioc_unregister_yank_from_file(s->postcopy_qemufile_src);
> > > + qemu_fclose(s->postcopy_qemufile_src);
> > > + s->postcopy_qemufile_src = NULL;
> > > + }
> > > +
> > > assert(!migration_is_active(s));
> > >
> > > if (s->state == MIGRATION_STATUS_CANCELLING) {
> > > @@ -3233,6 +3265,11 @@ static void migration_completion(MigrationState *s)
> > > qemu_savevm_state_complete_postcopy(s->to_dst_file);
> > > qemu_mutex_unlock_iothread();
> > >
> > > + /* Shutdown the postcopy fast path thread */
> > > + if (migrate_postcopy_preempt()) {
> > > + postcopy_preempt_shutdown_file(s);
> > > + }
> > > +
> > > trace_migration_completion_postcopy_end_after_complete();
> > > } else {
> > > goto fail;
> > > @@ -4120,6 +4157,15 @@ void migrate_fd_connect(MigrationState *s, Error *error_in)
> > > }
> > > }
> > >
> > > + /* This needs to be done before resuming a postcopy */
> > > + if (postcopy_preempt_setup(s, &local_err)) {
> > > + error_report_err(local_err);
> > > + migrate_set_state(&s->state, MIGRATION_STATUS_SETUP,
> > > + MIGRATION_STATUS_FAILED);
> > > + migrate_fd_cleanup(s);
> > > + return;
> > > + }
> > > +
> > > if (resume) {
> > > /* Wakeup the main migration thread to do the recovery */
> > > migrate_set_state(&s->state, MIGRATION_STATUS_POSTCOPY_PAUSED,
> > > diff --git a/migration/migration.h b/migration/migration.h
> > > index af4bcb19c2..caa910d956 100644
> > > --- a/migration/migration.h
> > > +++ b/migration/migration.h
> > > @@ -23,6 +23,7 @@
> > > #include "io/channel-buffer.h"
> > > #include "net/announce.h"
> > > #include "qom/object.h"
> > > +#include "postcopy-ram.h"
> > >
> > > struct PostcopyBlocktimeContext;
> > >
> > > @@ -112,6 +113,11 @@ struct MigrationIncomingState {
> > > * enabled.
> > > */
> > > unsigned int postcopy_channels;
> > > + /* QEMUFile for postcopy only; it'll be handled by a separate thread */
> > > + QEMUFile *postcopy_qemufile_dst;
> > > + /* Postcopy priority thread is used to receive postcopy requested pages */
> > > + QemuThread postcopy_prio_thread;
> > > + bool postcopy_prio_thread_created;
> > > /*
> > > * An array of temp host huge pages to be used, one for each postcopy
> > > * channel.
> > > @@ -192,6 +198,8 @@ struct MigrationState {
> > > QEMUBH *cleanup_bh;
> > > /* Protected by qemu_file_lock */
> > > QEMUFile *to_dst_file;
> > > + /* Postcopy specific transfer channel */
> > > + QEMUFile *postcopy_qemufile_src;
> > > QIOChannelBuffer *bioc;
> > > /*
> > > * Protects to_dst_file/from_dst_file pointers. We need to make sure we
> > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > index 738cc55fa6..30eddaacd1 100644
> > > --- a/migration/postcopy-ram.c
> > > +++ b/migration/postcopy-ram.c
> > > @@ -32,6 +32,9 @@
> > > #include "trace.h"
> > > #include "hw/boards.h"
> > > #include "exec/ramblock.h"
> > > +#include "socket.h"
> > > +#include "qemu-file-channel.h"
> > > +#include "yank_functions.h"
> > >
> > > /* Arbitrary limit on size of each discard command,
> > > * keeps them around ~200 bytes
> > > @@ -566,6 +569,11 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState *mis)
> > > {
> > > trace_postcopy_ram_incoming_cleanup_entry();
> > >
> > > + if (mis->postcopy_prio_thread_created) {
> > > + qemu_thread_join(&mis->postcopy_prio_thread);
> > > + mis->postcopy_prio_thread_created = false;
> > > + }
> > > +
> > > if (mis->have_fault_thread) {
> > > Error *local_err = NULL;
> > >
> > > @@ -1101,8 +1109,13 @@ static int postcopy_temp_pages_setup(MigrationIncomingState *mis)
> > > int err, i, channels;
> > > void *temp_page;
> > >
> > > - /* TODO: will be boosted when enable postcopy preemption */
> > > - mis->postcopy_channels = 1;
> > > + if (migrate_postcopy_preempt()) {
> > > + /* If preemption enabled, need extra channel for urgent requests */
> > > + mis->postcopy_channels = RAM_CHANNEL_MAX;
> > > + } else {
> > > + /* Both precopy/postcopy on the same channel */
> > > + mis->postcopy_channels = 1;
> > > + }
> > >
> > > channels = mis->postcopy_channels;
> > > mis->postcopy_tmp_pages = g_malloc0_n(sizeof(PostcopyTmpPage), channels);
> > > @@ -1169,7 +1182,7 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
> > > return -1;
> > > }
> > >
> > > - postcopy_thread_create(mis, &mis->fault_thread, "postcopy/fault",
> > > + postcopy_thread_create(mis, &mis->fault_thread, "qemu/fault-default",
> >
> > That name is still too long; I'd lose the 'qemu/'
>
> Sure.
>
> >
> > > postcopy_ram_fault_thread, QEMU_THREAD_JOINABLE);
> > > mis->have_fault_thread = true;
> > >
> > > @@ -1184,6 +1197,16 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis)
> > > return -1;
> > > }
> > >
> > > + if (migrate_postcopy_preempt()) {
> > > + /*
> > > + * This thread needs to be created after the temp pages because it'll fetch
> > > + * RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
> > > + */
> > > + postcopy_thread_create(mis, &mis->postcopy_prio_thread, "qemu/fault-fast",
> >
> > and that one.
>
> OK.
>
> >
> > > + postcopy_preempt_thread, QEMU_THREAD_JOINABLE);
> > > + mis->postcopy_prio_thread_created = true;
> > > + }
> > > +
> > > trace_postcopy_ram_enable_notify();
> > >
> > > return 0;
> > > @@ -1513,3 +1536,62 @@ void postcopy_unregister_shared_ufd(struct PostCopyFD *pcfd)
> > > }
> > > }
> > > }
> > > +
> > > +bool postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file)
> > > +{
> > > + mis->postcopy_qemufile_dst = file;
> > > +
> > > + trace_postcopy_preempt_new_channel();
> > > +
> > > + /* Start the migration immediately */
> > > + return true;
> > > +}
> > > +
> > > +int postcopy_preempt_setup(MigrationState *s, Error **errp)
> > > +{
> > > + QIOChannel *ioc;
> > > +
> > > + if (!migrate_postcopy_preempt()) {
> > > + return 0;
> > > + }
> > > +
> > > + if (!migrate_multi_channels_is_allowed()) {
> > > + error_setg(errp, "Postcopy preempt is not supported as current "
> > > + "migration stream does not support multi-channels.");
> > > + return -1;
> > > + }
> > > +
> > > + ioc = socket_send_channel_create_sync(errp);
> >
> > How do we get away with doing this sync here, but have to use async for
> > multifd?
>
> Ah right.. I think both will work? But async (as what multifd is doing)
> is definitely more elegant because synced version can block the main thread
> for unexpected times.
Right, that was my worry.
Dave
> I'll add a new patch to support async channel creations (assuming that's
> better than squashing the change).
>
> --
> Peter Xu
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2022-02-22 10:22 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-16 6:27 [PATCH 00/20] migration: Postcopy Preemption Peter Xu
2022-02-16 6:27 ` [PATCH 01/20] migration: Dump sub-cmd name in loadvm_process_command tp Peter Xu
2022-02-16 15:42 ` Dr. David Alan Gilbert
2022-02-16 6:27 ` [PATCH 02/20] migration: Finer grained tracepoints for POSTCOPY_LISTEN Peter Xu
2022-02-16 15:43 ` Dr. David Alan Gilbert
2022-02-16 6:27 ` [PATCH 03/20] migration: Tracepoint change in postcopy-run bottom half Peter Xu
2022-02-16 19:00 ` Dr. David Alan Gilbert
2022-02-16 6:27 ` [PATCH 04/20] migration: Introduce postcopy channels on dest node Peter Xu
2022-02-21 15:49 ` Dr. David Alan Gilbert
2022-02-16 6:27 ` [PATCH 05/20] migration: Dump ramblock and offset too when non-same-page detected Peter Xu
2022-02-16 6:27 ` [PATCH 06/20] migration: Add postcopy_thread_create() Peter Xu
2022-02-21 16:00 ` Dr. David Alan Gilbert
2022-02-16 6:27 ` [PATCH 07/20] migration: Move static var in ram_block_from_stream() into global Peter Xu
2022-02-16 6:27 ` [PATCH 08/20] migration: Add pss.postcopy_requested status Peter Xu
2022-02-16 6:27 ` [PATCH 09/20] migration: Move migrate_allow_multifd and helpers into migration.c Peter Xu
2022-02-16 6:27 ` [PATCH 10/20] migration: Enlarge postcopy recovery to capture !-EIO too Peter Xu
2022-02-21 16:15 ` Dr. David Alan Gilbert
2022-02-16 6:28 ` [PATCH 11/20] migration: postcopy_pause_fault_thread() never fails Peter Xu
2022-02-21 16:16 ` Dr. David Alan Gilbert
2022-02-16 6:28 ` [PATCH 12/20] migration: Export ram_load_postcopy() Peter Xu
2022-02-21 16:17 ` Dr. David Alan Gilbert
2022-02-16 6:28 ` [PATCH 13/20] migration: Move channel setup out of postcopy_try_recover() Peter Xu
2022-02-22 10:57 ` Dr. David Alan Gilbert
2022-02-23 6:40 ` Peter Xu
2022-02-23 9:47 ` Dr. David Alan Gilbert
2022-02-23 12:55 ` Peter Xu
2022-02-16 6:28 ` [PATCH 14/20] migration: Add migration_incoming_transport_cleanup() Peter Xu
2022-02-21 16:56 ` Dr. David Alan Gilbert
2022-02-16 6:28 ` [PATCH 15/20] migration: Allow migrate-recover to run multiple times Peter Xu
2022-02-21 17:03 ` Dr. David Alan Gilbert
2022-02-22 2:51 ` Peter Xu
2022-02-16 6:28 ` [PATCH 16/20] migration: Add postcopy-preempt capability Peter Xu
2022-02-16 6:28 ` [PATCH 17/20] migration: Postcopy preemption preparation on channel creation Peter Xu
2022-02-21 18:39 ` Dr. David Alan Gilbert
2022-02-22 8:34 ` Peter Xu
2022-02-22 10:19 ` Dr. David Alan Gilbert [this message]
2022-02-16 6:28 ` [PATCH 18/20] migration: Postcopy preemption enablement Peter Xu
2022-02-22 10:52 ` Dr. David Alan Gilbert
2022-02-23 7:01 ` Peter Xu
2022-02-23 9:56 ` Dr. David Alan Gilbert
2022-02-23 13:05 ` Peter Xu
2022-02-16 6:28 ` [PATCH 19/20] migration: Postcopy recover with preempt enabled Peter Xu
2022-02-22 11:32 ` Dr. David Alan Gilbert
2022-02-23 7:45 ` Peter Xu
2022-02-23 9:52 ` Dr. David Alan Gilbert
2022-02-23 13:14 ` Peter Xu
2022-02-23 18:53 ` Dr. David Alan Gilbert
2022-02-16 6:28 ` [PATCH 20/20] tests: Add postcopy preempt test Peter Xu
2022-02-22 12:51 ` Dr. David Alan Gilbert
2022-02-23 7:50 ` Peter Xu
2022-03-01 5:34 ` Peter Xu
2022-03-01 17:00 ` Dr. David Alan Gilbert
2022-03-02 6:41 ` Peter Xu
2022-02-16 9:28 ` [PATCH 00/20] migration: Postcopy Preemption Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YhS4uaMTHaFjM168@work-vm \
--to=dgilbert@redhat.com \
--cc=lsoaresp@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.