From: "Daniel P. Berrange" <berrange@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Laurent Vivier <lvivier@redhat.com>,
Juan Quintela <quintela@redhat.com>,
"Dr . David Alan Gilbert" <dgilbert@redhat.com>
Subject: Re: [Qemu-devel] [RFC 5/6] migration: store listen task tag
Date: Tue, 29 Aug 2017 11:38:31 +0100 [thread overview]
Message-ID: <20170829103831.GE3783@redhat.com> (raw)
In-Reply-To: <20170816094713.GA2294@pxdev.xzpeter.org>
On Wed, Aug 16, 2017 at 05:47:13PM +0800, Peter Xu wrote:
> On Tue, Aug 15, 2017 at 05:47:08PM +0800, Peter Xu wrote:
> > On Tue, Aug 15, 2017 at 10:27:07AM +0100, Daniel P. Berrange wrote:
> > > On Tue, Aug 15, 2017 at 04:50:06PM +0800, Peter Xu wrote:
> > > > On Tue, Aug 15, 2017 at 09:37:14AM +0100, Daniel P. Berrange wrote:
> > > > > On Tue, Aug 15, 2017 at 02:17:06PM +0800, Peter Xu wrote:
> > > > > > Store the task tag for migration types: tcp/unix/fd/exec in current
> > > > > > MigrationIncomingState struct.
> > > > > >
> > > > > > For defered migration, no need to store task tag since there is no task
> > > > > > running in the main loop at all. For RDMA, let's mark it as todo.
> > > > > >
> > > > > > Signed-off-by: Peter Xu <peterx@redhat.com>
> > > > > > ---
> > > > > > migration/migration.c | 22 ++++++++++++++++++----
> > > > > > migration/migration.h | 2 ++
> > > > > > 2 files changed, 20 insertions(+), 4 deletions(-)
> > > > > >
> > > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > > index c9b7085..daf356b 100644
> > > > > > --- a/migration/migration.c
> > > > > > +++ b/migration/migration.c
> > > > > > @@ -171,6 +171,7 @@ void migration_incoming_state_destroy(void)
> > > > > > mis->from_src_file = NULL;
> > > > > > }
> > > > > >
> > > > > > + mis->listen_task_tag = 0;
> > > > > > qemu_event_destroy(&mis->main_thread_load_event);
> > > > > > }
> > > > > >
> > > > > > @@ -265,25 +266,31 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > > > > > void qemu_start_incoming_migration(const char *uri, Error **errp)
> > > > > > {
> > > > > > const char *p;
> > > > > > + guint task_tag = 0;
> > > > > > + MigrationIncomingState *mis = migration_incoming_get_current();
> > > > > >
> > > > > > qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
> > > > > > if (!strcmp(uri, "defer")) {
> > > > > > deferred_incoming_migration(errp);
> > > > > > } else if (strstart(uri, "tcp:", &p)) {
> > > > > > - tcp_start_incoming_migration(p, errp);
> > > > > > + task_tag = tcp_start_incoming_migration(p, errp);
> > > > > > #ifdef CONFIG_RDMA
> > > > > > } else if (strstart(uri, "rdma:", &p)) {
> > > > > > + /* TODO: store task tag for RDMA migrations */
> > > > > > rdma_start_incoming_migration(p, errp);
> > > > > > #endif
> > > > > > } else if (strstart(uri, "exec:", &p)) {
> > > > > > - exec_start_incoming_migration(p, errp);
> > > > > > + task_tag = exec_start_incoming_migration(p, errp);
> > > > > > } else if (strstart(uri, "unix:", &p)) {
> > > > > > - unix_start_incoming_migration(p, errp);
> > > > > > + task_tag = unix_start_incoming_migration(p, errp);
> > > > > > } else if (strstart(uri, "fd:", &p)) {
> > > > > > - fd_start_incoming_migration(p, errp);
> > > > > > + task_tag = fd_start_incoming_migration(p, errp);
> > > > > > } else {
> > > > > > error_setg(errp, "unknown migration protocol: %s", uri);
> > > > > > + return;
> > > > > > }
> > > > > > +
> > > > > > + mis->listen_task_tag = task_tag;
> > > > >
> > > > > This is unsafe as currently written. The main loop watches that are
> > > > > associated with these tags are all unregistered when incoming migration
> > > > > starts. So if you save them, and then unregister later when postcopy
> > > > > is "stuck", then you're likely unregistering a tag associated with
> > > > > some other random part of QEMU, because the saved value is no longer
> > > > > valid.
> > > >
> > > > IIUC for incoming migration, the watch is not detached until
> > > > migration_fd_process_incoming() completes. So:
> > > >
> > > > - for precopy, the watch should be detached when migration completes
> > > >
> > > > - for postcopy, the watch should be detached when postcopy starts,
> > > > then main loop thread returns, while the ram_load_thread starts to
> > > > continue the migration.
> > > >
> > > > So basically the watch is detaching itself after
> > > > migration_fd_process_incoming() completes. To handle that, this patch
> > > > has this change:
> > > >
> > > > @@ -422,6 +429,13 @@ void migration_fd_process_incoming(QEMUFile *f)
> > > > co = qemu_coroutine_create(process_incoming_migration_co, f);
> > > > qemu_coroutine_enter(co);
> > > > }
> > > > +
> > > > + /*
> > > > + * When reach here, we should not need the listening port any
> > > > + * more. We'll detach the listening task soon, let's reset the
> > > > + * listen task tag.
> > > > + */
> > > > + mis->listen_task_tag = 0;
> > > >
> > > > Would this make sure the listen_task_tag is always valid if nonzero?
> > >
> > > Mixing 'return FALSE' from callbacks, with g_source_remove is a recipe
> > > for hard to diagnose bugs, so should be avoided in general.
> >
> > Makes sense.
> >
> > >
> > > So, IMHO, you should change all the callbacks to 'return TRUE' so that
> > > they keep the watch registered, and then explicitly call g_source_remove()
> > > passing the listen tag, when incoming migration starts.
> >
> > Yes this sounds better. Thanks!
>
> Hmm... when incoming migration starts, we are still in the handler of
> the listening gsource. I am not sure whether it's safe to remove the
> gsource in its own handlers.
Yes, it should be safe to call g_source_remove from within the handler
itself.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
next prev parent reply other threads:[~2017-08-29 10:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-15 6:17 [Qemu-devel] [RFC 0/6] migration: re-use migrate_incoming for postcopy recovery Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 1/6] migration: free SocketAddress where allocated Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 2/6] migration: return incoming task tag for sockets Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 3/6] migration: return incoming task tag for exec Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 4/6] migration: return incoming task tag for fd Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 5/6] migration: store listen task tag Peter Xu
2017-08-15 8:37 ` Daniel P. Berrange
2017-08-15 8:50 ` Peter Xu
2017-08-15 9:27 ` Daniel P. Berrange
2017-08-15 9:47 ` Peter Xu
2017-08-16 9:47 ` Peter Xu
2017-08-29 10:38 ` Daniel P. Berrange [this message]
2017-08-30 7:38 ` Peter Xu
2017-08-15 6:17 ` [Qemu-devel] [RFC 6/6] migration: allow migrate_incoming for paused VM Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170829103831.GE3783@redhat.com \
--to=berrange@redhat.com \
--cc=dgilbert@redhat.com \
--cc=lvivier@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).