qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: Fei Li <fli@suse.com>,
	qemu-devel@nongnu.org, armbru@redhat.com, famz@redhat.com,
	quintela@redhat.com
Subject: Re: [Qemu-devel] [PATCH RFC v7 5/9] migration: fix the multifd code when sending less channels
Date: Fri, 2 Nov 2018 16:33:35 +0000	[thread overview]
Message-ID: <20181102163334.GC2387@work-vm> (raw)
In-Reply-To: <20181102033241.GF7804@xz-x1>

* Peter Xu (peterx@redhat.com) wrote:
> On Fri, Nov 02, 2018 at 11:00:24AM +0800, Fei Li wrote:
> > 
> > 
> > On 11/02/2018 10:37 AM, Peter Xu wrote:
> > > On Thu, Nov 01, 2018 at 06:17:11PM +0800, Fei Li wrote:
> > > > Set the migration state to "failed" instead of "setup" when failing
> > > > to send packet via some channel.
> > > Could you please provide more information in the commit message?
> > > E.g., what will happen if without this patch?  Will it crash the
> > > source or stall the source migration or others?  Otherwise it's a bit
> > > hard for me to understand what's this patch for.
> > Sorry for the inadequate description , I was intended to say that when
> > failing
> > to do the live migration using multifd, e.g. sending less channels, the src
> > status displays "setup" when running `info migrate`. I assume we should tell
> > users that the "Migration status" is "failed" now (and along with the
> > failure reason).
> > 
> > The current src status when failed inmultifd_new_send_channel_async():
> > 
> > 
> > (qemu) migrate_set_capability x-multifd on
> > (qemu) migrate_set_parameter x-multifd-channels 4
> > (qemu) migrate -d tcp:192.168.190.98:4444
> > (qemu) qemu-system-x86_64: failed in multifd_new_send_channel_async due to
> > ...
> > (qemu) info migrate
> > globals:
> > store-global-state: on
> > only-migratable: off
> > send-configuration: on
> > send-section-footer: on
> > decompress-error-check: on
> > capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks:
> > off compress: off events: off postcopy-ram: off x-colo: off release-ram: off
> > block: off return-path: off pause-before-switchover: off x-multifd: on
> > dirty-bitmaps: off postcopy-blocktime: off late-block-activate: off
> > Migration status: setup
> > total time: 0 milliseconds
> 
> Thanks for the information.
> 
> I had a quick look.  For now we do this:
> 
>         multifd_save_setup (without waiting for channels to be ready)
>         create thread migration_thread
>                 (in thread)
>                 ram_save_setup
>                         multifd_send_sync_main (wait for the channels)
> 
> The thing is that we didn't get the notification when one of the
> multifd channel is failed.  IMHO instead of setting the global
> migration state in a per-channel function, we should just report the
> error upwards, then the main thread should decide how to change the
> state machine of the migration.

Best to wait for Juan on that; I've got vague memories that reporting
errors among the threads was a bit tricky.

Dave

> And we have set it in migrate_set_error() after all so the main thread
> should be able to know somehow (though IMHO I'll even prefer to have a
> per-channel variable to keep the state of the channel, then the
> per-channel functions won't touch any globals which offers better
> isolation).
> 
> I'm not sure how Juan thinks about it, but I'd prefer some work to
> provide such isolation and also some mechanism to allow the main
> thread to detect the per-channel errors not only during setup phase
> but also during the migration (e.g., when network is suddenly down).
> Then we don't touch any globals (e.g., we shouldn't call
> migrate_get_current in any per-channel function like
> multifd_new_send_channel_async).
> 
> > 
> > > 
> > > Normally I would prefer to not touch global states in feature specific
> > > code path, but I'd like to know the problem more first...
> > > 
> > > Thanks,
> > > 
> > > > Cc: Peter Xu <peterx@redhat.com>
> > > > Signed-off-by: Fei Li <fli@suse.com>
> > > > ---
> > > >   migration/ram.c | 2 ++
> > > >   1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/migration/ram.c b/migration/ram.c
> > > > index 4db3b3e8f4..c84d164fc8 100644
> > > > --- a/migration/ram.c
> > > > +++ b/migration/ram.c
> > > > @@ -1072,6 +1072,7 @@ out:
> > > >   static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
> > > >   {
> > > >       MultiFDSendParams *p = opaque;
> > > > +    MigrationState *s = migrate_get_current();
> > > >       QIOChannel *sioc = QIO_CHANNEL(qio_task_get_source(task));
> > > >       Error *local_err = NULL;
> > > > @@ -1083,6 +1084,7 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
> > > >           if (multifd_save_cleanup(&local_err) != 0) {
> > > >               migrate_set_error(migrate_get_current(), local_err);
> > > >           }
> > > > +        migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED);
> > > >       } else {
> > > >           p->c = QIO_CHANNEL(sioc);
> > > >           qio_channel_set_delay(p->c, false);
> > > > -- 
> > > > 2.13.7
> > > > 
> > > Regards,
> > > 
> > 
> 
> Regards,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

  parent reply	other threads:[~2018-11-02 16:33 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-01 10:17 [Qemu-devel] [PATCH RFC v7 0/9] qemu_thread_create: propagate errors to callers to check Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 1/9] Fix segmentation fault when qemu_signal_init fails Fei Li
2018-11-05 13:32   ` Juan Quintela
2018-11-06  5:08     ` Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 2/9] qemu_init_vcpu: add a new Error parameter to propagate Fei Li
2018-11-05 13:34   ` Juan Quintela
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 3/9] qemu_thread_join: fix segmentation fault Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 4/9] migration: fix some segmentation faults when using multifd Fei Li
2018-11-02  2:31   ` Peter Xu
2018-11-02  6:03     ` Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 5/9] migration: fix the multifd code when sending less channels Fei Li
2018-11-02  2:37   ` Peter Xu
2018-11-02  3:00     ` Fei Li
2018-11-02  3:32       ` Peter Xu
2018-11-02  7:13         ` Fei Li
2018-11-02  7:32           ` Peter Xu
2018-11-02 16:33         ` Dr. David Alan Gilbert [this message]
2018-11-12  4:43           ` Fei Li
2018-12-04  7:32             ` Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 6/9] migration: fix the multifd code when receiving " Fei Li
2018-11-02  2:46   ` Peter Xu
2018-11-06  5:29     ` Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 7/9] migration: remove unused &local_err parameter in migrate_set_error Fei Li
2018-11-05 13:59   ` Juan Quintela
2018-11-06  4:51     ` Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 8/9] migration: add more error handling for postcopy_ram_enable_notify Fei Li
2018-11-01 10:17 ` [Qemu-devel] [PATCH RFC v7 9/9] qemu_thread_create: propagate the error to callers to handle Fei Li
2018-11-05 13:53   ` Juan Quintela
2018-11-06  7:15     ` Fei Li
2018-11-03 18:09 ` [Qemu-devel] [PATCH RFC v7 0/9] qemu_thread_create: propagate errors to callers to check no-reply
2018-11-05  4:57   ` Fei Li
2018-11-05 18:19 ` no-reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181102163334.GC2387@work-vm \
    --to=dgilbert@redhat.com \
    --cc=armbru@redhat.com \
    --cc=famz@redhat.com \
    --cc=fli@suse.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).