From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Yori Fang <fangying1@huawei.com>
Cc: "Marc-André Lureau" <marcandre.lureau@gmail.com>,
"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] QEMU abort when network serivce is restarted during live migration with vhost-user as the network backend
Date: Wed, 15 Nov 2017 19:39:44 +0000 [thread overview]
Message-ID: <20171115193943.GA14688@work-vm> (raw)
In-Reply-To: <bbabe94a-bcb3-5ad5-7dfb-72327f8178b4@huawei.com>
* Yori Fang (fangying1@huawei.com) wrote:
>
>
> 在 2017/11/14 19:40, Marc-André Lureau 写道:
> > Hi
> >
> > On Tue, Nov 14, 2017 at 8:09 AM, fangying <fangying1@huawei.com> wrote:
> >> Hi all,
> >>
> >> We have a vm running migration with vhost-user as network backend, we notice that qemu will abort when openvswitch is restarted
> >> when MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward) is called. The reasion is clear that vhost_dev_set_log returns -1 because
> >> the network connection is temporarily lost due to the restart of openvswitch service.
> >>
> >> Below is the trace of the call stack.
> >>
> >> #0 0x00007f868ed971d7 in raise() from /usr/lib64/libc.so.6
> >> #1 0x00007f868ed988c8 in abort() from /usr/lib64/libc.so.6
> >> #2 0x00000000004d0d35 in vhost_log_global_start (listener=<optimized out>) at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost.c:794
> >> #2 0x0000000000486bd2 in memory_global_dirty_log_start at /usr/src/debug/qemu-kvm-2.8.1/memory.c:2304
> >> #3 0x0000000000486dcd in ram_save_init_globals at /usr/src/debug/qemu-kvm-2.8.1/migration/ram.c:2072
> >> #4 0x000000000048c185 in ram_save_setup (f=0x25e6ac0, opaque=<optimized out>) at /usr/src/debug/qemu-kvm-2.8.1/migration/ram.c:2093
> >> #5 0x00000000004fbee2 in qemu_savevm_state_begin at /usr/src/debug/qemu-kvm-2.8.1/migration/savevm.c:956
> >> #6 0x000000000083d8f8 in migration_thread at migration/migration.c:2198
> >>
> >> static void vhost_log_global_start(MemoryListener *listener)
> >> {
> >> int r;
> >>
> >> r = vhost_migration_log(listener, true);
> >> if (r < 0) {
> >> abort(); /* branch taken */
> >> }
> >> }
> >>
> >> What confuse me is that
> >> 1. do we really need to abort here ?
> >
> > Not if we have a sane way to handle the situation. It make sense
> > though to not want to support that use case (restarting the vhost-user
> > process during migration).
> >
> >> 2. all member of callbacks in MemoryListener returned with type void, we cannot judge in any upper function on the call stack.
> >> Can we just cancel migration here instead of calling abort ? like:
> >
> > That would be acceptable to me, but there should be a better way than
> > calling qmp_migrate_cancel() (we need to give a reason for cancelling,
> > and report it to user). Juan should be able to help.
>
> I agree with you, we'd better give more details here instead of passing NULL in qmp_migrate_cancel.
This is an unfortunate place to want to kill the migration; cancel is
the wrong call to use though (it's for users to cancel it).
I think the only way to do it is to call
migrate_fd_error(migrate_get_current(),...)
It's not nice though, but the only one I can think of from here.
Dave
> >
> >>
> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
> >> index ddc42f0..27ae4a2 100644
> >> --- a/hw/virtio/vhost.c
> >> +++ b/hw/virtio/vhost.c
> >> @@ -27,6 +27,7 @@
> >> #include "hw/virtio/virtio-access.h"
> >> #include "migration/blocker.h"
> >> #include "sysemu/dma.h"
> >> +#include "qmp-commands.h"
> >>
> >> /* enabled until disconnected backend stabilizes */
> >> #define _VHOST_DEBUG 1
> >> @@ -885,7 +886,7 @@ static void vhost_log_global_start(MemoryListener *listener)
> >>
> >> r = vhost_migration_log(listener, true);
> >> if (r < 0) {
> >> - abort();
> >> + qmp_migrate_cancel(NULL);
> >> }
> >> }
> >>
> >> @@ -895,7 +896,7 @@ static void vhost_log_global_stop(MemoryListener *listener)
> >>
> >> r = vhost_migration_log(listener, false);
> >> if (r < 0) {
> >> - abort();
> >> + qmp_migrate_cancel(NULL);
> >> }
> >> }
> >>
> >>
> >>
> >>
> >
> >
> >
>
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
next prev parent reply other threads:[~2017-11-15 19:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-14 7:09 [Qemu-devel] QEMU abort when network serivce is restarted during live migration with vhost-user as the network backend fangying
2017-11-14 11:40 ` Marc-André Lureau
2017-11-15 6:15 ` Yori Fang
2017-11-15 19:39 ` Dr. David Alan Gilbert [this message]
2017-11-16 2:23 ` Ying Fang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171115193943.GA14688@work-vm \
--to=dgilbert@redhat.com \
--cc=fangying1@huawei.com \
--cc=marcandre.lureau@gmail.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.