From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44663) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X7AEG-00019U-EA for qemu-devel@nongnu.org; Tue, 15 Jul 2014 17:25:41 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X7AEE-0003g5-Lu for qemu-devel@nongnu.org; Tue, 15 Jul 2014 17:25:40 -0400 Received: from mail-oa0-x231.google.com ([2607:f8b0:4003:c02::231]:57595) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X7AEE-0003fq-99 for qemu-devel@nongnu.org; Tue, 15 Jul 2014 17:25:38 -0400 Received: by mail-oa0-f49.google.com with SMTP id eb12so6570937oac.36 for ; Tue, 15 Jul 2014 14:25:37 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20140715210948.GA20036@amt.cnet> References: <20140715050318.GD26186@grmbl.mre> <20140715210948.GA20036@amt.cnet> From: Andrey Korolyov Date: Wed, 16 Jul 2014 01:25:17 +0400 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] latest rc: virtio-blk hangs forever after migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: Amit Shah , Paolo Bonzini , Fam Zheng , "qemu-devel@nongnu.org" On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti wrot= e: > On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote: >> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov wrote= : >> > On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah wrot= e: >> >> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote: >> >>> Hello, >> >>> >> >>> the issue is not specific to the iothread code because generic >> >>> virtio-blk also hangs up: >> >> >> >> Do you know which version works well? If you could bisect, that'll >> >> help a lot. >> >> >> >> Thanks, >> >> Amit >> > >> > Hi, >> > >> > 2.0 works definitely well. I`ll try to finish bisection today, though >> > every step takes about 10 minutes to complete. >> >> Yay! It is even outside of virtio-blk. >> >> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 >> Author: Marcelo Tosatti >> Date: Tue Jun 3 13:34:48 2014 -0300 >> >> kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec cal= culation >> >> Ensure proper env->tsc value for kvmclock_current_nsec calculation. >> >> Reported-by: Marcin Gibu=C5=82a >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Marcelo Tosatti >> Signed-off-by: Paolo Bonzini > > Andrey, > > Can you please provide instructions on how to create reproducible > environment? > > The following patch is equivalent to the original patch, for > the purposes of fixing the kvmclock problem. > > Perhaps it becomes easier to spot the reason for the hang you are > experiencing. > > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 272a88a..feb5fc5 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -17,7 +17,6 @@ > #include "qemu/host-utils.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > -#include "sysemu/cpus.h" > #include "hw/sysbus.h" > #include "hw/kvm/clock.h" > > @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) > > cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > > - assert(time.tsc_timestamp <=3D migration_tsc); > delta =3D migration_tsc - time.tsc_timestamp; > if (time.tsc_shift < 0) { > delta >>=3D -time.tsc_shift; > @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, in= t running, > if (s->clock_valid) { > return; > } > - > - cpu_synchronize_all_states(); > ret =3D kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); > if (ret < 0) { > fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret))= ; > diff --git a/migration.c b/migration.c > index 8d675b3..34f2325 100644 > --- a/migration.c > +++ b/migration.c > @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > old_vm_running =3D runstate_is_running(); > > + cpu_synchronize_all_states(); > ret =3D vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > if (ret >=3D 0) { > qemu_file_set_rate_limit(s->file, INT64_MAX); Marcelo, I do not see way easier than creating PoC deployment (involving at least two separated physical nodes) which will act for both as a sender and receiver for migration and for Ceph storage (http://ceph.com/docs/master/start/). For simplicity you probably want to disable cephx, therefore not putting the secret in the CLI. Also you may receive minimal qemu-ready installation using Mirantis` Fuel with Ceph deployment settings (it`ll deploy some Openstack too as a side effect, but the main reason to do things this way is a very high level of provisioning automation, you`ll get necessary environment with multi-node setting with RBD backend in matter of some clicks and some hours). In a meantime, I`ll try to reproduce the issue with iscsi, because I do not want to mess with shared storage and sanlock plugin.