From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:36901) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X7Kk7-0005VO-Br for qemu-devel@nongnu.org; Wed, 16 Jul 2014 04:39:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X7Kk5-00019v-Ib for qemu-devel@nongnu.org; Wed, 16 Jul 2014 04:39:15 -0400 Received: from mail-ob0-x232.google.com ([2607:f8b0:4003:c01::232]:49871) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X7Kk5-00019Z-7O for qemu-devel@nongnu.org; Wed, 16 Jul 2014 04:39:13 -0400 Received: by mail-ob0-f178.google.com with SMTP id nu7so595623obb.23 for ; Wed, 16 Jul 2014 01:39:12 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20140716011634.GA30717@amt.cnet> References: <20140715050318.GD26186@grmbl.mre> <20140715210948.GA20036@amt.cnet> <53C5A4C9.80609@redhat.com> <20140716011634.GA30717@amt.cnet> From: Andrey Korolyov Date: Wed, 16 Jul 2014 12:38:51 +0400 Message-ID: Content-Type: multipart/mixed; boundary=089e0129458c5dbfd304fe4b75ab Subject: Re: [Qemu-devel] latest rc: virtio-blk hangs forever after migration List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Marcelo Tosatti Cc: Amit Shah , Paolo Bonzini , Fam Zheng , "qemu-devel@nongnu.org" --089e0129458c5dbfd304fe4b75ab Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Wed, Jul 16, 2014 at 5:16 AM, Marcelo Tosatti wrot= e: > On Wed, Jul 16, 2014 at 03:40:47AM +0400, Andrey Korolyov wrote: >> On Wed, Jul 16, 2014 at 2:01 AM, Paolo Bonzini wro= te: >> > Il 15/07/2014 23:25, Andrey Korolyov ha scritto: >> > >> >> On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti >> >> wrote: >> >>> >> >>> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote: >> >>>> >> >>>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov >> >>>> wrote: >> >>>>> >> >>>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah >> >>>>> wrote: >> >>>>>> >> >>>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote: >> >>>>>>> >> >>>>>>> Hello, >> >>>>>>> >> >>>>>>> the issue is not specific to the iothread code because generic >> >>>>>>> virtio-blk also hangs up: >> >>>>>> >> >>>>>> >> >>>>>> Do you know which version works well? If you could bisect, that'= ll >> >>>>>> help a lot. >> >>>>>> >> >>>>>> Thanks, >> >>>>>> Amit >> >>>>> >> >>>>> >> >>>>> Hi, >> >>>>> >> >>>>> 2.0 works definitely well. I`ll try to finish bisection today, tho= ugh >> >>>>> every step takes about 10 minutes to complete. >> >>>> >> >>>> >> >>>> Yay! It is even outside of virtio-blk. >> >>>> >> >>>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 >> >>>> Author: Marcelo Tosatti >> >>>> Date: Tue Jun 3 13:34:48 2014 -0300 >> >>>> >> >>>> kvmclock: Ensure proper env->tsc value for kvmclock_current_nse= c >> >>>> calculation >> >>>> >> >>>> Ensure proper env->tsc value for kvmclock_current_nsec calculat= ion. >> >>>> >> >>>> Reported-by: Marcin Gibu=C5=82a >> >>>> Cc: qemu-stable@nongnu.org >> >>>> Signed-off-by: Marcelo Tosatti >> >>>> Signed-off-by: Paolo Bonzini >> >>> >> >>> >> >>> Andrey, >> >>> >> >>> Can you please provide instructions on how to create reproducible >> >>> environment? >> >>> >> >>> The following patch is equivalent to the original patch, for >> >>> the purposes of fixing the kvmclock problem. >> >>> >> >>> Perhaps it becomes easier to spot the reason for the hang you are >> >>> experiencing. >> >>> >> >>> >> >>> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c >> >>> index 272a88a..feb5fc5 100644 >> >>> --- a/hw/i386/kvm/clock.c >> >>> +++ b/hw/i386/kvm/clock.c >> >>> @@ -17,7 +17,6 @@ >> >>> #include "qemu/host-utils.h" >> >>> #include "sysemu/sysemu.h" >> >>> #include "sysemu/kvm.h" >> >>> -#include "sysemu/cpus.h" >> >>> #include "hw/sysbus.h" >> >>> #include "hw/kvm/clock.h" >> >>> >> >>> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockStat= e *s) >> >>> >> >>> cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time= )); >> >>> >> >>> - assert(time.tsc_timestamp <=3D migration_tsc); >> >>> delta =3D migration_tsc - time.tsc_timestamp; >> >>> if (time.tsc_shift < 0) { >> >>> delta >>=3D -time.tsc_shift; >> >>> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaqu= e, >> >>> int running, >> >>> if (s->clock_valid) { >> >>> return; >> >>> } >> >>> - >> >>> - cpu_synchronize_all_states(); >> >>> ret =3D kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); >> >>> if (ret < 0) { >> >>> fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", >> >>> strerror(ret)); >> >>> diff --git a/migration.c b/migration.c >> >>> index 8d675b3..34f2325 100644 >> >>> --- a/migration.c >> >>> +++ b/migration.c >> >>> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) >> >>> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER= ); >> >>> old_vm_running =3D runstate_is_running(); >> >>> >> >>> + cpu_synchronize_all_states(); >> >>> ret =3D vm_stop_force_state(RUN_STATE_FINISH_MIGRAT= E); >> >>> if (ret >=3D 0) { >> >>> qemu_file_set_rate_limit(s->file, INT64_MAX); >> > >> > >> > It could also be useful to apply the above patch _and_ revert >> > a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce. >> > >> > Paolo >> >> Yes, it solved the issue for me! (it took much time to check because >> most of country` debian mirrors went inconsistent by some reason) >> >> Also trivial addition: >> >> diff --git a/migration.c b/migration.c >> index 34f2325..65d1c88 100644 >> --- a/migration.c >> +++ b/migration.c >> @@ -25,6 +25,7 @@ >> #include "qemu/thread.h" >> #include "qmp-commands.h" >> #include "trace.h" >> +#include "sysemu/cpus.h" > > And what about not reverting a096b3a6732f846ec57dc28b47ee9435aa0609bf ? > > That is, test with a stock qemu.git tree and the patch sent today, > on this thread, to move cpu_synchronize_all_states ? > > The main reason for things to work for me is a revert of 9b1786829aefb83f37a8f3135e3ea91c56001b56 on top, not adding any other patches. I had tested two cases, with Alexander`s patch completely reverted plus suggestion from Marcelo and only with revert 9b178682 plug same suggestion. The difference is that the until Alexander` patch is not reverted, live migration is always failing by the timeout value, and when reverted migration always succeeds in 8-10 seconds. Appropriate diffs are attached for the reference. --089e0129458c5dbfd304fe4b75ab Content-Type: text/plain; charset=US-ASCII; name="diff-with-reverted-agraf-patch.txt" Content-Disposition: attachment; filename="diff-with-reverted-agraf-patch.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hxoe6t5i0 ZGlmZiAtLWdpdCBhL2h3L2kzODYva3ZtL2Nsb2NrLmMgYi9ody9pMzg2L2t2bS9jbG9jay5jCmlu ZGV4IDI3MmE4OGEuLjkzZTE4MjkgMTAwNjQ0Ci0tLSBhL2h3L2kzODYva3ZtL2Nsb2NrLmMKKysr IGIvaHcvaTM4Ni9rdm0vY2xvY2suYwpAQCAtMTcsNyArMTcsNiBAQAogI2luY2x1ZGUgInFlbXUv aG9zdC11dGlscy5oIgogI2luY2x1ZGUgInN5c2VtdS9zeXNlbXUuaCIKICNpbmNsdWRlICJzeXNl bXUva3ZtLmgiCi0jaW5jbHVkZSAic3lzZW11L2NwdXMuaCIKICNpbmNsdWRlICJody9zeXNidXMu aCIKICNpbmNsdWRlICJody9rdm0vY2xvY2suaCIKIApAQCAtMzYsNDkgKzM1LDYgQEAgdHlwZWRl ZiBzdHJ1Y3QgS1ZNQ2xvY2tTdGF0ZSB7CiAgICAgYm9vbCBjbG9ja192YWxpZDsKIH0gS1ZNQ2xv Y2tTdGF0ZTsKIAotc3RydWN0IHB2Y2xvY2tfdmNwdV90aW1lX2luZm8gewotICAgIHVpbnQzMl90 ICAgdmVyc2lvbjsKLSAgICB1aW50MzJfdCAgIHBhZDA7Ci0gICAgdWludDY0X3QgICB0c2NfdGlt ZXN0YW1wOwotICAgIHVpbnQ2NF90ICAgc3lzdGVtX3RpbWU7Ci0gICAgdWludDMyX3QgICB0c2Nf dG9fc3lzdGVtX211bDsKLSAgICBpbnQ4X3QgICAgIHRzY19zaGlmdDsKLSAgICB1aW50OF90ICAg IGZsYWdzOwotICAgIHVpbnQ4X3QgICAgcGFkWzJdOwotfSBfX2F0dHJpYnV0ZV9fKChfX3BhY2tl ZF9fKSk7IC8qIDMyIGJ5dGVzICovCi0KLXN0YXRpYyB1aW50NjRfdCBrdm1jbG9ja19jdXJyZW50 X25zZWMoS1ZNQ2xvY2tTdGF0ZSAqcykKLXsKLSAgICBDUFVTdGF0ZSAqY3B1ID0gZmlyc3RfY3B1 OwotICAgIENQVVg4NlN0YXRlICplbnYgPSBjcHUtPmVudl9wdHI7Ci0gICAgaHdhZGRyIGt2bWNs b2NrX3N0cnVjdF9wYSA9IGVudi0+c3lzdGVtX3RpbWVfbXNyICYgfjFVTEw7Ci0gICAgdWludDY0 X3QgbWlncmF0aW9uX3RzYyA9IGVudi0+dHNjOwotICAgIHN0cnVjdCBwdmNsb2NrX3ZjcHVfdGlt ZV9pbmZvIHRpbWU7Ci0gICAgdWludDY0X3QgZGVsdGE7Ci0gICAgdWludDY0X3QgbnNlY19sbzsK LSAgICB1aW50NjRfdCBuc2VjX2hpOwotICAgIHVpbnQ2NF90IG5zZWM7Ci0KLSAgICBpZiAoIShl bnYtPnN5c3RlbV90aW1lX21zciAmIDFVTEwpKSB7Ci0gICAgICAgIC8qIEtWTSBjbG9jayBub3Qg YWN0aXZlICovCi0gICAgICAgIHJldHVybiAwOwotICAgIH0KLQotICAgIGNwdV9waHlzaWNhbF9t ZW1vcnlfcmVhZChrdm1jbG9ja19zdHJ1Y3RfcGEsICZ0aW1lLCBzaXplb2YodGltZSkpOwotCi0g ICAgYXNzZXJ0KHRpbWUudHNjX3RpbWVzdGFtcCA8PSBtaWdyYXRpb25fdHNjKTsKLSAgICBkZWx0 YSA9IG1pZ3JhdGlvbl90c2MgLSB0aW1lLnRzY190aW1lc3RhbXA7Ci0gICAgaWYgKHRpbWUudHNj X3NoaWZ0IDwgMCkgewotICAgICAgICBkZWx0YSA+Pj0gLXRpbWUudHNjX3NoaWZ0OwotICAgIH0g ZWxzZSB7Ci0gICAgICAgIGRlbHRhIDw8PSB0aW1lLnRzY19zaGlmdDsKLSAgICB9Ci0KLSAgICBt dWx1NjQoJm5zZWNfbG8sICZuc2VjX2hpLCBkZWx0YSwgdGltZS50c2NfdG9fc3lzdGVtX211bCk7 Ci0gICAgbnNlYyA9IChuc2VjX2xvID4+IDMyKSB8IChuc2VjX2hpIDw8IDMyKTsKLSAgICByZXR1 cm4gbnNlYyArIHRpbWUuc3lzdGVtX3RpbWU7Ci19Ci0KIHN0YXRpYyB2b2lkIGt2bWNsb2NrX3Zt X3N0YXRlX2NoYW5nZSh2b2lkICpvcGFxdWUsIGludCBydW5uaW5nLAogICAgICAgICAgICAgICAg ICAgICAgICAgICAgICAgICAgICAgIFJ1blN0YXRlIHN0YXRlKQogewpAQCAtODksMTUgKzQ1LDkg QEAgc3RhdGljIHZvaWQga3ZtY2xvY2tfdm1fc3RhdGVfY2hhbmdlKHZvaWQgKm9wYXF1ZSwgaW50 IHJ1bm5pbmcsCiAKICAgICBpZiAocnVubmluZykgewogICAgICAgICBzdHJ1Y3Qga3ZtX2Nsb2Nr X2RhdGEgZGF0YTsKLSAgICAgICAgdWludDY0X3QgdGltZV9hdF9taWdyYXRpb24gPSBrdm1jbG9j a19jdXJyZW50X25zZWMocyk7CiAKICAgICAgICAgcy0+Y2xvY2tfdmFsaWQgPSBmYWxzZTsKIAot CS8qIFdlIGNhbid0IHJlbHkgb24gdGhlIG1pZ3JhdGVkIGNsb2NrIHZhbHVlLCBqdXN0IGRpc2Nh cmQgaXQgKi8KLQlpZiAodGltZV9hdF9taWdyYXRpb24pIHsKLQkgICAgICAgIHMtPmNsb2NrID0g dGltZV9hdF9taWdyYXRpb247Ci0JfQotCiAgICAgICAgIGRhdGEuY2xvY2sgPSBzLT5jbG9jazsK ICAgICAgICAgZGF0YS5mbGFncyA9IDA7CiAgICAgICAgIHJldCA9IGt2bV92bV9pb2N0bChrdm1f c3RhdGUsIEtWTV9TRVRfQ0xPQ0ssICZkYXRhKTsKQEAgLTEyNSw4ICs3NSw2IEBAIHN0YXRpYyB2 b2lkIGt2bWNsb2NrX3ZtX3N0YXRlX2NoYW5nZSh2b2lkICpvcGFxdWUsIGludCBydW5uaW5nLAog ICAgICAgICBpZiAocy0+Y2xvY2tfdmFsaWQpIHsKICAgICAgICAgICAgIHJldHVybjsKICAgICAg ICAgfQotCi0gICAgICAgIGNwdV9zeW5jaHJvbml6ZV9hbGxfc3RhdGVzKCk7CiAgICAgICAgIHJl dCA9IGt2bV92bV9pb2N0bChrdm1fc3RhdGUsIEtWTV9HRVRfQ0xPQ0ssICZkYXRhKTsKICAgICAg ICAgaWYgKHJldCA8IDApIHsKICAgICAgICAgICAgIGZwcmludGYoc3RkZXJyLCAiS1ZNX0dFVF9D TE9DSyBmYWlsZWQ6ICVzXG4iLCBzdHJlcnJvcihyZXQpKTsKZGlmZiAtLWdpdCBhL21pZ3JhdGlv bi5jIGIvbWlncmF0aW9uLmMKaW5kZXggOGQ2NzViMy4uNjVkMWM4OCAxMDA2NDQKLS0tIGEvbWln cmF0aW9uLmMKKysrIGIvbWlncmF0aW9uLmMKQEAgLTI1LDYgKzI1LDcgQEAKICNpbmNsdWRlICJx ZW11L3RocmVhZC5oIgogI2luY2x1ZGUgInFtcC1jb21tYW5kcy5oIgogI2luY2x1ZGUgInRyYWNl LmgiCisjaW5jbHVkZSAic3lzZW11L2NwdXMuaCIKIAogZW51bSB7CiAgICAgTUlHX1NUQVRFX0VS Uk9SID0gLTEsCkBAIC02MDgsNiArNjA5LDcgQEAgc3RhdGljIHZvaWQgKm1pZ3JhdGlvbl90aHJl YWQodm9pZCAqb3BhcXVlKQogICAgICAgICAgICAgICAgIHFlbXVfc3lzdGVtX3dha2V1cF9yZXF1 ZXN0KFFFTVVfV0FLRVVQX1JFQVNPTl9PVEhFUik7CiAgICAgICAgICAgICAgICAgb2xkX3ZtX3J1 bm5pbmcgPSBydW5zdGF0ZV9pc19ydW5uaW5nKCk7CiAKKyAgICAgICAgICAgICAgICBjcHVfc3lu Y2hyb25pemVfYWxsX3N0YXRlcygpOwogICAgICAgICAgICAgICAgIHJldCA9IHZtX3N0b3BfZm9y Y2Vfc3RhdGUoUlVOX1NUQVRFX0ZJTklTSF9NSUdSQVRFKTsKICAgICAgICAgICAgICAgICBpZiAo cmV0ID49IDApIHsKICAgICAgICAgICAgICAgICAgICAgcWVtdV9maWxlX3NldF9yYXRlX2xpbWl0 KHMtPmZpbGUsIElOVDY0X01BWCk7Cg== --089e0129458c5dbfd304fe4b75ab Content-Type: text/plain; charset=US-ASCII; name="diff-with-only-late-fix-moved.txt" Content-Disposition: attachment; filename="diff-with-only-late-fix-moved.txt" Content-Transfer-Encoding: base64 X-Attachment-Id: f_hxoe6xx01 ZGlmZiAtLWdpdCBhL2h3L2kzODYva3ZtL2Nsb2NrLmMgYi9ody9pMzg2L2t2bS9jbG9jay5jCmlu ZGV4IDI3MmE4OGEuLmZlYjVmYzUgMTAwNjQ0Ci0tLSBhL2h3L2kzODYva3ZtL2Nsb2NrLmMKKysr IGIvaHcvaTM4Ni9rdm0vY2xvY2suYwpAQCAtMTcsNyArMTcsNiBAQAogI2luY2x1ZGUgInFlbXUv aG9zdC11dGlscy5oIgogI2luY2x1ZGUgInN5c2VtdS9zeXNlbXUuaCIKICNpbmNsdWRlICJzeXNl bXUva3ZtLmgiCi0jaW5jbHVkZSAic3lzZW11L2NwdXMuaCIKICNpbmNsdWRlICJody9zeXNidXMu aCIKICNpbmNsdWRlICJody9rdm0vY2xvY2suaCIKIApAQCAtNjYsNyArNjUsNiBAQCBzdGF0aWMg dWludDY0X3Qga3ZtY2xvY2tfY3VycmVudF9uc2VjKEtWTUNsb2NrU3RhdGUgKnMpCiAKICAgICBj cHVfcGh5c2ljYWxfbWVtb3J5X3JlYWQoa3ZtY2xvY2tfc3RydWN0X3BhLCAmdGltZSwgc2l6ZW9m KHRpbWUpKTsKIAotICAgIGFzc2VydCh0aW1lLnRzY190aW1lc3RhbXAgPD0gbWlncmF0aW9uX3Rz Yyk7CiAgICAgZGVsdGEgPSBtaWdyYXRpb25fdHNjIC0gdGltZS50c2NfdGltZXN0YW1wOwogICAg IGlmICh0aW1lLnRzY19zaGlmdCA8IDApIHsKICAgICAgICAgZGVsdGEgPj49IC10aW1lLnRzY19z aGlmdDsKQEAgLTEyNSw4ICsxMjMsNiBAQCBzdGF0aWMgdm9pZCBrdm1jbG9ja192bV9zdGF0ZV9j aGFuZ2Uodm9pZCAqb3BhcXVlLCBpbnQgcnVubmluZywKICAgICAgICAgaWYgKHMtPmNsb2NrX3Zh bGlkKSB7CiAgICAgICAgICAgICByZXR1cm47CiAgICAgICAgIH0KLQotICAgICAgICBjcHVfc3lu Y2hyb25pemVfYWxsX3N0YXRlcygpOwogICAgICAgICByZXQgPSBrdm1fdm1faW9jdGwoa3ZtX3N0 YXRlLCBLVk1fR0VUX0NMT0NLLCAmZGF0YSk7CiAgICAgICAgIGlmIChyZXQgPCAwKSB7CiAgICAg ICAgICAgICBmcHJpbnRmKHN0ZGVyciwgIktWTV9HRVRfQ0xPQ0sgZmFpbGVkOiAlc1xuIiwgc3Ry ZXJyb3IocmV0KSk7CmRpZmYgLS1naXQgYS9taWdyYXRpb24uYyBiL21pZ3JhdGlvbi5jCmluZGV4 IDhkNjc1YjMuLjY1ZDFjODggMTAwNjQ0Ci0tLSBhL21pZ3JhdGlvbi5jCisrKyBiL21pZ3JhdGlv bi5jCkBAIC0yNSw2ICsyNSw3IEBACiAjaW5jbHVkZSAicWVtdS90aHJlYWQuaCIKICNpbmNsdWRl ICJxbXAtY29tbWFuZHMuaCIKICNpbmNsdWRlICJ0cmFjZS5oIgorI2luY2x1ZGUgInN5c2VtdS9j cHVzLmgiCiAKIGVudW0gewogICAgIE1JR19TVEFURV9FUlJPUiA9IC0xLApAQCAtNjA4LDYgKzYw OSw3IEBAIHN0YXRpYyB2b2lkICptaWdyYXRpb25fdGhyZWFkKHZvaWQgKm9wYXF1ZSkKICAgICAg ICAgICAgICAgICBxZW11X3N5c3RlbV93YWtldXBfcmVxdWVzdChRRU1VX1dBS0VVUF9SRUFTT05f T1RIRVIpOwogICAgICAgICAgICAgICAgIG9sZF92bV9ydW5uaW5nID0gcnVuc3RhdGVfaXNfcnVu bmluZygpOwogCisgICAgICAgICAgICAgICAgY3B1X3N5bmNocm9uaXplX2FsbF9zdGF0ZXMoKTsK ICAgICAgICAgICAgICAgICByZXQgPSB2bV9zdG9wX2ZvcmNlX3N0YXRlKFJVTl9TVEFURV9GSU5J U0hfTUlHUkFURSk7CiAgICAgICAgICAgICAgICAgaWYgKHJldCA+PSAwKSB7CiAgICAgICAgICAg ICAgICAgICAgIHFlbXVfZmlsZV9zZXRfcmF0ZV9saW1pdChzLT5maWxlLCBJTlQ2NF9NQVgpOwo= --089e0129458c5dbfd304fe4b75ab--