From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:59995)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <mui@mui.fi>)
	id 1W7mk7-0001KT-Oh
	for qemu-devel@nongnu.org; Mon, 27 Jan 2014 09:01:01 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <mui@mui.fi>) id 1W7mk0-0008Mq-Ep
	for qemu-devel@nongnu.org; Mon, 27 Jan 2014 09:00:51 -0500
Received: from smtp-69.nebula.fi ([83.145.220.69]:35607 helo=smtp-68.nebula.fi)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <mui@mui.fi>)
	id 1W7mk0-0008M7-0M
	for qemu-devel@nongnu.org; Mon, 27 Jan 2014 09:00:44 -0500
Received: from 83.150.66.103 (unknown [83.150.66.103])
	by smtp-68.nebula.fi (Postfix) with ESMTP id 36DFA32B0225
	for <qemu-devel@nongnu.org>; Mon, 27 Jan 2014 16:00:39 +0200 (EET)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8;
 format=flowed
Content-Transfer-Encoding: 7bit
Date: Mon, 27 Jan 2014 16:20:19 +0200
From: Markus Kovero <mui@mui.fi>
Message-ID: <e971ac508c89e89fe99f547d100dcb01@fiveam.org>
Subject: Re: [Qemu-devel] live migration between amd fam15h-fam10h
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: qemu-devel@nongnu.org

> Hi,
>
> I am getting a frozen guest when migrating from an Opteron 6274 host 
> (amd
> fam15h) to
> an Opteron 6174 host (amd fam10h). The live migration completes 
> succesfully, but
> the guest is frozen: vcn screen is still there, but no input is 
> possible and
> no kernel output is seen. Trying "c" on the qemu-monitor does not 
> help.
> I am using "-cpu Opteron_G3" which I assumed would be ok for both 
> host cpus.
>
> In the opposite direction (migrating from an amd fam10h host to an 
> amdfam15h
> host) the guest continues to run on the destination. However, on most 
> of these
> successfull live migrations, I notice a "clocksource unstable" 
> message on the
> guest kernel (using the default kvm-clock clocksource) e.g.
> Clocksource tsc unstable (delta = -1500533439 ns)
> Same situation (guest runs on destination with clocksource unstable 
> message)
> happens when migrating between fam15h hosts (I have not tried between 
> fam10h
> hosts)
>
> Changing the clocksource (tsc, acpi_pm, hpet) does not solve the 
> issue.
> Also tried with "-cpu kvm64" with same result.
>
> qemu-kvm version: 0.15.1, 1.0 or qemu-kvm/master
> Host kernel: 3.0.15 (on both hosts)
> Guest kernel: 3.0.6 or 3.2
>
> this is the qemu-kvm command line used on the source host:
>
> "
> kvm -enable-kvm -m 1024 -smp 1 -cpu Opteron_G3,check -drive \
> 
> file=/opt/test.img,if=none,id=drive-virtio-disk1,format=raw,cache=writethrough,boot=on
> -device
> 
> virtio-blk-pci,bus=pci.0,addr=0x5,drive=drive-virtio-disk1,id=virtio-disk1
> -monitor stdio -vnc 0.0.0.0:6 -vga std -chardev pty,id=charserial0 
> -device
> isa-serial,chardev=charserial0,id=serial0 -usb -device 
> usb-tablet,id=input0
> "
>
> The destination host has the same command line with an added 
> "-incoming
> tcp:4444". I have mainly tested this with non-shared storage (but 
> also shared
> storage has the same result). Migration is triggered with "migrate -b
> tcp:destip:4444"
>
> Do the TSC microarchitecture changes in amdfam15h (see AMD SW 
> optimiization
> guide for fam15h, 47414 Rev 3.02 Appendix E) affect pvclock stability 
> on
> migration in same family or across families?
>
> cpuid information follows in case it's helpful.
..snip..


Hi, I can confirm this problem still exists in live migrations between 
Opteron 6128HE and Opteron 6274.
Live migration from 6100-series to 6200-series work, but never from 
6200 to 6100.
Issue is reproducible and symptoms are identical with previous poster.
I have tested with 3.10.5 host-kernel and 1.7 qemu, also with 3.1.4 and 
 >1.0 qemu, guest kernel seems to be irrelevant at this point (as it 
crashes any OS).

I would say this needs attention, and I'm willing to help to get this 
sorted out.

Thanks for your thoughts.

Yours
Markus Kovero
+358 40 577 1129