From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37240) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDEf0-0000ZC-In for qemu-devel@nongnu.org; Tue, 11 Feb 2014 09:50:11 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WDEeu-0006Nb-IB for qemu-devel@nongnu.org; Tue, 11 Feb 2014 09:50:06 -0500 Received: from mail-ph.de-nserver.de ([85.158.179.214]:58784) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WDEeu-0006NO-94 for qemu-devel@nongnu.org; Tue, 11 Feb 2014 09:50:00 -0500 Message-ID: <52FA3893.3070402@profihost.ag> Date: Tue, 11 Feb 2014 15:49:55 +0100 From: Stefan Priebe - Profihost AG MIME-Version: 1.0 References: <52F4A952.6080007@profihost.ag> <52F4CB3E.3030203@profihost.ag> <20140207122157.GF2374@work-vm> <52F4D1CA.6030501@profihost.ag> <52F4D523.9040809@redhat.com> <52F4D9E2.4010600@profihost.ag> <52F4DD77.7000209@redhat.com> <52F4E211.9080808@profihost.ag> <52F4E37D.4050204@profihost.ag> <52F5321C.4090605@profihost.ag> <20140207200204.GA5013@work-vm> <52F53DBC.8000209@profihost.ag> <52F68438.9050300@profihost.ag> <52FA2660.9000204@redhat.com> <52FA268E.4030700@profihost.ag> <52FA2971.5080108@redhat.com> In-Reply-To: <52FA2971.5080108@redhat.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [pve-devel] QEMU LIve Migration - swap_free: Bad swap file entry List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Orit Wasserman , "Dr. David Alan Gilbert" Cc: Paolo Bonzini , qemu-devel , Alexandre DERUMIER Am 11.02.2014 14:45, schrieb Orit Wasserman: > On 02/11/2014 03:33 PM, Stefan Priebe - Profihost AG wrote: >> >> Am 11.02.2014 14:32, schrieb Orit Wasserman: >>> On 02/08/2014 09:23 PM, Stefan Priebe wrote: >>>> i could fix it by explicitly disable xbzrle - it seems its >>>> automatically on if i do not set the migration caps to false. >>>> >>>> So it seems to be a xbzrle bug. >>>> >>> >>> XBZRLE is disabled by default (actually all capabilities are off by >>> default) >>> What version of QEMU are you using that you need to disable it >>> explicitly? >>> Maybe you run migration with XBZRLE and canceled it, so it stays on? >> >> No real idea why this happens - but yes this seems to be a problem for >> me. >> > > I checked upstream QEMU and it is still off by default (always been) May be i had it on in the past and the VM was still running from an older migration. >> But the bug in XBZRLE is still there ;-) >> > > We need to understand the exact scenario in order to understand the > problem. > > What exact version of Qemu are you using? Qemu 1.7.0 > Can you try with the latest upstream version, there were some fixes to the > XBZRLE code? Sadly not - i have some custom patches (not related to xbzrle) which won't apply to current upstream. But i could cherry-pick the ones you have in mind - if you give me the commit ids. Stefan >> Stefan >> >>> Orit >>> >>>> Stefan >>>> >>>> Am 07.02.2014 21:10, schrieb Stefan Priebe: >>>>> Am 07.02.2014 21:02, schrieb Dr. David Alan Gilbert: >>>>>> * Stefan Priebe (s.priebe@profihost.ag) wrote: >>>>>>> anything i could try or debug? to help to find the problem? >>>>>> >>>>>> I think the most useful would be to see if the problem is >>>>>> a new problem in the 1.7 you're using or has existed >>>>>> for a while; depending on the machine type you used, it might >>>>>> be possible to load that image on an earlier (or newer) qemu >>>>>> and try the same test, however if the problem doesn't >>>>>> repeat reliably it can be hard. >>>>> >>>>> I've seen this first with Qemu 1.5 but was not able to reproduce it >>>>> for >>>>> month. 1.4 was working fine. >>>>> >>>>>> If you have any way of simplifying the configuration of the >>>>>> VM it would be good; e.g. if you could get a failure on >>>>>> something without graphics (-nographic) and USB. >>>>> >>>>> Sadly not ;-( >>>>> >>>>>> Dave >>>>>> >>>>>>> >>>>>>> Stefan >>>>>>> >>>>>>> Am 07.02.2014 14:45, schrieb Stefan Priebe - Profihost AG: >>>>>>>> it's always the same "pattern" there are too many 0 instead of X. >>>>>>>> >>>>>>>> only seen: >>>>>>>> >>>>>>>> read:0x0000000000000000 ... expected:0xffffffffffffffff >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> read:0xffffffff00000000 ... expected:0xffffffffffffffff >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> read:0x0000bf000000bf00 ... expected:0xffffbfffffffbfff >>>>>>>> >>>>>>>> or >>>>>>>> >>>>>>>> read:0x0000000000000000 ... expected:0xb5b5b5b5b5b5b5b5 >>>>>>>> >>>>>>>> no idea if this helps. >>>>>>>> >>>>>>>> Stefan >>>>>>>> >>>>>>>> Am 07.02.2014 14:39, schrieb Stefan Priebe - Profihost AG: >>>>>>>>> Hi, >>>>>>>>> Am 07.02.2014 14:19, schrieb Paolo Bonzini: >>>>>>>>>> Il 07/02/2014 14:04, Stefan Priebe - Profihost AG ha scritto: >>>>>>>>>>> first of all i've now a memory image of a VM where i can >>>>>>>>>>> reproduce it. >>>>>>>>>> >>>>>>>>>> You mean you start that VM with -incoming 'exec:cat >>>>>>>>>> /path/to/vm.img'? >>>>>>>>>> But google stress test doesn't report any error until you start >>>>>>>>>> migration _and_ it finishes? >>>>>>>>> >>>>>>>>> Sorry no i meant i have a VM where i saved the memory to disk - >>>>>>>>> so i >>>>>>>>> don't need to wait hours until i can reproduce as it does not >>>>>>>>> happen >>>>>>>>> with a fresh started VM. So it's a state file i think. >>>>>>>>> >>>>>>>>>> Another test: >>>>>>>>>> >>>>>>>>>> - start the VM with -S, migrate, do errors appear on the >>>>>>>>>> destination? >>>>>>>>> >>>>>>>>> I started with -S and the errors appear AFTER resuming/unpause >>>>>>>>> the VM. >>>>>>>>> So it is fine until i resume it on the "new" host. >>>>>>>>> >>>>>>>>> Stefan >>>>>>>>> >>>>>>> >>>>>> -- >>>>>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK >>>>>> >>>> >>> >