From mboxrd@z Thu Jan 1 00:00:00 1970 From: Konrad Rzeszutek Wilk Subject: Re: xen/stable-2.6.32.x xen-4.1.1 live migration fails with kernels 2.6.39, 3.0.3 and 3.1-rc2 Date: Thu, 8 Sep 2011 13:32:12 -0400 Message-ID: <20110908173212.GA17026@dumpdata.com> References: <4E4EA3E2.2040809@leuphana.de> <4E52224902000078000525CC@nat28.tlf.novell.com> <4E52601B.5060609@leuphana.de> <20110824203435.GA27865@dumpdata.com> <4E55F682.8060405@leuphana.de> <20110826150054.GA1793@dumpdata.com> <4E57D745.2080701@leuphana.de> <20110829194911.GC16530@dumpdata.com> <4E5E320A.60401@leuphana.de> <20110907135046.GD32190@dumpdata.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Content-Disposition: inline In-Reply-To: <20110907135046.GD32190@dumpdata.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Andreas Olsowski Cc: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org On Wed, Sep 07, 2011 at 09:50:47AM -0400, Konrad Rzeszutek Wilk wrote: > On Wed, Aug 31, 2011 at 03:07:22PM +0200, Andreas Olsowski wrote: > > A little update, i now have all machines running on xen-4.1-testing > > with xen/stable-2.6.32.x > > That gave me the possiblity for additional tests. > > > > (I also tested xm/xend in addtion to xl/libxl, to make sure its not > > a xl/libxl problem.) > > > > I took the liberty to create a new test result matrix that should > > provide a better overview (in case someone else wants to get the > > whole picture): > > So.. I don't think the issue I am seeing is exactly the same. This is > what 'xl' gives me: Scratch that. I am seeing the error below if I: 1) Create guest on 4GB machine 2) Migrate it to the 32GB box (guest still works) 3) Migrate it to the 4GB box (guest dies - error below shows up and guest is dead). With 3.1-rc5 virgin - both Dom0 and DomU. Also Xen 4.1-testing on top of this. I tried just creating a guest on the 32GB and migrating it - and while it did migrate it was stuck in a hypercall_page call or crashed later on. Andreas, Thanks for reporting this. > > :~/ > > xl migrate 3 tst010 > root@tst010's password: > migration target: Ready to receive domain. > Saving to migration stream new xl format (info 0x0/0x0/326) > Loading new save file incoming migration stream (new xl fmt info 0x0/0x0/326) > Savefile contains xl domain config > xc: Saving memory: iter 0 (last sent 0 skipped 0): 262400/262400 100% > xc: Saving memory: iter 2 (last sent 1105 skipped 23): 262400/262400 100% > xc: Saving memory: iter 3 (last sent 74 skipped 0): 262400/262400 100% > xc: Saving memory: iter 4 (last sent 0 skipped 0): 262400/262400 100% > xc: error: unexpected PFN mapping failure pfn 19d0 map_mfn 4e7e04 p2m_mfn 4e7e04: Internal error > libxl: error: libxl_dom.c:363:libxl__domain_restore_common: restoring domain: Resource temporarily unavailable > libxl: error: libxl_create.c:483:do_domain_create: cannot (re-)build domain: -3 > libxl: error: libxl.c:733:libxl_domain_destroy: non-existant domain 4 > migration target: Domain creation failed (code -3). > libxl: error: libxl_utils.c:410:libxl_read_exactly: file/stream truncated reading ready message from migration receiver stream > libxl: info: libxl_exec.c:125:libxl_report_child_exitstatus: migration target process [5810] exited with error status 3 > Migration failed, resuming at sender. > > > And on the receiving side (tst010) I get a monster off: > > (XEN) mm.c:945:d0 Error getting mfn 4e7e04 (pfn ffffffffffffffff) from L1 entry 80000004e7e04627 for l1e_owner=0, pg_owner=4 > XEN) mm.c:945:d0 Error getting mfn 36fd19 (pfn ffffffffffffffff) from L1 entry 800000036fd19627 for l1e_owner=0, pg_owner=4 > (XEN) mm.c:945:d0 Error getting mfn 36f583 (pfn ffffffffffffffff) from L1 entry 800000036f583627 for l1e_owner=0, pg_owner=4 > .. > (XEN) mm.c:945:d0 Error getting mfn 4e7d09 (pfn ffffffffffffffff) from L1 entry 80000004e7d09627 for l1e_owner=0, pg_owner=4 > (XEN) event_channel.c:250:d3 EVTCHNOP failure: error -17 > > > The migration is from a 4GB box to a 32GB box (worked), then back to the 4GB( worked) > and then back to the 32GB (boom!). > > anyhow, let me try this with 4.1-testing branch. Running on the bleeding > edge might not be the best idea sometimes.