From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mukesh Rathor Subject: Re: Error restoring DomU when using GPLPV Date: Tue, 15 Sep 2009 12:14:20 -0700 Message-ID: <4AAFE78C.4000608@oracle.com> References: Reply-To: mukesh.rathor@oracle.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Joshua West , Dan Magenheimer , James Harper , "Kurt C. Hackel" , "annie.li@oracle.com" , xen-devel , "wayne.gong@oracle.com" List-Id: xen-devel@lists.xenproject.org Keir Fraser wrote: > On 15/09/2009 03:25, "Mukesh Rathor" wrote: > >> Ok, I've been looking at this and figured what's going on. Annie's problem >> lies in not remapping the grant frames post migration. Hence the leak, >> tot_pages goes up every time until migration fails. On linux, remapping >> is where the frames created by restore (for heap pfn's), get freed back to >> the dom heap, is what I found. So that's a fix to be made on win >> pv driver side. > > Although obviosuly that is a bug, I'm not sure why it would cause this > particular issue? The domheap pages do not get freed and replaced with > xenheap pages, but why does that affect the next save/restore cycle? After > all, xc_domain_save does not distinguish between Xenheap and domheap pages? xc_domain_save doesn't distinguish is actually the problem, as xc_domain_restore then backs xenheap pfn's for shinfo/gnt frames with dom heap pages. These dom heap pages do get freed and replaced by xenheap pages on target host (upon guest remap in gnttab_map()) in following code: arch_memory_op(): /* Remove previously mapped page if it was present. */ prev_mfn = gmfn_to_mfn(d, xatp.gpfn); if ( mfn_valid(prev_mfn) ) { ..... guest_remove_page(d, xatp.gpfn); <======= } Eg. my guest with 128M gets created with tot_pages=0x83eb max_pages:0x8400. Now xc_domain_save saves all, 0x83eb+shinfo+gnt frames(2), so I see tot_pages on target go upto 0x83ee. Now, guest remaps() shinfo and gnt frames. The dom heap pages are returned in guest_remove_page(), tot_pages goes back to 0x83eb. In Annie's case, driver forgets to remap the 2 gnt frames, so dom heap pages are wrongly mapped and tot_pages remains at 0x83ed, and after few more when it reaches 0x83ff, migration fails as save is not be able to create 0x83ff+shinfo+gntframes temporarily, max_page being 0x8400. Hope that makes sense. >> 1. Always balloon down, shinfo+gnttab frames: This needs to be done just >> once during load, right? I'm not sure how it would work tho if mem gets >> ballooned up subsequently. I suppose the driver will have to intercept >> every increase in reservation and balloon down everytime? > > Well, it is the same driver that is doing the ballooning, so it's kind of > easy to intercept, right? Just need to track how many Xenheap pages are > mapped and maintain that amount of 'balloon down'. Yup, that's what I thought, but just wanted to make sure. >> Also, balloon down during suspend call would prob be too late, right? > > Indeed it would. Need to do it during boot. It's only a few pages though, so > noone will miss them. > >> 2. libxc fix: I wonder how much work this will be. Good thing here is, >> it'll take care of both linux and PV HVM guests avoiding driver >> updates in many versions, and hence appealing to us. Can we somehow >> mark the frames special to be skipped? Looking at biiig xc_domain_save >> function, not sure in case of HVM, how pfn_type gets set. May be before >> the >> outer loop, it could ask hyp for all xen heap page list, but then what if >> a >> new page gets added to the list in between..... > > It's a pain. Pfn_type[] I think doesn't really get used. Xc_domain_save() > just tries to map PFNs and saves all the ones it successfully maps. So the > problem is it is allowed to map Xenheap pages. But we can't always disallow > that because sometimes the tools have good reason to map Xenheap pages. So > we'd need a new hypercall, or a flag, or something, and that would need dom0 > kernel changes as well as Xen and toolstack changes. So it's rather a pain. Ok got it, I think driver change is the way to go. >> Also, unfortunately, the failure case is not handled properly sometimes. >> If migration fails after suspend, then no way to get the guest >> back. I even noticed, the guest disappeared totally from both source and >> target when failed, couple times of several dozen migrations I did. > > That shouldn't happen since there is a mechanism to cancel the suspension of > a suspended guest. Possibly xend doesn't get it right every time, as it's > error handling is pretty poor in general. I trust the underlying mechanisms > below xend pretty well however. > -- Keir thanks a lot, Mukesh