From mboxrd@z Thu Jan 1 00:00:00 1970 From: ANNIE LI Subject: Re: Error restoring DomU when using GPLPV Date: Sat, 05 Sep 2009 15:33:15 +0800 Message-ID: <4AA2143B.5040104@oracle.com> References: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============0538699148==" Return-path: In-Reply-To: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xensource.com Errors-To: xen-devel-bounces@lists.xensource.com To: Keir Fraser Cc: Joshua West , Dan Magenheimer , James Harper , "xen-devel@lists.xensource.com" , "wayne.gong@oracle.com" List-Id: xen-devel@lists.xenproject.org This is a multi-part message in MIME format. --===============0538699148== Content-Type: multipart/alternative; boundary="------------010400020101010203020604" This is a multi-part message in MIME format. --------------010400020101010203020604 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Yes. About (3), my test result is save/restore can work only once if ballooning down pages when driver first load. But it works if ballooning down when driver load every times. Thanks Annie. Keir Fraser wrote: > Not all those pages are special. Frames fc0xx will be ACPI tables, resident > in ordinary guest memory pages, for example. Only the Xen-heap pages are > special and need to be (1) skipped; or (2) unmapped by the HVMPV drivers on > suspend; or (3) accounted for by HVMPV drivers by unmapping and freeing an > equal number of domain-heap pages. (1) is 'nicest' but actually a bit of a > pain to implement; (2) won't work well for live migration, where the pages > wouldn't get unmapped by the drivers until the last round of page copying; > and (3) was apparently tried by Annie but didn't work? I'm curious why (3) > didn't work - I can't explain that. > > -- Keir > > On 05/09/2009 00:02, "Dan Magenheimer" wrote: > > >> On further debugging, it appears that the >> p2m_size may be OK, but there's something about >> those 24 "magic" gpfns that isn't quite right. >> >> >>> -----Original Message----- >>> From: Dan Magenheimer >>> Sent: Friday, September 04, 2009 3:29 PM >>> To: Wayne Gong; Annie Li; Keir Fraser >>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>> Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV >>> >>> >>> I think I've tracked down the cause of this problem >>> in the hypervisor, but am unsure how to best fix it. >>> >>> In tools/libxc/xc_domain_save.c, the static variable p2m_size >>> is said to be "number of pfns this guest has (i.e. number of >>> entries in the P2M)". But apparently p2m_size is getting >>> set to a very large number (0x100000) regardless of the >>> maximum psuedophysical memory for the hvm guest. As a result, >>> some "magic" pages in the 0xf0000-0xfefff range are getting >>> placed in the save file. But since they are not "real" >>> pages, the restore process runs beyond the maximum number >>> of physical pages allowed for the domain and fails. >>> (The gpfn of the last 24 pages saved are f2020, fc000-fc012, >>> feffb, feffc, feffd, feffe.) >>> >>> p2m_size is set in "save" with a call to a memory_op hypercall >>> (XENMEM_maximum_gpfn) which for an hvm domain returns >>> d->arch.p2m->max_mapped_pfn. I suspect that the meaning >>> of max_mapped_pfn changed at some point to more match >>> its name, but this changed the semantics of the hypercall >>> as used by xc_domain_restore, resulting in this curious >>> problem. >>> >>> Any thoughts on how to fix this? >>> >>> >>>> -----Original Message----- >>>> From: Annie Li >>>> Sent: Tuesday, September 01, 2009 10:27 PM >>>> To: Keir Fraser >>>> Cc: Joshua West; James Harper; xen-devel@lists.xensource.com >>>> Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV >>>> >>>> >>>> >>>> >>>>> It seems this problem is connected with gnttab, not shareinfo. >>>>> I changed some code about grant table in winpv driver (not using >>>>> balloon down shinfo+gnttab method), >>>>> >>> save/restore/migration can work >>> >>>>> properly on Xen3.4 now. >>>>> >>>>> What i changed is winpv driver use hypercall >>>>> >>>> XENMEM_add_to_physmap to >>>> >>>>> map corresponding grant tables which devices require, instead of >>>>> mapping all 32 pages grant table during initialization. It seems >>>>> those extra grant table mapping cause this problem. >>>>> >>>> Wondering whether those extra grant table mapping is the root >>>> cause of >>>> the migration problem? or by luck as linux PVHVM too? >>>> >>>> Thanks >>>> Annie. >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >>>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>> >>> > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > --------------010400020101010203020604 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Yes. About (3), my test result is save/restore can work only once if ballooning down pages when driver
 first load. But it works if ballooning down when driver load every times.

Thanks
Annie.

Keir Fraser wrote:
Not all those pages are special. Frames fc0xx will be ACPI tables, resident
in ordinary guest memory pages, for example. Only the Xen-heap pages are
special and need to be (1) skipped; or (2) unmapped by the HVMPV drivers on
suspend; or (3) accounted for by HVMPV drivers by unmapping and freeing an
equal number of domain-heap pages. (1) is 'nicest' but actually a bit of a
pain to implement; (2) won't work well for live migration, where the pages
wouldn't get unmapped by the drivers until the last round of page copying;
and (3) was apparently tried by Annie but didn't work? I'm curious why (3)
didn't work - I can't explain that.

 -- Keir

On 05/09/2009 00:02, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:

  
On further debugging, it appears that the
p2m_size may be OK, but there's something about
those 24 "magic" gpfns that isn't quite right.

    
-----Original Message-----
From: Dan Magenheimer
Sent: Friday, September 04, 2009 3:29 PM
To: Wayne Gong; Annie Li; Keir Fraser
Cc: Joshua West; James Harper; xen-devel@lists.xensource.com
Subject: RE: [Xen-devel] Error restoring DomU when using GPLPV


I think I've tracked down the cause of this problem
in the hypervisor, but am unsure how to best fix it.

In tools/libxc/xc_domain_save.c, the static variable p2m_size
is said to be "number of pfns this guest has (i.e. number of
entries in the P2M)".  But apparently p2m_size is getting
set to a very large number (0x100000) regardless of the
maximum psuedophysical memory for the hvm guest.  As a result,
some "magic" pages in the 0xf0000-0xfefff range are getting
placed in the save file.  But since they are not "real"
pages, the restore process runs beyond the maximum number
of physical pages allowed for the domain and fails.
(The gpfn of the last 24 pages saved are f2020, fc000-fc012,
feffb, feffc, feffd, feffe.)

p2m_size is set in "save" with a call to a memory_op hypercall
(XENMEM_maximum_gpfn) which for an hvm domain returns
d->arch.p2m->max_mapped_pfn.  I suspect that the meaning
of max_mapped_pfn changed at some point to more match
its name, but this changed the semantics of the hypercall
as used by xc_domain_restore, resulting in this curious
problem.

Any thoughts on how to fix this?

      
-----Original Message-----
From: Annie Li 
Sent: Tuesday, September 01, 2009 10:27 PM
To: Keir Fraser
Cc: Joshua West; James Harper; xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] Error restoring DomU when using GPLPV



        
It seems this problem is connected with gnttab, not shareinfo.
I changed some code about grant table in winpv driver (not using
balloon down shinfo+gnttab method),
          
save/restore/migration can work
      
properly on Xen3.4 now.

What i changed is winpv driver use hypercall
          
XENMEM_add_to_physmap to
        
map corresponding grant tables which devices require, instead of
mapping all 32 pages grant table during initialization.  It seems
those extra grant table mapping cause this problem.
          
Wondering whether those extra grant table mapping is the root
cause of 
the migration problem? or by luck as linux PVHVM too?

Thanks
Annie.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

        
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

      



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
  
--------------010400020101010203020604-- --===============0538699148== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel --===============0538699148==--