xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: David Vrabel <david.vrabel@citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: balloon driver broken in 3.12+ after save+restore
Date: Fri, 27 Jun 2014 15:57:07 +0200	[thread overview]
Message-ID: <53AD7833.80805@invisiblethingslab.com> (raw)
In-Reply-To: <53AD3EAE.4030008@citrix.com>


[-- Attachment #1.1: Type: text/plain, Size: 4828 bytes --]

On 27.06.2014 11:51, David Vrabel wrote:
> On 22/05/14 02:31, Marek Marczykowski-Górecki wrote:
>> Hi,
>>
>> I have a problem with balloon driver after/during restoring a saved domain.
>> There are two symptoms:
>> 1. When domain was 'xl mem-set <some size smaller than initial>' just before
>> save, it still needs initial memory size to restore. Details below.
>>
>> 2. Restored domain sometimes (most of the time) do not want to balloon down.
>> For example when the domain has 3300MB and I mem-set it to 2800MB, nothing
>> changes immediately (only "target" in sysfs) - both 'xl list' and 'free'
>> inside reports the same size (and plenty of free memory in the VM). After some
>> time it get ballooned down to ~3000, still not 2800. I haven't found any
>> pattern here.
>>
>> Both of above was working perfectly in 3.11.
>>
>> I'm running Xen 4.1.6.1.
>>
>> Details for the first problem:
>> Preparation:
>> I start the VM as in config at the end of email (memory=400, maxmem=4000),
>> wait some time, then 'xl mem-set' to size just about really used memory (about
>> 200MB in most cases). Then 'sleep 1' and 'xl save'.
>> When I want to restore that domain, I get initial config file, replace memory
>> setting with size used in 'xl mem-set' above and call 'xl restore' providing
>> that config. It fails with this error:
>> ---
>> Loading new save file /var/run/qubes/current-savefile (new xl fmt info
>> 0x0/0x0/849)
>>  Savefile contains xl domain config
>> xc: detail: xc_domain_restore start: p2m_size = fa800
>> xc: detail: Failed allocation for dom 51: 1024 extents of order 0
>> xc: error: Failed to allocate memory for batch.!: Internal error
>> xc: detail: Restore exit with rc=1
>> libxl: error: libxl_dom.c:313:libxl__domain_restore_common restoring domain:
>> Resource temporarily unavailable
>> cannot (re-)build domain: -3
>> libxl: error: libxl.c:713:libxl_domain_destroy non-existant domain 51
>> ---
>> When memory set back to 400 (or slightly lower, like 380) - restore succeeded,
>> but still the second problem is happening.
>>
>> I've bisected the first problem down to this commit:
>> commit cd9151e26d31048b2b5e00fd02e110e07d2200c9
>>     xen/balloon: set a mapping for ballooned out pages
> 
> Sorry for the delay. I somehow missed this.
> 
> This is likely caused by the balloon driver creating multiple entries
> in the p2m all pointing to the MFNs of the scratch pages. These
> duplicates are de-duped on save/restore.
> 
> I suspect your 2nd issue may also be caused by this.
> 
> Can you try this patch, please?

Looks to be the right fix, thanks!

> 
> 8<----------------------------------------------
> xen/balloon: set ballooned out pages as invalid in p2m
> 
> Since cd9151e26d31048b2b5e00fd02e110e07d2200c9 (xen/balloon: set a
> mapping for ballooned out pages), a ballooned out page had its entry
> in the p2m set to the MFN of one of the scratch page.  This means that
> the p2m will contain many entries pointing to the same MFN.
> 
> During a domain save, this many-to-one entries are not considered and
> the scratch page is saved multiple times. On restore the ballooned
> pages are populated with new frames and the domain may use up its
> allocation before all pages can be restores.
> 
> Set ballooned out pages as INVALID_P2M_ENTRY in the p2m (as they
> werebefore), preventing them from being saved and re-populated on
> restore.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> ---
>  drivers/xen/balloon.c |   12 +++++-------
>  1 file changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
> index b7a506f..5c660c7 100644
> --- a/drivers/xen/balloon.c
> +++ b/drivers/xen/balloon.c
> @@ -426,20 +426,18 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp)
>  		 * p2m are consistent.
>  		 */
>  		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
> -			unsigned long p;
> -			struct page   *scratch_page = get_balloon_scratch_page();
> -
>  			if (!PageHighMem(page)) {
> +				struct page *scratch_page = get_balloon_scratch_page();
> +
>  				ret = HYPERVISOR_update_va_mapping(
>  						(unsigned long)__va(pfn << PAGE_SHIFT),
>  						pfn_pte(page_to_pfn(scratch_page),
>  							PAGE_KERNEL_RO), 0);
>  				BUG_ON(ret);
> -			}
> -			p = page_to_pfn(scratch_page);
> -			__set_phys_to_machine(pfn, pfn_to_mfn(p));
>  
> -			put_balloon_scratch_page();
> +				put_balloon_scratch_page();
> +			}
> +			__set_phys_to_machine(pfn, INVALID_P2M_ENTRY);
>  		}
>  #endif
>  
> 


-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 538 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

      reply	other threads:[~2014-06-27 13:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-22  1:31 balloon driver broken in 3.12+ after save+restore Marek Marczykowski-Górecki
2014-06-27  0:42 ` [bisected] " Marek Marczykowski-Górecki
2014-06-27  9:51 ` David Vrabel
2014-06-27 13:57   ` Marek Marczykowski-Górecki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53AD7833.80805@invisiblethingslab.com \
    --to=marmarek@invisiblethingslab.com \
    --cc=david.vrabel@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).