From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?UTF-8?B?TWFyZWsgTWFyY3p5a293c2tpLUfDs3JlY2tp?= Subject: Re: balloon driver broken in 3.12+ after save+restore Date: Fri, 27 Jun 2014 15:57:07 +0200 Message-ID: <53AD7833.80805@invisiblethingslab.com> References: <537D538F.6000905@invisiblethingslab.com> <53AD3EAE.4030008@citrix.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============3593345247407006476==" Return-path: In-Reply-To: <53AD3EAE.4030008@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: David Vrabel , "xen-devel@lists.xen.org" List-Id: xen-devel@lists.xenproject.org This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --===============3593345247407006476== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="o0xfapOsOj42fFAPRG9vEeQPoTdormlfG" This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --o0xfapOsOj42fFAPRG9vEeQPoTdormlfG Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 27.06.2014 11:51, David Vrabel wrote: > On 22/05/14 02:31, Marek Marczykowski-G=C3=B3recki wrote: >> Hi, >> >> I have a problem with balloon driver after/during restoring a saved do= main. >> There are two symptoms: >> 1. When domain was 'xl mem-set ' just = before >> save, it still needs initial memory size to restore. Details below. >> >> 2. Restored domain sometimes (most of the time) do not want to balloon= down. >> For example when the domain has 3300MB and I mem-set it to 2800MB, not= hing >> changes immediately (only "target" in sysfs) - both 'xl list' and 'fre= e' >> inside reports the same size (and plenty of free memory in the VM). Af= ter some >> time it get ballooned down to ~3000, still not 2800. I haven't found a= ny >> pattern here. >> >> Both of above was working perfectly in 3.11. >> >> I'm running Xen 4.1.6.1. >> >> Details for the first problem: >> Preparation: >> I start the VM as in config at the end of email (memory=3D400, maxmem=3D= 4000), >> wait some time, then 'xl mem-set' to size just about really used memor= y (about >> 200MB in most cases). Then 'sleep 1' and 'xl save'. >> When I want to restore that domain, I get initial config file, replace= memory >> setting with size used in 'xl mem-set' above and call 'xl restore' pro= viding >> that config. It fails with this error: >> --- >> Loading new save file /var/run/qubes/current-savefile (new xl fmt info= >> 0x0/0x0/849) >> Savefile contains xl domain config >> xc: detail: xc_domain_restore start: p2m_size =3D fa800 >> xc: detail: Failed allocation for dom 51: 1024 extents of order 0 >> xc: error: Failed to allocate memory for batch.!: Internal error >> xc: detail: Restore exit with rc=3D1 >> libxl: error: libxl_dom.c:313:libxl__domain_restore_common restoring d= omain: >> Resource temporarily unavailable >> cannot (re-)build domain: -3 >> libxl: error: libxl.c:713:libxl_domain_destroy non-existant domain 51 >> --- >> When memory set back to 400 (or slightly lower, like 380) - restore su= cceeded, >> but still the second problem is happening. >> >> I've bisected the first problem down to this commit: >> commit cd9151e26d31048b2b5e00fd02e110e07d2200c9 >> xen/balloon: set a mapping for ballooned out pages >=20 > Sorry for the delay. I somehow missed this. >=20 > This is likely caused by the balloon driver creating multiple entries > in the p2m all pointing to the MFNs of the scratch pages. These > duplicates are de-duped on save/restore. >=20 > I suspect your 2nd issue may also be caused by this. >=20 > Can you try this patch, please? Looks to be the right fix, thanks! >=20 > 8<---------------------------------------------- > xen/balloon: set ballooned out pages as invalid in p2m >=20 > Since cd9151e26d31048b2b5e00fd02e110e07d2200c9 (xen/balloon: set a > mapping for ballooned out pages), a ballooned out page had its entry > in the p2m set to the MFN of one of the scratch page. This means that > the p2m will contain many entries pointing to the same MFN. >=20 > During a domain save, this many-to-one entries are not considered and > the scratch page is saved multiple times. On restore the ballooned > pages are populated with new frames and the domain may use up its > allocation before all pages can be restores. >=20 > Set ballooned out pages as INVALID_P2M_ENTRY in the p2m (as they > werebefore), preventing them from being saved and re-populated on > restore. >=20 > Signed-off-by: David Vrabel > --- > drivers/xen/balloon.c | 12 +++++------- > 1 file changed, 5 insertions(+), 7 deletions(-) >=20 > diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c > index b7a506f..5c660c7 100644 > --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -426,20 +426,18 @@ static enum bp_state decrease_reservation(unsigne= d long nr_pages, gfp_t gfp) > * p2m are consistent. > */ > if (!xen_feature(XENFEAT_auto_translated_physmap)) { > - unsigned long p; > - struct page *scratch_page =3D get_balloon_scratch_page(); > - > if (!PageHighMem(page)) { > + struct page *scratch_page =3D get_balloon_scratch_page(); > + > ret =3D HYPERVISOR_update_va_mapping( > (unsigned long)__va(pfn << PAGE_SHIFT), > pfn_pte(page_to_pfn(scratch_page), > PAGE_KERNEL_RO), 0); > BUG_ON(ret); > - } > - p =3D page_to_pfn(scratch_page); > - __set_phys_to_machine(pfn, pfn_to_mfn(p)); > =20 > - put_balloon_scratch_page(); > + put_balloon_scratch_page(); > + } > + __set_phys_to_machine(pfn, INVALID_P2M_ENTRY); > } > #endif > =20 >=20 --=20 Best Regards, Marek Marczykowski-G=C3=B3recki Invisible Things Lab A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? --o0xfapOsOj42fFAPRG9vEeQPoTdormlfG Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQEcBAEBAgAGBQJTrXgzAAoJENuP0xzK19cse54IAJBPI04wx4zy9+nG70Q0VtuC o0+fg+AfUgdMoVdW0ZAlLZppt+uhb8gAEwiEn0hZGq/x2lh8z9733MN9JkSNJDtY W75sQZ9RIwBl24aYV7zjxH34DyrgZnvXhp2e55XGtp4qVMP9zVYeODyEeWxl+yQ6 GkQV1KWvT5ZsIo8zKyOUZwenAzZOe4IuCK2qHuSMLnXfXH8Hu7Rp3pumX50HqDno pfKT7I/HJlA23duBV63uIIjSna3eIk9dzLzQpQmpBUEGqG7WjvZZk0p0Z02rXBsI THB318H+/lOY6R64OCqNQCkmoUXFBIouVod8G2QBdjTlGc1WmpYS2jvHgmffSYw= =wBm9 -----END PGP SIGNATURE----- --o0xfapOsOj42fFAPRG9vEeQPoTdormlfG-- --===============3593345247407006476== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============3593345247407006476==--