From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Campbell Subject: Re: [BUG] Xen live migration dom0 memory Date: Fri, 4 Sep 2015 18:05:02 +0100 Message-ID: <1441386302.25589.19.camel@citrix.com> References: <559BEFA2.3000509@epiontis.com> <1441382667.25589.7.camel@citrix.com> <1441384737.25589.15.camel@citrix.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1441384737.25589.15.camel@citrix.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Epiontis IT , xen-devel@lists.xen.org, Ian.Jackson@eu.citrix.com List-Id: xen-devel@lists.xenproject.org On Fri, 2015-09-04 at 17:38 +0100, Ian Campbell wrote: > On Fri, 2015-09-04 at 17:04 +0100, Ian Campbell wrote: > > On Tue, 2015-07-07 at 17:26 +0200, Epiontis IT wrote: > > > Hello, > > > > > > I already posted on the xen-users list and Ian told me to post a bug > > > report here. We have the following problem: > > > > > > When I start a migration with "xl migrate " the > > > destination machine sets up a vm "--incoming" and seems to sync > > > memory because "xl list" shows "Mem" for the incoming vm going up > > > stepwise until 2048. After that though the vm doesn't get launched. > > > The > > > > > > vm is frozen for about a minute, the dom0 begins swapping out the 2GB > > > > > > to > > > disk (because it only has 512M available for itself) and a process > > > > I spoke with Ian Jackson about this this afternoon and we spent some > > staring at the pmap, and we are now both very perplexed... > > > > So I've just tried to repro this. > > > > I installed latest 4.5-testing and a 512M dom0 (which as it happens is > > also > > what our automated test system uses). I then migrated a 3GB domU with > > "xl > > migrate localhost" and it completed successfully. I had to do > > localhost migrate due to only having one box, but I don't think this > > should > > matter. > > > > I observed the save helper with pmap and never saw any large > > allocations. > > > > This matches the behaviour that both Ian and I expected. > > > > Your logs show you are running 4.5.1-rc1. I've looked over > > git log 4.5.1-rc1..origin/staging-4.5 -- tools/libx[cl]/ > > and I don't see anything like a fix for a leak etc. > > > > Next step is to revert my test box from staging-4.5 to 4.5.1-rc1 and > > see > > what changes. > > Nothing changed. > > I then tried building this tree with > > EXTRA_CFLAGS_XEN_TOOLS="-g -O2 -fstack-protector-strong -Wformat > -Werror=format-security" APPEND_CPPFLAGS="-D_FORTIFY_SOURCE=2" > APPEND_LDFLAGS="-Wl,-z,relro" > > which is how the Debian package is built, still no change. > > My next step was to build patched with the Debian patches, still no > change. I see in https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=797205 that 4.4 also demonstrates this for you. So I have now tried the packaged versions of Xen 4.4.1 + Linux 3.16 (from Jessie) and Xen 4.4.1 + Linux 4.1 (from jessie-backports) and still no luck reproducing the issue. I'm afraid at this point I'm giving up. I think the most practical next step would be for you to arrange run the restore helper under valgrind as I describe below. Ian. > > So I'm at even more of a loss than I was before :-/ > > Your report said you were running Debian 8.0 (Jessie) but with Xen 4.5.1 > -rc1, while Jessie only has 4.4.1. > > Was 4.5.1-rc1 from experimental or built yourself? Did you install > anything > else from Sid/Experimental (or Stretch)? > > If not from the packages in experimental then how did you install Xen? > > If you can still reproduce then you could try installing the latest > valgrind, move libxl-save-helper aside and replace it with a script which > runs the tool under valgrind. > > Modern valgrind understands many Xen hypercalls etc and so is quite > useful > even on the Xen toolstack binaries. > > Ian. > > > > > Ian. > > > > > > > > /usr/lib/xen-4.5/bin/libxl-save-helper --restore-domain > > > > > > takes about 90% of the dom0 memory. After that the vm ist launched, > > > memory consumption goes back to normal, swap space is freed. > > > > > > I attached several log and output files in xen_bug_report.txt. In > > > particular Ian wanted to see the pmap of the save helper process > > > which > > > you can find at the end of the file taken during several phases of > > > the > > > migration process. > > > > > > > > > Thank you, > > > Alex > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xen.org > > > http://lists.xen.org/xen-devel > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel