From mboxrd@z Thu Jan 1 00:00:00 1970 From: Andrew Cooper Subject: Re: Linux 4.1 reports wrong number of pages to toolstack Date: Fri, 4 Sep 2015 19:39:27 +0100 Message-ID: <55E9E55F.6000108@citrix.com> References: <20150904004039.GA23402@zion.uk.xensource.com> <55E91225.4090500@suse.com> <55E97259020000780009F87F@prv-mh.provo.novell.com> <55E965F8.7060200@citrix.com> <20150904113503.GP18474@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZXvtb-0002PF-T7 for xen-devel@lists.xenproject.org; Fri, 04 Sep 2015 18:39:32 +0000 In-Reply-To: <20150904113503.GP18474@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: Juergen Gross , Ian Campbell , Ian Jackson , David Vrabel , Jan Beulich , xen-devel@lists.xenproject.org List-Id: xen-devel@lists.xenproject.org On 04/09/15 12:35, Wei Liu wrote: > On Fri, Sep 04, 2015 at 10:35:52AM +0100, Andrew Cooper wrote: >> On 04/09/15 09:28, Jan Beulich wrote: >>>>>> On 04.09.15 at 05:38, wrote: >>>> On 09/04/2015 02:40 AM, Wei Liu wrote: >>>>> This issue is exposed by the introduction of migration v2. The symptom is that >>>>> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >>>>> many pages. >>>>> >>>>> Note that all guests have 512MB memory, which means they have 131072 pages. >>>>> >>>>> Both 3.14 tests [2] [3] get the correct number of pages. Like: >>>>> >>>>> xc: detail: max_pfn 0x1ffff, p2m_frames 256 >>>>> ... >>>>> xc: detail: Memory: 2048/131072 1% >>>>> ... >>>>> >>>>> However in both 4.1 [0] [1] the number of pages are quite wrong. >>>>> >>>>> 4.1 32 bit: >>>>> >>>>> xc: detail: max_pfn 0xfffff, p2m_frames 1024 >>>>> ... >>>>> xc: detail: Memory: 11264/1048576 1% >>>>> ... >>>>> >>>>> It thinks it has 4096MB memory. >>>>> >>>>> 4.1 64 bit: >>>>> >>>>> xc: detail: max_pfn 0x3ffff, p2m_frames 512 >>>>> ... >>>>> xc: detail: Memory: 3072/262144 1% >>>>> ... >>>>> >>>>> It thinks it has 1024MB memory. >>>>> >>>>> The total number of pages is determined in libxc by calling >>>>> xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from >>>>> hypervisor. And that value is clearly touched by Linux in some way. >>>> Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list >>>> can handle. This is not the memory size of the domain. >>>> >>>>> I now think this is a bug in Linux kernel. The biggest suspect is the >>>>> introduction of linear P2M. If you think this is a bug in toolstack, >>>>> please let me know. >>>> I absolutely think it is a toolstack bug. Even without the linear p2m >>>> things would go wrong in case a ballooned down guest would be migrated, >>>> as shared_info->arch.max_pfn would hold the upper limit of the guest >>>> in this case and not the current size. >>> I don't think this necessarily is a tool stack bug, at least not in >>> the sense implied above - since (afaik) migrating ballooned guests >>> (at least PV ones) has been working before, there ought to be >>> logic to skip ballooned pages (and I certainly recall having seen >>> migration slowly move up to e.g. 50% and the skip the other >>> half due to being ballooned, albeit that recollection certainly is >> >from before v2). And pages above the highest populated one >>> ought to be considered ballooned just as much. With the >>> information provided by Wei I don't think we can judge about >>> this, since it only shows the values the migration process starts >>> from, not when, why, or how it fails. >> Max pfn reported by migration v2 is max pfn, not the number of pages of RAM >> in the guest. >> > I understand that by looking at the code. Just the log itself > is very confusing. > > I propose we rename the log a bit. Maybe change "Memory" to "P2M" or > something else? P2M would be wrong for HVM guests. Memory was the same term used by the legacy code iirc. "Frames" is probably the best term. > >> It is used for the size of the bitmaps used by migration v2, including the >> logdirty op calls. >> >> All frames between 0 and max pfn will have their type queried, and acted >> upon appropriately, including doing nothing if the frame was ballooned out. > In short, do you think this is a bug in migration v2? There is insufficient information in this thread to say either way. Maybe. Maybe a Linux kernel bug. > > When I looked at write_batch() I found some snippets that I thought to > be wrong. But I didn't what to make the judgement when I didn't have a > clear head. write_batch() is a complicated function but it can't usefully be split any further. I would be happy to explain bits or expand the existing comments, but it is also possible that it is buggy. ~Andrew