From mboxrd@z Thu Jan 1 00:00:00 1970 From: Juergen Gross Subject: Re: Linux 4.1 reports wrong number of pages to toolstack Date: Fri, 4 Sep 2015 05:38:13 +0200 Message-ID: <55E91225.4090500@suse.com> References: <20150904004039.GA23402@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZXhpR-0002TI-5s for xen-devel@lists.xenproject.org; Fri, 04 Sep 2015 03:38:17 +0000 In-Reply-To: <20150904004039.GA23402@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu , xen-devel@lists.xenproject.org, David Vrabel Cc: Andrew Cooper , Ian Jackson , Ian Campbell List-Id: xen-devel@lists.xenproject.org On 09/04/2015 02:40 AM, Wei Liu wrote: > Hi David > > This issue is exposed by the introduction of migration v2. The symptom is that > a guest with 32 bit 4.1 kernel can't be restored because it's asking for too > many pages. > > Note that all guests have 512MB memory, which means they have 131072 pages. > > Both 3.14 tests [2] [3] get the correct number of pages. Like: > > xc: detail: max_pfn 0x1ffff, p2m_frames 256 > ... > xc: detail: Memory: 2048/131072 1% > ... > > However in both 4.1 [0] [1] the number of pages are quite wrong. > > 4.1 32 bit: > > xc: detail: max_pfn 0xfffff, p2m_frames 1024 > ... > xc: detail: Memory: 11264/1048576 1% > ... > > It thinks it has 4096MB memory. > > 4.1 64 bit: > > xc: detail: max_pfn 0x3ffff, p2m_frames 512 > ... > xc: detail: Memory: 3072/262144 1% > ... > > It thinks it has 1024MB memory. > > The total number of pages is determined in libxc by calling > xc_domain_nr_gpfns, which yanks shared_info->arch.max_pfn from > hypervisor. And that value is clearly touched by Linux in some way. Sure. shared_info->arch.max_pfn holds the number of pfns the p2m list can handle. This is not the memory size of the domain. > I now think this is a bug in Linux kernel. The biggest suspect is the > introduction of linear P2M. If you think this is a bug in toolstack, > please let me know. I absolutely think it is a toolstack bug. Even without the linear p2m things would go wrong in case a ballooned down guest would be migrated, as shared_info->arch.max_pfn would hold the upper limit of the guest in this case and not the current size. Juergen