From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Vrabel Subject: Re: Linux 4.1 reports wrong number of pages to toolstack Date: Fri, 4 Sep 2015 15:58:11 +0100 Message-ID: <55E9B183.7040709@citrix.com> References: <20150904004039.GA23402@zion.uk.xensource.com> <1441356837.26292.431.camel@citrix.com> <55E9ADBE.5030102@citrix.com> <20150904145329.GH27133@zion.uk.xensource.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1ZXsS7-00089D-Kf for xen-devel@lists.xenproject.org; Fri, 04 Sep 2015 14:58:55 +0000 In-Reply-To: <20150904145329.GH27133@zion.uk.xensource.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Wei Liu Cc: Juergen Gross , xen-devel@lists.xenproject.org, Ian Jackson , Ian Campbell , Andrew Cooper List-Id: xen-devel@lists.xenproject.org On 04/09/15 15:53, Wei Liu wrote: > On Fri, Sep 04, 2015 at 03:42:06PM +0100, David Vrabel wrote: >> On 04/09/15 09:53, Ian Campbell wrote: >>> On Fri, 2015-09-04 at 01:40 +0100, Wei Liu wrote: >>>> Hi David >>>> >>>> This issue is exposed by the introduction of migration v2. The symptom is that >>>> a guest with 32 bit 4.1 kernel can't be restored because it's asking for too >>>> many pages. >>> >>> FWIW my adhoc tests overnight gave me: >>> >>> 37858: b953c0d234bc72e8489d3bf51a276c5c4ec85345 v4.1 Fail >>> 37862: 39a8804455fb23f09157341d3ba7db6d7ae6ee76 v4.0 Fail >>> 37860: bfa76d49576599a4b9f9b7a71f23d73d6dcff735 v3.19 Fail >>> >>> 37872: e36f014edff70fc02b3d3d79cead1d58f289332e v3.19-rc7 Fail >>> 37866: 26bc420b59a38e4e6685a73345a0def461136dce v3.19-rc6 Fail >>> 37868: ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc v3.19-rc5 Fail >>> 37864: eaa27f34e91a14cdceed26ed6c6793ec1d186115 v3.19-rc4 Fail * >>> 37867: b1940cd21c0f4abdce101253e860feff547291b0 v3.19-rc3 Pass * >>> 37865: b7392d2247cfe6771f95d256374f1a8e6a6f48d6 v3.19-rc2 Pass >>> >>> 37863: 97bf6af1f928216fd6c5a66e8a57bfa95a659672 v3.19-rc1 Pass >>> >>> 37861: b2776bf7149bddd1f4161f14f79520f17fc1d71d v3.18 Pass >>> >>> I have set the adhoc bisector working on the ~200 commits between rc3 and >>> rc4. It's running in the Citrix instance (which is quieter) so the interim >>> results are only visible within our network at http://osstest.xs.citrite.ne >>> t/~osstest/testlogs/results-adhoc/bisect/xen-unstable/test-amd64-i386 >>> -xl..html. >>> >>> So far it has confirmed the basis fail and it is now rechecking the basis >>> pass. >>> >>> Slightly strange though is: >>> $ git log --oneline v3.19-rc3..v3.19-rc4 -- drivers/xen/ arch/x86/xen/ include/xen/ >>> $ >>> >>> i.e. there are no relevant seeming xen commits in that range. Maybe the >>> last one of this is more relevant? >> >> Since this bisect attempt appears to have disappeared into the weeds I >> did my own and it fingered: >> >> 633d6f17cd91ad5bf2370265946f716e42d388c6 (x86/xen: prepare p2m list for >> memory hotplug) which was introduced in 4.0-rc7. >> >> This looks a lot more plausible as the Linux change triggering the >> migration failures. >> > > FWIW. Same 32bit kernel, 128MB memory, migration is OK. This commit is only bad with 64-bit guests -- with a 32-bit guest the maximum p2m size covers only 64 GiB. It will also requires XEN_BALLOON_MEMORY_HOTPLUG to be enabled. This commit is exposing a toolstack bug. David