From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:57914) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aqgTy-0001Rd-5w for qemu-devel@nongnu.org; Thu, 14 Apr 2016 08:34:50 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aqgTu-0005HC-5l for qemu-devel@nongnu.org; Thu, 14 Apr 2016 08:34:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41500) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aqgTu-0005H5-0o for qemu-devel@nongnu.org; Thu, 14 Apr 2016 08:34:46 -0400 Date: Thu, 14 Apr 2016 13:34:41 +0100 From: "Dr. David Alan Gilbert" Message-ID: <20160414123441.GF2252@work-vm> References: <20160412175501.GB6415@work-vm> <20160413080545.GA2270@work-vm> <20160413114103.GB2270@work-vm> <20160413125053.GC2270@work-vm> <20160413205132.GG26364@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160413205132.GG26364@redhat.com> Subject: Re: [Qemu-devel] post-copy is broken? List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Andrea Arcangeli , kirill.shutemov@linux.intel.com Cc: "Li, Liang Z" , Amit Shah , "qemu-devel@nongnu.org" , "quintela@redhat.com" * Andrea Arcangeli (aarcange@redhat.com) wrote: > The next suspect is the massive THP refcounting change that went > upstream recently: > As further debug hint, can you try to disable THP and see if that > makes the problem go away? Yep, this seems to be the problem (cc'ing in Kirill). 122afea9626ab3f717b250a8dd3d5ebf57cdb56c - works (just before Kirill disables THP) 61f5d698cc97600e813ca5cf8e449b1ea1c11492 - breaks (when THP is reenabled) It's pretty reliable; as you say disabling THP makes it work again and putting it back to THP/madvise mode makes it break. And you need to test on a machine with some free ram to make sure THP has a chance to have happened. I'm not sure of all of the rework that happened in that series, but my reading of it is that splitting of THP pages gets deferred; so I wonder if when I do the madvise to turn THP off, if it's actually still got THP pages and thus we end up with a whole THP mapped when I'm expecting to be userfaulting those pages. Dave > > Thanks, > Andrea -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK