From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:60734) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3Zz7-000400-79 for qemu-devel@nongnu.org; Wed, 26 Apr 2017 23:20:53 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d3Zz4-0006Nx-1J for qemu-devel@nongnu.org; Wed, 26 Apr 2017 23:20:49 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56018) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d3Zz3-0006NK-P1 for qemu-devel@nongnu.org; Wed, 26 Apr 2017 23:20:45 -0400 Date: Thu, 27 Apr 2017 11:20:37 +0800 From: Peter Xu Message-ID: <20170427032037.GE26792@pxdev.xzpeter.org> References: <20170426183721.7482-1-dgilbert@redhat.com> <20170426183721.7482-2-dgilbert@redhat.com> <8a107a40-073c-8181-75aa-e5700f2900a7@de.ibm.com> <20170426190442.GG2394@work-vm> <20170426193743.GF3508@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20170426193743.GF3508@redhat.com> Subject: Re: [Qemu-devel] [PATCH 1/2] Postcopy: Force allocation of all-zero precopy pages List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" Cc: Christian Borntraeger , qemu-devel@nongnu.org, quintela@redhat.com, lvivier@redhat.com, Andrea Arcangeli On Wed, Apr 26, 2017 at 09:37:43PM +0200, Andrea Arcangeli wrote: > Hello, > > On Wed, Apr 26, 2017 at 08:04:43PM +0100, Dr. David Alan Gilbert wrote: > > * Christian Borntraeger (borntraeger@de.ibm.com) wrote: > > > On 04/26/2017 08:37 PM, Dr. David Alan Gilbert (git) wrote: > > > > From: "Dr. David Alan Gilbert" > > > > > > > > When an all-zero page is received during the precopy > > > > phase of a postcopy-enabled migration we must force > > > > allocation otherwise accesses to the page will still > > > > get blocked by userfault. > > > > > > > > Symptom: > > > > a) If the page is accessed by a device during device-load > > > > then we get a deadlock as the source finishes sending > > > > all its pages but the destination device-load is still > > > > paused and so doesn't clean up. > > > > > > > > b) If the page is accessed later, then the thread will stay > > > > paused until the end of migration rather than carrying on > > > > running, until we release userfault at the end. > > > > > > > > Signed-off-by: Dr. David Alan Gilbert > > > > Reported-by: Christian Borntraeger > > > > > > CC stable? after all the guest hangs on both sides > > > > > > Has survived 40 migrations (usually failed at the 2nd) > > > Tested-by: Christian Borntraeger > > > > Great...but..... > > Andrea (added to the mail) says this shouldn't be necessary. > > The read we were doing in the is_zero_range() should have been sufficient > > to get the page mapped and that zero page should have survived. > > > > So - I guess that's back a step, we need to figure out why the > > page disapepars for you. > > Yes reading during precopy is enough to fill the hole and prevent > userfault missing faults to trigger. > > Somehow the pagetable must be mapped by a zeropage or a hugezeropage > or a regular page allocated during a previous precopy pass or a > pre-zeroed subpage part of a THP. > > Even if the hugezeropage is splitted later by a MADV_DONTNEED with > postcopy starts, they will become 4k zeropages. > > After a read succeeds, nothing (except MADV_DONTNEED or other explicit > syscalls which qemu would need to invoke explicitly between > is_zero_range and UFFDIO_REGISTER) should be able to bring the > pagetable back to its "pte_none/pmd_none" state that will then trigger > missing userfaults during postcopy later. No matter what finally the solution would be (after see Juan's comment, I am curious about whether is_zero_page() behaves differently in power now)... Dave, would it worth mentioning in ram_handle_compressed() about this read side-effect? Otherwise imho it might be hard for many people to quickly notice this. Thanks, -- Peter Xu