From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:48280) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzQEd-0007av-2g for qemu-devel@nongnu.org; Thu, 19 Nov 2015 09:30:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZzQEZ-0003oe-SC for qemu-devel@nongnu.org; Thu, 19 Nov 2015 09:30:50 -0500 Received: from e38.co.us.ibm.com ([32.97.110.159]:46080) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzQEZ-0003ni-KF for qemu-devel@nongnu.org; Thu, 19 Nov 2015 09:30:47 -0500 Received: from localhost by e38.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 19 Nov 2015 07:30:46 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id C5EE81FF002D for ; Thu, 19 Nov 2015 07:18:55 -0700 (MST) Received: from d03av01.boulder.ibm.com (d03av01.boulder.ibm.com [9.17.195.167]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id tAJERCpR50069708 for ; Thu, 19 Nov 2015 07:27:12 -0700 Received: from d03av01.boulder.ibm.com (localhost [127.0.0.1]) by d03av01.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id tAJEUhiI005314 for ; Thu, 19 Nov 2015 07:30:43 -0700 References: <1447341516-18076-1-git-send-email-jjherne@linux.vnet.ibm.com> <564C7DCA.8010400@suse.cz> <564D86AE.1010305@de.ibm.com> <20151119131011.GD2653@work-vm> From: "Jason J. Herne" Message-ID: <564DDD12.3070302@linux.vnet.ibm.com> Date: Thu, 19 Nov 2015 09:30:42 -0500 MIME-Version: 1.0 In-Reply-To: <20151119131011.GD2653@work-vm> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [PATCH] mm: Loosen MADV_NOHUGEPAGE to enable Qemu postcopy on s390 Reply-To: jjherne@linux.vnet.ibm.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Dr. David Alan Gilbert" , Christian Borntraeger Cc: aarcange@redhat.com, akpm@linux-foundation.org, qemu-devel , Vlastimil Babka , Juan Quintela On 11/19/2015 08:10 AM, Dr. David Alan Gilbert wrote: > * Christian Borntraeger (borntraeger@de.ibm.com) wrote: >> On 11/18/2015 02:31 PM, Vlastimil Babka wrote: >>> [CC += linux-api@vger.kernel.org] ... > Can you tell me if the following works for you: > > > From 545809a18fa768eccdaafe9bd842490c3390b00c Mon Sep 17 00:00:00 2001 > From: "Dr. David Alan Gilbert" > Date: Thu, 19 Nov 2015 12:05:36 +0000 > Subject: [PATCH] Assume madvise for (no)hugepage works > > madvise() returns EINVAL in the case of many failures, but also > returns it in cases where the host kernel doesn't have THP enabled. > Postcopy only really cares that THP is off before it detects faults, > and turns it back on afterwards; so we're going to have > to assume that if the madvise fails then the host just doesn't do > THP and we can carry on with the postcopy. > > Signed-off-by: Dr. David Alan Gilbert > --- > migration/postcopy-ram.c | 10 ++-------- > 1 file changed, 2 insertions(+), 8 deletions(-) > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c > index 22d6b18..3946aa9 100644 > --- a/migration/postcopy-ram.c > +++ b/migration/postcopy-ram.c > @@ -241,10 +241,7 @@ static int cleanup_range(const char *block_name, void *host_addr, > * We turned off hugepage for the precopy stage with postcopy enabled > * we can turn it back on now. > */ > - if (qemu_madvise(host_addr, length, QEMU_MADV_HUGEPAGE)) { > - error_report("%s HUGEPAGE: %s", __func__, strerror(errno)); > - return -1; > - } > + qemu_madvise(host_addr, length, QEMU_MADV_HUGEPAGE); > > /* > * We can also turn off userfault now since we should have all the > @@ -345,10 +342,7 @@ static int nhp_range(const char *block_name, void *host_addr, > * do delete areas of the page, even if THP thinks a hugepage would > * be a good idea, so force hugepages off. > */ > - if (qemu_madvise(host_addr, length, QEMU_MADV_NOHUGEPAGE)) { > - error_report("%s: NOHUGEPAGE: %s", __func__, strerror(errno)); > - return -1; > - } > + qemu_madvise(host_addr, length, QEMU_MADV_NOHUGEPAGE); > > return 0; > } > I tried this without my madvise kernel patch, and was able to get by the problem as expected. We still need the kernel patch set "Allow gmap fault to retry​" as posted to linux-mm to get userfaultfd support playing nicely with s390 async page faults. But that is a separate problem entirely. Tested-by: Jason J. Herne -- -- Jason J. Herne (jjherne@linux.vnet.ibm.com)