From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38032) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XHX4D-0007xI-8C for qemu-devel@nongnu.org; Wed, 13 Aug 2014 07:50:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1XHX47-0001UT-SO for qemu-devel@nongnu.org; Wed, 13 Aug 2014 07:50:09 -0400 Received: from mx1.redhat.com ([209.132.183.28]:2077) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1XHX47-0001TQ-Kr for qemu-devel@nongnu.org; Wed, 13 Aug 2014 07:50:03 -0400 Date: Wed, 13 Aug 2014 13:50:20 +0200 From: "Michael S. Tsirkin" Message-ID: <20140813115020.GC20244@redhat.com> References: <1407928917-16220-1-git-send-email-zhang.zhanghailiang@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1407928917-16220-1-git-send-email-zhang.zhanghailiang@huawei.com> Subject: Re: [Qemu-devel] [PATCH] mlock: fix bug when mlockall called before mbind List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: zhanghailiang Cc: hutao@cn.fujitsu.com, luonengjun@huawei.com, peter.huangpeng@huawei.com, xiexiangyou , qemu-devel@nongnu.org, aliguori@amazon.com, imammedo@redhat.com, pbonzini@redhat.com, gaowanlong@cn.fujitsu.com On Wed, Aug 13, 2014 at 07:21:57PM +0800, zhanghailiang wrote: > If we configure qemu with realtime-mlock-on and memory-node-bind at the same time, > Qemu will fail to start, and mbind() fails with message "Input/output error". > > >From man page: > int mbind(void *addr, unsigned long len, int mode, > unsigned long *nodemask, unsigned long maxnode, > unsigned flags); > The *MPOL_BIND* mode specifies a strict policy that restricts memory allocation > to the nodes specified in nodemask. > If *MPOL_MF_STRICT* is passed in flags and policy is not MPOL_DEFAULT(In qemu > here is MPOL_BIND), then the call will fail with the error EIO if the existing > pages in the memory range don't follow the policy. > > The memory locked ahead by mlockall can not guarantee to follow the policy above, > And if that happens, it will result in an EIO error. > > So we should call mlock after mbind, here we adjust the place where called mlock, > Move it to function pc_memory_init. > > Signed-off-by: xiexiangyou > Signed-off-by: zhanghailiang OK but won't this still fail in case of memory hotplug? We set MCL_FUTURE so the same will apply? Maybe it's enough to set MPOL_MF_MOVE? Does the following work for you? --> hostmem: set MPOL_MF_MOVE When memory is allocated on a wrong node, MPOL_MF_STRICT doesn't move it - it just fails the allocation. A simple way to reproduce the failure is with mlock=on realtime feature. The code comment actually says: "ensure policy won't be ignored" so setting MPOL_MF_MOVE seems like a better way to do this. Signed-off-by: Michael S. Tsirkin --- diff --git a/backends/hostmem.c b/backends/hostmem.c index ca10c51..a9905c0 100644 --- a/backends/hostmem.c +++ b/backends/hostmem.c @@ -304,7 +304,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp) /* ensure policy won't be ignored in case memory is preallocated * before mbind(). note: MPOL_MF_STRICT is ignored on hugepages so * this doesn't catch hugepage case. */ - unsigned flags = MPOL_MF_STRICT; + unsigned flags = MPOL_MF_STRICT | MPOL_MF_MOVE; /* check for invalid host-nodes and policies and give more verbose * error messages than mbind(). */ -- MST