From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37326) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wis7C-00012C-Hr for qemu-devel@nongnu.org; Fri, 09 May 2014 17:13:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Wis7B-0000pv-Gn for qemu-devel@nongnu.org; Fri, 09 May 2014 17:13:58 -0400 Received: from mail-wg0-x22d.google.com ([2a00:1450:400c:c00::22d]:60850) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Wis7B-0000pd-Av for qemu-devel@nongnu.org; Fri, 09 May 2014 17:13:57 -0400 Received: by mail-wg0-f45.google.com with SMTP id m15so4576973wgh.16 for ; Fri, 09 May 2014 14:13:56 -0700 (PDT) Sender: Paolo Bonzini Message-ID: <536D450E.80103@redhat.com> Date: Fri, 09 May 2014 23:13:50 +0200 From: Paolo Bonzini MIME-Version: 1.0 References: <536B9A0C.6040103@redhat.com> <20140509082949.GC3086@G08FNSTD100614.fnst.cn.fujitsu.com> <20140509175415.GA9730@otherpad.lan.raisama.net> In-Reply-To: <20140509175415.GA9730@otherpad.lan.raisama.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH v3.1 00/31] NUMA series, and hostmem improvements List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Eduardo Habkost , Hu Tao Cc: Igor Mammedov , qemu-devel@nongnu.org Il 09/05/2014 19:54, Eduardo Habkost ha scritto: > On Fri, May 09, 2014 at 04:29:49PM +0800, Hu Tao wrote: >> On Thu, May 08, 2014 at 04:51:56PM +0200, Paolo Bonzini wrote: >>> Il 06/05/2014 11:27, Hu Tao ha scritto: >>>> This series includes work on QOMifying the memory backends. >>>> the idea is to delegate all properties of the memory backend to >>>> a new QOM class hierarchy, in which the concrete classes >>>> are hostmem-ram and hostmem-file. The backend is passed to the >>>> machine via "-numa node,memdev=foo" where "foo" is the id of the >>>> backend object. >>> >>> Hello, >>> >>> I noticed now that if you have the host-nodes property set Linux >>> requires you to set a policy other than "default" too. If you don't, >>> the mbind system call fails. >>> >>> What about squashing something like this? >>> >>> Paolo >>> >>> diff --git a/backends/hostmem.c b/backends/hostmem.c >>> index d3f8476..a0a3111 100644 >>> --- a/backends/hostmem.c >>> +++ b/backends/hostmem.c >>> @@ -299,12 +299,23 @@ host_memory_backend_memory_init(UserCreatable *uc, Error **errp) >>> >>> #ifdef CONFIG_NUMA >>> unsigned long maxnode = find_last_bit(backend->host_nodes, MAX_NODES); >>> + unsigned policy = backend->policy; >>> + >>> + /* Linux does not accept MPOL_DEFAULT with nonzero bitmap, but >>> + * "-object memory-ram,size=128M,hostnodes=0,policy=bind" is a >>> + * bit of a mouthful. So if the host_nodes bitmap is nonzero, >>> + * pick the BIND policy. > > Are we sure MPOL_BIND is a better default than MPOL_INTERLEAVE or > MPOL_PREFERRED? Better than MPOL_PREFERRED, yes. Better than MPOL_INTERLEAVE, I am not sure. > * If policy=default is set, it is always going to be MPOL_DEFAULT. > * If policy=bind is set, it is always going to be MPOL_BIND. > * if policy=preferred is set, it is always going to be MPOL_PREFERRED. > * If policy is omitted, it will be be MPOL_DEFAULT is host-nodes is > unset, and MPOL_BIND if host_nodes is set. That's possible. Or we can just detect the case in host_memory_backend_memory_init and provide a better error message than just "Invalid argument" (aka EINVAL). We're always free to take an error situation and make it non-error in the future, but the reverse would be hard to do. Paolo