From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59073) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vlglh-0001Cp-Hg for qemu-devel@nongnu.org; Wed, 27 Nov 2013 10:11:15 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Vlglb-0006cX-HU for qemu-devel@nongnu.org; Wed, 27 Nov 2013 10:11:09 -0500 Received: from [222.73.24.84] (port=4451 helo=song.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Vlglb-0006bV-3g for qemu-devel@nongnu.org; Wed, 27 Nov 2013 10:11:03 -0500 Message-ID: <52960B33.6070703@cn.fujitsu.com> Date: Wed, 27 Nov 2013 23:09:39 +0800 From: Wanlong Gao MIME-Version: 1.0 References: <1385086118-11699-1-git-send-email-gaowanlong@cn.fujitsu.com> <1385086118-11699-11-git-send-email-gaowanlong@cn.fujitsu.com> <52960349.5020603@redhat.com> In-Reply-To: <52960349.5020603@redhat.com> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=ISO-8859-15 Subject: Re: [Qemu-devel] [PATCH V16 09/11] NUMA: set guest numa nodes memory policy Reply-To: gaowanlong@cn.fujitsu.com List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Paolo Bonzini Cc: drjones@redhat.com, ehabkost@redhat.com, lersek@redhat.com, mtosatti@redhat.com, qemu-devel@nongnu.org, lcapitulino@redhat.com, bsd@redhat.com, anthony@codemonkey.ws, hutao@cn.fujitsu.com, y-goto@jp.fujitsu.com, peter.huangpeng@huawei.com, afaerber@suse.de, Wanlong Gao On 11/27/2013 10:35 PM, Paolo Bonzini wrote: > Il 22/11/2013 03:08, Wanlong Gao ha scritto: >> +static int set_node_mem_policy(int nodeid) >> +{ >> +#ifdef __linux__ >> + void *ram_ptr; >> + RAMBlock *block; >> + ram_addr_t len, ram_offset = 0; >> + int bind_mode; >> + int i; >> + >> + QTAILQ_FOREACH(block, &ram_list.blocks, next) { >> + if (!strcmp(block->mr->name, "pc.ram")) { > > This is not acceptable, "pc.ram" is a board-specific name. > > I think instead set_node_mem_policy could be something like > > int memory_region_set_mem_policy(MemoryRegion *mr, > uint64_t start, uint64_t length, > uint64_t offset); > > that applies the NUMA policies specified for [offset, offset+length) to > the host physical address [ptr+start, ptr+start+length), where ptr is > memory_region_get_ram_ptr(mr). > > Each board then can call the function after it adds RAM with > memory_region_add_subregion. Got it, than you. Thanks, Wanlong Gao > > Paolo > >> + break; >> + } >> + } >> + >> + if (block->host == NULL) { >> + return -1; >> + } >> + >> + ram_ptr = block->host; >> + for (i = 0; i < nodeid; i++) { >> + len = numa_info[i].node_mem; >> + ram_offset += len; >> + } >> + >> + len = numa_info[nodeid].node_mem; >> + bind_mode = node_parse_bind_mode(nodeid); >> + unsigned long *nodes = numa_info[nodeid].host_mem; >> + >> + /* This is a workaround for a long standing bug in Linux' >> + * mbind implementation, which cuts off the last specified >> + * node. To stay compatible should this bug be fixed, we >> + * specify one more node and zero this one out. >> + */ >> + unsigned long maxnode = find_last_bit(nodes, MAX_NODES); >> + if (syscall(SYS_mbind, ram_ptr + ram_offset, len, bind_mode, >> + nodes, maxnode + 2, 0)) { >> + perror("mbind"); >> + return -1; >> + } > > Also, it's still not clear to me why we're not using libnuma. > >> +#endif >> + >> + return 0; >> +} >> + >> void set_numa_modes(void) >> { >> CPUState *cpu; >> @@ -240,4 +319,11 @@ void set_numa_modes(void) >> } >> } >> } >> + >> + for (i = 0; i < nb_numa_nodes; i++) { >> + if (set_node_mem_policy(i) == -1) { >> + fprintf(stderr, >> + "qemu: can not set host memory policy for node%d\n", i); >> + } >> + } >> } >> > >