From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752122AbbIZJc4 (ORCPT ); Sat, 26 Sep 2015 05:32:56 -0400 Received: from cn.fujitsu.com ([59.151.112.132]:9694 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1751818AbbIZJcx (ORCPT ); Sat, 26 Sep 2015 05:32:53 -0400 X-IronPort-AV: E=Sophos;i="5.15,520,1432569600"; d="scan'208";a="101127487" Message-ID: <560665DB.7020301@cn.fujitsu.com> Date: Sat, 26 Sep 2015 17:31:07 +0800 From: Tang Chen User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.1.0 MIME-Version: 1.0 To: Tejun Heo CC: , , , , , , , , , , , , , , , , Subject: Re: [PATCH v2 3/7] x86, gfp: Cache best near node for memory allocation. References: <1441859269-25831-1-git-send-email-tangchen@cn.fujitsu.com> <1441859269-25831-4-git-send-email-tangchen@cn.fujitsu.com> <20150910192935.GI8114@mtj.duckdns.org> In-Reply-To: <20150910192935.GI8114@mtj.duckdns.org> Content-Type: text/plain; charset="ISO-8859-1"; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, tj On 09/11/2015 03:29 AM, Tejun Heo wrote: > Hello, > > On Thu, Sep 10, 2015 at 12:27:45PM +0800, Tang Chen wrote: >> diff --git a/include/linux/gfp.h b/include/linux/gfp.h >> index ad35f30..1a1324f 100644 >> --- a/include/linux/gfp.h >> +++ b/include/linux/gfp.h >> @@ -307,13 +307,19 @@ static inline struct page *alloc_pages_node(int nid, gfp_t gfp_mask, >> if (nid < 0) >> nid = numa_node_id(); >> >> + if (!node_online(nid)) >> + nid = get_near_online_node(nid); >> + >> return __alloc_pages(gfp_mask, order, node_zonelist(nid, gfp_mask)); >> } > Why not just update node_data[]->node_zonelist in the first place? zonelist will be rebuilt in __offline_pages() when the zone is not populated any more. Here, getting the best near online node is for those cpus on memory-less nodes. In the original code, if nid is NUMA_NO_NODE, the node the current cpu resides in will be chosen. And if the node is memory-less node, the cpu will be mapped to its best near online node. But this patch-set will map the cpu to its original node, so numa_node_id() may return a memory-less node to allocator. And then memory allocation may fail. > Also, what's the synchronization rule here? How are allocators > synchronized against node hot [un]plugs? The rule is: node_to_near_node_map[] array will be updated each time node [un]hotplug happens. Now it is not protected by a lock. But I think acquiring a lock may cause performance regression to memory allocator. When rebuilding zonelist, stop_machine is used. So I think maybe updating the node_to_near_node_map[] array at the same time when zonelist is rebuilt could be a good idea. Thanks.