From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Aneesh Kumar K.V" Subject: Re: [RFC][PATCH v2 10/21] mm: build separate zonelist for PMEM and DRAM node Date: Mon, 07 Jan 2019 19:39:19 +0530 Message-ID: <87h8ekk19s.fsf@linux.ibm.com> References: <20181226131446.330864849@intel.com> <20181226133351.644607371@intel.com> <87sgyc7n9a.fsf@linux.ibm.com> <20190107095753.7feee5fxjja5lt75@wfg-t540p.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain Cc: Andrew Morton , Linux Memory Management List , Fan Du , kvm@vger.kernel.org, LKML , Yao Yuan , Peng Dong , Huang Ying , Liu Jingqi , Dong Eddie , Dave Hansen , Zhang Yi , Dan Williams To: Fengguang Wu Return-path: In-Reply-To: <20190107095753.7feee5fxjja5lt75@wfg-t540p.sh.intel.com> Sender: linux-kernel-owner@vger.kernel.org List-Id: kvm.vger.kernel.org Fengguang Wu writes: > On Tue, Jan 01, 2019 at 02:44:41PM +0530, Aneesh Kumar K.V wrote: >>Fengguang Wu writes: >> >>> From: Fan Du >>> >>> When allocate page, DRAM and PMEM node should better not fall back to >>> each other. This allows migration code to explicitly control which type >>> of node to allocate pages from. >>> >>> With this patch, PMEM NUMA node can only be used in 2 ways: >>> - migrate in and out >>> - numactl >> >>Can we achieve this using nodemask? That way we don't tag nodes with >>different properties such as DRAM/PMEM. We can then give the >>flexibilility to the device init code to add the new memory nodes to >>the right nodemask > > Aneesh, in patch 2 we did create nodemask numa_nodes_pmem and > numa_nodes_dram. What's your supposed way of "using nodemask"? > IIUC the patch is to avoid allocation from PMEM nodes and the way you achieve it is by checking if (is_node_pmem(n)). We already have abstractness to avoid allocation from a node using node mask. I was wondering whether we can do the equivalent of above using that. ie, __next_zone_zonelist can do zref_in_nodemask(z, default_exclude_nodemask)) and decide whether to use the specific zone or not. That way we don't add special code like + PGDAT_DRAM, /* Volatile DRAM memory node */ + PGDAT_PMEM, /* Persistent memory node */ The reason is that there could be other device memory that would want to get excluded from that default allocation like you are doing for PMEM -aneesh