From: Anshuman Khandual <khandual@linux.vnet.ibm.com>
To: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Cc: mhocko@suse.com, vbabka@suse.cz, mgorman@suse.de,
minchan@kernel.org, aneesh.kumar@linux.vnet.ibm.com,
bsingharora@gmail.com, srikar@linux.vnet.ibm.com,
haren@linux.vnet.ibm.com, jglisse@redhat.com,
dave.hansen@intel.com
Subject: [RFC 2/4] mm/cpuset: Exclude coherent device memory nodes from mems_allowed
Date: Tue, 22 Nov 2016 19:49:38 +0530 [thread overview]
Message-ID: <1479824388-30446-3-git-send-email-khandual@linux.vnet.ibm.com> (raw)
In-Reply-To: <1479824388-30446-1-git-send-email-khandual@linux.vnet.ibm.com>
Task's mems_allowed decides the final node mask of nodes from which memory
can be allocated irrespective of the process or VMA based memory policy.
Coherent device memory nodes should not be used for any user space memory
allocation, hence they should not be part of any mems_allowed mask in user
space to begin with. This adds a new function system_ram() which computes
system RAM only node mask and excludes all the coherent memory nodes on the
platform. This resultant system RAM node mask is used instead of N_MEMORY
node mask during cpuset update and mems_allowed initialization. It achieves
isolation of the coherent device memory node from userspace allocations.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
---
include/linux/mm.h | 1 +
include/linux/node.h | 12 ++++++++++++
kernel/cpuset.c | 12 +++++++-----
3 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index a92c8d7..c40b454 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -446,6 +446,7 @@ static inline int put_page_testzero(struct page *page)
return page_ref_dec_and_test(page);
}
+
/*
* Try to grab a ref unless the page has a refcount of zero, return false if
* that is the case.
diff --git a/include/linux/node.h b/include/linux/node.h
index fc319de..99978f9 100644
--- a/include/linux/node.h
+++ b/include/linux/node.h
@@ -87,4 +87,16 @@ static inline void register_hugetlbfs_with_node(node_registration_func_t reg,
static inline int arch_check_node_cdm(int nid) {return 0;}
#endif
+static inline nodemask_t ram_nodemask(void)
+{
+#ifdef CONFIG_COHERENT_DEVICE
+ nodemask_t ram_nodes;
+
+ nodes_clear(ram_nodes);
+ nodes_andnot(ram_nodes, node_states[N_MEMORY], node_states[N_COHERENT_DEVICE]);
+ return ram_nodes;
+#else
+ return node_states[N_MEMORY];
+#endif
+}
#endif /* _LINUX_NODE_H_ */
diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 29f815d..bdbe847 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -364,9 +364,11 @@ static void guarantee_online_cpus(struct cpuset *cs, struct cpumask *pmask)
*/
static void guarantee_online_mems(struct cpuset *cs, nodemask_t *pmask)
{
- while (!nodes_intersects(cs->effective_mems, node_states[N_MEMORY]))
+ nodemask_t ram_nodes = ram_nodemask();
+
+ while (!nodes_intersects(cs->effective_mems, ram_nodes))
cs = parent_cs(cs);
- nodes_and(*pmask, cs->effective_mems, node_states[N_MEMORY]);
+ nodes_and(*pmask, cs->effective_mems, ram_nodes);
}
/*
@@ -2301,7 +2303,7 @@ static void cpuset_hotplug_workfn(struct work_struct *work)
/* fetch the available cpus/mems and find out which changed how */
cpumask_copy(&new_cpus, cpu_active_mask);
- new_mems = node_states[N_MEMORY];
+ new_mems = ram_nodemask();
cpus_updated = !cpumask_equal(top_cpuset.effective_cpus, &new_cpus);
mems_updated = !nodes_equal(top_cpuset.effective_mems, new_mems);
@@ -2393,11 +2395,11 @@ static int cpuset_track_online_nodes(struct notifier_block *self,
void __init cpuset_init_smp(void)
{
cpumask_copy(top_cpuset.cpus_allowed, cpu_active_mask);
- top_cpuset.mems_allowed = node_states[N_MEMORY];
+ top_cpuset.mems_allowed = ram_nodemask();
top_cpuset.old_mems_allowed = top_cpuset.mems_allowed;
cpumask_copy(top_cpuset.effective_cpus, cpu_active_mask);
- top_cpuset.effective_mems = node_states[N_MEMORY];
+ top_cpuset.effective_mems = ram_nodemask();
register_hotmemory_notifier(&cpuset_track_online_nodes_nb);
--
1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-11-22 14:20 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-11-22 14:19 [RFC 0/4] Define coherent device memory node Anshuman Khandual
2016-11-22 14:19 ` [RFC 1/4] mm: " Anshuman Khandual
2016-11-29 17:57 ` Dave Hansen
2016-11-30 11:46 ` Anshuman Khandual
2016-11-22 14:19 ` Anshuman Khandual [this message]
2016-11-22 14:19 ` [RFC 3/4] mm/hugetlb: Restrict HugeTLB page allocations only to system ram nodemask Anshuman Khandual
2016-11-22 14:19 ` [RFC 4/4] mm: Ignore cpuset enforcement when allocation flag has __GFP_THISNODE Anshuman Khandual
2016-11-28 21:12 ` Dave Hansen
2016-11-29 6:51 ` Anshuman Khandual
2016-11-29 16:52 ` Dave Hansen
2016-11-30 11:17 ` Anshuman Khandual
2016-11-30 19:43 ` Dave Hansen
2016-11-22 14:19 ` [DEBUG 05/12] powerpc/mm: Identify coherent device memory nodes during platform init Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 06/12] powerpc/mm: Create numa nodes for hotplug memory Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 07/12] powerpc/mm: Allow memory hotplug into a memory less node Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 08/12] mm: Enable CONFIG_MOVABLE_NODE on powerpc Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 09/12] powerpc: Enable CONFIG_MOVABLE_NODE for PPC64 platform Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 10/12] mm: Add a new migration function migrate_virtual_range() Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 11/12] drivers: Add two drivers for coherent device memory tests Anshuman Khandual
2016-11-22 14:19 ` [DEBUG 12/12] test: Add a script to perform random VMA migrations across nodes Anshuman Khandual
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1479824388-30446-3-git-send-email-khandual@linux.vnet.ibm.com \
--to=khandual@linux.vnet.ibm.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=bsingharora@gmail.com \
--cc=dave.hansen@intel.com \
--cc=haren@linux.vnet.ibm.com \
--cc=jglisse@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=srikar@linux.vnet.ibm.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).