linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [V5 PATCH 02/26] memory_hotplug: handle empty zone when online_movable/online_kernel
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
@ 2012-10-29 15:20 ` Lai Jiangshan
  2012-10-29 15:20 ` [V5 PATCH 03/26] memory_hotplug: ensure every online node has NORMAL memory Lai Jiangshan
                   ` (15 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:20 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Wen Congyang,
	linux-mm

make online_movable/online_kernel can empty a zone
or can move memory to a empty zone.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memory_hotplug.c |   51 +++++++++++++++++++++++++++++++++++++++++++++------
 1 files changed, 45 insertions(+), 6 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6d3bec4..bdcdaf6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -227,8 +227,17 @@ static void resize_zone(struct zone *zone, unsigned long start_pfn,
 
 	zone_span_writelock(zone);
 
-	zone->zone_start_pfn = start_pfn;
-	zone->spanned_pages = end_pfn - start_pfn;
+	if (end_pfn - start_pfn) {
+		zone->zone_start_pfn = start_pfn;
+		zone->spanned_pages = end_pfn - start_pfn;
+	} else {
+		/*
+		 * make it consist as free_area_init_core(),
+		 * if spanned_pages = 0, then keep start_pfn = 0
+		 */
+		zone->zone_start_pfn = 0;
+		zone->spanned_pages = 0;
+	}
 
 	zone_span_writeunlock(zone);
 }
@@ -244,10 +253,19 @@ static void fix_zone_id(struct zone *zone, unsigned long start_pfn,
 		set_page_links(pfn_to_page(pfn), zid, nid, pfn);
 }
 
-static int move_pfn_range_left(struct zone *z1, struct zone *z2,
+static int __meminit move_pfn_range_left(struct zone *z1, struct zone *z2,
 		unsigned long start_pfn, unsigned long end_pfn)
 {
+	int ret;
 	unsigned long flags;
+	unsigned long z1_start_pfn;
+
+	if (!z1->wait_table) {
+		ret = init_currently_empty_zone(z1, start_pfn,
+			end_pfn - start_pfn, MEMMAP_HOTPLUG);
+		if (ret)
+			return ret;
+	}
 
 	pgdat_resize_lock(z1->zone_pgdat, &flags);
 
@@ -261,7 +279,13 @@ static int move_pfn_range_left(struct zone *z1, struct zone *z2,
 	if (end_pfn <= z2->zone_start_pfn)
 		goto out_fail;
 
-	resize_zone(z1, z1->zone_start_pfn, end_pfn);
+	/* use start_pfn for z1's start_pfn if z1 is empty */
+	if (z1->spanned_pages)
+		z1_start_pfn = z1->zone_start_pfn;
+	else
+		z1_start_pfn = start_pfn;
+
+	resize_zone(z1, z1_start_pfn, end_pfn);
 	resize_zone(z2, end_pfn, z2->zone_start_pfn + z2->spanned_pages);
 
 	pgdat_resize_unlock(z1->zone_pgdat, &flags);
@@ -274,10 +298,19 @@ out_fail:
 	return -1;
 }
 
-static int move_pfn_range_right(struct zone *z1, struct zone *z2,
+static int __meminit move_pfn_range_right(struct zone *z1, struct zone *z2,
 		unsigned long start_pfn, unsigned long end_pfn)
 {
+	int ret;
 	unsigned long flags;
+	unsigned long z2_end_pfn;
+
+	if (!z2->wait_table) {
+		ret = init_currently_empty_zone(z2, start_pfn,
+			end_pfn - start_pfn, MEMMAP_HOTPLUG);
+		if (ret)
+			return ret;
+	}
 
 	pgdat_resize_lock(z1->zone_pgdat, &flags);
 
@@ -291,8 +324,14 @@ static int move_pfn_range_right(struct zone *z1, struct zone *z2,
 	if (start_pfn >= z1->zone_start_pfn + z1->spanned_pages)
 		goto out_fail;
 
+	/* use end_pfn for z2's end_pfn if z2 is empty */
+	if (z2->spanned_pages)
+		z2_end_pfn = z2->zone_start_pfn + z2->spanned_pages;
+	else
+		z2_end_pfn = end_pfn;
+
 	resize_zone(z1, z1->zone_start_pfn, start_pfn);
-	resize_zone(z2, start_pfn, z2->zone_start_pfn + z2->spanned_pages);
+	resize_zone(z2, start_pfn, z2_end_pfn);
 
 	pgdat_resize_unlock(z1->zone_pgdat, &flags);
 
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 03/26] memory_hotplug: ensure every online node has NORMAL memory
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
  2012-10-29 15:20 ` [V5 PATCH 02/26] memory_hotplug: handle empty zone when online_movable/online_kernel Lai Jiangshan
@ 2012-10-29 15:20 ` Lai Jiangshan
  2012-10-29 15:20 ` [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:20 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Wen Congyang,
	linux-mm

Old  memory hotplug code and new online/movable may cause a online node
don't have any normal memory, but memory-management acts bad when we have
nodes which is online but don't have any normal memory.
Example: it may cause a bound task fail on all kernel allocation and
cause the task can't create task or create other kernel object.

So we disable non-normal-memory-node here, we will enable it
when we prepared.


Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memory_hotplug.c |   40 ++++++++++++++++++++++++++++++++++++++++
 1 files changed, 40 insertions(+), 0 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index bdcdaf6..9af9641 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -589,6 +589,12 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 	return 0;
 }
 
+/* ensure every online node has NORMAL memory */
+static bool can_online_high_movable(struct zone *zone)
+{
+	return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
+}
+
 /* check which state of node_states will be changed when online memory */
 static void node_states_check_changes_online(unsigned long nr_pages,
 	struct zone *zone, struct memory_notify *arg)
@@ -654,6 +660,12 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
 	 */
 	zone = page_zone(pfn_to_page(pfn));
 
+	if ((zone_idx(zone) > ZONE_NORMAL || online_type == ONLINE_MOVABLE) &&
+	    !can_online_high_movable(zone)) {
+		unlock_memory_hotplug();
+		return -1;
+	}
+
 	if (online_type == ONLINE_KERNEL && zone_idx(zone) == ZONE_MOVABLE) {
 		if (move_pfn_range_left(zone - 1, zone, pfn, pfn + nr_pages)) {
 			unlock_memory_hotplug();
@@ -1058,6 +1070,30 @@ check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 	return offlined;
 }
 
+/* ensure the node has NORMAL memory if it is still online */
+static bool can_offline_normal(struct zone *zone, unsigned long nr_pages)
+{
+	struct pglist_data *pgdat = zone->zone_pgdat;
+	unsigned long present_pages = 0;
+	enum zone_type zt;
+
+	for (zt = 0; zt <= ZONE_NORMAL; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+
+	if (present_pages > nr_pages)
+		return true;
+
+	present_pages = 0;
+	for (; zt <= ZONE_MOVABLE; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+
+	/*
+	 * we can't offline the last normal memory until all
+	 * higher memory is offlined.
+	 */
+	return present_pages == 0;
+}
+
 /* check which state of node_states will be changed when offline memory */
 static void node_states_check_changes_offline(unsigned long nr_pages,
 		struct zone *zone, struct memory_notify *arg)
@@ -1145,6 +1181,10 @@ static int __ref __offline_pages(unsigned long start_pfn,
 	node = zone_to_nid(zone);
 	nr_pages = end_pfn - start_pfn;
 
+	ret = -EINVAL;
+	if (zone_idx(zone) <= ZONE_NORMAL && !can_offline_normal(zone, nr_pages))
+		goto out;
+
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true);
 	if (ret)
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
  2012-10-29 15:20 ` [V5 PATCH 02/26] memory_hotplug: handle empty zone when online_movable/online_kernel Lai Jiangshan
  2012-10-29 15:20 ` [V5 PATCH 03/26] memory_hotplug: ensure every online node has NORMAL memory Lai Jiangshan
@ 2012-10-29 15:20 ` Lai Jiangshan
  2012-10-29 16:22   ` Michal Hocko
  2012-10-31 13:18   ` Michal Hocko
  2012-10-29 15:20 ` [V5 PATCH 09/26] oom: " Lai Jiangshan
                   ` (13 subsequent siblings)
  16 siblings, 2 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:20 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Johannes Weiner,
	Michal Hocko, Balbir Singh, Tejun Heo, Li Zefan, cgroups,
	linux-mm, containers

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memcontrol.c  |   18 +++++++++---------
 mm/page_cgroup.c |    2 +-
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7acf43b..1b69665 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -800,7 +800,7 @@ static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg,
 	int nid;
 	u64 total = 0;
 
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
 		total += mem_cgroup_node_nr_lru_pages(memcg, nid, lru_mask);
 	return total;
 }
@@ -1611,9 +1611,9 @@ static void mem_cgroup_may_update_nodemask(struct mem_cgroup *memcg)
 		return;
 
 	/* make a nodemask where this memcg uses memory from */
-	memcg->scan_nodes = node_states[N_HIGH_MEMORY];
+	memcg->scan_nodes = node_states[N_MEMORY];
 
-	for_each_node_mask(nid, node_states[N_HIGH_MEMORY]) {
+	for_each_node_mask(nid, node_states[N_MEMORY]) {
 
 		if (!test_mem_cgroup_node_reclaimable(memcg, nid, false))
 			node_clear(nid, memcg->scan_nodes);
@@ -1684,7 +1684,7 @@ static bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap)
 	/*
 	 * Check rest of nodes.
 	 */
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		if (node_isset(nid, memcg->scan_nodes))
 			continue;
 		if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap))
@@ -3759,7 +3759,7 @@ move_account:
 		drain_all_stock_sync(memcg);
 		ret = 0;
 		mem_cgroup_start_move(memcg);
-		for_each_node_state(node, N_HIGH_MEMORY) {
+		for_each_node_state(node, N_MEMORY) {
 			for (zid = 0; !ret && zid < MAX_NR_ZONES; zid++) {
 				enum lru_list lru;
 				for_each_lru(lru) {
@@ -4087,7 +4087,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
 	seq_printf(m, "total=%lu", total_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
 	}
@@ -4095,7 +4095,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	file_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_FILE);
 	seq_printf(m, "file=%lu", file_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_FILE);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4104,7 +4104,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	anon_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_ANON);
 	seq_printf(m, "anon=%lu", anon_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				LRU_ALL_ANON);
 		seq_printf(m, " N%d=%lu", nid, node_nr);
@@ -4113,7 +4113,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
 
 	unevictable_nr = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE));
 	seq_printf(m, "unevictable=%lu", unevictable_nr);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
 				BIT(LRU_UNEVICTABLE));
 		seq_printf(m, " N%d=%lu", nid, node_nr);
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 5ddad0c..c1054ad 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -271,7 +271,7 @@ void __init page_cgroup_init(void)
 	if (mem_cgroup_disabled())
 		return;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		start_pfn = node_start_pfn(nid);
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 09/26] oom: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (2 preceding siblings ...)
  2012-10-29 15:20 ` [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
@ 2012-10-29 15:20 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 10/26] mm,migrate: " Lai Jiangshan
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:20 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Michal Hocko,
	KOSAKI Motohiro, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 mm/oom_kill.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 79e0f3e..aa2d89c 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -257,7 +257,7 @@ static enum oom_constraint constrained_alloc(struct zonelist *zonelist,
 	 * the page allocator means a mempolicy is in effect.  Cpuset policy
 	 * is enforced in get_page_from_freelist().
 	 */
-	if (nodemask && !nodes_subset(node_states[N_HIGH_MEMORY], *nodemask)) {
+	if (nodemask && !nodes_subset(node_states[N_MEMORY], *nodemask)) {
 		*totalpages = total_swap_pages;
 		for_each_node_mask(nid, *nodemask)
 			*totalpages += node_spanned_pages(nid);
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 10/26] mm,migrate: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (3 preceding siblings ...)
  2012-10-29 15:20 ` [V5 PATCH 09/26] oom: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 11/26] mempolicy: " Lai Jiangshan
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Michal Hocko,
	Hugh Dickins, Christoph Lameter, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
---
 mm/migrate.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 77ed2d7..d595e58 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1201,7 +1201,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes,
 			if (node < 0 || node >= MAX_NUMNODES)
 				goto out_pm;
 
-			if (!node_state(node, N_HIGH_MEMORY))
+			if (!node_state(node, N_MEMORY))
 				goto out_pm;
 
 			err = -EACCES;
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 11/26] mempolicy: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (4 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 10/26] mm,migrate: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 12/26] hugetlb: " Lai Jiangshan
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, KOSAKI Motohiro,
	Christoph Lameter, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/mempolicy.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d04a8a5..d4a084c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -212,9 +212,9 @@ static int mpol_set_nodemask(struct mempolicy *pol,
 	/* if mode is MPOL_DEFAULT, pol is NULL. This is right. */
 	if (pol == NULL)
 		return 0;
-	/* Check N_HIGH_MEMORY */
+	/* Check N_MEMORY */
 	nodes_and(nsc->mask1,
-		  cpuset_current_mems_allowed, node_states[N_HIGH_MEMORY]);
+		  cpuset_current_mems_allowed, node_states[N_MEMORY]);
 
 	VM_BUG_ON(!nodes);
 	if (pol->mode == MPOL_PREFERRED && nodes_empty(*nodes))
@@ -1388,7 +1388,7 @@ SYSCALL_DEFINE4(migrate_pages, pid_t, pid, unsigned long, maxnode,
 		goto out_put;
 	}
 
-	if (!nodes_subset(*new, node_states[N_HIGH_MEMORY])) {
+	if (!nodes_subset(*new, node_states[N_MEMORY])) {
 		err = -EINVAL;
 		goto out_put;
 	}
@@ -2361,7 +2361,7 @@ void __init numa_policy_init(void)
 	 * fall back to the largest node if they're all smaller.
 	 */
 	nodes_clear(interleave_nodes);
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long total_pages = node_present_pages(nid);
 
 		/* Preserve the largest node */
@@ -2442,7 +2442,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		*nodelist++ = '\0';
 		if (nodelist_parse(nodelist, nodes))
 			goto out;
-		if (!nodes_subset(nodes, node_states[N_HIGH_MEMORY]))
+		if (!nodes_subset(nodes, node_states[N_MEMORY]))
 			goto out;
 	} else
 		nodes_clear(nodes);
@@ -2476,7 +2476,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
 		 * Default to online nodes with memory if no nodelist
 		 */
 		if (!nodelist)
-			nodes = node_states[N_HIGH_MEMORY];
+			nodes = node_states[N_MEMORY];
 		break;
 	case MPOL_LOCAL:
 		/*
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 12/26] hugetlb: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (5 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 11/26] mempolicy: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 13/26] vmstat: " Lai Jiangshan
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Greg Kroah-Hartman,
	Michal Hocko, Hillf Danton, Aneesh Kumar K.V, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 drivers/base/node.c |    2 +-
 mm/hugetlb.c        |   24 ++++++++++++------------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 5d7731e..4c3aa7c 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -227,7 +227,7 @@ static node_registration_func_t __hugetlb_unregister_node;
 static inline bool hugetlb_register_node(struct node *node)
 {
 	if (__hugetlb_register_node &&
-			node_state(node->dev.id, N_HIGH_MEMORY)) {
+			node_state(node->dev.id, N_MEMORY)) {
 		__hugetlb_register_node(node);
 		return true;
 	}
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 59a0059..7720ade 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1057,7 +1057,7 @@ static void return_unused_surplus_pages(struct hstate *h,
 	 * on-line nodes with memory and will handle the hstate accounting.
 	 */
 	while (nr_pages--) {
-		if (!free_pool_huge_page(h, &node_states[N_HIGH_MEMORY], 1))
+		if (!free_pool_huge_page(h, &node_states[N_MEMORY], 1))
 			break;
 	}
 }
@@ -1180,14 +1180,14 @@ static struct page *alloc_huge_page(struct vm_area_struct *vma,
 int __weak alloc_bootmem_huge_page(struct hstate *h)
 {
 	struct huge_bootmem_page *m;
-	int nr_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int nr_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	while (nr_nodes) {
 		void *addr;
 
 		addr = __alloc_bootmem_node_nopanic(
 				NODE_DATA(hstate_next_node_to_alloc(h,
-						&node_states[N_HIGH_MEMORY])),
+						&node_states[N_MEMORY])),
 				huge_page_size(h), huge_page_size(h), 0);
 
 		if (addr) {
@@ -1259,7 +1259,7 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h)
 			if (!alloc_bootmem_huge_page(h))
 				break;
 		} else if (!alloc_fresh_huge_page(h,
-					 &node_states[N_HIGH_MEMORY]))
+					 &node_states[N_MEMORY]))
 			break;
 	}
 	h->max_huge_pages = i;
@@ -1527,7 +1527,7 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 				init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 	} else if (nodes_allowed) {
 		/*
@@ -1537,11 +1537,11 @@ static ssize_t nr_hugepages_store_common(bool obey_mempolicy,
 		count += h->nr_huge_pages - h->nr_huge_pages_node[nid];
 		init_nodemask_of_node(nodes_allowed, nid);
 	} else
-		nodes_allowed = &node_states[N_HIGH_MEMORY];
+		nodes_allowed = &node_states[N_MEMORY];
 
 	h->max_huge_pages = set_max_huge_pages(h, count, nodes_allowed);
 
-	if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+	if (nodes_allowed != &node_states[N_MEMORY])
 		NODEMASK_FREE(nodes_allowed);
 
 	return len;
@@ -1844,7 +1844,7 @@ static void hugetlb_register_all_nodes(void)
 {
 	int nid;
 
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		struct node *node = &node_devices[nid];
 		if (node->dev.id == nid)
 			hugetlb_register_node(node);
@@ -1939,8 +1939,8 @@ void __init hugetlb_add_hstate(unsigned order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-	h->next_nid_to_alloc = first_node(node_states[N_HIGH_MEMORY]);
-	h->next_nid_to_free = first_node(node_states[N_HIGH_MEMORY]);
+	h->next_nid_to_alloc = first_node(node_states[N_MEMORY]);
+	h->next_nid_to_free = first_node(node_states[N_MEMORY]);
 	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
 					huge_page_size(h)/1024);
 	/*
@@ -2035,11 +2035,11 @@ static int hugetlb_sysctl_handler_common(bool obey_mempolicy,
 		if (!(obey_mempolicy &&
 			       init_nodemask_of_mempolicy(nodes_allowed))) {
 			NODEMASK_FREE(nodes_allowed);
-			nodes_allowed = &node_states[N_HIGH_MEMORY];
+			nodes_allowed = &node_states[N_MEMORY];
 		}
 		h->max_huge_pages = set_max_huge_pages(h, tmp, nodes_allowed);
 
-		if (nodes_allowed != &node_states[N_HIGH_MEMORY])
+		if (nodes_allowed != &node_states[N_MEMORY])
 			NODEMASK_FREE(nodes_allowed);
 	}
 out:
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 13/26] vmstat: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (6 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 12/26] hugetlb: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 16/26] vmscan: " Lai Jiangshan
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Mel Gorman,
	Christoph Lameter, Minchan Kim, Johannes Weiner, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
---
 mm/vmstat.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmstat.c b/mm/vmstat.c
index c737057..1b5cacd 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -930,7 +930,7 @@ static int pagetypeinfo_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	seq_printf(m, "Page block order: %d\n", pageblock_order);
@@ -1292,7 +1292,7 @@ static int unusable_show(struct seq_file *m, void *arg)
 	pg_data_t *pgdat = (pg_data_t *)arg;
 
 	/* check memoryless node */
-	if (!node_state(pgdat->node_id, N_HIGH_MEMORY))
+	if (!node_state(pgdat->node_id, N_MEMORY))
 		return 0;
 
 	walk_zones_in_node(m, pgdat, unusable_show_print);
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 16/26] vmscan: use N_MEMORY instead N_HIGH_MEMORY
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (7 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 13/26] vmstat: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 17/26] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
                   ` (7 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Minchan Kim,
	Hugh Dickins, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 mm/vmscan.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2624edc..98a2e11 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3135,7 +3135,7 @@ static int __devinit cpu_callback(struct notifier_block *nfb,
 	int nid;
 
 	if (action == CPU_ONLINE || action == CPU_ONLINE_FROZEN) {
-		for_each_node_state(nid, N_HIGH_MEMORY) {
+		for_each_node_state(nid, N_MEMORY) {
 			pg_data_t *pgdat = NODE_DATA(nid);
 			const struct cpumask *mask;
 
@@ -3191,7 +3191,7 @@ static int __init kswapd_init(void)
 	int nid;
 
 	swap_setup();
-	for_each_node_state(nid, N_HIGH_MEMORY)
+	for_each_node_state(nid, N_MEMORY)
  		kswapd_run(nid);
 	hotcpu_notifier(cpu_callback, 0);
 	return 0;
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 17/26] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (8 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 16/26] vmscan: " Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 18/26] hotplug: update nodemasks management Lai Jiangshan
                   ` (6 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Thomas Gleixner,
	Ingo Molnar, H. Peter Anvin, Tejun Heo, Pekka Enberg, Minchan Kim,
	Michal Hocko, linux-mm

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.

Since we introduced N_MEMORY, we update the initialization of node_states.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 arch/x86/mm/init_64.c |    4 +++-
 mm/page_alloc.c       |   40 ++++++++++++++++++++++------------------
 2 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 3baff25..2ead3c8 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -630,7 +630,9 @@ void __init paging_init(void)
 	 *	 numa support is not compiled in, and later node_set_state
 	 *	 will not set it back.
 	 */
-	node_clear_state(0, N_NORMAL_MEMORY);
+	node_clear_state(0, N_MEMORY);
+	if (N_MEMORY != N_NORMAL_MEMORY)
+		node_clear_state(0, N_NORMAL_MEMORY);
 
 	zone_sizes_init();
 }
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b1ef9b0..b70c929 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1692,7 +1692,7 @@ bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark,
  *
  * If the zonelist cache is present in the passed in zonelist, then
  * returns a pointer to the allowed node mask (either the current
- * tasks mems_allowed, or node_states[N_HIGH_MEMORY].)
+ * tasks mems_allowed, or node_states[N_MEMORY].)
  *
  * If the zonelist cache is not available for this zonelist, does
  * nothing and returns NULL.
@@ -1721,7 +1721,7 @@ static nodemask_t *zlc_setup(struct zonelist *zonelist, int alloc_flags)
 
 	allowednodes = !in_interrupt() && (alloc_flags & ALLOC_CPUSET) ?
 					&cpuset_current_mems_allowed :
-					&node_states[N_HIGH_MEMORY];
+					&node_states[N_MEMORY];
 	return allowednodes;
 }
 
@@ -3194,7 +3194,7 @@ static int find_next_best_node(int node, nodemask_t *used_node_mask)
 		return node;
 	}
 
-	for_each_node_state(n, N_HIGH_MEMORY) {
+	for_each_node_state(n, N_MEMORY) {
 
 		/* Don't want a node to appear more than once */
 		if (node_isset(n, *used_node_mask))
@@ -3336,7 +3336,7 @@ static int default_zonelist_order(void)
  	 * local memory, NODE_ORDER may be suitable.
          */
 	average_size = total_size /
-				(nodes_weight(node_states[N_HIGH_MEMORY]) + 1);
+				(nodes_weight(node_states[N_MEMORY]) + 1);
 	for_each_online_node(nid) {
 		low_kmem_size = 0;
 		total_size = 0;
@@ -4669,7 +4669,7 @@ unsigned long __init find_min_pfn_with_active_regions(void)
 /*
  * early_calculate_totalpages()
  * Sum pages in active regions for movable zone.
- * Populate N_HIGH_MEMORY for calculating usable_nodes.
+ * Populate N_MEMORY for calculating usable_nodes.
  */
 static unsigned long __init early_calculate_totalpages(void)
 {
@@ -4682,7 +4682,7 @@ static unsigned long __init early_calculate_totalpages(void)
 
 		totalpages += pages;
 		if (pages)
-			node_set_state(nid, N_HIGH_MEMORY);
+			node_set_state(nid, N_MEMORY);
 	}
   	return totalpages;
 }
@@ -4699,9 +4699,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	unsigned long usable_startpfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
-	nodemask_t saved_node_state = node_states[N_HIGH_MEMORY];
+	nodemask_t saved_node_state = node_states[N_MEMORY];
 	unsigned long totalpages = early_calculate_totalpages();
-	int usable_nodes = nodes_weight(node_states[N_HIGH_MEMORY]);
+	int usable_nodes = nodes_weight(node_states[N_MEMORY]);
 
 	/*
 	 * If movablecore was specified, calculate what size of
@@ -4736,7 +4736,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
-	for_each_node_state(nid, N_HIGH_MEMORY) {
+	for_each_node_state(nid, N_MEMORY) {
 		unsigned long start_pfn, end_pfn;
 
 		/*
@@ -4828,23 +4828,27 @@ restart:
 
 out:
 	/* restore the node_state */
-	node_states[N_HIGH_MEMORY] = saved_node_state;
+	node_states[N_MEMORY] = saved_node_state;
 }
 
-/* Any regular memory on that node ? */
-static void __init check_for_regular_memory(pg_data_t *pgdat)
+/* Any regular or high memory on that node ? */
+static void check_for_memory(pg_data_t *pgdat, int nid)
 {
-#ifdef CONFIG_HIGHMEM
 	enum zone_type zone_type;
 
-	for (zone_type = 0; zone_type <= ZONE_NORMAL; zone_type++) {
+	if (N_MEMORY == N_NORMAL_MEMORY)
+		return;
+
+	for (zone_type = 0; zone_type <= ZONE_MOVABLE - 1; zone_type++) {
 		struct zone *zone = &pgdat->node_zones[zone_type];
 		if (zone->present_pages) {
-			node_set_state(zone_to_nid(zone), N_NORMAL_MEMORY);
+			node_set_state(nid, N_HIGH_MEMORY);
+			if (N_NORMAL_MEMORY != N_HIGH_MEMORY &&
+			    zone_type <= ZONE_NORMAL)
+				node_set_state(nid, N_NORMAL_MEMORY);
 			break;
 		}
 	}
-#endif
 }
 
 /**
@@ -4927,8 +4931,8 @@ void __init free_area_init_nodes(unsigned long *max_zone_pfn)
 
 		/* Any memory on that node */
 		if (pgdat->node_present_pages)
-			node_set_state(nid, N_HIGH_MEMORY);
-		check_for_regular_memory(pgdat);
+			node_set_state(nid, N_MEMORY);
+		check_for_memory(pgdat, nid);
 	}
 }
 
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 18/26] hotplug: update nodemasks management
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (9 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 17/26] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 19/26] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Rob Landley,
	Jianguo Wu, Kay Sievers, Wen Congyang, linux-doc, linux-mm

update nodemasks management for N_MEMORY

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/memory-hotplug.txt |    5 ++-
 include/linux/memory.h           |    1 +
 mm/memory_hotplug.c              |   87 +++++++++++++++++++++++++++++++-------
 3 files changed, 77 insertions(+), 16 deletions(-)

diff --git a/Documentation/memory-hotplug.txt b/Documentation/memory-hotplug.txt
index c6f993d..8e5eacb 100644
--- a/Documentation/memory-hotplug.txt
+++ b/Documentation/memory-hotplug.txt
@@ -390,6 +390,7 @@ struct memory_notify {
        unsigned long start_pfn;
        unsigned long nr_pages;
        int status_change_nid_normal;
+       int status_change_nid_high;
        int status_change_nid;
 }
 
@@ -397,7 +398,9 @@ start_pfn is start_pfn of online/offline memory.
 nr_pages is # of pages of online/offline memory.
 status_change_nid_normal is set node id when N_NORMAL_MEMORY of nodemask
 is (will be) set/clear, if this is -1, then nodemask status is not changed.
-status_change_nid is set node id when N_HIGH_MEMORY of nodemask is (will be)
+status_change_nid_high is set node id when N_HIGH_MEMORY of nodemask
+is (will be) set/clear, if this is -1, then nodemask status is not changed.
+status_change_nid is set node id when N_MEMORY of nodemask is (will be)
 set/clear. It means a new(memoryless) node gets new memory by online and a
 node loses all memory. If this is -1, then nodemask status is not changed.
 If status_changed_nid* >= 0, callback should create/discard structures for the
diff --git a/include/linux/memory.h b/include/linux/memory.h
index a09216d..45e93b4 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -54,6 +54,7 @@ struct memory_notify {
 	unsigned long start_pfn;
 	unsigned long nr_pages;
 	int status_change_nid_normal;
+	int status_change_nid_high;
 	int status_change_nid;
 };
 
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9af9641..a55b547 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -603,13 +603,15 @@ static void node_states_check_changes_online(unsigned long nr_pages,
 	enum zone_type zone_last = ZONE_NORMAL;
 
 	/*
-	 * If we have HIGHMEM, node_states[N_NORMAL_MEMORY] contains nodes
-	 * which have 0...ZONE_NORMAL, set zone_last to ZONE_NORMAL.
+	 * If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_NORMAL,
+	 * set zone_last to ZONE_NORMAL.
 	 *
-	 * If we don't have HIGHMEM, node_states[N_NORMAL_MEMORY] contains nodes
-	 * which have 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
+	 * If we don't have HIGHMEM nor movable node,
+	 * node_states[N_NORMAL_MEMORY] contains nodes which have zones of
+	 * 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
 	 */
-	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+	if (N_MEMORY == N_NORMAL_MEMORY)
 		zone_last = ZONE_MOVABLE;
 
 	/*
@@ -623,12 +625,34 @@ static void node_states_check_changes_online(unsigned long nr_pages,
 	else
 		arg->status_change_nid_normal = -1;
 
+#ifdef CONFIG_HIGHMEM
+	/*
+	 * If we have movable node, node_states[N_HIGH_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_HIGH,
+	 * set zone_last to ZONE_HIGH.
+	 *
+	 * If we don't have movable node, node_states[N_NORMAL_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_MOVABLE,
+	 * set zone_last to ZONE_MOVABLE.
+	 */
+	zone_last = ZONE_HIGH;
+	if (N_MEMORY == N_HIGH_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	if (zone_idx(zone) <= zone_last && !node_state(nid, N_HIGH_MEMORY))
+		arg->status_change_nid_high = nid;
+	else
+		arg->status_change_nid_high = -1;
+#else
+	arg->status_change_nid_high = arg->status_change_nid_normal;
+#endif
+
 	/*
 	 * if the node don't have memory befor online, we will need to
-	 * set the node to node_states[N_HIGH_MEMORY] after the memory
+	 * set the node to node_states[N_MEMORY] after the memory
 	 * is online.
 	 */
-	if (!node_state(nid, N_HIGH_MEMORY))
+	if (!node_state(nid, N_MEMORY))
 		arg->status_change_nid = nid;
 	else
 		arg->status_change_nid = -1;
@@ -639,7 +663,10 @@ static void node_states_set_node(int node, struct memory_notify *arg)
 	if (arg->status_change_nid_normal >= 0)
 		node_set_state(node, N_NORMAL_MEMORY);
 
-	node_set_state(node, N_HIGH_MEMORY);
+	if (arg->status_change_nid_high >= 0)
+		node_set_state(node, N_HIGH_MEMORY);
+
+	node_set_state(node, N_MEMORY);
 }
 
 
@@ -1103,13 +1130,15 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
 	enum zone_type zt, zone_last = ZONE_NORMAL;
 
 	/*
-	 * If we have HIGHMEM, node_states[N_NORMAL_MEMORY] contains nodes
-	 * which have 0...ZONE_NORMAL, set zone_last to ZONE_NORMAL.
+	 * If we have HIGHMEM or movable node, node_states[N_NORMAL_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_NORMAL,
+	 * set zone_last to ZONE_NORMAL.
 	 *
-	 * If we don't have HIGHMEM, node_states[N_NORMAL_MEMORY] contains nodes
-	 * which have 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
+	 * If we don't have HIGHMEM nor movable node,
+	 * node_states[N_NORMAL_MEMORY] contains nodes which have zones of
+	 * 0...ZONE_MOVABLE, set zone_last to ZONE_MOVABLE.
 	 */
-	if (N_HIGH_MEMORY == N_NORMAL_MEMORY)
+	if (N_MEMORY == N_NORMAL_MEMORY)
 		zone_last = ZONE_MOVABLE;
 
 	/*
@@ -1126,6 +1155,30 @@ static void node_states_check_changes_offline(unsigned long nr_pages,
 	else
 		arg->status_change_nid_normal = -1;
 
+#ifdef CONIG_HIGHMEM
+	/*
+	 * If we have movable node, node_states[N_HIGH_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_HIGH,
+	 * set zone_last to ZONE_HIGH.
+	 *
+	 * If we don't have movable node, node_states[N_NORMAL_MEMORY]
+	 * contains nodes which have zones of 0...ZONE_MOVABLE,
+	 * set zone_last to ZONE_MOVABLE.
+	 */
+	zone_last = ZONE_HIGH;
+	if (N_MEMORY == N_HIGH_MEMORY)
+		zone_last = ZONE_MOVABLE;
+
+	for (; zt <= zone_last; zt++)
+		present_pages += pgdat->node_zones[zt].present_pages;
+	if (zone_idx(zone) <= zone_last && nr_pages >= present_pages)
+		arg->status_change_nid_high = zone_to_nid(zone);
+	else
+		arg->status_change_nid_high = -1;
+#else
+	arg->status_change_nid_high = arg->status_change_nid_normal;
+#endif
+
 	/*
 	 * node_states[N_HIGH_MEMORY] contains nodes which have 0...ZONE_MOVABLE
 	 */
@@ -1150,9 +1203,13 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
 	if (arg->status_change_nid_normal >= 0)
 		node_clear_state(node, N_NORMAL_MEMORY);
 
-	if ((N_HIGH_MEMORY != N_NORMAL_MEMORY) &&
-	    (arg->status_change_nid >= 0))
+	if ((N_MEMORY != N_NORMAL_MEMORY) &&
+	    (arg->status_change_nid_high >= 0))
 		node_clear_state(node, N_HIGH_MEMORY);
+
+	if ((N_MEMORY != N_HIGH_MEMORY) &&
+	    (arg->status_change_nid >= 0))
+		node_clear_state(node, N_MEMORY);
 }
 
 static int __ref __offline_pages(unsigned long start_pfn,
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 19/26] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (10 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 18/26] hotplug: update nodemasks management Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 20/26] memory_hotplug: allow online/offline memory to result movable node Lai Jiangshan
                   ` (4 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Greg Kroah-Hartman,
	Christoph Lameter, Hillf Danton, Minchan Kim, Johannes Weiner,
	Dan Magenheimer, Mel Gorman, Michal Hocko, linux-mm

All are prepared, we can actually introduce N_MEMORY.
add CONFIG_MOVABLE_NODE make we can use it for movable-dedicated node

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 drivers/base/node.c      |    6 ++++++
 include/linux/nodemask.h |    4 ++++
 mm/Kconfig               |    8 ++++++++
 mm/page_alloc.c          |    3 +++
 4 files changed, 21 insertions(+), 0 deletions(-)

diff --git a/drivers/base/node.c b/drivers/base/node.c
index 4c3aa7c..9cdd66f 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -620,6 +620,9 @@ static struct node_attr node_state_attr[] = {
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = _NODE_ATTR(has_high_memory, N_HIGH_MEMORY),
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	[N_MEMORY] = _NODE_ATTR(has_memory, N_MEMORY),
+#endif
 	[N_CPU] = _NODE_ATTR(has_cpu, N_CPU),
 };
 
@@ -630,6 +633,9 @@ static struct attribute *node_state_attrs[] = {
 #ifdef CONFIG_HIGHMEM
 	&node_state_attr[N_HIGH_MEMORY].attr.attr,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	&node_state_attr[N_MEMORY].attr.attr,
+#endif
 	&node_state_attr[N_CPU].attr.attr,
 	NULL
 };
diff --git a/include/linux/nodemask.h b/include/linux/nodemask.h
index c6ebdc9..4e2cbfa 100644
--- a/include/linux/nodemask.h
+++ b/include/linux/nodemask.h
@@ -380,7 +380,11 @@ enum node_states {
 #else
 	N_HIGH_MEMORY = N_NORMAL_MEMORY,
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	N_MEMORY,		/* The node has memory(regular, high, movable) */
+#else
 	N_MEMORY = N_HIGH_MEMORY,
+#endif
 	N_CPU,		/* The node has one or more cpus */
 	NR_NODE_STATES
 };
diff --git a/mm/Kconfig b/mm/Kconfig
index a3f8ddd..957ebd5 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -143,6 +143,14 @@ config NO_BOOTMEM
 config MEMORY_ISOLATION
 	boolean
 
+config MOVABLE_NODE
+	boolean "Enable to assign a node has only movable memory"
+	depends on HAVE_MEMBLOCK
+	depends on NO_BOOTMEM
+	depends on X86_64
+	depends on NUMA
+	default y
+
 # eventually, we can have this option just 'select SPARSEMEM'
 config MEMORY_HOTPLUG
 	bool "Allow for memory hot-add"
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b70c929..a42337f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -90,6 +90,9 @@ nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
 #ifdef CONFIG_HIGHMEM
 	[N_HIGH_MEMORY] = { { [0] = 1UL } },
 #endif
+#ifdef CONFIG_MOVABLE_NODE
+	[N_MEMORY] = { { [0] = 1UL } },
+#endif
 	[N_CPU] = { { [0] = 1UL } },
 #endif	/* NUMA */
 };
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 20/26] memory_hotplug: allow online/offline memory to result movable node
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (11 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 19/26] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 21/26] page_alloc: add kernelcore_max_addr Lai Jiangshan
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Wen Congyang,
	linux-mm

Now, memory management can handle movable node or nodes which don't have
any normal memory, so we can dynamic configure and add movable node by:
	online a ZONE_MOVABLE memory from a previous offline node
	offline the last normal memory which result a non-normal-memory-node

movable-node is very important for power-saving,
hardware partitioning and high-available-system(hardware fault management).


Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memory_hotplug.c |   16 ++++++++++++++++
 1 files changed, 16 insertions(+), 0 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a55b547..756744c 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -589,11 +589,19 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
 	return 0;
 }
 
+#ifdef CONFIG_MOVABLE_NODE
+/* when CONFIG_MOVABLE_NODE, we allow online node don't have normal memory */
+static bool can_online_high_movable(struct zone *zone)
+{
+	return true;
+}
+#else /* #ifdef CONFIG_MOVABLE_NODE */
 /* ensure every online node has NORMAL memory */
 static bool can_online_high_movable(struct zone *zone)
 {
 	return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
 }
+#endif /* #ifdef CONFIG_MOVABLE_NODE */
 
 /* check which state of node_states will be changed when online memory */
 static void node_states_check_changes_online(unsigned long nr_pages,
@@ -1097,6 +1105,13 @@ check_pages_isolated(unsigned long start_pfn, unsigned long end_pfn)
 	return offlined;
 }
 
+#ifdef CONFIG_MOVABLE_NODE
+/* when CONFIG_MOVABLE_NODE, we allow online node don't have normal memory */
+static bool can_offline_normal(struct zone *zone, unsigned long nr_pages)
+{
+	return true;
+}
+#else /* #ifdef CONFIG_MOVABLE_NODE */
 /* ensure the node has NORMAL memory if it is still online */
 static bool can_offline_normal(struct zone *zone, unsigned long nr_pages)
 {
@@ -1120,6 +1135,7 @@ static bool can_offline_normal(struct zone *zone, unsigned long nr_pages)
 	 */
 	return present_pages == 0;
 }
+#endif /* #ifdef CONFIG_MOVABLE_NODE */
 
 /* check which state of node_states will be changed when offline memory */
 static void node_states_check_changes_offline(unsigned long nr_pages,
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 21/26] page_alloc: add kernelcore_max_addr
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (12 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 20/26] memory_hotplug: allow online/offline memory to result movable node Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 24/26] memblock: limit memory address from memblock Lai Jiangshan
                   ` (2 subsequent siblings)
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Rob Landley,
	Minchan Kim, Michal Hocko, linux-doc, linux-mm

Current ZONE_MOVABLE (kernelcore=) setting policy with boot option doesn't meet
our requirement. We need something like kernelcore_max_addr=XX boot option
to limit the kernelcore upper address.

The memory with higher address will be migratable(movable) and they
are easier to be offline(always ready to be offline when the system don't require
so much memory).

It makes things easy when we dynamic hot-add/remove memory, make better
utilities of memories, and helps for THP.

All kernelcore_max_addr=, kernelcore= and movablecore= can be safely specified
at the same time(or any 2 of them).

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 Documentation/kernel-parameters.txt |    9 +++++++++
 mm/page_alloc.c                     |   29 ++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index 9776f06..2b72ffb 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1223,6 +1223,15 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
 			use the HighMem zone if it exists, and the Normal
 			zone if it does not.
 
+	kernelcore_max_addr=nn[KMG]	[KNL,X86,IA-64,PPC] This parameter
+			is the same effect as kernelcore parameter, except it
+			specifies the up physical address of memory range
+			usable by the kernel for non-movable allocations.
+			If both kernelcore and kernelcore_max_addr are
+			specified, this requested's priority is higher than
+			kernelcore's.
+			See the kernelcore parameter.
+
 	kgdbdbgp=	[KGDB,HW] kgdb over EHCI usb debug port.
 			Format: <Controller#>[,poll interval]
 			The controller # is the number of the ehci usb debug
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a42337f..11df8b5 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -203,6 +203,7 @@ static unsigned long __meminitdata dma_reserve;
 #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP
 static unsigned long __meminitdata arch_zone_lowest_possible_pfn[MAX_NR_ZONES];
 static unsigned long __meminitdata arch_zone_highest_possible_pfn[MAX_NR_ZONES];
+static unsigned long __initdata required_kernelcore_max_pfn;
 static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
@@ -4700,6 +4701,7 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 {
 	int i, nid;
 	unsigned long usable_startpfn;
+	unsigned long kernelcore_max_pfn;
 	unsigned long kernelcore_node, kernelcore_remaining;
 	/* save the state before borrow the nodemask */
 	nodemask_t saved_node_state = node_states[N_MEMORY];
@@ -4728,6 +4730,9 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 		required_kernelcore = max(required_kernelcore, corepages);
 	}
 
+	if (required_kernelcore_max_pfn && !required_kernelcore)
+		required_kernelcore = totalpages;
+
 	/* If kernelcore was not specified, there is no ZONE_MOVABLE */
 	if (!required_kernelcore)
 		goto out;
@@ -4736,6 +4741,12 @@ static void __init find_zone_movable_pfns_for_nodes(void)
 	find_usable_zone_for_movable();
 	usable_startpfn = arch_zone_lowest_possible_pfn[movable_zone];
 
+	if (required_kernelcore_max_pfn)
+		kernelcore_max_pfn = required_kernelcore_max_pfn;
+	else
+		kernelcore_max_pfn = ULONG_MAX >> PAGE_SHIFT;
+	kernelcore_max_pfn = max(kernelcore_max_pfn, usable_startpfn);
+
 restart:
 	/* Spread kernelcore memory as evenly as possible throughout nodes */
 	kernelcore_node = required_kernelcore / usable_nodes;
@@ -4762,8 +4773,12 @@ restart:
 			unsigned long size_pages;
 
 			start_pfn = max(start_pfn, zone_movable_pfn[nid]);
-			if (start_pfn >= end_pfn)
+			end_pfn = min(kernelcore_max_pfn, end_pfn);
+			if (start_pfn >= end_pfn) {
+				if (!zone_movable_pfn[nid])
+					zone_movable_pfn[nid] = start_pfn;
 				continue;
+			}
 
 			/* Account for what is only usable for kernelcore */
 			if (start_pfn < usable_startpfn) {
@@ -4954,6 +4969,18 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
 	return 0;
 }
 
+#ifdef CONFIG_MOVABLE_NODE
+/*
+ * kernelcore_max_addr=addr sets the up physical address of memory range
+ * for use for allocations that cannot be reclaimed or migrated.
+ */
+static int __init cmdline_parse_kernelcore_max_addr(char *p)
+{
+	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+}
+early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
+#endif
+
 /*
  * kernelcore=size sets the amount of memory for use for allocations that
  * cannot be reclaimed or migrated.
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 24/26] memblock: limit memory address from memblock
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (13 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 21/26] page_alloc: add kernelcore_max_addr Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 25/26] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 26/26] mempolicy: fix is_valid_nodemask() Lai Jiangshan
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Tejun Heo,
	Wanpeng Li, Jacob Shin, Ingo Molnar, Minchan Kim, Michal Hocko,
	linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

Setting kernelcore_max_pfn means all memory which is bigger than
the boot parameter is allocated as ZONE_MOVABLE. So memory which
is allocated by memblock also should be limited by the parameter.

The patch limits memory from memblock.

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 include/linux/memblock.h |    1 +
 mm/memblock.c            |    5 ++++-
 mm/page_alloc.c          |    6 +++++-
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index d452ee1..3e52911 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -42,6 +42,7 @@ struct memblock {
 
 extern struct memblock memblock;
 extern int memblock_debug;
+extern phys_addr_t memblock_limit;
 
 #define memblock_dbg(fmt, ...) \
 	if (memblock_debug) printk(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
diff --git a/mm/memblock.c b/mm/memblock.c
index 6259055..ee2e307 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -957,7 +957,10 @@ void __init_memblock memblock_trim_memory(phys_addr_t align)
 
 void __init_memblock memblock_set_current_limit(phys_addr_t limit)
 {
-	memblock.current_limit = limit;
+	if (!memblock_limit || (memblock_limit > limit))
+		memblock.current_limit = limit;
+	else
+		memblock.current_limit = memblock_limit;
 }
 
 static void __init_memblock memblock_dump(struct memblock_type *type, char *name)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 11df8b5..f76b696 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -208,6 +208,8 @@ static unsigned long __initdata required_kernelcore;
 static unsigned long __initdata required_movablecore;
 static unsigned long __meminitdata zone_movable_pfn[MAX_NUMNODES];
 
+phys_addr_t memblock_limit;
+
 /* movable_zone is the "real" zone pages in ZONE_MOVABLE are taken from */
 int movable_zone;
 EXPORT_SYMBOL(movable_zone);
@@ -4976,7 +4978,9 @@ static int __init cmdline_parse_core(char *p, unsigned long *core)
  */
 static int __init cmdline_parse_kernelcore_max_addr(char *p)
 {
-	return cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	cmdline_parse_core(p, &required_kernelcore_max_pfn);
+	memblock_limit = required_kernelcore_max_pfn << PAGE_SHIFT;
+	return 0;
 }
 early_param("kernelcore_max_addr", cmdline_parse_kernelcore_max_addr);
 #endif
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 25/26] memblock: compare current_limit with end variable at memblock_find_in_range_node()
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (14 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 24/26] memblock: limit memory address from memblock Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  2012-10-29 15:21 ` [V5 PATCH 26/26] mempolicy: fix is_valid_nodemask() Lai Jiangshan
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, Tejun Heo,
	Ingo Molnar, Wanpeng Li, linux-mm

From: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>

memblock_find_in_range_node() does not compare memblock.current_limit
with end variable. Thus even if memblock.current_limit is smaller than
end variable, the function allocates memory address that is bigger than
memblock.current_limit.

The patch adds the check to "memblock_find_in_range_node()"

Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
---
 mm/memblock.c |    5 +++--
 1 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index ee2e307..50ab53c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -100,11 +100,12 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
 					phys_addr_t align, int nid)
 {
 	phys_addr_t this_start, this_end, cand;
+	phys_addr_t current_limit = memblock.current_limit;
 	u64 i;
 
 	/* pump up @end */
-	if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
-		end = memblock.current_limit;
+	if ((end == MEMBLOCK_ALLOC_ACCESSIBLE) || (end > current_limit))
+		end = current_limit;
 
 	/* avoid allocating the first page */
 	start = max_t(phys_addr_t, start, PAGE_SIZE);
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [V5 PATCH 26/26] mempolicy: fix is_valid_nodemask()
       [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
                   ` (15 preceding siblings ...)
  2012-10-29 15:21 ` [V5 PATCH 25/26] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
@ 2012-10-29 15:21 ` Lai Jiangshan
  16 siblings, 0 replies; 23+ messages in thread
From: Lai Jiangshan @ 2012-10-29 15:21 UTC (permalink / raw)
  To: Mel Gorman, David Rientjes, LKML, x86 maintainers
  Cc: Jiang Liu, Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki,
	Yasuaki ISIMATU, Andrew Morton, Lai Jiangshan, KOSAKI Motohiro,
	Christoph Lameter, linux-mm

is_valid_nodemask() is introduced by 19770b32. but it does not match
its comments, because it does not check the zone which > policy_zone.

Also in b377fd, this commits told us, if highest zone is ZONE_MOVABLE,
we should also apply memory policies to it. so ZONE_MOVABLE should be valid zone
for policies. is_valid_nodemask() need to be changed to match it.

Fix: check all zones, even its zoneid > policy_zone.
Use nodes_intersects() instead open code to check it.

Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Reported-by: Wen Congyang <wency@cn.fujitsu.com>
---
 mm/mempolicy.c |   36 ++++++++++++++++++++++--------------
 1 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d4a084c..ed7c249 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -140,19 +140,7 @@ static const struct mempolicy_operations {
 /* Check that the nodemask contains at least one populated zone */
 static int is_valid_nodemask(const nodemask_t *nodemask)
 {
-	int nd, k;
-
-	for_each_node_mask(nd, *nodemask) {
-		struct zone *z;
-
-		for (k = 0; k <= policy_zone; k++) {
-			z = &NODE_DATA(nd)->node_zones[k];
-			if (z->present_pages > 0)
-				return 1;
-		}
-	}
-
-	return 0;
+	return nodes_intersects(*nodemask, node_states[N_MEMORY]);
 }
 
 static inline int mpol_store_user_nodemask(const struct mempolicy *pol)
@@ -1572,6 +1560,26 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
 	return pol;
 }
 
+static int apply_policy_zone(struct mempolicy *policy, enum zone_type zone)
+{
+	enum zone_type dynamic_policy_zone = policy_zone;
+
+	BUG_ON(dynamic_policy_zone == ZONE_MOVABLE);
+
+	/*
+	 * if policy->v.nodes has movable memory only,
+	 * we apply policy when gfp_zone(gfp) = ZONE_MOVABLE only.
+	 *
+	 * policy->v.nodes is intersect with node_states[N_MEMORY].
+	 * so if the following test faile, it implies
+	 * policy->v.nodes has movable memory only.
+	 */
+	if (!nodes_intersects(policy->v.nodes, node_states[N_HIGH_MEMORY]))
+		dynamic_policy_zone = ZONE_MOVABLE;
+
+	return zone >= dynamic_policy_zone;
+}
+
 /*
  * Return a nodemask representing a mempolicy for filtering nodes for
  * page allocation
@@ -1580,7 +1588,7 @@ static nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *policy)
 {
 	/* Lower zones don't get a nodemask applied for MPOL_BIND */
 	if (unlikely(policy->mode == MPOL_BIND) &&
-			gfp_zone(gfp) >= policy_zone &&
+			apply_policy_zone(policy, gfp_zone(gfp)) &&
 			cpuset_nodemask_valid_mems_allowed(&policy->v.nodes))
 		return &policy->v.nodes;
 
-- 
1.7.4.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 15:20 ` [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
@ 2012-10-29 16:22   ` Michal Hocko
  2012-10-29 20:40     ` David Rientjes
  2012-10-31 13:18   ` Michal Hocko
  1 sibling, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2012-10-29 16:22 UTC (permalink / raw)
  To: Lai Jiangshan
  Cc: Mel Gorman, David Rientjes, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers

On Mon 29-10-12 23:20:58, Lai Jiangshan wrote:
> N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> N_MEMORY stands for the nodes that has any memory.

What is the difference of those two?

> The code here need to handle with the nodes which have memory, we should
> use N_MEMORY instead.
> 
> Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
> ---
>  mm/memcontrol.c  |   18 +++++++++---------
>  mm/page_cgroup.c |    2 +-
>  2 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 7acf43b..1b69665 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -800,7 +800,7 @@ static unsigned long mem_cgroup_nr_lru_pages(struct mem_cgroup *memcg,
>  	int nid;
>  	u64 total = 0;
>  
> -	for_each_node_state(nid, N_HIGH_MEMORY)
> +	for_each_node_state(nid, N_MEMORY)
>  		total += mem_cgroup_node_nr_lru_pages(memcg, nid, lru_mask);
>  	return total;
>  }
> @@ -1611,9 +1611,9 @@ static void mem_cgroup_may_update_nodemask(struct mem_cgroup *memcg)
>  		return;
>  
>  	/* make a nodemask where this memcg uses memory from */
> -	memcg->scan_nodes = node_states[N_HIGH_MEMORY];
> +	memcg->scan_nodes = node_states[N_MEMORY];
>  
> -	for_each_node_mask(nid, node_states[N_HIGH_MEMORY]) {
> +	for_each_node_mask(nid, node_states[N_MEMORY]) {
>  
>  		if (!test_mem_cgroup_node_reclaimable(memcg, nid, false))
>  			node_clear(nid, memcg->scan_nodes);
> @@ -1684,7 +1684,7 @@ static bool mem_cgroup_reclaimable(struct mem_cgroup *memcg, bool noswap)
>  	/*
>  	 * Check rest of nodes.
>  	 */
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		if (node_isset(nid, memcg->scan_nodes))
>  			continue;
>  		if (test_mem_cgroup_node_reclaimable(memcg, nid, noswap))
> @@ -3759,7 +3759,7 @@ move_account:
>  		drain_all_stock_sync(memcg);
>  		ret = 0;
>  		mem_cgroup_start_move(memcg);
> -		for_each_node_state(node, N_HIGH_MEMORY) {
> +		for_each_node_state(node, N_MEMORY) {
>  			for (zid = 0; !ret && zid < MAX_NR_ZONES; zid++) {
>  				enum lru_list lru;
>  				for_each_lru(lru) {
> @@ -4087,7 +4087,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
>  
>  	total_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL);
>  	seq_printf(m, "total=%lu", total_nr);
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid, LRU_ALL);
>  		seq_printf(m, " N%d=%lu", nid, node_nr);
>  	}
> @@ -4095,7 +4095,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
>  
>  	file_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_FILE);
>  	seq_printf(m, "file=%lu", file_nr);
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
>  				LRU_ALL_FILE);
>  		seq_printf(m, " N%d=%lu", nid, node_nr);
> @@ -4104,7 +4104,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
>  
>  	anon_nr = mem_cgroup_nr_lru_pages(memcg, LRU_ALL_ANON);
>  	seq_printf(m, "anon=%lu", anon_nr);
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
>  				LRU_ALL_ANON);
>  		seq_printf(m, " N%d=%lu", nid, node_nr);
> @@ -4113,7 +4113,7 @@ static int memcg_numa_stat_show(struct cgroup *cont, struct cftype *cft,
>  
>  	unevictable_nr = mem_cgroup_nr_lru_pages(memcg, BIT(LRU_UNEVICTABLE));
>  	seq_printf(m, "unevictable=%lu", unevictable_nr);
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		node_nr = mem_cgroup_node_nr_lru_pages(memcg, nid,
>  				BIT(LRU_UNEVICTABLE));
>  		seq_printf(m, " N%d=%lu", nid, node_nr);
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 5ddad0c..c1054ad 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -271,7 +271,7 @@ void __init page_cgroup_init(void)
>  	if (mem_cgroup_disabled())
>  		return;
>  
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		unsigned long start_pfn, end_pfn;
>  
>  		start_pfn = node_start_pfn(nid);
> -- 
> 1.7.4.4
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 16:22   ` Michal Hocko
@ 2012-10-29 20:40     ` David Rientjes
  2012-10-29 20:58       ` Michal Hocko
  0 siblings, 1 reply; 23+ messages in thread
From: David Rientjes @ 2012-10-29 20:40 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Lai Jiangshan, Mel Gorman, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers

On Mon, 29 Oct 2012, Michal Hocko wrote:

> > N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> > N_MEMORY stands for the nodes that has any memory.
> 
> What is the difference of those two?
> 

Patch 5 in the series introduces it to be equal to N_HIGH_MEMORY, so 
accepting this patch would be an implicit ack of the direction taken 
there.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 20:40     ` David Rientjes
@ 2012-10-29 20:58       ` Michal Hocko
  2012-10-29 21:08         ` David Rientjes
  0 siblings, 1 reply; 23+ messages in thread
From: Michal Hocko @ 2012-10-29 20:58 UTC (permalink / raw)
  To: David Rientjes
  Cc: Lai Jiangshan, Mel Gorman, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers

On Mon 29-10-12 13:40:39, David Rientjes wrote:
> On Mon, 29 Oct 2012, Michal Hocko wrote:
> 
> > > N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> > > N_MEMORY stands for the nodes that has any memory.
> > 
> > What is the difference of those two?
> > 
> 
> Patch 5 in the series 

Strange, I do not see that one at the mailing list.

> introduces it to be equal to N_HIGH_MEMORY, so 

So this is just a rename? If yes it would be much esier if it was
mentioned in the patch description.

> accepting this patch would be an implicit ack of the direction taken 
> there.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 20:58       ` Michal Hocko
@ 2012-10-29 21:08         ` David Rientjes
  2012-10-29 21:34           ` Michal Hocko
  0 siblings, 1 reply; 23+ messages in thread
From: David Rientjes @ 2012-10-29 21:08 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Lai Jiangshan, Mel Gorman, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers

On Mon, 29 Oct 2012, Michal Hocko wrote:

> > > > N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> > > > N_MEMORY stands for the nodes that has any memory.
> > > 
> > > What is the difference of those two?
> > > 
> > 
> > Patch 5 in the series 
> 
> Strange, I do not see that one at the mailing list.
> 

http://marc.info/?l=linux-kernel&m=135152595827692

> > introduces it to be equal to N_HIGH_MEMORY, so 
> 
> So this is just a rename? If yes it would be much esier if it was
> mentioned in the patch description.
> 

It's not even a rename even though it should be, it's adding yet another 
node_states that is equal to N_HIGH_MEMORY since that state already 
includes all memory.  It's just a matter of taste but I think we should be 
renaming it instead of aliasing it (unless you actually want to make 
N_HIGH_MEMORY only include nodes with highmem, but nothing depends on 
that).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 21:08         ` David Rientjes
@ 2012-10-29 21:34           ` Michal Hocko
  0 siblings, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2012-10-29 21:34 UTC (permalink / raw)
  To: David Rientjes
  Cc: Lai Jiangshan, Mel Gorman, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers

On Mon 29-10-12 14:08:05, David Rientjes wrote:
> On Mon, 29 Oct 2012, Michal Hocko wrote:
> 
> > > > > N_HIGH_MEMORY stands for the nodes that has normal or high memory.
> > > > > N_MEMORY stands for the nodes that has any memory.
> > > > 
> > > > What is the difference of those two?
> > > > 
> > > 
> > > Patch 5 in the series 
> > 
> > Strange, I do not see that one at the mailing list.
> > 
> 
> http://marc.info/?l=linux-kernel&m=135152595827692

Thanks!

> > > introduces it to be equal to N_HIGH_MEMORY, so 
> > 
> > So this is just a rename? If yes it would be much esier if it was
> > mentioned in the patch description.
> > 
> 
> It's not even a rename even though it should be, it's adding yet another 
> node_states that is equal to N_HIGH_MEMORY since that state already 
> includes all memory.  

Which is really strange because I do not see any reason for yet another
alias if the follow up patches rename all of them (I didn't try to apply
the whole series to check that so I might be wrong here).

> It's just a matter of taste but I think we should be renaming it
> instead of aliasing it (unless you actually want to make N_HIGH_MEMORY
> only include nodes with highmem, but nothing depends on that).

Agreed, I've always considered N_HIGH_MEMORY misleading and confusing so
renaming it would really make a lot of sense to me.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY
  2012-10-29 15:20 ` [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
  2012-10-29 16:22   ` Michal Hocko
@ 2012-10-31 13:18   ` Michal Hocko
  1 sibling, 0 replies; 23+ messages in thread
From: Michal Hocko @ 2012-10-31 13:18 UTC (permalink / raw)
  To: Lai Jiangshan, Wen Congyang
  Cc: Mel Gorman, David Rientjes, LKML, x86 maintainers, Jiang Liu,
	Rusty Russell, Yinghai Lu, KAMEZAWA Hiroyuki, Yasuaki ISIMATU,
	Andrew Morton, Johannes Weiner, Balbir Singh, Tejun Heo, Li Zefan,
	cgroups, linux-mm, containers, Christoph Lameter, Hillf Danton

On Wed 31-10-12 15:03:36, Wen Congyang wrote:
> At 10/30/2012 04:46 AM, David Rientjes Wrote:
> > On Mon, 29 Oct 2012, Lai Jiangshan wrote:
[...]
> >> In one word, we need a N_MEMORY. We just intrude it as an alias to
> >> N_HIGH_MEMORY and fix all im-proper usages of N_HIGH_MEMORY in late patches.
> >>
> > 
> > If this is really that problematic (and it appears it's not given that 
> > there are many use cases of it and people tend to get it right), then why 
> > not simply rename N_HIGH_MEMORY instead of introducing yet another 
> > nodemask to the equation?
> 
> The reason is that we need a node which only contains movable memory. This
> feature is very important for node hotplug. So we will add a new nodemask
> for movable memory. N_MEMORY contains movable memory but N_HIGH_MEMORY
> doesn't contain it.

OK, so the N_MOVABLE_MEMORY (or how you will call it) requires that all
the allocations will be migrateable?
How do you want to achieve that with the page_cgroup descriptors? (see
bellow)

On Mon 29-10-12 23:20:58, Lai Jiangshan wrote:
[...]
> diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
> index 5ddad0c..c1054ad 100644
> --- a/mm/page_cgroup.c
> +++ b/mm/page_cgroup.c
> @@ -271,7 +271,7 @@ void __init page_cgroup_init(void)
>  	if (mem_cgroup_disabled())
>  		return;
>  
> -	for_each_node_state(nid, N_HIGH_MEMORY) {
> +	for_each_node_state(nid, N_MEMORY) {
>  		unsigned long start_pfn, end_pfn;
>  
>  		start_pfn = node_start_pfn(nid);

This will call init_section_page_cgroup(pfn, nid) later which allocates
page_cgroup descriptors which are not movable. Or is there any code in
your patchset that handles this?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2012-10-31 13:18 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <1351523301-20048-1-git-send-email-laijs@cn.fujitsu.com>
2012-10-29 15:20 ` [V5 PATCH 02/26] memory_hotplug: handle empty zone when online_movable/online_kernel Lai Jiangshan
2012-10-29 15:20 ` [V5 PATCH 03/26] memory_hotplug: ensure every online node has NORMAL memory Lai Jiangshan
2012-10-29 15:20 ` [V5 PATCH 08/26] memcontrol: use N_MEMORY instead N_HIGH_MEMORY Lai Jiangshan
2012-10-29 16:22   ` Michal Hocko
2012-10-29 20:40     ` David Rientjes
2012-10-29 20:58       ` Michal Hocko
2012-10-29 21:08         ` David Rientjes
2012-10-29 21:34           ` Michal Hocko
2012-10-31 13:18   ` Michal Hocko
2012-10-29 15:20 ` [V5 PATCH 09/26] oom: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 10/26] mm,migrate: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 11/26] mempolicy: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 12/26] hugetlb: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 13/26] vmstat: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 16/26] vmscan: " Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 17/26] page_alloc: use N_MEMORY instead N_HIGH_MEMORY change the node_states initialization Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 18/26] hotplug: update nodemasks management Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 19/26] numa: add CONFIG_MOVABLE_NODE for movable-dedicated node Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 20/26] memory_hotplug: allow online/offline memory to result movable node Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 21/26] page_alloc: add kernelcore_max_addr Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 24/26] memblock: limit memory address from memblock Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 25/26] memblock: compare current_limit with end variable at memblock_find_in_range_node() Lai Jiangshan
2012-10-29 15:21 ` [V5 PATCH 26/26] mempolicy: fix is_valid_nodemask() Lai Jiangshan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).