* [PATCH 0/3] introduce a new state 'isolate' for memblock to split the isolation and migration steps
@ 2018-09-19 3:17 Pingfan Liu
2018-09-19 3:17 ` [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock Pingfan Liu
` (2 more replies)
0 siblings, 3 replies; 6+ messages in thread
From: Pingfan Liu @ 2018-09-19 3:17 UTC (permalink / raw)
To: linux-mm
Cc: Pingfan Liu, Andrew Morton, KAMEZAWA Hiroyuki, Mel Gorman,
Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko, Bharata B Rao,
Dan Williams, H. Peter Anvin, Kirill A . Shutemov
Currently, offline pages in the unit of memblock, and normally, it is done
one by one on each memblock. If there is only one numa node, then the dst
pages may come from the next memblock to be offlined, which wastes time
during memory offline. For a system with multi numa node, if only replacing
part of mem on a node, and the migration dst page can be allocated from
local node (which is done by [3/3]), it also faces such issue.
This patch suggests to introduce a new state, named 'isolate', the state
transition can be isolate -> online or reversion. And another slight
benefit of "isolated" state is no further allocation on this memblock,
which can block potential unmovable page allocated again from this
memblock for a long time.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Pingfan Liu (3):
mm/isolation: separate the isolation and migration ops in offline
memblock
drivers/base/memory: introduce a new state 'isolate' for memblock
drivers/base/node: create a partial offline hints under each node
drivers/base/memory.c | 31 ++++++++++++++++++++++++++++++-
drivers/base/node.c | 33 +++++++++++++++++++++++++++++++++
include/linux/memory.h | 1 +
include/linux/mmzone.h | 1 +
include/linux/page-isolation.h | 4 ++--
include/linux/pageblock-flags.h | 2 ++
mm/memory_hotplug.c | 37 ++++++++++++++++++++++---------------
mm/page_alloc.c | 4 ++--
mm/page_isolation.c | 28 +++++++++++++++++++++++-----
9 files changed, 116 insertions(+), 25 deletions(-)
--
2.7.4
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock
2018-09-19 3:17 [PATCH 0/3] introduce a new state 'isolate' for memblock to split the isolation and migration steps Pingfan Liu
@ 2018-09-19 3:17 ` Pingfan Liu
2018-09-19 3:17 ` [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock Pingfan Liu
2018-09-19 3:17 ` [PATCH 3/3] drivers/base/node: create a partial offline hints under each node Pingfan Liu
2 siblings, 0 replies; 6+ messages in thread
From: Pingfan Liu @ 2018-09-19 3:17 UTC (permalink / raw)
To: linux-mm
Cc: Pingfan Liu, Andrew Morton, KAMEZAWA Hiroyuki, Mel Gorman,
Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko, Bharata B Rao,
Dan Williams, H. Peter Anvin, Kirill A . Shutemov
The current design of start_isolate_page_range() relies on MIGRATE_ISOLATE
to run against other threads. Hence the callers of start_isolate_page_range()
can only do the isolation by themselves.
But in this series, a suggested mem offline seq splits the pageblock's
isolation and migration on a memblock, i.e.
-1. call start_isolate_page_range() on a batch of memblock
-2. call __offline_pages() on each memblock.
This requires the ability to allow __offline_pages() to reuse the isolation
About the mark of isolation, it is not preferable to do it in
memblock, because at this level, pageblock is used, and the memblock should
be hidden. On the other hand, isolation and compaction can not run in
parallel, the PB_migrate_skip bit can be reused to mark the isolation
result of previous ops, as used by this patch. Also the prototype of
start_isolate_page_range() is changed to tell __offline_pages cases from
temporary isolation e.g. alloc_contig_range()
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
include/linux/page-isolation.h | 4 ++--
include/linux/pageblock-flags.h | 2 ++
mm/memory_hotplug.c | 6 +++---
mm/page_alloc.c | 4 ++--
mm/page_isolation.c | 28 +++++++++++++++++++++++-----
5 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 4ae347c..dcc2bd1 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -47,7 +47,7 @@ int move_freepages_block(struct zone *zone, struct page *page,
*/
int
start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype, bool skip_hwpoisoned_pages);
+ unsigned int migratetype, bool skip_hwpoisoned_pages, bool reuse);
/*
* Changes MIGRATE_ISOLATE to MIGRATE_MOVABLE.
@@ -55,7 +55,7 @@ start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
*/
int
undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype);
+ unsigned int migratetype, bool reuse);
/*
* Test all pages in [start_pfn, end_pfn) are isolated or not.
diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
index 9132c5c..80c5341 100644
--- a/include/linux/pageblock-flags.h
+++ b/include/linux/pageblock-flags.h
@@ -31,6 +31,8 @@ enum pageblock_bits {
PB_migrate_end = PB_migrate + 3 - 1,
/* 3 bits required for migrate types */
PB_migrate_skip,/* If set the block is skipped by compaction */
+ PB_isolate_skip = PB_migrate_skip,
+ /* isolation and compaction do not concur */
/*
* Assume the bits will always align on a word. If this assumption
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 9eea6e8..228de4d 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1616,7 +1616,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
/* set above range as isolated */
ret = start_isolate_page_range(start_pfn, end_pfn,
- MIGRATE_MOVABLE, true);
+ MIGRATE_MOVABLE, true, true);
if (ret)
return ret;
@@ -1662,7 +1662,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
We cannot do rollback at this point. */
offline_isolated_pages(start_pfn, end_pfn);
/* reset pagetype flags and makes migrate type to be MOVABLE */
- undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+ undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true);
/* removal success */
adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages);
zone->present_pages -= offlined_pages;
@@ -1697,7 +1697,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
((unsigned long long) end_pfn << PAGE_SHIFT) - 1);
memory_notify(MEM_CANCEL_OFFLINE, &arg);
/* pushback to free area */
- undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+ undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE, true);
return ret;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 05e983f..a0ae259 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7882,7 +7882,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
ret = start_isolate_page_range(pfn_max_align_down(start),
pfn_max_align_up(end), migratetype,
- false);
+ false, false);
if (ret)
return ret;
@@ -7967,7 +7967,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
done:
undo_isolate_page_range(pfn_max_align_down(start),
- pfn_max_align_up(end), migratetype);
+ pfn_max_align_up(end), migratetype, false);
return ret;
}
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 43e0856..36858ab 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -15,8 +15,18 @@
#define CREATE_TRACE_POINTS
#include <trace/events/page_isolation.h>
+#define get_pageblock_isolate_skip(page) \
+ get_pageblock_flags_group(page, PB_isolate_skip, \
+ PB_isolate_skip)
+#define clear_pageblock_isolate_skip(page) \
+ set_pageblock_flags_group(page, 0, PB_isolate_skip, \
+ PB_isolate_skip)
+#define set_pageblock_isolate_skip(page) \
+ set_pageblock_flags_group(page, 1, PB_isolate_skip, \
+ PB_isolate_skip)
+
static int set_migratetype_isolate(struct page *page, int migratetype,
- bool skip_hwpoisoned_pages)
+ bool skip_hwpoisoned_pages, bool reuse)
{
struct zone *zone;
unsigned long flags, pfn;
@@ -33,8 +43,11 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
* If it is already set, then someone else must have raced and
* set it before us. Return -EBUSY
*/
- if (is_migrate_isolate_page(page))
+ if (is_migrate_isolate_page(page)) {
+ if (reuse && get_pageblock_isolate_skip(page))
+ ret = 0;
goto out;
+ }
pfn = page_to_pfn(page);
arg.start_pfn = pfn;
@@ -75,6 +88,8 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
int mt = get_pageblock_migratetype(page);
set_pageblock_migratetype(page, MIGRATE_ISOLATE);
+ if (reuse)
+ set_pageblock_isolate_skip(page);
zone->nr_isolate_pageblock++;
nr_pages = move_freepages_block(zone, page, MIGRATE_ISOLATE,
NULL);
@@ -185,7 +200,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
* prevents two threads from simultaneously working on overlapping ranges.
*/
int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype, bool skip_hwpoisoned_pages)
+ unsigned int migratetype, bool skip_hwpoisoned_pages, bool reuse)
{
unsigned long pfn;
unsigned long undo_pfn;
@@ -199,7 +214,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
pfn += pageblock_nr_pages) {
page = __first_valid_page(pfn, pageblock_nr_pages);
if (page &&
- set_migratetype_isolate(page, migratetype, skip_hwpoisoned_pages)) {
+ set_migratetype_isolate(page, migratetype,
+ skip_hwpoisoned_pages, reuse)) {
undo_pfn = pfn;
goto undo;
}
@@ -222,7 +238,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
* Make isolated pages available again.
*/
int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
- unsigned migratetype)
+ unsigned int migratetype, bool reuse)
{
unsigned long pfn;
struct page *page;
@@ -236,6 +252,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
page = __first_valid_page(pfn, pageblock_nr_pages);
if (!page || !is_migrate_isolate_page(page))
continue;
+ if (reuse)
+ clear_pageblock_isolate_skip(page);
unset_migratetype_isolate(page, migratetype);
}
return 0;
--
2.7.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock
2018-09-19 3:17 [PATCH 0/3] introduce a new state 'isolate' for memblock to split the isolation and migration steps Pingfan Liu
2018-09-19 3:17 ` [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock Pingfan Liu
@ 2018-09-19 3:17 ` Pingfan Liu
2018-09-19 6:49 ` kbuild test robot
2018-09-19 3:17 ` [PATCH 3/3] drivers/base/node: create a partial offline hints under each node Pingfan Liu
2 siblings, 1 reply; 6+ messages in thread
From: Pingfan Liu @ 2018-09-19 3:17 UTC (permalink / raw)
To: linux-mm
Cc: Pingfan Liu, Andrew Morton, KAMEZAWA Hiroyuki, Mel Gorman,
Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko, Bharata B Rao,
Dan Williams, H. Peter Anvin, Kirill A . Shutemov
Currently, offline pages in the unit of memblock, and normally, it is done
one by one on each memblock. If there is only one numa node, then the dst
pages may come from the next memblock to be offlined, which wastes time
during memory offline. For a system with multi numa node, if only replacing
part of mem on a node, and the migration dst page can be allocated from
local node (which is done by [3/3]), it also faces such issue.
This patch suggests to introduce a new state, named 'isolate', the state
transition can be isolate -> online or reversion. And another slight
benefit of "isolated" state is no further allocation on this memblock,
which can block potential unmovable page allocated again from this
memblock for a long time.
After this patch, the suggested ops to offline pages
will looks like:
for i in {s..e}; do echo isolate > memory$i/state; done
for i in {s..e}; do echo offline > memory$i/state; done
Since this patch does not change the original offline path, hence
for i in (s..e); do echo offline > memory$i/state; done
still works.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
drivers/base/memory.c | 31 ++++++++++++++++++++++++++++++-
include/linux/memory.h | 1 +
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index c8a1cb0..3b714be 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -19,6 +19,7 @@
#include <linux/memory.h>
#include <linux/memory_hotplug.h>
#include <linux/mm.h>
+#include <linux/page-isolation.h>
#include <linux/mutex.h>
#include <linux/stat.h>
#include <linux/slab.h>
@@ -166,6 +167,9 @@ static ssize_t show_mem_state(struct device *dev,
case MEM_GOING_OFFLINE:
len = sprintf(buf, "going-offline\n");
break;
+ case MEM_ISOLATED:
+ len = sprintf(buf, "isolated\n");
+ break;
default:
len = sprintf(buf, "ERROR-UNKNOWN-%ld\n",
mem->state);
@@ -323,6 +327,9 @@ store_mem_state(struct device *dev,
{
struct memory_block *mem = to_memory_block(dev);
int ret, online_type;
+ int isolated = 0;
+ unsigned long start_pfn;
+ unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
ret = lock_device_hotplug_sysfs();
if (ret)
@@ -336,7 +343,13 @@ store_mem_state(struct device *dev,
online_type = MMOP_ONLINE_KEEP;
else if (sysfs_streq(buf, "offline"))
online_type = MMOP_OFFLINE;
- else {
+ else if (sysfs_streq(buf, "isolate")) {
+ isolated = 1;
+ goto memblock_isolated;
+ } else if (sysfs_streq(buf, "unisolate")) {
+ isolated = -1;
+ goto memblock_isolated;
+ } else {
ret = -EINVAL;
goto err;
}
@@ -366,6 +379,20 @@ store_mem_state(struct device *dev,
mem_hotplug_done();
err:
+memblock_isolated:
+ if (isolated == 1 && mem->state == MEM_ONLINE) {
+ start_pfn = section_nr_to_pfn(mem->start_section_nr);
+ ret = start_isolate_page_range(start_pfn, start_pfn + nr_pages,
+ MIGRATE_MOVABLE, true, true);
+ if (!ret)
+ mem->state = MEM_ISOLATED;
+ } else if (isolated == -1 && mem->state == MEM_ISOLATED) {
+ start_pfn = section_nr_to_pfn(mem->start_section_nr);
+ ret = undo_isolate_page_range(start_pfn, start_pfn + nr_pages,
+ MIGRATE_MOVABLE, true);
+ if (!ret)
+ mem->state = MEM_ONLINE;
+ }
unlock_device_hotplug();
if (ret < 0)
@@ -455,6 +482,7 @@ static DEVICE_ATTR(phys_index, 0444, show_mem_start_phys_index, NULL);
static DEVICE_ATTR(state, 0644, show_mem_state, store_mem_state);
static DEVICE_ATTR(phys_device, 0444, show_phys_device, NULL);
static DEVICE_ATTR(removable, 0444, show_mem_removable, NULL);
+//static DEVICE_ATTR(isolate, 0600, show_mem_isolate, store_mem_isolate);
/*
* Block size attribute stuff
@@ -631,6 +659,7 @@ static struct attribute *memory_memblk_attrs[] = {
#ifdef CONFIG_MEMORY_HOTREMOVE
&dev_attr_valid_zones.attr,
#endif
+ //&dev_attr_isolate.attr,
NULL
};
diff --git a/include/linux/memory.h b/include/linux/memory.h
index a6ddefc..e00f22c 100644
--- a/include/linux/memory.h
+++ b/include/linux/memory.h
@@ -47,6 +47,7 @@ int set_memory_block_size_order(unsigned int order);
#define MEM_GOING_ONLINE (1<<3)
#define MEM_CANCEL_ONLINE (1<<4)
#define MEM_CANCEL_OFFLINE (1<<5)
+#define MEM_ISOLATED (1<<6)
struct memory_notify {
unsigned long start_pfn;
--
2.7.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH 3/3] drivers/base/node: create a partial offline hints under each node
2018-09-19 3:17 [PATCH 0/3] introduce a new state 'isolate' for memblock to split the isolation and migration steps Pingfan Liu
2018-09-19 3:17 ` [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock Pingfan Liu
2018-09-19 3:17 ` [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock Pingfan Liu
@ 2018-09-19 3:17 ` Pingfan Liu
2018-09-19 4:36 ` kbuild test robot
2 siblings, 1 reply; 6+ messages in thread
From: Pingfan Liu @ 2018-09-19 3:17 UTC (permalink / raw)
To: linux-mm
Cc: Pingfan Liu, Andrew Morton, KAMEZAWA Hiroyuki, Mel Gorman,
Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko, Bharata B Rao,
Dan Williams, H. Peter Anvin, Kirill A . Shutemov
When offline mem, there are two cases: 1st, offline all of memblock under a
node. 2nd, only offline and replace part of mem under a node. For the 2nd
case, there is not need to alloc new page from other nodes, which may incur
extra numa fault to resolve the misplaced issue, and place unnecessary mem
pressure on other nodes. The patch suggests to introduce an interface
/sys/../node/nodeX/partial_offline to let the user order how to
allocate a new page, i.e. from local node or other nodes.
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Pavel Tatashin <pasha.tatashin@oracle.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
---
drivers/base/node.c | 33 +++++++++++++++++++++++++++++++++
include/linux/mmzone.h | 1 +
mm/memory_hotplug.c | 31 +++++++++++++++++++------------
3 files changed, 53 insertions(+), 12 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 1ac4c36..64b0cb8 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -25,6 +25,36 @@ static struct bus_type node_subsys = {
.dev_name = "node",
};
+static ssize_t read_partial_offline(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ int nid = dev->id;
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ ssize_t len = 0;
+
+ if (pgdat->partial_offline)
+ len = sprintf(buf, "1\n");
+ else
+ len = sprintf(buf, "0\n");
+
+ return len;
+}
+
+static ssize_t write_partial_offline(struct device *dev,
+ struct device_attribute *attr, const char *buf, size_t count)
+{
+ int nid = dev->id;
+ struct pglist_data *pgdat = NODE_DATA(nid);
+
+ if (sysfs_streq(buf, "1"))
+ pgdat->partial_offline = true;
+ else if (sysfs_streq(buf, "0"))
+ pgdat->partial_offline = false;
+ else
+ return -EINVAL;
+
+ return strlen(buf);
+}
static ssize_t node_read_cpumap(struct device *dev, bool list, char *buf)
{
@@ -56,6 +86,8 @@ static inline ssize_t node_read_cpulist(struct device *dev,
return node_read_cpumap(dev, true, buf);
}
+static DEVICE_ATTR(partial_offline, 0600, read_partial_offline,
+ write_partial_offline);
static DEVICE_ATTR(cpumap, S_IRUGO, node_read_cpumask, NULL);
static DEVICE_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL);
@@ -235,6 +267,7 @@ static struct attribute *node_dev_attrs[] = {
&dev_attr_numastat.attr,
&dev_attr_distance.attr,
&dev_attr_vmstat.attr,
+ &dev_attr_partial_offline.attr,
NULL
};
ATTRIBUTE_GROUPS(node_dev);
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 1e22d96..80c44c8 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -722,6 +722,7 @@ typedef struct pglist_data {
/* Per-node vmstats */
struct per_cpu_nodestat __percpu *per_cpu_nodestats;
atomic_long_t vm_stat[NR_VM_NODE_STAT_ITEMS];
+ bool partial_offline;
} pg_data_t;
#define node_present_pages(nid) (NODE_DATA(nid)->node_present_pages)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 228de4d..3c66075 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1346,18 +1346,10 @@ static unsigned long scan_movable_pages(unsigned long start, unsigned long end)
static struct page *new_node_page(struct page *page, unsigned long private)
{
- int nid = page_to_nid(page);
- nodemask_t nmask = node_states[N_MEMORY];
-
- /*
- * try to allocate from a different node but reuse this node if there
- * are no other online nodes to be used (e.g. we are offlining a part
- * of the only existing node)
- */
- node_clear(nid, nmask);
- if (nodes_empty(nmask))
- node_set(nid, nmask);
+ nodemask_t nmask = *(nodemask_t *)private;
+ int nid;
+ nid = page_to_nid(page);
return new_page_nodemask(page, nid, &nmask);
}
@@ -1371,6 +1363,8 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
int not_managed = 0;
int ret = 0;
LIST_HEAD(source);
+ int nid;
+ nodemask_t nmask = node_states[N_MEMORY];
for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) {
if (!pfn_valid(pfn))
@@ -1430,8 +1424,21 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
goto out;
}
+ page = list_entry(source.next, struct page, lru);
+ nid = page_to_nid(page);
+ if (!NODE_DATA(nid)->partial_offline) {
+ /*
+ * try to allocate from a different node but reuse this
+ * node if there are no other online nodes to be used
+ * (e.g. we are offlining a part of the only existing
+ * node)
+ */
+ node_clear(nid, nmask);
+ if (nodes_empty(nmask))
+ node_set(nid, nmask);
+ }
/* Allocate a new page from the nearest neighbor node */
- ret = migrate_pages(&source, new_node_page, NULL, 0,
+ ret = migrate_pages(&source, new_node_page, NULL, &nmask,
MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
if (ret)
putback_movable_pages(&source);
--
2.7.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 3/3] drivers/base/node: create a partial offline hints under each node
2018-09-19 3:17 ` [PATCH 3/3] drivers/base/node: create a partial offline hints under each node Pingfan Liu
@ 2018-09-19 4:36 ` kbuild test robot
0 siblings, 0 replies; 6+ messages in thread
From: kbuild test robot @ 2018-09-19 4:36 UTC (permalink / raw)
To: Pingfan Liu
Cc: kbuild-all, linux-mm, Andrew Morton, KAMEZAWA Hiroyuki,
Mel Gorman, Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko,
Bharata B Rao, Dan Williams, H. Peter Anvin, Kirill A . Shutemov
[-- Attachment #1: Type: text/plain, Size: 4562 bytes --]
Hi Pingfan,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on linus/master]
[also build test WARNING on v4.19-rc4 next-20180918]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Pingfan-Liu/introduce-a-new-state-isolate-for-memblock-to-split-the-isolation-and-migration-steps/20180919-112650
config: x86_64-randconfig-x018-201837 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-1) 7.3.0
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All warnings (new ones prefixed by >>):
mm/memory_hotplug.c: In function 'do_migrate_range':
>> mm/memory_hotplug.c:1442:53: warning: passing argument 4 of 'migrate_pages' makes integer from pointer without a cast [-Wint-conversion]
ret = migrate_pages(&source, new_node_page, NULL, &nmask,
^
In file included from mm/memory_hotplug.c:27:0:
include/linux/migrate.h:68:12: note: expected 'long unsigned int' but argument is of type 'nodemask_t * {aka struct <anonymous> *}'
extern int migrate_pages(struct list_head *l, new_page_t new, free_page_t free,
^~~~~~~~~~~~~
vim +/migrate_pages +1442 mm/memory_hotplug.c
1356
1357 #define NR_OFFLINE_AT_ONCE_PAGES (256)
1358 static int
1359 do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
1360 {
1361 unsigned long pfn;
1362 struct page *page;
1363 int move_pages = NR_OFFLINE_AT_ONCE_PAGES;
1364 int not_managed = 0;
1365 int ret = 0;
1366 LIST_HEAD(source);
1367 int nid;
1368 nodemask_t nmask = node_states[N_MEMORY];
1369
1370 for (pfn = start_pfn; pfn < end_pfn && move_pages > 0; pfn++) {
1371 if (!pfn_valid(pfn))
1372 continue;
1373 page = pfn_to_page(pfn);
1374
1375 if (PageHuge(page)) {
1376 struct page *head = compound_head(page);
1377 pfn = page_to_pfn(head) + (1<<compound_order(head)) - 1;
1378 if (compound_order(head) > PFN_SECTION_SHIFT) {
1379 ret = -EBUSY;
1380 break;
1381 }
1382 if (isolate_huge_page(page, &source))
1383 move_pages -= 1 << compound_order(head);
1384 continue;
1385 } else if (PageTransHuge(page))
1386 pfn = page_to_pfn(compound_head(page))
1387 + hpage_nr_pages(page) - 1;
1388
1389 if (!get_page_unless_zero(page))
1390 continue;
1391 /*
1392 * We can skip free pages. And we can deal with pages on
1393 * LRU and non-lru movable pages.
1394 */
1395 if (PageLRU(page))
1396 ret = isolate_lru_page(page);
1397 else
1398 ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
1399 if (!ret) { /* Success */
1400 put_page(page);
1401 list_add_tail(&page->lru, &source);
1402 move_pages--;
1403 if (!__PageMovable(page))
1404 inc_node_page_state(page, NR_ISOLATED_ANON +
1405 page_is_file_cache(page));
1406
1407 } else {
1408 #ifdef CONFIG_DEBUG_VM
1409 pr_alert("failed to isolate pfn %lx\n", pfn);
1410 dump_page(page, "isolation failed");
1411 #endif
1412 put_page(page);
1413 /* Because we don't have big zone->lock. we should
1414 check this again here. */
1415 if (page_count(page)) {
1416 not_managed++;
1417 ret = -EBUSY;
1418 break;
1419 }
1420 }
1421 }
1422 if (!list_empty(&source)) {
1423 if (not_managed) {
1424 putback_movable_pages(&source);
1425 goto out;
1426 }
1427
1428 page = list_entry(source.next, struct page, lru);
1429 nid = page_to_nid(page);
1430 if (!NODE_DATA(nid)->partial_offline) {
1431 /*
1432 * try to allocate from a different node but reuse this
1433 * node if there are no other online nodes to be used
1434 * (e.g. we are offlining a part of the only existing
1435 * node)
1436 */
1437 node_clear(nid, nmask);
1438 if (nodes_empty(nmask))
1439 node_set(nid, nmask);
1440 }
1441 /* Allocate a new page from the nearest neighbor node */
> 1442 ret = migrate_pages(&source, new_node_page, NULL, &nmask,
1443 MIGRATE_SYNC, MR_MEMORY_HOTPLUG);
1444 if (ret)
1445 putback_movable_pages(&source);
1446 }
1447 out:
1448 return ret;
1449 }
1450
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 33239 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock
2018-09-19 3:17 ` [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock Pingfan Liu
@ 2018-09-19 6:49 ` kbuild test robot
0 siblings, 0 replies; 6+ messages in thread
From: kbuild test robot @ 2018-09-19 6:49 UTC (permalink / raw)
To: Pingfan Liu
Cc: kbuild-all, linux-mm, Andrew Morton, KAMEZAWA Hiroyuki,
Mel Gorman, Greg Kroah-Hartman, Pavel Tatashin, Michal Hocko,
Bharata B Rao, Dan Williams, H. Peter Anvin, Kirill A . Shutemov
[-- Attachment #1: Type: text/plain, Size: 3823 bytes --]
Hi Pingfan,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on linus/master]
[also build test ERROR on v4.19-rc4 next-20180918]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
url: https://github.com/0day-ci/linux/commits/Pingfan-Liu/introduce-a-new-state-isolate-for-memblock-to-split-the-isolation-and-migration-steps/20180919-112650
config: x86_64-randconfig-s0-09191204 (attached as .config)
compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64
All errors (new ones prefixed by >>):
drivers/base/memory.o: In function `store_mem_state':
>> drivers/base/memory.c:385: undefined reference to `start_isolate_page_range'
>> drivers/base/memory.c:391: undefined reference to `undo_isolate_page_range'
vim +385 drivers/base/memory.c
323
324 static ssize_t
325 store_mem_state(struct device *dev,
326 struct device_attribute *attr, const char *buf, size_t count)
327 {
328 struct memory_block *mem = to_memory_block(dev);
329 int ret, online_type;
330 int isolated = 0;
331 unsigned long start_pfn;
332 unsigned long nr_pages = PAGES_PER_SECTION * sections_per_block;
333
334 ret = lock_device_hotplug_sysfs();
335 if (ret)
336 return ret;
337
338 if (sysfs_streq(buf, "online_kernel"))
339 online_type = MMOP_ONLINE_KERNEL;
340 else if (sysfs_streq(buf, "online_movable"))
341 online_type = MMOP_ONLINE_MOVABLE;
342 else if (sysfs_streq(buf, "online"))
343 online_type = MMOP_ONLINE_KEEP;
344 else if (sysfs_streq(buf, "offline"))
345 online_type = MMOP_OFFLINE;
346 else if (sysfs_streq(buf, "isolate")) {
347 isolated = 1;
348 goto memblock_isolated;
349 } else if (sysfs_streq(buf, "unisolate")) {
350 isolated = -1;
351 goto memblock_isolated;
352 } else {
353 ret = -EINVAL;
354 goto err;
355 }
356
357 /*
358 * Memory hotplug needs to hold mem_hotplug_begin() for probe to find
359 * the correct memory block to online before doing device_online(dev),
360 * which will take dev->mutex. Take the lock early to prevent an
361 * inversion, memory_subsys_online() callbacks will be implemented by
362 * assuming it's already protected.
363 */
364 mem_hotplug_begin();
365
366 switch (online_type) {
367 case MMOP_ONLINE_KERNEL:
368 case MMOP_ONLINE_MOVABLE:
369 case MMOP_ONLINE_KEEP:
370 mem->online_type = online_type;
371 ret = device_online(&mem->dev);
372 break;
373 case MMOP_OFFLINE:
374 ret = device_offline(&mem->dev);
375 break;
376 default:
377 ret = -EINVAL; /* should never happen */
378 }
379
380 mem_hotplug_done();
381 err:
382 memblock_isolated:
383 if (isolated == 1 && mem->state == MEM_ONLINE) {
384 start_pfn = section_nr_to_pfn(mem->start_section_nr);
> 385 ret = start_isolate_page_range(start_pfn, start_pfn + nr_pages,
386 MIGRATE_MOVABLE, true, true);
387 if (!ret)
388 mem->state = MEM_ISOLATED;
389 } else if (isolated == -1 && mem->state == MEM_ISOLATED) {
390 start_pfn = section_nr_to_pfn(mem->start_section_nr);
> 391 ret = undo_isolate_page_range(start_pfn, start_pfn + nr_pages,
392 MIGRATE_MOVABLE, true);
393 if (!ret)
394 mem->state = MEM_ONLINE;
395 }
396 unlock_device_hotplug();
397
398 if (ret < 0)
399 return ret;
400 if (ret)
401 return -EINVAL;
402
403 return count;
404 }
405
---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation
[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 27621 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-09-19 6:50 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-09-19 3:17 [PATCH 0/3] introduce a new state 'isolate' for memblock to split the isolation and migration steps Pingfan Liu
2018-09-19 3:17 ` [PATCH 1/3] mm/isolation: separate the isolation and migration ops in offline memblock Pingfan Liu
2018-09-19 3:17 ` [PATCH 2/3] drivers/base/memory: introduce a new state 'isolate' for memblock Pingfan Liu
2018-09-19 6:49 ` kbuild test robot
2018-09-19 3:17 ` [PATCH 3/3] drivers/base/node: create a partial offline hints under each node Pingfan Liu
2018-09-19 4:36 ` kbuild test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).