* [PATCH v4 0/6] Per process reclaim
@ 2013-05-08 23:38 Minchan Kim
2013-05-08 23:38 ` [PATCH v4 1/6] mm: " Minchan Kim
` (5 more replies)
0 siblings, 6 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).
One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.
The patch[1] adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.
It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.
Reclaim file-backed pages only.
echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
echo anon > /proc/PID/reclaim
Reclaim all pages
echo all > /proc/PID/reclaim
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
The patch[4] causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.
Another requirement is per address space reclaim.(By Michael Kerrisk)
In case of Webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so patch[5] supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.
echo [addr] [size-byte] > /proc/pid/reclaim
* Changelog from v3
* Rebased on next-20130508
* Minor change
* Changelog from v2
* Use memparse - Namhung Kim
* Add Acked-by
* Changelog from v1
* Change reclaim knob interface - Dave Hansen
* proc.txt document change - Rob Landley
Minchan Kim (6):
[1] mm: Per process reclaim
[2] mm: make shrink_page_list with pages work from multiple zones
[3] mm: Remove shrink_page
[4] mm: Enhance per process reclaim to consider shared pages
[5] mm: Support address range reclaim
[6] add documentation about reclaim knob on proc.txt
Documentation/filesystems/proc.txt | 20 +++++
fs/proc/base.c | 3 +
fs/proc/internal.h | 1 +
fs/proc/task_mmu.c | 176 +++++++++++++++++++++++++++++++++++++
include/linux/ksm.h | 6 +-
include/linux/rmap.h | 10 ++-
mm/Kconfig | 16 ++++
mm/ksm.c | 9 +-
mm/memory-failure.c | 2 +-
mm/migrate.c | 6 +-
mm/rmap.c | 57 ++++++++----
mm/vmscan.c | 57 +++++++++++-
12 files changed, 336 insertions(+), 27 deletions(-)
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v4 1/6] mm: Per process reclaim
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
@ 2013-05-08 23:38 ` Minchan Kim
2013-05-08 23:38 ` [PATCH v4 2/6] mm: make shrink_page_list with pages work from multiple zones Minchan Kim
` (4 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).
One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.
This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.
It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.
Reclaim file-backed pages only.
echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
echo anon > /proc/PID/reclaim
Reclaim all pages
echo all > /proc/PID/reclaim
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
fs/proc/base.c | 3 ++
fs/proc/internal.h | 1 +
fs/proc/task_mmu.c | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/rmap.h | 4 ++
mm/Kconfig | 13 ++++++
mm/vmscan.c | 59 +++++++++++++++++++++++++
6 files changed, 201 insertions(+)
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4d3ebd6..9286b43 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2669,6 +2669,9 @@ static const struct pid_entry tgid_base_stuff[] = {
REG("mounts", S_IRUGO, proc_mounts_operations),
REG("mountinfo", S_IRUGO, proc_mountinfo_operations),
REG("mountstats", S_IRUSR, proc_mountstats_operations),
+#ifdef CONFIG_PROCESS_RECLAIM
+ REG("reclaim", S_IWUSR, proc_reclaim_operations),
+#endif
#ifdef CONFIG_PROC_PAGE_MONITOR
REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
REG("smaps", S_IRUGO, proc_pid_smaps_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index f417d43..0c29375 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -204,6 +204,7 @@ struct pde_opener {
};
extern const struct inode_operations proc_pid_link_inode_operations;
+extern const struct file_operations proc_reclaim_operations;
extern void proc_init_inodecache(void);
extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3240a49..cd6bb70 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -11,6 +11,7 @@
#include <linux/rmap.h>
#include <linux/swap.h>
#include <linux/swapops.h>
+#include <linux/mm_inline.h>
#include <asm/elf.h>
#include <asm/uaccess.h>
@@ -1182,6 +1183,126 @@ const struct file_operations proc_pagemap2_operations = {
};
#endif /* CONFIG_PROC_PAGE_MONITOR */
+#ifdef CONFIG_PROCESS_RECLAIM
+static int reclaim_pte_range(pmd_t *pmd, unsigned long addr,
+ unsigned long end, struct mm_walk *walk)
+{
+ struct vm_area_struct *vma = walk->private;
+ pte_t *pte, ptent;
+ spinlock_t *ptl;
+ struct page *page;
+ LIST_HEAD(page_list);
+ int isolated;
+
+ split_huge_page_pmd(vma, addr, pmd);
+ if (pmd_trans_unstable(pmd))
+ return 0;
+cont:
+ isolated = 0;
+ pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+ for (; addr != end; pte++, addr += PAGE_SIZE) {
+ ptent = *pte;
+ if (!pte_present(ptent))
+ continue;
+
+ page = vm_normal_page(vma, addr, ptent);
+ if (!page)
+ continue;
+
+ if (isolate_lru_page(page))
+ continue;
+
+ list_add(&page->lru, &page_list);
+ inc_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
+ isolated++;
+ if (isolated >= SWAP_CLUSTER_MAX)
+ break;
+ }
+ pte_unmap_unlock(pte - 1, ptl);
+ reclaim_pages_from_list(&page_list);
+ if (addr != end)
+ goto cont;
+
+ cond_resched();
+ return 0;
+}
+
+enum reclaim_type {
+ RECLAIM_FILE,
+ RECLAIM_ANON,
+ RECLAIM_ALL,
+ RECLAIM_RANGE,
+};
+
+static ssize_t reclaim_write(struct file *file, const char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ struct task_struct *task;
+ char buffer[PROC_NUMBUF];
+ struct mm_struct *mm;
+ struct vm_area_struct *vma;
+ enum reclaim_type type;
+ char *type_buf;
+
+ memset(buffer, 0, sizeof(buffer));
+ if (count > sizeof(buffer) - 1)
+ count = sizeof(buffer) - 1;
+
+ if (copy_from_user(buffer, buf, count))
+ return -EFAULT;
+
+ type_buf = strstrip(buffer);
+ if (!strcmp(type_buf, "file"))
+ type = RECLAIM_FILE;
+ else if (!strcmp(type_buf, "anon"))
+ type = RECLAIM_ANON;
+ else if (!strcmp(type_buf, "all"))
+ type = RECLAIM_ALL;
+ else
+ return -EINVAL;
+
+ task = get_proc_task(file->f_path.dentry->d_inode);
+ if (!task)
+ return -ESRCH;
+
+ mm = get_task_mm(task);
+ if (mm) {
+ struct mm_walk reclaim_walk = {
+ .pmd_entry = reclaim_pte_range,
+ .mm = mm,
+ };
+
+ down_read(&mm->mmap_sem);
+ for (vma = mm->mmap; vma; vma = vma->vm_next) {
+ reclaim_walk.private = vma;
+
+ if (is_vm_hugetlb_page(vma))
+ continue;
+
+ if (type == RECLAIM_ANON && vma->vm_file)
+ continue;
+ if (type == RECLAIM_FILE && !vma->vm_file)
+ continue;
+
+ walk_page_range(vma->vm_start, vma->vm_end,
+ &reclaim_walk);
+ }
+ flush_tlb_mm(mm);
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+ }
+ put_task_struct(task);
+
+ return count;
+}
+
+const struct file_operations proc_reclaim_operations = {
+ .write = reclaim_write,
+ .llseek = noop_llseek,
+};
+#endif
+
#ifdef CONFIG_NUMA
struct numa_maps {
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..a24e34e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -10,6 +10,10 @@
#include <linux/rwsem.h>
#include <linux/memcontrol.h>
+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+
/*
* The anon_vma heads a list of private "related" vmas, to scan if
* an anonymous page pointing to this anon_vma needs to be unmapped:
diff --git a/mm/Kconfig b/mm/Kconfig
index b0a7dad..d119b5b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -489,3 +489,16 @@ config MEM_SOFT_DIRTY
it can be cleared by hands.
See Documentation/vm/soft-dirty.txt for more details.
+
+config PROCESS_RECLAIM
+ bool "Enable process reclaim"
+ depends on PROC_FS
+ default n
+ help
+ It allows to reclaim pages of the process by /proc/pid/reclaim.
+
+ (echo file > /proc/PID/reclaim) reclaims file-backed pages only.
+ (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
+ (echo all > /proc/PID/reclaim) reclaims all pages.
+
+ Any other vaule is ignored.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index fa6a853..6b7cba3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -992,6 +992,65 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
return ret;
}
+#ifdef CONFIG_PROCESS_RECLAIM
+static unsigned long shrink_page(struct page *page,
+ struct zone *zone,
+ struct scan_control *sc,
+ enum ttu_flags ttu_flags,
+ unsigned long *ret_nr_dirty,
+ unsigned long *ret_nr_writeback,
+ bool force_reclaim,
+ struct list_head *ret_pages)
+{
+ int reclaimed;
+ LIST_HEAD(page_list);
+ list_add(&page->lru, &page_list);
+
+ reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
+ ret_nr_dirty, ret_nr_writeback,
+ force_reclaim);
+ if (!reclaimed)
+ list_splice(&page_list, ret_pages);
+
+ return reclaimed;
+}
+
+unsigned long reclaim_pages_from_list(struct list_head *page_list)
+{
+ struct scan_control sc = {
+ .gfp_mask = GFP_KERNEL,
+ .priority = DEF_PRIORITY,
+ .may_unmap = 1,
+ .may_swap = 1,
+ };
+
+ LIST_HEAD(ret_pages);
+ struct page *page;
+ unsigned long dummy1, dummy2;
+ unsigned long nr_reclaimed = 0;
+
+ while (!list_empty(page_list)) {
+ page = lru_to_page(page_list);
+ list_del(&page->lru);
+
+ ClearPageActive(page);
+ nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+ TTU_UNMAP|TTU_IGNORE_ACCESS,
+ &dummy1, &dummy2, true, &ret_pages);
+ }
+
+ while (!list_empty(&ret_pages)) {
+ page = lru_to_page(&ret_pages);
+ list_del(&page->lru);
+ dec_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
+ putback_lru_page(page);
+ }
+
+ return nr_reclaimed;
+}
+#endif
+
/*
* Attempt to remove the specified page from its LRU. Only take this page
* if it is of the appropriate PageActive status. Pages which are being
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 2/6] mm: make shrink_page_list with pages work from multiple zones
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
2013-05-08 23:38 ` [PATCH v4 1/6] mm: " Minchan Kim
@ 2013-05-08 23:38 ` Minchan Kim
2013-05-08 23:38 ` [PATCH v4 3/6] mm: Remove shrink_page Minchan Kim
` (3 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
Shrink_page_list expects all pages come from a same zone
but it's too limited to use.
This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/vmscan.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6b7cba3..a1fb526 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -706,7 +706,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
goto keep;
VM_BUG_ON(PageActive(page));
- VM_BUG_ON(page_zone(page) != zone);
+ if (zone)
+ VM_BUG_ON(page_zone(page) != zone);
sc->nr_scanned++;
@@ -952,7 +953,7 @@ keep:
* back off and wait for congestion to clear because further reclaim
* will encounter the same problem
*/
- if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
+ if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc) && zone)
zone_set_flag(zone, ZONE_CONGESTED);
free_hot_cold_page_list(&free_pages, 1);
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 3/6] mm: Remove shrink_page
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
2013-05-08 23:38 ` [PATCH v4 1/6] mm: " Minchan Kim
2013-05-08 23:38 ` [PATCH v4 2/6] mm: make shrink_page_list with pages work from multiple zones Minchan Kim
@ 2013-05-08 23:38 ` Minchan Kim
2013-05-08 23:39 ` [PATCH v4 4/6] mm: Enhance per process reclaim to consider shared pages Minchan Kim
` (2 subsequent siblings)
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:38 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
mm/vmscan.c | 47 ++++++++++++++---------------------------------
1 file changed, 14 insertions(+), 33 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a1fb526..0d4df03 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -924,6 +924,13 @@ free_it:
* appear not as the counts should be low
*/
list_add(&page->lru, &free_pages);
+ /*
+ * If pagelist are from multiple zones, we should decrease
+ * NR_ISOLATED_ANON + x on freed pages in here.
+ */
+ if (!zone)
+ dec_zone_page_state(page, NR_ISOLATED_ANON +
+ page_is_file_cache(page));
continue;
cull_mlocked:
@@ -994,28 +1001,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
}
#ifdef CONFIG_PROCESS_RECLAIM
-static unsigned long shrink_page(struct page *page,
- struct zone *zone,
- struct scan_control *sc,
- enum ttu_flags ttu_flags,
- unsigned long *ret_nr_dirty,
- unsigned long *ret_nr_writeback,
- bool force_reclaim,
- struct list_head *ret_pages)
-{
- int reclaimed;
- LIST_HEAD(page_list);
- list_add(&page->lru, &page_list);
-
- reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
- ret_nr_dirty, ret_nr_writeback,
- force_reclaim);
- if (!reclaimed)
- list_splice(&page_list, ret_pages);
-
- return reclaimed;
-}
-
unsigned long reclaim_pages_from_list(struct list_head *page_list)
{
struct scan_control sc = {
@@ -1025,23 +1010,19 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
.may_swap = 1,
};
- LIST_HEAD(ret_pages);
+ unsigned long nr_reclaimed;
struct page *page;
unsigned long dummy1, dummy2;
- unsigned long nr_reclaimed = 0;
-
- while (!list_empty(page_list)) {
- page = lru_to_page(page_list);
- list_del(&page->lru);
+ list_for_each_entry(page, page_list, lru)
ClearPageActive(page);
- nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+
+ nr_reclaimed = shrink_page_list(page_list, NULL, &sc,
TTU_UNMAP|TTU_IGNORE_ACCESS,
- &dummy1, &dummy2, true, &ret_pages);
- }
+ &dummy1, &dummy2, true);
- while (!list_empty(&ret_pages)) {
- page = lru_to_page(&ret_pages);
+ while (!list_empty(page_list)) {
+ page = lru_to_page(page_list);
list_del(&page->lru);
dec_zone_page_state(page, NR_ISOLATED_ANON +
page_is_file_cache(page));
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 4/6] mm: Enhance per process reclaim to consider shared pages
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
` (2 preceding siblings ...)
2013-05-08 23:38 ` [PATCH v4 3/6] mm: Remove shrink_page Minchan Kim
@ 2013-05-08 23:39 ` Minchan Kim
2013-05-08 23:39 ` [PATCH v4 5/6] mm: Support address range reclaim Minchan Kim
2013-05-08 23:39 ` [PATCH v4 6/6] add documentation about reclaim knob on proc.txt Minchan Kim
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim, Sangseok Lee
Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.
This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.
Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
fs/proc/task_mmu.c | 2 +-
include/linux/ksm.h | 6 ++++--
include/linux/rmap.h | 8 +++++---
mm/ksm.c | 9 ++++++++-
mm/memory-failure.c | 2 +-
mm/migrate.c | 6 ++++--
mm/rmap.c | 57 +++++++++++++++++++++++++++++++++++++---------------
mm/vmscan.c | 14 +++++++++++--
8 files changed, 76 insertions(+), 28 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cd6bb70..ccc97b1 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1220,7 +1220,7 @@ cont:
break;
}
pte_unmap_unlock(pte - 1, ptl);
- reclaim_pages_from_list(&page_list);
+ reclaim_pages_from_list(&page_list, vma);
if (addr != end)
goto cont;
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..d8e556b 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
int page_referenced_ksm(struct page *page,
struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
+int try_to_unmap_ksm(struct page *page,
+ enum ttu_flags flags, struct vm_area_struct *vma);
int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
struct vm_area_struct *, unsigned long, void *), void *arg);
void ksm_migrate_page(struct page *newpage, struct page *oldpage);
@@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
return 0;
}
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+static inline int try_to_unmap_ksm(struct page *page,
+ enum ttu_flags flags, struct vm_area_struct *target_vma)
{
return 0;
}
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a24e34e..6c7d030 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -12,7 +12,8 @@
extern int isolate_lru_page(struct page *page);
extern void putback_lru_page(struct page *page);
-extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
+ struct vm_area_struct *vma);
/*
* The anon_vma heads a list of private "related" vmas, to scan if
@@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
#define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
-int try_to_unmap(struct page *, enum ttu_flags flags);
+int try_to_unmap(struct page *, enum ttu_flags flags,
+ struct vm_area_struct *vma);
int try_to_unmap_one(struct page *, struct vm_area_struct *,
unsigned long address, enum ttu_flags flags);
@@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
return 0;
}
-#define try_to_unmap(page, refs) SWAP_FAIL
+#define try_to_unmap(page, refs, vma) SWAP_FAIL
static inline int page_mkclean(struct page *page)
{
diff --git a/mm/ksm.c b/mm/ksm.c
index b6afe0c..2efdfd2 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1946,7 +1946,8 @@ out:
return referenced;
}
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
struct stable_node *stable_node;
struct rmap_item *rmap_item;
@@ -1959,6 +1960,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
stable_node = page_stable_node(page);
if (!stable_node)
return SWAP_FAIL;
+
+ if (target_vma) {
+ unsigned long address = vma_address(page, target_vma);
+ ret = try_to_unmap_one(page, target_vma, address, flags);
+ goto out;
+ }
again:
hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
struct anon_vma *anon_vma = rmap_item->anon_vma;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ceb0c7f..f3928e4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
if (hpage != ppage)
lock_page(ppage);
- ret = try_to_unmap(ppage, ttu);
+ ret = try_to_unmap(ppage, ttu, NULL);
if (ret != SWAP_SUCCESS)
printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
pfn, page_mapcount(ppage));
diff --git a/mm/migrate.c b/mm/migrate.c
index c9c5eee..aef29a0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
}
/* Establish migration ptes or remove ptes */
- try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+ NULL);
skip_unmap:
if (!page_mapped(page))
@@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
if (PageAnon(hpage))
anon_vma = page_get_anon_vma(hpage);
- try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+ try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+ NULL);
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..43718fc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
/**
* try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
+ * rmap method if @vma is NULL
* @page: the page to unmap/unlock
* @flags: action and flags
+ * @target_vma: vma for unmapping a @page
*
* Find all the mappings of a page using the mapping pointer and the vma chains
* contained in the anon_vma struct it points to.
*
+ * If @target_vma isn't NULL, this function unmap a page from the vma
+ *
* This function is only called from try_to_unmap/try_to_munlock for
* anonymous pages.
* When called from try_to_munlock(), the mmap_sem of the mm containing the vma
@@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
* vm_flags for that VMA. That should be OK, because that vma shouldn't be
* 'LOCKED.
*/
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
+ int ret = SWAP_AGAIN;
+ unsigned long address;
struct anon_vma *anon_vma;
pgoff_t pgoff;
struct anon_vma_chain *avc;
- int ret = SWAP_AGAIN;
+
+ if (target_vma) {
+ address = vma_address(page, target_vma);
+ return try_to_unmap_one(page, target_vma, address, flags);
+ }
anon_vma = page_lock_anon_vma_read(page);
if (!anon_vma)
@@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
struct vm_area_struct *vma = avc->vma;
- unsigned long address;
/*
* During exec, a temporary VMA is setup and later moved.
@@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
* try_to_unmap_file - unmap/unlock file page using the object-based rmap method
* @page: the page to unmap/unlock
* @flags: action and flags
+ * @target_vma: vma for unmapping @page
*
* Find all the mappings of a page using the mapping pointer and the vma chains
* contained in the address_space struct it points to.
@@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
* vm_flags for that VMA. That should be OK, because that vma shouldn't be
* 'LOCKED.
*/
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *target_vma)
{
struct address_space *mapping = page->mapping;
pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1512,16 +1523,26 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
unsigned long max_nl_cursor = 0;
unsigned long max_nl_size = 0;
unsigned int mapcount;
+ unsigned long address;
if (PageHuge(page))
pgoff = page->index << compound_order(page);
mutex_lock(&mapping->i_mmap_mutex);
- vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
- unsigned long address = vma_address(page, vma);
- ret = try_to_unmap_one(page, vma, address, flags);
- if (ret != SWAP_AGAIN || !page_mapped(page))
+ if (target_vma) {
+ /* We don't handle non-linear vma on ramfs */
+ if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
goto out;
+ address = vma_address(page, target_vma);
+ ret = try_to_unmap_one(page, target_vma, address, flags);
+ goto out;
+ } else {
+ vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+ address = vma_address(page, vma);
+ ret = try_to_unmap_one(page, vma, address, flags);
+ if (ret != SWAP_AGAIN || !page_mapped(page))
+ goto out;
+ }
}
if (list_empty(&mapping->i_mmap_nonlinear))
@@ -1602,9 +1623,12 @@ out:
* try_to_unmap - try to remove all page table mappings to a page
* @page: the page to get unmapped
* @flags: action and flags
+ * @vma : target vma for reclaim
*
* Tries to remove all the page table entries which are mapping this
* page, used in the pageout path. Caller must hold the page lock.
+ * If @vma is not NULL, this function try to remove @page from only @vma
+ * without peeking all mapped vma for @page.
* Return values are:
*
* SWAP_SUCCESS - we succeeded in removing all mappings
@@ -1612,7 +1636,8 @@ out:
* SWAP_FAIL - the page is unswappable
* SWAP_MLOCK - page is mlocked.
*/
-int try_to_unmap(struct page *page, enum ttu_flags flags)
+int try_to_unmap(struct page *page, enum ttu_flags flags,
+ struct vm_area_struct *vma)
{
int ret;
@@ -1620,11 +1645,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
if (unlikely(PageKsm(page)))
- ret = try_to_unmap_ksm(page, flags);
+ ret = try_to_unmap_ksm(page, flags, vma);
else if (PageAnon(page))
- ret = try_to_unmap_anon(page, flags);
+ ret = try_to_unmap_anon(page, flags, vma);
else
- ret = try_to_unmap_file(page, flags);
+ ret = try_to_unmap_file(page, flags, vma);
if (ret != SWAP_MLOCK && !page_mapped(page))
ret = SWAP_SUCCESS;
return ret;
@@ -1650,11 +1675,11 @@ int try_to_munlock(struct page *page)
VM_BUG_ON(!PageLocked(page) || PageLRU(page));
if (unlikely(PageKsm(page)))
- return try_to_unmap_ksm(page, TTU_MUNLOCK);
+ return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
else if (PageAnon(page))
- return try_to_unmap_anon(page, TTU_MUNLOCK);
+ return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
else
- return try_to_unmap_file(page, TTU_MUNLOCK);
+ return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
}
void __put_anon_vma(struct anon_vma *anon_vma)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0d4df03..86aa489 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -93,6 +93,13 @@ struct scan_control {
* are scanned.
*/
nodemask_t *nodemask;
+
+ /*
+ * Reclaim pages from a vma. If the page is shared by other tasks
+ * it is zapped from a vma without reclaim so it ends up remaining
+ * on memory until last task zap it.
+ */
+ struct vm_area_struct *target_vma;
};
#define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -794,7 +801,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
* processes. Try to unmap it here.
*/
if (page_mapped(page) && mapping) {
- switch (try_to_unmap(page, ttu_flags)) {
+ switch (try_to_unmap(page,
+ ttu_flags, sc->target_vma)) {
case SWAP_FAIL:
goto activate_locked;
case SWAP_AGAIN:
@@ -1001,13 +1009,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
}
#ifdef CONFIG_PROCESS_RECLAIM
-unsigned long reclaim_pages_from_list(struct list_head *page_list)
+unsigned long reclaim_pages_from_list(struct list_head *page_list,
+ struct vm_area_struct *vma)
{
struct scan_control sc = {
.gfp_mask = GFP_KERNEL,
.priority = DEF_PRIORITY,
.may_unmap = 1,
.may_swap = 1,
+ .target_vma = vma,
};
unsigned long nr_reclaimed;
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 5/6] mm: Support address range reclaim
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
` (3 preceding siblings ...)
2013-05-08 23:39 ` [PATCH v4 4/6] mm: Enhance per process reclaim to consider shared pages Minchan Kim
@ 2013-05-08 23:39 ` Minchan Kim
2013-05-08 23:39 ` [PATCH v4 6/6] add documentation about reclaim knob on proc.txt Minchan Kim
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
This patch adds address range reclaim of a process.
The requirement is following as,
Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.
echo [addr] [size-byte] > /proc/pid/reclaim
The addr should be page-aligned.
So now reclaim konb's interface is following as.
echo file > /proc/pid/reclaim
reclaim file-backed pages only
echo anon > /proc/pid/reclaim
reclaim anonymous pages only
echo all > /proc/pid/reclaim
reclaim all pages
echo 0x100000 8K > /proc/pid/reclaim
reclaim pages in (0x100000 - 0x102000)
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
fs/proc/task_mmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++----------
mm/Kconfig | 3 ++
2 files changed, 73 insertions(+), 15 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ccc97b1..61b6bde 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -12,6 +12,7 @@
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/mm_inline.h>
+#include <linux/ctype.h>
#include <asm/elf.h>
#include <asm/uaccess.h>
@@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
struct task_struct *task;
- char buffer[PROC_NUMBUF];
+ char buffer[200];
struct mm_struct *mm;
struct vm_area_struct *vma;
enum reclaim_type type;
char *type_buf;
+ struct mm_walk reclaim_walk = {};
+ unsigned long start = 0;
+ unsigned long end = 0;
memset(buffer, 0, sizeof(buffer));
if (count > sizeof(buffer) - 1)
@@ -1259,42 +1263,93 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
type = RECLAIM_ANON;
else if (!strcmp(type_buf, "all"))
type = RECLAIM_ALL;
+ else if (isdigit(*type_buf))
+ type = RECLAIM_RANGE;
else
- return -EINVAL;
+ goto out_err;
+
+ if (type == RECLAIM_RANGE) {
+ char *token;
+ unsigned long long len, len_in, tmp;
+ token = strsep(&type_buf, " ");
+ if (!token)
+ goto out_err;
+ tmp = memparse(token, &token);
+ if (tmp & ~PAGE_MASK || tmp > ULONG_MAX)
+ goto out_err;
+ start = tmp;
+
+ token = strsep(&type_buf, " ");
+ if (!token)
+ goto out_err;
+ len_in = memparse(token, &token);
+ len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+ if (len > ULONG_MAX)
+ goto out_err;
+ /*
+ * Check to see whether len was rounded up from small -ve
+ * to zero.
+ */
+ if (len_in && !len)
+ goto out_err;
+
+ end = start + len;
+ if (end < start)
+ goto out_err;
+ }
task = get_proc_task(file->f_path.dentry->d_inode);
if (!task)
return -ESRCH;
mm = get_task_mm(task);
- if (mm) {
- struct mm_walk reclaim_walk = {
- .pmd_entry = reclaim_pte_range,
- .mm = mm,
- };
+ if (!mm)
+ goto out;
- down_read(&mm->mmap_sem);
- for (vma = mm->mmap; vma; vma = vma->vm_next) {
- reclaim_walk.private = vma;
+ reclaim_walk.mm = mm;
+ reclaim_walk.pmd_entry = reclaim_pte_range;
+ down_read(&mm->mmap_sem);
+ if (type == RECLAIM_RANGE) {
+ vma = find_vma(mm, start);
+ while (vma) {
+ if (vma->vm_start > end)
+ break;
+ if (is_vm_hugetlb_page(vma))
+ continue;
+
+ reclaim_walk.private = vma;
+ walk_page_range(max(vma->vm_start, start),
+ min(vma->vm_end, end),
+ &reclaim_walk);
+ vma = vma->vm_next;
+ }
+ } else {
+ for (vma = mm->mmap; vma; vma = vma->vm_next) {
if (is_vm_hugetlb_page(vma))
continue;
if (type == RECLAIM_ANON && vma->vm_file)
continue;
+
if (type == RECLAIM_FILE && !vma->vm_file)
continue;
+ reclaim_walk.private = vma;
walk_page_range(vma->vm_start, vma->vm_end,
- &reclaim_walk);
+ &reclaim_walk);
}
- flush_tlb_mm(mm);
- up_read(&mm->mmap_sem);
- mmput(mm);
}
- put_task_struct(task);
+ flush_tlb_mm(mm);
+ up_read(&mm->mmap_sem);
+ mmput(mm);
+out:
+ put_task_struct(task);
return count;
+
+out_err:
+ return -EINVAL;
}
const struct file_operations proc_reclaim_operations = {
diff --git a/mm/Kconfig b/mm/Kconfig
index d119b5b..3460da4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -501,4 +501,7 @@ config PROCESS_RECLAIM
(echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
(echo all > /proc/PID/reclaim) reclaims all pages.
+ (echo addr size-byte > /proc/PID/reclaim) reclaims pages in
+ (addr, addr + size-bytes) of the process.
+
Any other vaule is ignored.
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH v4 6/6] add documentation about reclaim knob on proc.txt
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
` (4 preceding siblings ...)
2013-05-08 23:39 ` [PATCH v4 5/6] mm: Support address range reclaim Minchan Kim
@ 2013-05-08 23:39 ` Minchan Kim
5 siblings, 0 replies; 7+ messages in thread
From: Minchan Kim @ 2013-05-08 23:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Michael Kerrisk, Rik van Riel, Dave Hansen, Namhyung Kim,
linux-mm, linux-kernel, Minchan Kim
This patch adds stuff about new reclaim field in proc.txt
Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
Documentation/filesystems/proc.txt | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 488c094..ee4cef1 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
maps Memory maps to executables and library files (2.4)
mem Memory held by this process
root Link to the root directory of this process
+ reclaim Reclaim pages in this process
stat Process status
statm Process memory status information
status Process status in human readable form
@@ -489,6 +490,25 @@ To clear the soft-dirty bit
Any other value written to /proc/PID/clear_refs will have no effect.
+The file /proc/PID/reclaim is used to reclaim pages in this process.
+To reclaim file-backed pages,
+ > echo file > /proc/PID/reclaim
+
+To reclaim anonymous pages,
+ > echo anon > /proc/PID/reclaim
+
+To reclaim all pages,
+ > echo all > /proc/PID/reclaim
+
+Also, you can specify address range of process so part of address space
+will be reclaimed. The format is following as
+ > echo addr size-byte > /proc/PID/reclaim
+
+NOTE: addr should be page-aligned.
+
+Below is example which try to reclaim 2M from 0x100000.
+ > echo 0x100000 2M > /proc/PID/reclaim
+
The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
using /proc/kpageflags and number of times a page is mapped using
/proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
--
1.8.2.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-05-08 23:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-08 23:38 [PATCH v4 0/6] Per process reclaim Minchan Kim
2013-05-08 23:38 ` [PATCH v4 1/6] mm: " Minchan Kim
2013-05-08 23:38 ` [PATCH v4 2/6] mm: make shrink_page_list with pages work from multiple zones Minchan Kim
2013-05-08 23:38 ` [PATCH v4 3/6] mm: Remove shrink_page Minchan Kim
2013-05-08 23:39 ` [PATCH v4 4/6] mm: Enhance per process reclaim to consider shared pages Minchan Kim
2013-05-08 23:39 ` [PATCH v4 5/6] mm: Support address range reclaim Minchan Kim
2013-05-08 23:39 ` [PATCH v4 6/6] add documentation about reclaim knob on proc.txt Minchan Kim
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).