* [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node
@ 2011-09-25 10:59 Kautuk Consul
2011-09-26 9:20 ` KAMEZAWA Hiroyuki
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
0 siblings, 2 replies; 14+ messages in thread
From: Kautuk Consul @ 2011-09-25 10:59 UTC (permalink / raw)
To: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel
Cc: linux-mm, linux-kernel, Kautuk Consul
write_scan_unavictable_node checks the value req returned by
strict_strtoul and returns 1 if req is 0.
However, when strict_strtoul returns 0, it means successful conversion
of buf to unsigned long.
Due to this, the function was not proceeding to scan the zones for
unevictable pages even though we write a valid value to the
scan_unevictable_pages sys file.
Changing this if check slightly to check for invalid value
in buf as well as 0 value stored in res after successful conversion
via strict_strtoul.
In both cases, we do not perform the scanning of this node's zones.
Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
mm/vmscan.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b55699c..73996e6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3450,8 +3450,8 @@ static ssize_t write_scan_unevictable_node(struct sys_device *dev,
unsigned long res;
unsigned long req = strict_strtoul(buf, 10, &res);
- if (!req)
- return 1; /* zero is no-op */
+ if (req || !res)
+ return 1; /* Invalid input or zero is no-op */
for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
if (!populated_zone(zone))
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node
2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul
@ 2011-09-26 9:20 ` KAMEZAWA Hiroyuki
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
1 sibling, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-09-26 9:20 UTC (permalink / raw)
To: Kautuk Consul
Cc: Andrew Morton, Mel Gorman, Minchan Kim, Rik van Riel, linux-mm,
linux-kernel
On Sun, 25 Sep 2011 16:29:40 +0530
Kautuk Consul <consul.kautuk@gmail.com> wrote:
> write_scan_unavictable_node checks the value req returned by
> strict_strtoul and returns 1 if req is 0.
>
> However, when strict_strtoul returns 0, it means successful conversion
> of buf to unsigned long.
>
> Due to this, the function was not proceeding to scan the zones for
> unevictable pages even though we write a valid value to the
> scan_unevictable_pages sys file.
>
> Changing this if check slightly to check for invalid value
> in buf as well as 0 value stored in res after successful conversion
> via strict_strtoul.
> In both cases, we do not perform the scanning of this node's zones.
>
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul
2011-09-26 9:20 ` KAMEZAWA Hiroyuki
@ 2011-09-26 11:29 ` Johannes Weiner
2011-09-26 12:10 ` kautuk.c @samsung.com
` (2 more replies)
1 sibling, 3 replies; 14+ messages in thread
From: Johannes Weiner @ 2011-09-26 11:29 UTC (permalink / raw)
To: Kautuk Consul
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
> write_scan_unavictable_node checks the value req returned by
> strict_strtoul and returns 1 if req is 0.
>
> However, when strict_strtoul returns 0, it means successful conversion
> of buf to unsigned long.
>
> Due to this, the function was not proceeding to scan the zones for
> unevictable pages even though we write a valid value to the
> scan_unevictable_pages sys file.
Given that there is not a real reason for this knob (anymore) and that
it apparently never really worked since the day it was introduced, how
about we just drop all that code instead?
Hannes
---
From: Johannes Weiner <jweiner@redhat.com>
Subject: mm: remove sysctl to manually rescue unevictable pages
At one point, anonymous pages were supposed to go on the unevictable
list when no swap space was configured, and the idea was to manually
rescue those pages after adding swap and making them evictable again.
But nowadays, swap-backed pages on the anon LRU list are not scanned
without available swap space anyway, so there is no point in moving
them to a separate list anymore.
The manual rescue could also be used in case pages were stranded on
the unevictable list due to race conditions. But the code has been
around for a while now and newly discovered bugs should be properly
reported and dealt with instead of relying on such a manual fixup.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
drivers/base/node.c | 3 -
include/linux/swap.h | 16 ------
kernel/sysctl.c | 7 ---
mm/vmscan.c | 130 --------------------------------------------------
4 files changed, 0 insertions(+), 156 deletions(-)
diff --git a/drivers/base/node.c b/drivers/base/node.c
index 9e58e71..b9d6e93 100644
--- a/drivers/base/node.c
+++ b/drivers/base/node.c
@@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent)
sysdev_create_file(&node->sysdev, &attr_distance);
sysdev_create_file_optional(&node->sysdev, &attr_vmstat);
- scan_unevictable_register_node(node);
-
hugetlb_register_node(node);
compaction_register_node(node);
@@ -303,7 +301,6 @@ void unregister_node(struct node *node)
sysdev_remove_file(&node->sysdev, &attr_distance);
sysdev_remove_file_optional(&node->sysdev, &attr_vmstat);
- scan_unevictable_unregister_node(node);
hugetlb_unregister_node(node); /* no-op, if memoryless node */
sysdev_unregister(&node->sysdev);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index b156e80..a6a9ee5 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
extern int page_evictable(struct page *page, struct vm_area_struct *vma);
extern void scan_mapping_unevictable_pages(struct address_space *);
-extern unsigned long scan_unevictable_pages;
-extern int scan_unevictable_handler(struct ctl_table *, int,
- void __user *, size_t *, loff_t *);
-#ifdef CONFIG_NUMA
-extern int scan_unevictable_register_node(struct node *node);
-extern void scan_unevictable_unregister_node(struct node *node);
-#else
-static inline int scan_unevictable_register_node(struct node *node)
-{
- return 0;
-}
-static inline void scan_unevictable_unregister_node(struct node *node)
-{
-}
-#endif
-
extern int kswapd_run(int nid);
extern void kswapd_stop(int nid);
#ifdef CONFIG_CGROUP_MEM_RES_CTLR
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 4f057f9..0d66092 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = {
.extra2 = &one,
},
#endif
- {
- .procname = "scan_unevictable_pages",
- .data = &scan_unevictable_pages,
- .maxlen = sizeof(scan_unevictable_pages),
- .mode = 0644,
- .proc_handler = scan_unevictable_handler,
- },
#ifdef CONFIG_MEMORY_FAILURE
{
.procname = "memory_failure_early_kill",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7502726..c99a097 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
}
}
-
-/**
- * scan_zone_unevictable_pages - check unevictable list for evictable pages
- * @zone - zone of which to scan the unevictable list
- *
- * Scan @zone's unevictable LRU lists to check for pages that have become
- * evictable. Move those that have to @zone's inactive list where they
- * become candidates for reclaim, unless shrink_inactive_zone() decides
- * to reactivate them. Pages that are still unevictable are rotated
- * back onto @zone's unevictable list.
- */
-#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
-static void scan_zone_unevictable_pages(struct zone *zone)
-{
- struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
- unsigned long scan;
- unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
-
- while (nr_to_scan > 0) {
- unsigned long batch_size = min(nr_to_scan,
- SCAN_UNEVICTABLE_BATCH_SIZE);
-
- spin_lock_irq(&zone->lru_lock);
- for (scan = 0; scan < batch_size; scan++) {
- struct page *page = lru_to_page(l_unevictable);
-
- if (!trylock_page(page))
- continue;
-
- prefetchw_prev_lru_page(page, l_unevictable, flags);
-
- if (likely(PageLRU(page) && PageUnevictable(page)))
- check_move_unevictable_page(page, zone);
-
- unlock_page(page);
- }
- spin_unlock_irq(&zone->lru_lock);
-
- nr_to_scan -= batch_size;
- }
-}
-
-
-/**
- * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages
- *
- * A really big hammer: scan all zones' unevictable LRU lists to check for
- * pages that have become evictable. Move those back to the zones'
- * inactive list where they become candidates for reclaim.
- * This occurs when, e.g., we have unswappable pages on the unevictable lists,
- * and we add swap to the system. As such, it runs in the context of a task
- * that has possibly/probably made some previously unevictable pages
- * evictable.
- */
-static void scan_all_zones_unevictable_pages(void)
-{
- struct zone *zone;
-
- for_each_zone(zone) {
- scan_zone_unevictable_pages(zone);
- }
-}
-
-/*
- * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of
- * all nodes' unevictable lists for evictable pages
- */
-unsigned long scan_unevictable_pages;
-
-int scan_unevictable_handler(struct ctl_table *table, int write,
- void __user *buffer,
- size_t *length, loff_t *ppos)
-{
- proc_doulongvec_minmax(table, write, buffer, length, ppos);
-
- if (write && *(unsigned long *)table->data)
- scan_all_zones_unevictable_pages();
-
- scan_unevictable_pages = 0;
- return 0;
-}
-
-#ifdef CONFIG_NUMA
-/*
- * per node 'scan_unevictable_pages' attribute. On demand re-scan of
- * a specified node's per zone unevictable lists for evictable pages.
- */
-
-static ssize_t read_scan_unevictable_node(struct sys_device *dev,
- struct sysdev_attribute *attr,
- char *buf)
-{
- return sprintf(buf, "0\n"); /* always zero; should fit... */
-}
-
-static ssize_t write_scan_unevictable_node(struct sys_device *dev,
- struct sysdev_attribute *attr,
- const char *buf, size_t count)
-{
- struct zone *node_zones = NODE_DATA(dev->id)->node_zones;
- struct zone *zone;
- unsigned long res;
- unsigned long req = strict_strtoul(buf, 10, &res);
-
- if (!req)
- return 1; /* zero is no-op */
-
- for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
- if (!populated_zone(zone))
- continue;
- scan_zone_unevictable_pages(zone);
- }
- return 1;
-}
-
-
-static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR,
- read_scan_unevictable_node,
- write_scan_unevictable_node);
-
-int scan_unevictable_register_node(struct node *node)
-{
- return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages);
-}
-
-void scan_unevictable_unregister_node(struct node *node)
-{
- sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages);
-}
-#endif
--
1.7.6.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
@ 2011-09-26 12:10 ` kautuk.c @samsung.com
2011-09-26 12:29 ` kautuk.c @samsung.com
2011-09-26 14:18 ` Johannes Weiner
2011-09-26 23:11 ` Andrew Morton
2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro
2 siblings, 2 replies; 14+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-26 12:10 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
>> write_scan_unavictable_node checks the value req returned by
>> strict_strtoul and returns 1 if req is 0.
>>
>> However, when strict_strtoul returns 0, it means successful conversion
>> of buf to unsigned long.
>>
>> Due to this, the function was not proceeding to scan the zones for
>> unevictable pages even though we write a valid value to the
>> scan_unevictable_pages sys file.
>
> Given that there is not a real reason for this knob (anymore) and that
> it apparently never really worked since the day it was introduced, how
> about we just drop all that code instead?
>
> Hannes
>
> ---
> From: Johannes Weiner <jweiner@redhat.com>
> Subject: mm: remove sysctl to manually rescue unevictable pages
>
> At one point, anonymous pages were supposed to go on the unevictable
> list when no swap space was configured, and the idea was to manually
> rescue those pages after adding swap and making them evictable again.
> But nowadays, swap-backed pages on the anon LRU list are not scanned
> without available swap space anyway, so there is no point in moving
> them to a separate list anymore.
Is this code only for anonymous pages ?
It seems to look at all pages in the zone both file as well as anon.
>
> The manual rescue could also be used in case pages were stranded on
> the unevictable list due to race conditions. But the code has been
> around for a while now and newly discovered bugs should be properly
> reported and dealt with instead of relying on such a manual fixup.
What you say seems to be all right for anon pages, but what about file
pages ?
I'm not sure about how this could happen, but what if some file-system caused
a file cache page to be set to evictable or reclaimable without
actually removing
that page from the unevictable list ?
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
> drivers/base/node.c | 3 -
> include/linux/swap.h | 16 ------
> kernel/sysctl.c | 7 ---
> mm/vmscan.c | 130 --------------------------------------------------
> 4 files changed, 0 insertions(+), 156 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index 9e58e71..b9d6e93 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent)
> sysdev_create_file(&node->sysdev, &attr_distance);
> sysdev_create_file_optional(&node->sysdev, &attr_vmstat);
>
> - scan_unevictable_register_node(node);
> -
> hugetlb_register_node(node);
>
> compaction_register_node(node);
> @@ -303,7 +301,6 @@ void unregister_node(struct node *node)
> sysdev_remove_file(&node->sysdev, &attr_distance);
> sysdev_remove_file_optional(&node->sysdev, &attr_vmstat);
>
> - scan_unevictable_unregister_node(node);
> hugetlb_unregister_node(node); /* no-op, if memoryless node */
>
> sysdev_unregister(&node->sysdev);
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index b156e80..a6a9ee5 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
> extern int page_evictable(struct page *page, struct vm_area_struct *vma);
> extern void scan_mapping_unevictable_pages(struct address_space *);
>
> -extern unsigned long scan_unevictable_pages;
> -extern int scan_unevictable_handler(struct ctl_table *, int,
> - void __user *, size_t *, loff_t *);
> -#ifdef CONFIG_NUMA
> -extern int scan_unevictable_register_node(struct node *node);
> -extern void scan_unevictable_unregister_node(struct node *node);
> -#else
> -static inline int scan_unevictable_register_node(struct node *node)
> -{
> - return 0;
> -}
> -static inline void scan_unevictable_unregister_node(struct node *node)
> -{
> -}
> -#endif
> -
> extern int kswapd_run(int nid);
> extern void kswapd_stop(int nid);
> #ifdef CONFIG_CGROUP_MEM_RES_CTLR
> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
> index 4f057f9..0d66092 100644
> --- a/kernel/sysctl.c
> +++ b/kernel/sysctl.c
> @@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = {
> .extra2 = &one,
> },
> #endif
> - {
> - .procname = "scan_unevictable_pages",
> - .data = &scan_unevictable_pages,
> - .maxlen = sizeof(scan_unevictable_pages),
> - .mode = 0644,
> - .proc_handler = scan_unevictable_handler,
> - },
> #ifdef CONFIG_MEMORY_FAILURE
> {
> .procname = "memory_failure_early_kill",
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7502726..c99a097 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
> }
>
> }
> -
> -/**
> - * scan_zone_unevictable_pages - check unevictable list for evictable pages
> - * @zone - zone of which to scan the unevictable list
> - *
> - * Scan @zone's unevictable LRU lists to check for pages that have become
> - * evictable. Move those that have to @zone's inactive list where they
> - * become candidates for reclaim, unless shrink_inactive_zone() decides
> - * to reactivate them. Pages that are still unevictable are rotated
> - * back onto @zone's unevictable list.
> - */
> -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
> -static void scan_zone_unevictable_pages(struct zone *zone)
> -{
> - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
> - unsigned long scan;
> - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
> -
> - while (nr_to_scan > 0) {
> - unsigned long batch_size = min(nr_to_scan,
> - SCAN_UNEVICTABLE_BATCH_SIZE);
> -
> - spin_lock_irq(&zone->lru_lock);
> - for (scan = 0; scan < batch_size; scan++) {
> - struct page *page = lru_to_page(l_unevictable);
> -
> - if (!trylock_page(page))
> - continue;
> -
> - prefetchw_prev_lru_page(page, l_unevictable, flags);
> -
> - if (likely(PageLRU(page) && PageUnevictable(page)))
> - check_move_unevictable_page(page, zone);
> -
> - unlock_page(page);
> - }
> - spin_unlock_irq(&zone->lru_lock);
> -
> - nr_to_scan -= batch_size;
> - }
> -}
> -
> -
> -/**
> - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages
> - *
> - * A really big hammer: scan all zones' unevictable LRU lists to check for
> - * pages that have become evictable. Move those back to the zones'
> - * inactive list where they become candidates for reclaim.
> - * This occurs when, e.g., we have unswappable pages on the unevictable lists,
> - * and we add swap to the system. As such, it runs in the context of a task
> - * that has possibly/probably made some previously unevictable pages
> - * evictable.
> - */
> -static void scan_all_zones_unevictable_pages(void)
> -{
> - struct zone *zone;
> -
> - for_each_zone(zone) {
> - scan_zone_unevictable_pages(zone);
> - }
> -}
> -
> -/*
> - * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of
> - * all nodes' unevictable lists for evictable pages
> - */
> -unsigned long scan_unevictable_pages;
> -
> -int scan_unevictable_handler(struct ctl_table *table, int write,
> - void __user *buffer,
> - size_t *length, loff_t *ppos)
> -{
> - proc_doulongvec_minmax(table, write, buffer, length, ppos);
> -
> - if (write && *(unsigned long *)table->data)
> - scan_all_zones_unevictable_pages();
> -
> - scan_unevictable_pages = 0;
> - return 0;
> -}
> -
> -#ifdef CONFIG_NUMA
> -/*
> - * per node 'scan_unevictable_pages' attribute. On demand re-scan of
> - * a specified node's per zone unevictable lists for evictable pages.
> - */
> -
> -static ssize_t read_scan_unevictable_node(struct sys_device *dev,
> - struct sysdev_attribute *attr,
> - char *buf)
> -{
> - return sprintf(buf, "0\n"); /* always zero; should fit... */
> -}
> -
> -static ssize_t write_scan_unevictable_node(struct sys_device *dev,
> - struct sysdev_attribute *attr,
> - const char *buf, size_t count)
> -{
> - struct zone *node_zones = NODE_DATA(dev->id)->node_zones;
> - struct zone *zone;
> - unsigned long res;
> - unsigned long req = strict_strtoul(buf, 10, &res);
> -
> - if (!req)
> - return 1; /* zero is no-op */
> -
> - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
> - if (!populated_zone(zone))
> - continue;
> - scan_zone_unevictable_pages(zone);
> - }
> - return 1;
> -}
> -
> -
> -static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR,
> - read_scan_unevictable_node,
> - write_scan_unevictable_node);
> -
> -int scan_unevictable_register_node(struct node *node)
> -{
> - return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages);
> -}
> -
> -void scan_unevictable_unregister_node(struct node *node)
> -{
> - sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages);
> -}
> -#endif
> --
> 1.7.6.2
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 12:10 ` kautuk.c @samsung.com
@ 2011-09-26 12:29 ` kautuk.c @samsung.com
2011-09-26 14:25 ` Johannes Weiner
2011-09-26 14:18 ` Johannes Weiner
1 sibling, 1 reply; 14+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-26 12:29 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com
<consul.kautuk@gmail.com> wrote:
> On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote:
>> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
>>> write_scan_unavictable_node checks the value req returned by
>>> strict_strtoul and returns 1 if req is 0.
>>>
>>> However, when strict_strtoul returns 0, it means successful conversion
>>> of buf to unsigned long.
>>>
>>> Due to this, the function was not proceeding to scan the zones for
>>> unevictable pages even though we write a valid value to the
>>> scan_unevictable_pages sys file.
>>
>> Given that there is not a real reason for this knob (anymore) and that
>> it apparently never really worked since the day it was introduced, how
>> about we just drop all that code instead?
>>
>> Hannes
>>
>> ---
>> From: Johannes Weiner <jweiner@redhat.com>
>> Subject: mm: remove sysctl to manually rescue unevictable pages
>>
>> At one point, anonymous pages were supposed to go on the unevictable
>> list when no swap space was configured, and the idea was to manually
>> rescue those pages after adding swap and making them evictable again.
>> But nowadays, swap-backed pages on the anon LRU list are not scanned
>> without available swap space anyway, so there is no point in moving
>> them to a separate list anymore.
>
> Is this code only for anonymous pages ?
> It seems to look at all pages in the zone both file as well as anon.
>
>>
>> The manual rescue could also be used in case pages were stranded on
>> the unevictable list due to race conditions. But the code has been
>> around for a while now and newly discovered bugs should be properly
>> reported and dealt with instead of relying on such a manual fixup.
>
> What you say seems to be all right for anon pages, but what about file
> pages ?
> I'm not sure about how this could happen, but what if some file-system caused
> a file cache page to be set to evictable or reclaimable without
> actually removing
> that page from the unevictable list ?
What I would like to also add is that while the transition of an anon
page from and
to the unevictable lists is straight-forward, should we make the same assumption
about file cache pages ?
I am not sure about this, but could a file-system cause this kind of a problem
independent of the mlocking behaviour of a user-mode app ?
>
>>
>> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
>> ---
>> drivers/base/node.c | 3 -
>> include/linux/swap.h | 16 ------
>> kernel/sysctl.c | 7 ---
>> mm/vmscan.c | 130 --------------------------------------------------
>> 4 files changed, 0 insertions(+), 156 deletions(-)
>>
>> diff --git a/drivers/base/node.c b/drivers/base/node.c
>> index 9e58e71..b9d6e93 100644
>> --- a/drivers/base/node.c
>> +++ b/drivers/base/node.c
>> @@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent)
>> sysdev_create_file(&node->sysdev, &attr_distance);
>> sysdev_create_file_optional(&node->sysdev, &attr_vmstat);
>>
>> - scan_unevictable_register_node(node);
>> -
>> hugetlb_register_node(node);
>>
>> compaction_register_node(node);
>> @@ -303,7 +301,6 @@ void unregister_node(struct node *node)
>> sysdev_remove_file(&node->sysdev, &attr_distance);
>> sysdev_remove_file_optional(&node->sysdev, &attr_vmstat);
>>
>> - scan_unevictable_unregister_node(node);
>> hugetlb_unregister_node(node); /* no-op, if memoryless node */
>>
>> sysdev_unregister(&node->sysdev);
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index b156e80..a6a9ee5 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
>> extern int page_evictable(struct page *page, struct vm_area_struct *vma);
>> extern void scan_mapping_unevictable_pages(struct address_space *);
>>
>> -extern unsigned long scan_unevictable_pages;
>> -extern int scan_unevictable_handler(struct ctl_table *, int,
>> - void __user *, size_t *, loff_t *);
>> -#ifdef CONFIG_NUMA
>> -extern int scan_unevictable_register_node(struct node *node);
>> -extern void scan_unevictable_unregister_node(struct node *node);
>> -#else
>> -static inline int scan_unevictable_register_node(struct node *node)
>> -{
>> - return 0;
>> -}
>> -static inline void scan_unevictable_unregister_node(struct node *node)
>> -{
>> -}
>> -#endif
>> -
>> extern int kswapd_run(int nid);
>> extern void kswapd_stop(int nid);
>> #ifdef CONFIG_CGROUP_MEM_RES_CTLR
>> diff --git a/kernel/sysctl.c b/kernel/sysctl.c
>> index 4f057f9..0d66092 100644
>> --- a/kernel/sysctl.c
>> +++ b/kernel/sysctl.c
>> @@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = {
>> .extra2 = &one,
>> },
>> #endif
>> - {
>> - .procname = "scan_unevictable_pages",
>> - .data = &scan_unevictable_pages,
>> - .maxlen = sizeof(scan_unevictable_pages),
>> - .mode = 0644,
>> - .proc_handler = scan_unevictable_handler,
>> - },
>> #ifdef CONFIG_MEMORY_FAILURE
>> {
>> .procname = "memory_failure_early_kill",
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 7502726..c99a097 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
>> }
>>
>> }
>> -
>> -/**
>> - * scan_zone_unevictable_pages - check unevictable list for evictable pages
>> - * @zone - zone of which to scan the unevictable list
>> - *
>> - * Scan @zone's unevictable LRU lists to check for pages that have become
>> - * evictable. Move those that have to @zone's inactive list where they
>> - * become candidates for reclaim, unless shrink_inactive_zone() decides
>> - * to reactivate them. Pages that are still unevictable are rotated
>> - * back onto @zone's unevictable list.
>> - */
>> -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
>> -static void scan_zone_unevictable_pages(struct zone *zone)
>> -{
>> - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
>> - unsigned long scan;
>> - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
>> -
>> - while (nr_to_scan > 0) {
>> - unsigned long batch_size = min(nr_to_scan,
>> - SCAN_UNEVICTABLE_BATCH_SIZE);
>> -
>> - spin_lock_irq(&zone->lru_lock);
>> - for (scan = 0; scan < batch_size; scan++) {
>> - struct page *page = lru_to_page(l_unevictable);
>> -
>> - if (!trylock_page(page))
>> - continue;
>> -
>> - prefetchw_prev_lru_page(page, l_unevictable, flags);
>> -
>> - if (likely(PageLRU(page) && PageUnevictable(page)))
>> - check_move_unevictable_page(page, zone);
>> -
>> - unlock_page(page);
>> - }
>> - spin_unlock_irq(&zone->lru_lock);
>> -
>> - nr_to_scan -= batch_size;
>> - }
>> -}
>> -
>> -
>> -/**
>> - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages
>> - *
>> - * A really big hammer: scan all zones' unevictable LRU lists to check for
>> - * pages that have become evictable. Move those back to the zones'
>> - * inactive list where they become candidates for reclaim.
>> - * This occurs when, e.g., we have unswappable pages on the unevictable lists,
>> - * and we add swap to the system. As such, it runs in the context of a task
>> - * that has possibly/probably made some previously unevictable pages
>> - * evictable.
>> - */
>> -static void scan_all_zones_unevictable_pages(void)
>> -{
>> - struct zone *zone;
>> -
>> - for_each_zone(zone) {
>> - scan_zone_unevictable_pages(zone);
>> - }
>> -}
>> -
>> -/*
>> - * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of
>> - * all nodes' unevictable lists for evictable pages
>> - */
>> -unsigned long scan_unevictable_pages;
>> -
>> -int scan_unevictable_handler(struct ctl_table *table, int write,
>> - void __user *buffer,
>> - size_t *length, loff_t *ppos)
>> -{
>> - proc_doulongvec_minmax(table, write, buffer, length, ppos);
>> -
>> - if (write && *(unsigned long *)table->data)
>> - scan_all_zones_unevictable_pages();
>> -
>> - scan_unevictable_pages = 0;
>> - return 0;
>> -}
>> -
>> -#ifdef CONFIG_NUMA
>> -/*
>> - * per node 'scan_unevictable_pages' attribute. On demand re-scan of
>> - * a specified node's per zone unevictable lists for evictable pages.
>> - */
>> -
>> -static ssize_t read_scan_unevictable_node(struct sys_device *dev,
>> - struct sysdev_attribute *attr,
>> - char *buf)
>> -{
>> - return sprintf(buf, "0\n"); /* always zero; should fit... */
>> -}
>> -
>> -static ssize_t write_scan_unevictable_node(struct sys_device *dev,
>> - struct sysdev_attribute *attr,
>> - const char *buf, size_t count)
>> -{
>> - struct zone *node_zones = NODE_DATA(dev->id)->node_zones;
>> - struct zone *zone;
>> - unsigned long res;
>> - unsigned long req = strict_strtoul(buf, 10, &res);
>> -
>> - if (!req)
>> - return 1; /* zero is no-op */
>> -
>> - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
>> - if (!populated_zone(zone))
>> - continue;
>> - scan_zone_unevictable_pages(zone);
>> - }
>> - return 1;
>> -}
>> -
>> -
>> -static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR,
>> - read_scan_unevictable_node,
>> - write_scan_unevictable_node);
>> -
>> -int scan_unevictable_register_node(struct node *node)
>> -{
>> - return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages);
>> -}
>> -
>> -void scan_unevictable_unregister_node(struct node *node)
>> -{
>> - sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages);
>> -}
>> -#endif
>> --
>> 1.7.6.2
>>
>>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 12:10 ` kautuk.c @samsung.com
2011-09-26 12:29 ` kautuk.c @samsung.com
@ 2011-09-26 14:18 ` Johannes Weiner
1 sibling, 0 replies; 14+ messages in thread
From: Johannes Weiner @ 2011-09-26 14:18 UTC (permalink / raw)
To: kautuk.c @samsung.com
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Mon, Sep 26, 2011 at 05:40:52PM +0530, kautuk.c @samsung.com wrote:
> On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> > On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
> >> write_scan_unavictable_node checks the value req returned by
> >> strict_strtoul and returns 1 if req is 0.
> >>
> >> However, when strict_strtoul returns 0, it means successful conversion
> >> of buf to unsigned long.
> >>
> >> Due to this, the function was not proceeding to scan the zones for
> >> unevictable pages even though we write a valid value to the
> >> scan_unevictable_pages sys file.
> >
> > Given that there is not a real reason for this knob (anymore) and that
> > it apparently never really worked since the day it was introduced, how
> > about we just drop all that code instead?
> >
> > Hannes
> >
> > ---
> > From: Johannes Weiner <jweiner@redhat.com>
> > Subject: mm: remove sysctl to manually rescue unevictable pages
> >
> > At one point, anonymous pages were supposed to go on the unevictable
> > list when no swap space was configured, and the idea was to manually
> > rescue those pages after adding swap and making them evictable again.
> > But nowadays, swap-backed pages on the anon LRU list are not scanned
> > without available swap space anyway, so there is no point in moving
> > them to a separate list anymore.
>
> Is this code only for anonymous pages ?
> It seems to look at all pages in the zone both file as well as anon.
The code scans both, but the usecase I described was moving swapbacked
pages from the unevictable list after configuring swap space.
> > The manual rescue could also be used in case pages were stranded on
> > the unevictable list due to race conditions. But the code has been
> > around for a while now and newly discovered bugs should be properly
> > reported and dealt with instead of relying on such a manual fixup.
>
> What you say seems to be all right for anon pages, but what about file
> pages ?
> I'm not sure about how this could happen, but what if some file-system caused
> a file cache page to be set to evictable or reclaimable without
> actually removing
> that page from the unevictable list ?
Currently, unevictable file pages come from ramfs, shmem, and mlock.
ramfs never makes them evictable. shmem, after making an inode
evictable again, scans the that inode's address_space and rescues the
pages it finds. munlock synchroneously rescues the pages vma's range.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 12:29 ` kautuk.c @samsung.com
@ 2011-09-26 14:25 ` Johannes Weiner
2011-09-26 14:55 ` kautuk.c @samsung.com
0 siblings, 1 reply; 14+ messages in thread
From: Johannes Weiner @ 2011-09-26 14:25 UTC (permalink / raw)
To: kautuk.c @samsung.com
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Mon, Sep 26, 2011 at 05:59:39PM +0530, kautuk.c @samsung.com wrote:
> On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com
> <consul.kautuk@gmail.com> wrote:
> > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> >> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
> >>> write_scan_unavictable_node checks the value req returned by
> >>> strict_strtoul and returns 1 if req is 0.
> >>>
> >>> However, when strict_strtoul returns 0, it means successful conversion
> >>> of buf to unsigned long.
> >>>
> >>> Due to this, the function was not proceeding to scan the zones for
> >>> unevictable pages even though we write a valid value to the
> >>> scan_unevictable_pages sys file.
> >>
> >> Given that there is not a real reason for this knob (anymore) and that
> >> it apparently never really worked since the day it was introduced, how
> >> about we just drop all that code instead?
> >>
> >> Hannes
> >>
> >> ---
> >> From: Johannes Weiner <jweiner@redhat.com>
> >> Subject: mm: remove sysctl to manually rescue unevictable pages
> >>
> >> At one point, anonymous pages were supposed to go on the unevictable
> >> list when no swap space was configured, and the idea was to manually
> >> rescue those pages after adding swap and making them evictable again.
> >> But nowadays, swap-backed pages on the anon LRU list are not scanned
> >> without available swap space anyway, so there is no point in moving
> >> them to a separate list anymore.
> >
> > Is this code only for anonymous pages ?
> > It seems to look at all pages in the zone both file as well as anon.
> >
> >>
> >> The manual rescue could also be used in case pages were stranded on
> >> the unevictable list due to race conditions. But the code has been
> >> around for a while now and newly discovered bugs should be properly
> >> reported and dealt with instead of relying on such a manual fixup.
> >
> > What you say seems to be all right for anon pages, but what about file
> > pages ?
> > I'm not sure about how this could happen, but what if some file-system caused
> > a file cache page to be set to evictable or reclaimable without
> > actually removing
> > that page from the unevictable list ?
>
> What I would like to also add is that while the transition of an anon
> page from and
> to the unevictable lists is straight-forward, should we make the same assumption
> about file cache pages ?
We should make no assumptions if our code base is open source :-)
> I am not sure about this, but could a file-system cause this kind of a problem
> independent of the mlocking behaviour of a user-mode app ?
Currently, I only see shmem and ramfs meddling with unevictability
outside of mlock and they both look correct to me.
I'd say that if a filesystem required this knob and user-intervention
for the VM to behave correctly, it needs fixing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 14:25 ` Johannes Weiner
@ 2011-09-26 14:55 ` kautuk.c @samsung.com
0 siblings, 0 replies; 14+ messages in thread
From: kautuk.c @samsung.com @ 2011-09-26 14:55 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro
On Mon, Sep 26, 2011 at 7:55 PM, Johannes Weiner <jweiner@redhat.com> wrote:
> On Mon, Sep 26, 2011 at 05:59:39PM +0530, kautuk.c @samsung.com wrote:
>> On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com
>> <consul.kautuk@gmail.com> wrote:
>> > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote:
>> >> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
>> >>> write_scan_unavictable_node checks the value req returned by
>> >>> strict_strtoul and returns 1 if req is 0.
>> >>>
>> >>> However, when strict_strtoul returns 0, it means successful conversion
>> >>> of buf to unsigned long.
>> >>>
>> >>> Due to this, the function was not proceeding to scan the zones for
>> >>> unevictable pages even though we write a valid value to the
>> >>> scan_unevictable_pages sys file.
>> >>
>> >> Given that there is not a real reason for this knob (anymore) and that
>> >> it apparently never really worked since the day it was introduced, how
>> >> about we just drop all that code instead?
>> >>
>> >> Hannes
>> >>
>> >> ---
>> >> From: Johannes Weiner <jweiner@redhat.com>
>> >> Subject: mm: remove sysctl to manually rescue unevictable pages
>> >>
>> >> At one point, anonymous pages were supposed to go on the unevictable
>> >> list when no swap space was configured, and the idea was to manually
>> >> rescue those pages after adding swap and making them evictable again.
>> >> But nowadays, swap-backed pages on the anon LRU list are not scanned
>> >> without available swap space anyway, so there is no point in moving
>> >> them to a separate list anymore.
>> >
>> > Is this code only for anonymous pages ?
>> > It seems to look at all pages in the zone both file as well as anon.
>> >
>> >>
>> >> The manual rescue could also be used in case pages were stranded on
>> >> the unevictable list due to race conditions. But the code has been
>> >> around for a while now and newly discovered bugs should be properly
>> >> reported and dealt with instead of relying on such a manual fixup.
>> >
>> > What you say seems to be all right for anon pages, but what about file
>> > pages ?
>> > I'm not sure about how this could happen, but what if some file-system caused
>> > a file cache page to be set to evictable or reclaimable without
>> > actually removing
>> > that page from the unevictable list ?
>>
>> What I would like to also add is that while the transition of an anon
>> page from and
>> to the unevictable lists is straight-forward, should we make the same assumption
>> about file cache pages ?
>
> We should make no assumptions if our code base is open source :-)
>
>> I am not sure about this, but could a file-system cause this kind of a problem
>> independent of the mlocking behaviour of a user-mode app ?
>
> Currently, I only see shmem and ramfs meddling with unevictability
> outside of mlock and they both look correct to me.
>
> I'd say that if a filesystem required this knob and user-intervention
> for the VM to behave correctly, it needs fixing.
Yes I agree. Otherwise that kind of defeats the overall purpose of
open-source. :)
Thanks for review and the info.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
2011-09-26 12:10 ` kautuk.c @samsung.com
@ 2011-09-26 23:11 ` Andrew Morton
2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner
2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro
2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2011-09-26 23:11 UTC (permalink / raw)
To: Johannes Weiner
Cc: Kautuk Consul, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro, Andrew Morton
On Mon, 26 Sep 2011 13:29:45 +0200
Johannes Weiner <jweiner@redhat.com> wrote:
> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
> > write_scan_unavictable_node checks the value req returned by
> > strict_strtoul and returns 1 if req is 0.
> >
> > However, when strict_strtoul returns 0, it means successful conversion
> > of buf to unsigned long.
> >
> > Due to this, the function was not proceeding to scan the zones for
> > unevictable pages even though we write a valid value to the
> > scan_unevictable_pages sys file.
>
> Given that there is not a real reason for this knob (anymore) and that
> it apparently never really worked since the day it was introduced, how
> about we just drop all that code instead?
>
Yes, let's remove it if at all possible.
However, to be nice to people I do think we should emit a once-per-boot
printk when someone tries to use it, then remove it for real at least a
couple of kernel cycles later. Just in case someone's script or tuning
app is trying to open that procfs file.
>
> ---
> From: Johannes Weiner <jweiner@redhat.com>
> Subject: mm: remove sysctl to manually rescue unevictable pages
>
> At one point, anonymous pages were supposed to go on the unevictable
> list when no swap space was configured, and the idea was to manually
> rescue those pages after adding swap and making them evictable again.
> But nowadays, swap-backed pages on the anon LRU list are not scanned
> without available swap space anyway, so there is no point in moving
> them to a separate list anymore.
>
> The manual rescue could also be used in case pages were stranded on
> the unevictable list due to race conditions. But the code has been
> around for a while now and newly discovered bugs should be properly
> reported and dealt with instead of relying on such a manual fixup.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
The changelog failed to note that the sysctl doesn't actually *work*.
This is a pretty strong argument for removing it ;)
Also, a reported-by:Kautuk would have been nice.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch] mm: disable user interface to manually rescue unevictable pages
2011-09-26 23:11 ` Andrew Morton
@ 2011-09-27 7:27 ` Johannes Weiner
2011-09-28 2:14 ` Minchan Kim
2011-09-28 4:51 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 14+ messages in thread
From: Johannes Weiner @ 2011-09-27 7:27 UTC (permalink / raw)
To: Andrew Morton
Cc: Kautuk Consul, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro, Andrew Morton
At one point, anonymous pages were supposed to go on the unevictable
list when no swap space was configured, and the idea was to manually
rescue those pages after adding swap and making them evictable again.
But nowadays, swap-backed pages on the anon LRU list are not scanned
without available swap space anyway, so there is no point in moving
them to a separate list anymore.
The manual rescue could also be used in case pages were stranded on
the unevictable list due to race conditions. But the code has been
around for a while now and newly discovered bugs should be properly
reported and dealt with instead of relying on such a manual fixup.
In addition to the lack of a usecase, the sysfs interface to rescue
pages from a specific NUMA node has been broken since its
introduction, so it's unlikely that anybody ever relied on that.
This patch removes the functionality behind the sysctl and the
node-interface and emits a one-time warning when somebody tries to
access either of them.
Signed-off-by: Johannes Weiner <jweiner@redhat.com>
Reported-by: Kautuk Consul <consul.kautuk@gmail.com>
---
mm/vmscan.c | 84 +++++-----------------------------------------------------
1 files changed, 8 insertions(+), 76 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7502726..71b5616 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -3397,66 +3397,12 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
}
-/**
- * scan_zone_unevictable_pages - check unevictable list for evictable pages
- * @zone - zone of which to scan the unevictable list
- *
- * Scan @zone's unevictable LRU lists to check for pages that have become
- * evictable. Move those that have to @zone's inactive list where they
- * become candidates for reclaim, unless shrink_inactive_zone() decides
- * to reactivate them. Pages that are still unevictable are rotated
- * back onto @zone's unevictable list.
- */
-#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
-static void scan_zone_unevictable_pages(struct zone *zone)
+static void warn_scan_unevictable_pages(void)
{
- struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
- unsigned long scan;
- unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
-
- while (nr_to_scan > 0) {
- unsigned long batch_size = min(nr_to_scan,
- SCAN_UNEVICTABLE_BATCH_SIZE);
-
- spin_lock_irq(&zone->lru_lock);
- for (scan = 0; scan < batch_size; scan++) {
- struct page *page = lru_to_page(l_unevictable);
-
- if (!trylock_page(page))
- continue;
-
- prefetchw_prev_lru_page(page, l_unevictable, flags);
-
- if (likely(PageLRU(page) && PageUnevictable(page)))
- check_move_unevictable_page(page, zone);
-
- unlock_page(page);
- }
- spin_unlock_irq(&zone->lru_lock);
-
- nr_to_scan -= batch_size;
- }
-}
-
-
-/**
- * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages
- *
- * A really big hammer: scan all zones' unevictable LRU lists to check for
- * pages that have become evictable. Move those back to the zones'
- * inactive list where they become candidates for reclaim.
- * This occurs when, e.g., we have unswappable pages on the unevictable lists,
- * and we add swap to the system. As such, it runs in the context of a task
- * that has possibly/probably made some previously unevictable pages
- * evictable.
- */
-static void scan_all_zones_unevictable_pages(void)
-{
- struct zone *zone;
-
- for_each_zone(zone) {
- scan_zone_unevictable_pages(zone);
- }
+ printk_once(KERN_WARNING
+ "The scan_unevictable_pages sysctl/node-interface has been "
+ "disabled for lack of a legitimate use case. If you have "
+ "one, please send an email to linux-mm@kvack.org.\n");
}
/*
@@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write,
void __user *buffer,
size_t *length, loff_t *ppos)
{
+ warn_scan_unevictable_pages();
proc_doulongvec_minmax(table, write, buffer, length, ppos);
-
- if (write && *(unsigned long *)table->data)
- scan_all_zones_unevictable_pages();
-
scan_unevictable_pages = 0;
return 0;
}
@@ -3488,6 +3431,7 @@ static ssize_t read_scan_unevictable_node(struct sys_device *dev,
struct sysdev_attribute *attr,
char *buf)
{
+ warn_scan_unevictable_pages();
return sprintf(buf, "0\n"); /* always zero; should fit... */
}
@@ -3495,19 +3439,7 @@ static ssize_t write_scan_unevictable_node(struct sys_device *dev,
struct sysdev_attribute *attr,
const char *buf, size_t count)
{
- struct zone *node_zones = NODE_DATA(dev->id)->node_zones;
- struct zone *zone;
- unsigned long res;
- unsigned long req = strict_strtoul(buf, 10, &res);
-
- if (!req)
- return 1; /* zero is no-op */
-
- for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) {
- if (!populated_zone(zone))
- continue;
- scan_zone_unevictable_pages(zone);
- }
+ warn_scan_unevictable_pages();
return 1;
}
--
1.7.6.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
2011-09-26 12:10 ` kautuk.c @samsung.com
2011-09-26 23:11 ` Andrew Morton
@ 2011-09-28 1:00 ` KOSAKI Motohiro
2 siblings, 0 replies; 14+ messages in thread
From: KOSAKI Motohiro @ 2011-09-28 1:00 UTC (permalink / raw)
To: jweiner
Cc: consul.kautuk, akpm, mel, minchan.kim, kamezawa.hiroyu, riel,
linux-mm, linux-kernel, lee.schermerhorn
(2011/09/26 20:29), Johannes Weiner wrote:
> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote:
>> write_scan_unavictable_node checks the value req returned by
>> strict_strtoul and returns 1 if req is 0.
>>
>> However, when strict_strtoul returns 0, it means successful conversion
>> of buf to unsigned long.
>>
>> Due to this, the function was not proceeding to scan the zones for
>> unevictable pages even though we write a valid value to the
>> scan_unevictable_pages sys file.
>
> Given that there is not a real reason for this knob (anymore) and that
> it apparently never really worked since the day it was introduced, how
> about we just drop all that code instead?
>
> Hannes
>
> ---
> From: Johannes Weiner <jweiner@redhat.com>
> Subject: mm: remove sysctl to manually rescue unevictable pages
>
> At one point, anonymous pages were supposed to go on the unevictable
> list when no swap space was configured, and the idea was to manually
> rescue those pages after adding swap and making them evictable again.
> But nowadays, swap-backed pages on the anon LRU list are not scanned
> without available swap space anyway, so there is no point in moving
> them to a separate list anymore.
>
> The manual rescue could also be used in case pages were stranded on
> the unevictable list due to race conditions. But the code has been
> around for a while now and newly discovered bugs should be properly
> reported and dealt with instead of relying on such a manual fixup.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
About three years ago when we introduced unevictable pages feature,
we were worry about there are mlock, shmmem-lock abuse in the real
world and we broke such assumption. And we expected this knob help
to dig unevictable pages bug report. Briefly says, If this knob works
meangfully, our unevictable handling code or their driver code are buggy.
Fortunately, Such bug report was never happen. So, this knob finished
the role.
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages
2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner
@ 2011-09-28 2:14 ` Minchan Kim
2011-10-04 10:10 ` Johannes Weiner
2011-09-28 4:51 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 14+ messages in thread
From: Minchan Kim @ 2011-09-28 2:14 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Kautuk Consul, Mel Gorman, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro, Andrew Morton
On Tue, Sep 27, 2011 at 09:27:14AM +0200, Johannes Weiner wrote:
> At one point, anonymous pages were supposed to go on the unevictable
> list when no swap space was configured, and the idea was to manually
> rescue those pages after adding swap and making them evictable again.
> But nowadays, swap-backed pages on the anon LRU list are not scanned
> without available swap space anyway, so there is no point in moving
> them to a separate list anymore.
>
> The manual rescue could also be used in case pages were stranded on
> the unevictable list due to race conditions. But the code has been
> around for a while now and newly discovered bugs should be properly
> reported and dealt with instead of relying on such a manual fixup.
>
> In addition to the lack of a usecase, the sysfs interface to rescue
> pages from a specific NUMA node has been broken since its
> introduction, so it's unlikely that anybody ever relied on that.
>
> This patch removes the functionality behind the sysctl and the
> node-interface and emits a one-time warning when somebody tries to
> access either of them.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> Reported-by: Kautuk Consul <consul.kautuk@gmail.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
There is a nitpick at below but I don't care of it.
> ---
> mm/vmscan.c | 84 +++++-----------------------------------------------------
> 1 files changed, 8 insertions(+), 76 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 7502726..71b5616 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3397,66 +3397,12 @@ void scan_mapping_unevictable_pages(struct address_space *mapping)
>
> }
>
> -/**
> - * scan_zone_unevictable_pages - check unevictable list for evictable pages
> - * @zone - zone of which to scan the unevictable list
> - *
> - * Scan @zone's unevictable LRU lists to check for pages that have become
> - * evictable. Move those that have to @zone's inactive list where they
> - * become candidates for reclaim, unless shrink_inactive_zone() decides
> - * to reactivate them. Pages that are still unevictable are rotated
> - * back onto @zone's unevictable list.
> - */
> -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */
> -static void scan_zone_unevictable_pages(struct zone *zone)
> +static void warn_scan_unevictable_pages(void)
> {
> - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list;
> - unsigned long scan;
> - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE);
> -
> - while (nr_to_scan > 0) {
> - unsigned long batch_size = min(nr_to_scan,
> - SCAN_UNEVICTABLE_BATCH_SIZE);
> -
> - spin_lock_irq(&zone->lru_lock);
> - for (scan = 0; scan < batch_size; scan++) {
> - struct page *page = lru_to_page(l_unevictable);
> -
> - if (!trylock_page(page))
> - continue;
> -
> - prefetchw_prev_lru_page(page, l_unevictable, flags);
> -
> - if (likely(PageLRU(page) && PageUnevictable(page)))
> - check_move_unevictable_page(page, zone);
> -
> - unlock_page(page);
> - }
> - spin_unlock_irq(&zone->lru_lock);
> -
> - nr_to_scan -= batch_size;
> - }
> -}
> -
> -
> -/**
> - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages
> - *
> - * A really big hammer: scan all zones' unevictable LRU lists to check for
> - * pages that have become evictable. Move those back to the zones'
> - * inactive list where they become candidates for reclaim.
> - * This occurs when, e.g., we have unswappable pages on the unevictable lists,
> - * and we add swap to the system. As such, it runs in the context of a task
> - * that has possibly/probably made some previously unevictable pages
> - * evictable.
> - */
> -static void scan_all_zones_unevictable_pages(void)
> -{
> - struct zone *zone;
> -
> - for_each_zone(zone) {
> - scan_zone_unevictable_pages(zone);
> - }
> + printk_once(KERN_WARNING
> + "The scan_unevictable_pages sysctl/node-interface has been "
> + "disabled for lack of a legitimate use case. If you have "
> + "one, please send an email to linux-mm@kvack.org.\n");
> }
>
> /*
> @@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write,
> void __user *buffer,
> size_t *length, loff_t *ppos)
> {
> + warn_scan_unevictable_pages();
> proc_doulongvec_minmax(table, write, buffer, length, ppos);
> -
> - if (write && *(unsigned long *)table->data)
> - scan_all_zones_unevictable_pages();
> -
> scan_unevictable_pages = 0;
Nitpick:
Could we remove this resetting with zero?
--
Kinds regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages
2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner
2011-09-28 2:14 ` Minchan Kim
@ 2011-09-28 4:51 ` KAMEZAWA Hiroyuki
1 sibling, 0 replies; 14+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-09-28 4:51 UTC (permalink / raw)
To: Johannes Weiner
Cc: Andrew Morton, Kautuk Consul, Mel Gorman, Minchan Kim,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro, Andrew Morton
On Tue, 27 Sep 2011 09:27:14 +0200
Johannes Weiner <jweiner@redhat.com> wrote:
> At one point, anonymous pages were supposed to go on the unevictable
> list when no swap space was configured, and the idea was to manually
> rescue those pages after adding swap and making them evictable again.
> But nowadays, swap-backed pages on the anon LRU list are not scanned
> without available swap space anyway, so there is no point in moving
> them to a separate list anymore.
>
> The manual rescue could also be used in case pages were stranded on
> the unevictable list due to race conditions. But the code has been
> around for a while now and newly discovered bugs should be properly
> reported and dealt with instead of relying on such a manual fixup.
>
> In addition to the lack of a usecase, the sysfs interface to rescue
> pages from a specific NUMA node has been broken since its
> introduction, so it's unlikely that anybody ever relied on that.
>
> This patch removes the functionality behind the sysctl and the
> node-interface and emits a one-time warning when somebody tries to
> access either of them.
>
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> Reported-by: Kautuk Consul <consul.kautuk@gmail.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages
2011-09-28 2:14 ` Minchan Kim
@ 2011-10-04 10:10 ` Johannes Weiner
0 siblings, 0 replies; 14+ messages in thread
From: Johannes Weiner @ 2011-10-04 10:10 UTC (permalink / raw)
To: Minchan Kim
Cc: Andrew Morton, Kautuk Consul, Mel Gorman, KAMEZAWA Hiroyuki,
Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn,
KOSAKI Motohiro, Andrew Morton
On Wed, Sep 28, 2011 at 11:14:24AM +0900, Minchan Kim wrote:
> On Tue, Sep 27, 2011 at 09:27:14AM +0200, Johannes Weiner wrote:
> > At one point, anonymous pages were supposed to go on the unevictable
> > list when no swap space was configured, and the idea was to manually
> > rescue those pages after adding swap and making them evictable again.
> > But nowadays, swap-backed pages on the anon LRU list are not scanned
> > without available swap space anyway, so there is no point in moving
> > them to a separate list anymore.
> >
> > The manual rescue could also be used in case pages were stranded on
> > the unevictable list due to race conditions. But the code has been
> > around for a while now and newly discovered bugs should be properly
> > reported and dealt with instead of relying on such a manual fixup.
> >
> > In addition to the lack of a usecase, the sysfs interface to rescue
> > pages from a specific NUMA node has been broken since its
> > introduction, so it's unlikely that anybody ever relied on that.
> >
> > This patch removes the functionality behind the sysctl and the
> > node-interface and emits a one-time warning when somebody tries to
> > access either of them.
> >
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > Reported-by: Kautuk Consul <consul.kautuk@gmail.com>
> Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
Thanks, Minchan.
> > @@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write,
> > void __user *buffer,
> > size_t *length, loff_t *ppos)
> > {
> > + warn_scan_unevictable_pages();
> > proc_doulongvec_minmax(table, write, buffer, length, ppos);
> > -
> > - if (write && *(unsigned long *)table->data)
> > - scan_all_zones_unevictable_pages();
> > -
> > scan_unevictable_pages = 0;
>
> Nitpick:
> Could we remove this resetting with zero?
table->data = &scan_unevictable_pages, so I let it in such that the
generic sysctl stuff to process the input and clean up afterward is
complete.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-10-04 10:11 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul
2011-09-26 9:20 ` KAMEZAWA Hiroyuki
2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner
2011-09-26 12:10 ` kautuk.c @samsung.com
2011-09-26 12:29 ` kautuk.c @samsung.com
2011-09-26 14:25 ` Johannes Weiner
2011-09-26 14:55 ` kautuk.c @samsung.com
2011-09-26 14:18 ` Johannes Weiner
2011-09-26 23:11 ` Andrew Morton
2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner
2011-09-28 2:14 ` Minchan Kim
2011-10-04 10:10 ` Johannes Weiner
2011-09-28 4:51 ` KAMEZAWA Hiroyuki
2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).