* [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node @ 2011-09-25 10:59 Kautuk Consul 2011-09-26 9:20 ` KAMEZAWA Hiroyuki 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner 0 siblings, 2 replies; 14+ messages in thread From: Kautuk Consul @ 2011-09-25 10:59 UTC (permalink / raw) To: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel Cc: linux-mm, linux-kernel, Kautuk Consul write_scan_unavictable_node checks the value req returned by strict_strtoul and returns 1 if req is 0. However, when strict_strtoul returns 0, it means successful conversion of buf to unsigned long. Due to this, the function was not proceeding to scan the zones for unevictable pages even though we write a valid value to the scan_unevictable_pages sys file. Changing this if check slightly to check for invalid value in buf as well as 0 value stored in res after successful conversion via strict_strtoul. In both cases, we do not perform the scanning of this node's zones. Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> --- mm/vmscan.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index b55699c..73996e6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3450,8 +3450,8 @@ static ssize_t write_scan_unevictable_node(struct sys_device *dev, unsigned long res; unsigned long req = strict_strtoul(buf, 10, &res); - if (!req) - return 1; /* zero is no-op */ + if (req || !res) + return 1; /* Invalid input or zero is no-op */ for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) { if (!populated_zone(zone)) -- 1.7.4.1 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node 2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul @ 2011-09-26 9:20 ` KAMEZAWA Hiroyuki 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner 1 sibling, 0 replies; 14+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-26 9:20 UTC (permalink / raw) To: Kautuk Consul Cc: Andrew Morton, Mel Gorman, Minchan Kim, Rik van Riel, linux-mm, linux-kernel On Sun, 25 Sep 2011 16:29:40 +0530 Kautuk Consul <consul.kautuk@gmail.com> wrote: > write_scan_unavictable_node checks the value req returned by > strict_strtoul and returns 1 if req is 0. > > However, when strict_strtoul returns 0, it means successful conversion > of buf to unsigned long. > > Due to this, the function was not proceeding to scan the zones for > unevictable pages even though we write a valid value to the > scan_unevictable_pages sys file. > > Changing this if check slightly to check for invalid value > in buf as well as 0 value stored in res after successful conversion > via strict_strtoul. > In both cases, we do not perform the scanning of this node's zones. > > Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul 2011-09-26 9:20 ` KAMEZAWA Hiroyuki @ 2011-09-26 11:29 ` Johannes Weiner 2011-09-26 12:10 ` kautuk.c @samsung.com ` (2 more replies) 1 sibling, 3 replies; 14+ messages in thread From: Johannes Weiner @ 2011-09-26 11:29 UTC (permalink / raw) To: Kautuk Consul Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: > write_scan_unavictable_node checks the value req returned by > strict_strtoul and returns 1 if req is 0. > > However, when strict_strtoul returns 0, it means successful conversion > of buf to unsigned long. > > Due to this, the function was not proceeding to scan the zones for > unevictable pages even though we write a valid value to the > scan_unevictable_pages sys file. Given that there is not a real reason for this knob (anymore) and that it apparently never really worked since the day it was introduced, how about we just drop all that code instead? Hannes --- From: Johannes Weiner <jweiner@redhat.com> Subject: mm: remove sysctl to manually rescue unevictable pages At one point, anonymous pages were supposed to go on the unevictable list when no swap space was configured, and the idea was to manually rescue those pages after adding swap and making them evictable again. But nowadays, swap-backed pages on the anon LRU list are not scanned without available swap space anyway, so there is no point in moving them to a separate list anymore. The manual rescue could also be used in case pages were stranded on the unevictable list due to race conditions. But the code has been around for a while now and newly discovered bugs should be properly reported and dealt with instead of relying on such a manual fixup. Signed-off-by: Johannes Weiner <jweiner@redhat.com> --- drivers/base/node.c | 3 - include/linux/swap.h | 16 ------ kernel/sysctl.c | 7 --- mm/vmscan.c | 130 -------------------------------------------------- 4 files changed, 0 insertions(+), 156 deletions(-) diff --git a/drivers/base/node.c b/drivers/base/node.c index 9e58e71..b9d6e93 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent) sysdev_create_file(&node->sysdev, &attr_distance); sysdev_create_file_optional(&node->sysdev, &attr_vmstat); - scan_unevictable_register_node(node); - hugetlb_register_node(node); compaction_register_node(node); @@ -303,7 +301,6 @@ void unregister_node(struct node *node) sysdev_remove_file(&node->sysdev, &attr_distance); sysdev_remove_file_optional(&node->sysdev, &attr_vmstat); - scan_unevictable_unregister_node(node); hugetlb_unregister_node(node); /* no-op, if memoryless node */ sysdev_unregister(&node->sysdev); diff --git a/include/linux/swap.h b/include/linux/swap.h index b156e80..a6a9ee5 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) extern int page_evictable(struct page *page, struct vm_area_struct *vma); extern void scan_mapping_unevictable_pages(struct address_space *); -extern unsigned long scan_unevictable_pages; -extern int scan_unevictable_handler(struct ctl_table *, int, - void __user *, size_t *, loff_t *); -#ifdef CONFIG_NUMA -extern int scan_unevictable_register_node(struct node *node); -extern void scan_unevictable_unregister_node(struct node *node); -#else -static inline int scan_unevictable_register_node(struct node *node) -{ - return 0; -} -static inline void scan_unevictable_unregister_node(struct node *node) -{ -} -#endif - extern int kswapd_run(int nid); extern void kswapd_stop(int nid); #ifdef CONFIG_CGROUP_MEM_RES_CTLR diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 4f057f9..0d66092 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = { .extra2 = &one, }, #endif - { - .procname = "scan_unevictable_pages", - .data = &scan_unevictable_pages, - .maxlen = sizeof(scan_unevictable_pages), - .mode = 0644, - .proc_handler = scan_unevictable_handler, - }, #ifdef CONFIG_MEMORY_FAILURE { .procname = "memory_failure_early_kill", diff --git a/mm/vmscan.c b/mm/vmscan.c index 7502726..c99a097 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping) } } - -/** - * scan_zone_unevictable_pages - check unevictable list for evictable pages - * @zone - zone of which to scan the unevictable list - * - * Scan @zone's unevictable LRU lists to check for pages that have become - * evictable. Move those that have to @zone's inactive list where they - * become candidates for reclaim, unless shrink_inactive_zone() decides - * to reactivate them. Pages that are still unevictable are rotated - * back onto @zone's unevictable list. - */ -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */ -static void scan_zone_unevictable_pages(struct zone *zone) -{ - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list; - unsigned long scan; - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE); - - while (nr_to_scan > 0) { - unsigned long batch_size = min(nr_to_scan, - SCAN_UNEVICTABLE_BATCH_SIZE); - - spin_lock_irq(&zone->lru_lock); - for (scan = 0; scan < batch_size; scan++) { - struct page *page = lru_to_page(l_unevictable); - - if (!trylock_page(page)) - continue; - - prefetchw_prev_lru_page(page, l_unevictable, flags); - - if (likely(PageLRU(page) && PageUnevictable(page))) - check_move_unevictable_page(page, zone); - - unlock_page(page); - } - spin_unlock_irq(&zone->lru_lock); - - nr_to_scan -= batch_size; - } -} - - -/** - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages - * - * A really big hammer: scan all zones' unevictable LRU lists to check for - * pages that have become evictable. Move those back to the zones' - * inactive list where they become candidates for reclaim. - * This occurs when, e.g., we have unswappable pages on the unevictable lists, - * and we add swap to the system. As such, it runs in the context of a task - * that has possibly/probably made some previously unevictable pages - * evictable. - */ -static void scan_all_zones_unevictable_pages(void) -{ - struct zone *zone; - - for_each_zone(zone) { - scan_zone_unevictable_pages(zone); - } -} - -/* - * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of - * all nodes' unevictable lists for evictable pages - */ -unsigned long scan_unevictable_pages; - -int scan_unevictable_handler(struct ctl_table *table, int write, - void __user *buffer, - size_t *length, loff_t *ppos) -{ - proc_doulongvec_minmax(table, write, buffer, length, ppos); - - if (write && *(unsigned long *)table->data) - scan_all_zones_unevictable_pages(); - - scan_unevictable_pages = 0; - return 0; -} - -#ifdef CONFIG_NUMA -/* - * per node 'scan_unevictable_pages' attribute. On demand re-scan of - * a specified node's per zone unevictable lists for evictable pages. - */ - -static ssize_t read_scan_unevictable_node(struct sys_device *dev, - struct sysdev_attribute *attr, - char *buf) -{ - return sprintf(buf, "0\n"); /* always zero; should fit... */ -} - -static ssize_t write_scan_unevictable_node(struct sys_device *dev, - struct sysdev_attribute *attr, - const char *buf, size_t count) -{ - struct zone *node_zones = NODE_DATA(dev->id)->node_zones; - struct zone *zone; - unsigned long res; - unsigned long req = strict_strtoul(buf, 10, &res); - - if (!req) - return 1; /* zero is no-op */ - - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) { - if (!populated_zone(zone)) - continue; - scan_zone_unevictable_pages(zone); - } - return 1; -} - - -static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR, - read_scan_unevictable_node, - write_scan_unevictable_node); - -int scan_unevictable_register_node(struct node *node) -{ - return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages); -} - -void scan_unevictable_unregister_node(struct node *node) -{ - sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages); -} -#endif -- 1.7.6.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner @ 2011-09-26 12:10 ` kautuk.c @samsung.com 2011-09-26 12:29 ` kautuk.c @samsung.com 2011-09-26 14:18 ` Johannes Weiner 2011-09-26 23:11 ` Andrew Morton 2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro 2 siblings, 2 replies; 14+ messages in thread From: kautuk.c @samsung.com @ 2011-09-26 12:10 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote: > On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: >> write_scan_unavictable_node checks the value req returned by >> strict_strtoul and returns 1 if req is 0. >> >> However, when strict_strtoul returns 0, it means successful conversion >> of buf to unsigned long. >> >> Due to this, the function was not proceeding to scan the zones for >> unevictable pages even though we write a valid value to the >> scan_unevictable_pages sys file. > > Given that there is not a real reason for this knob (anymore) and that > it apparently never really worked since the day it was introduced, how > about we just drop all that code instead? > > Hannes > > --- > From: Johannes Weiner <jweiner@redhat.com> > Subject: mm: remove sysctl to manually rescue unevictable pages > > At one point, anonymous pages were supposed to go on the unevictable > list when no swap space was configured, and the idea was to manually > rescue those pages after adding swap and making them evictable again. > But nowadays, swap-backed pages on the anon LRU list are not scanned > without available swap space anyway, so there is no point in moving > them to a separate list anymore. Is this code only for anonymous pages ? It seems to look at all pages in the zone both file as well as anon. > > The manual rescue could also be used in case pages were stranded on > the unevictable list due to race conditions. But the code has been > around for a while now and newly discovered bugs should be properly > reported and dealt with instead of relying on such a manual fixup. What you say seems to be all right for anon pages, but what about file pages ? I'm not sure about how this could happen, but what if some file-system caused a file cache page to be set to evictable or reclaimable without actually removing that page from the unevictable list ? > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> > --- > drivers/base/node.c | 3 - > include/linux/swap.h | 16 ------ > kernel/sysctl.c | 7 --- > mm/vmscan.c | 130 -------------------------------------------------- > 4 files changed, 0 insertions(+), 156 deletions(-) > > diff --git a/drivers/base/node.c b/drivers/base/node.c > index 9e58e71..b9d6e93 100644 > --- a/drivers/base/node.c > +++ b/drivers/base/node.c > @@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent) > sysdev_create_file(&node->sysdev, &attr_distance); > sysdev_create_file_optional(&node->sysdev, &attr_vmstat); > > - scan_unevictable_register_node(node); > - > hugetlb_register_node(node); > > compaction_register_node(node); > @@ -303,7 +301,6 @@ void unregister_node(struct node *node) > sysdev_remove_file(&node->sysdev, &attr_distance); > sysdev_remove_file_optional(&node->sysdev, &attr_vmstat); > > - scan_unevictable_unregister_node(node); > hugetlb_unregister_node(node); /* no-op, if memoryless node */ > > sysdev_unregister(&node->sysdev); > diff --git a/include/linux/swap.h b/include/linux/swap.h > index b156e80..a6a9ee5 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) > extern int page_evictable(struct page *page, struct vm_area_struct *vma); > extern void scan_mapping_unevictable_pages(struct address_space *); > > -extern unsigned long scan_unevictable_pages; > -extern int scan_unevictable_handler(struct ctl_table *, int, > - void __user *, size_t *, loff_t *); > -#ifdef CONFIG_NUMA > -extern int scan_unevictable_register_node(struct node *node); > -extern void scan_unevictable_unregister_node(struct node *node); > -#else > -static inline int scan_unevictable_register_node(struct node *node) > -{ > - return 0; > -} > -static inline void scan_unevictable_unregister_node(struct node *node) > -{ > -} > -#endif > - > extern int kswapd_run(int nid); > extern void kswapd_stop(int nid); > #ifdef CONFIG_CGROUP_MEM_RES_CTLR > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index 4f057f9..0d66092 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = { > .extra2 = &one, > }, > #endif > - { > - .procname = "scan_unevictable_pages", > - .data = &scan_unevictable_pages, > - .maxlen = sizeof(scan_unevictable_pages), > - .mode = 0644, > - .proc_handler = scan_unevictable_handler, > - }, > #ifdef CONFIG_MEMORY_FAILURE > { > .procname = "memory_failure_early_kill", > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 7502726..c99a097 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping) > } > > } > - > -/** > - * scan_zone_unevictable_pages - check unevictable list for evictable pages > - * @zone - zone of which to scan the unevictable list > - * > - * Scan @zone's unevictable LRU lists to check for pages that have become > - * evictable. Move those that have to @zone's inactive list where they > - * become candidates for reclaim, unless shrink_inactive_zone() decides > - * to reactivate them. Pages that are still unevictable are rotated > - * back onto @zone's unevictable list. > - */ > -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */ > -static void scan_zone_unevictable_pages(struct zone *zone) > -{ > - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list; > - unsigned long scan; > - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE); > - > - while (nr_to_scan > 0) { > - unsigned long batch_size = min(nr_to_scan, > - SCAN_UNEVICTABLE_BATCH_SIZE); > - > - spin_lock_irq(&zone->lru_lock); > - for (scan = 0; scan < batch_size; scan++) { > - struct page *page = lru_to_page(l_unevictable); > - > - if (!trylock_page(page)) > - continue; > - > - prefetchw_prev_lru_page(page, l_unevictable, flags); > - > - if (likely(PageLRU(page) && PageUnevictable(page))) > - check_move_unevictable_page(page, zone); > - > - unlock_page(page); > - } > - spin_unlock_irq(&zone->lru_lock); > - > - nr_to_scan -= batch_size; > - } > -} > - > - > -/** > - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages > - * > - * A really big hammer: scan all zones' unevictable LRU lists to check for > - * pages that have become evictable. Move those back to the zones' > - * inactive list where they become candidates for reclaim. > - * This occurs when, e.g., we have unswappable pages on the unevictable lists, > - * and we add swap to the system. As such, it runs in the context of a task > - * that has possibly/probably made some previously unevictable pages > - * evictable. > - */ > -static void scan_all_zones_unevictable_pages(void) > -{ > - struct zone *zone; > - > - for_each_zone(zone) { > - scan_zone_unevictable_pages(zone); > - } > -} > - > -/* > - * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of > - * all nodes' unevictable lists for evictable pages > - */ > -unsigned long scan_unevictable_pages; > - > -int scan_unevictable_handler(struct ctl_table *table, int write, > - void __user *buffer, > - size_t *length, loff_t *ppos) > -{ > - proc_doulongvec_minmax(table, write, buffer, length, ppos); > - > - if (write && *(unsigned long *)table->data) > - scan_all_zones_unevictable_pages(); > - > - scan_unevictable_pages = 0; > - return 0; > -} > - > -#ifdef CONFIG_NUMA > -/* > - * per node 'scan_unevictable_pages' attribute. On demand re-scan of > - * a specified node's per zone unevictable lists for evictable pages. > - */ > - > -static ssize_t read_scan_unevictable_node(struct sys_device *dev, > - struct sysdev_attribute *attr, > - char *buf) > -{ > - return sprintf(buf, "0\n"); /* always zero; should fit... */ > -} > - > -static ssize_t write_scan_unevictable_node(struct sys_device *dev, > - struct sysdev_attribute *attr, > - const char *buf, size_t count) > -{ > - struct zone *node_zones = NODE_DATA(dev->id)->node_zones; > - struct zone *zone; > - unsigned long res; > - unsigned long req = strict_strtoul(buf, 10, &res); > - > - if (!req) > - return 1; /* zero is no-op */ > - > - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) { > - if (!populated_zone(zone)) > - continue; > - scan_zone_unevictable_pages(zone); > - } > - return 1; > -} > - > - > -static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR, > - read_scan_unevictable_node, > - write_scan_unevictable_node); > - > -int scan_unevictable_register_node(struct node *node) > -{ > - return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages); > -} > - > -void scan_unevictable_unregister_node(struct node *node) > -{ > - sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages); > -} > -#endif > -- > 1.7.6.2 > > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 12:10 ` kautuk.c @samsung.com @ 2011-09-26 12:29 ` kautuk.c @samsung.com 2011-09-26 14:25 ` Johannes Weiner 2011-09-26 14:18 ` Johannes Weiner 1 sibling, 1 reply; 14+ messages in thread From: kautuk.c @samsung.com @ 2011-09-26 12:29 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com <consul.kautuk@gmail.com> wrote: > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote: >> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: >>> write_scan_unavictable_node checks the value req returned by >>> strict_strtoul and returns 1 if req is 0. >>> >>> However, when strict_strtoul returns 0, it means successful conversion >>> of buf to unsigned long. >>> >>> Due to this, the function was not proceeding to scan the zones for >>> unevictable pages even though we write a valid value to the >>> scan_unevictable_pages sys file. >> >> Given that there is not a real reason for this knob (anymore) and that >> it apparently never really worked since the day it was introduced, how >> about we just drop all that code instead? >> >> Hannes >> >> --- >> From: Johannes Weiner <jweiner@redhat.com> >> Subject: mm: remove sysctl to manually rescue unevictable pages >> >> At one point, anonymous pages were supposed to go on the unevictable >> list when no swap space was configured, and the idea was to manually >> rescue those pages after adding swap and making them evictable again. >> But nowadays, swap-backed pages on the anon LRU list are not scanned >> without available swap space anyway, so there is no point in moving >> them to a separate list anymore. > > Is this code only for anonymous pages ? > It seems to look at all pages in the zone both file as well as anon. > >> >> The manual rescue could also be used in case pages were stranded on >> the unevictable list due to race conditions. But the code has been >> around for a while now and newly discovered bugs should be properly >> reported and dealt with instead of relying on such a manual fixup. > > What you say seems to be all right for anon pages, but what about file > pages ? > I'm not sure about how this could happen, but what if some file-system caused > a file cache page to be set to evictable or reclaimable without > actually removing > that page from the unevictable list ? What I would like to also add is that while the transition of an anon page from and to the unevictable lists is straight-forward, should we make the same assumption about file cache pages ? I am not sure about this, but could a file-system cause this kind of a problem independent of the mlocking behaviour of a user-mode app ? > >> >> Signed-off-by: Johannes Weiner <jweiner@redhat.com> >> --- >> drivers/base/node.c | 3 - >> include/linux/swap.h | 16 ------ >> kernel/sysctl.c | 7 --- >> mm/vmscan.c | 130 -------------------------------------------------- >> 4 files changed, 0 insertions(+), 156 deletions(-) >> >> diff --git a/drivers/base/node.c b/drivers/base/node.c >> index 9e58e71..b9d6e93 100644 >> --- a/drivers/base/node.c >> +++ b/drivers/base/node.c >> @@ -278,8 +278,6 @@ int register_node(struct node *node, int num, struct node *parent) >> sysdev_create_file(&node->sysdev, &attr_distance); >> sysdev_create_file_optional(&node->sysdev, &attr_vmstat); >> >> - scan_unevictable_register_node(node); >> - >> hugetlb_register_node(node); >> >> compaction_register_node(node); >> @@ -303,7 +301,6 @@ void unregister_node(struct node *node) >> sysdev_remove_file(&node->sysdev, &attr_distance); >> sysdev_remove_file_optional(&node->sysdev, &attr_vmstat); >> >> - scan_unevictable_unregister_node(node); >> hugetlb_unregister_node(node); /* no-op, if memoryless node */ >> >> sysdev_unregister(&node->sysdev); >> diff --git a/include/linux/swap.h b/include/linux/swap.h >> index b156e80..a6a9ee5 100644 >> --- a/include/linux/swap.h >> +++ b/include/linux/swap.h >> @@ -276,22 +276,6 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) >> extern int page_evictable(struct page *page, struct vm_area_struct *vma); >> extern void scan_mapping_unevictable_pages(struct address_space *); >> >> -extern unsigned long scan_unevictable_pages; >> -extern int scan_unevictable_handler(struct ctl_table *, int, >> - void __user *, size_t *, loff_t *); >> -#ifdef CONFIG_NUMA >> -extern int scan_unevictable_register_node(struct node *node); >> -extern void scan_unevictable_unregister_node(struct node *node); >> -#else >> -static inline int scan_unevictable_register_node(struct node *node) >> -{ >> - return 0; >> -} >> -static inline void scan_unevictable_unregister_node(struct node *node) >> -{ >> -} >> -#endif >> - >> extern int kswapd_run(int nid); >> extern void kswapd_stop(int nid); >> #ifdef CONFIG_CGROUP_MEM_RES_CTLR >> diff --git a/kernel/sysctl.c b/kernel/sysctl.c >> index 4f057f9..0d66092 100644 >> --- a/kernel/sysctl.c >> +++ b/kernel/sysctl.c >> @@ -1325,13 +1325,6 @@ static struct ctl_table vm_table[] = { >> .extra2 = &one, >> }, >> #endif >> - { >> - .procname = "scan_unevictable_pages", >> - .data = &scan_unevictable_pages, >> - .maxlen = sizeof(scan_unevictable_pages), >> - .mode = 0644, >> - .proc_handler = scan_unevictable_handler, >> - }, >> #ifdef CONFIG_MEMORY_FAILURE >> { >> .procname = "memory_failure_early_kill", >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 7502726..c99a097 100644 >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -3396,133 +3396,3 @@ void scan_mapping_unevictable_pages(struct address_space *mapping) >> } >> >> } >> - >> -/** >> - * scan_zone_unevictable_pages - check unevictable list for evictable pages >> - * @zone - zone of which to scan the unevictable list >> - * >> - * Scan @zone's unevictable LRU lists to check for pages that have become >> - * evictable. Move those that have to @zone's inactive list where they >> - * become candidates for reclaim, unless shrink_inactive_zone() decides >> - * to reactivate them. Pages that are still unevictable are rotated >> - * back onto @zone's unevictable list. >> - */ >> -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */ >> -static void scan_zone_unevictable_pages(struct zone *zone) >> -{ >> - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list; >> - unsigned long scan; >> - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE); >> - >> - while (nr_to_scan > 0) { >> - unsigned long batch_size = min(nr_to_scan, >> - SCAN_UNEVICTABLE_BATCH_SIZE); >> - >> - spin_lock_irq(&zone->lru_lock); >> - for (scan = 0; scan < batch_size; scan++) { >> - struct page *page = lru_to_page(l_unevictable); >> - >> - if (!trylock_page(page)) >> - continue; >> - >> - prefetchw_prev_lru_page(page, l_unevictable, flags); >> - >> - if (likely(PageLRU(page) && PageUnevictable(page))) >> - check_move_unevictable_page(page, zone); >> - >> - unlock_page(page); >> - } >> - spin_unlock_irq(&zone->lru_lock); >> - >> - nr_to_scan -= batch_size; >> - } >> -} >> - >> - >> -/** >> - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages >> - * >> - * A really big hammer: scan all zones' unevictable LRU lists to check for >> - * pages that have become evictable. Move those back to the zones' >> - * inactive list where they become candidates for reclaim. >> - * This occurs when, e.g., we have unswappable pages on the unevictable lists, >> - * and we add swap to the system. As such, it runs in the context of a task >> - * that has possibly/probably made some previously unevictable pages >> - * evictable. >> - */ >> -static void scan_all_zones_unevictable_pages(void) >> -{ >> - struct zone *zone; >> - >> - for_each_zone(zone) { >> - scan_zone_unevictable_pages(zone); >> - } >> -} >> - >> -/* >> - * scan_unevictable_pages [vm] sysctl handler. On demand re-scan of >> - * all nodes' unevictable lists for evictable pages >> - */ >> -unsigned long scan_unevictable_pages; >> - >> -int scan_unevictable_handler(struct ctl_table *table, int write, >> - void __user *buffer, >> - size_t *length, loff_t *ppos) >> -{ >> - proc_doulongvec_minmax(table, write, buffer, length, ppos); >> - >> - if (write && *(unsigned long *)table->data) >> - scan_all_zones_unevictable_pages(); >> - >> - scan_unevictable_pages = 0; >> - return 0; >> -} >> - >> -#ifdef CONFIG_NUMA >> -/* >> - * per node 'scan_unevictable_pages' attribute. On demand re-scan of >> - * a specified node's per zone unevictable lists for evictable pages. >> - */ >> - >> -static ssize_t read_scan_unevictable_node(struct sys_device *dev, >> - struct sysdev_attribute *attr, >> - char *buf) >> -{ >> - return sprintf(buf, "0\n"); /* always zero; should fit... */ >> -} >> - >> -static ssize_t write_scan_unevictable_node(struct sys_device *dev, >> - struct sysdev_attribute *attr, >> - const char *buf, size_t count) >> -{ >> - struct zone *node_zones = NODE_DATA(dev->id)->node_zones; >> - struct zone *zone; >> - unsigned long res; >> - unsigned long req = strict_strtoul(buf, 10, &res); >> - >> - if (!req) >> - return 1; /* zero is no-op */ >> - >> - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) { >> - if (!populated_zone(zone)) >> - continue; >> - scan_zone_unevictable_pages(zone); >> - } >> - return 1; >> -} >> - >> - >> -static SYSDEV_ATTR(scan_unevictable_pages, S_IRUGO | S_IWUSR, >> - read_scan_unevictable_node, >> - write_scan_unevictable_node); >> - >> -int scan_unevictable_register_node(struct node *node) >> -{ >> - return sysdev_create_file(&node->sysdev, &attr_scan_unevictable_pages); >> -} >> - >> -void scan_unevictable_unregister_node(struct node *node) >> -{ >> - sysdev_remove_file(&node->sysdev, &attr_scan_unevictable_pages); >> -} >> -#endif >> -- >> 1.7.6.2 >> >> > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 12:29 ` kautuk.c @samsung.com @ 2011-09-26 14:25 ` Johannes Weiner 2011-09-26 14:55 ` kautuk.c @samsung.com 0 siblings, 1 reply; 14+ messages in thread From: Johannes Weiner @ 2011-09-26 14:25 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Mon, Sep 26, 2011 at 05:59:39PM +0530, kautuk.c @samsung.com wrote: > On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com > <consul.kautuk@gmail.com> wrote: > > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote: > >> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: > >>> write_scan_unavictable_node checks the value req returned by > >>> strict_strtoul and returns 1 if req is 0. > >>> > >>> However, when strict_strtoul returns 0, it means successful conversion > >>> of buf to unsigned long. > >>> > >>> Due to this, the function was not proceeding to scan the zones for > >>> unevictable pages even though we write a valid value to the > >>> scan_unevictable_pages sys file. > >> > >> Given that there is not a real reason for this knob (anymore) and that > >> it apparently never really worked since the day it was introduced, how > >> about we just drop all that code instead? > >> > >> Hannes > >> > >> --- > >> From: Johannes Weiner <jweiner@redhat.com> > >> Subject: mm: remove sysctl to manually rescue unevictable pages > >> > >> At one point, anonymous pages were supposed to go on the unevictable > >> list when no swap space was configured, and the idea was to manually > >> rescue those pages after adding swap and making them evictable again. > >> But nowadays, swap-backed pages on the anon LRU list are not scanned > >> without available swap space anyway, so there is no point in moving > >> them to a separate list anymore. > > > > Is this code only for anonymous pages ? > > It seems to look at all pages in the zone both file as well as anon. > > > >> > >> The manual rescue could also be used in case pages were stranded on > >> the unevictable list due to race conditions. But the code has been > >> around for a while now and newly discovered bugs should be properly > >> reported and dealt with instead of relying on such a manual fixup. > > > > What you say seems to be all right for anon pages, but what about file > > pages ? > > I'm not sure about how this could happen, but what if some file-system caused > > a file cache page to be set to evictable or reclaimable without > > actually removing > > that page from the unevictable list ? > > What I would like to also add is that while the transition of an anon > page from and > to the unevictable lists is straight-forward, should we make the same assumption > about file cache pages ? We should make no assumptions if our code base is open source :-) > I am not sure about this, but could a file-system cause this kind of a problem > independent of the mlocking behaviour of a user-mode app ? Currently, I only see shmem and ramfs meddling with unevictability outside of mlock and they both look correct to me. I'd say that if a filesystem required this knob and user-intervention for the VM to behave correctly, it needs fixing. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 14:25 ` Johannes Weiner @ 2011-09-26 14:55 ` kautuk.c @samsung.com 0 siblings, 0 replies; 14+ messages in thread From: kautuk.c @samsung.com @ 2011-09-26 14:55 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Mon, Sep 26, 2011 at 7:55 PM, Johannes Weiner <jweiner@redhat.com> wrote: > On Mon, Sep 26, 2011 at 05:59:39PM +0530, kautuk.c @samsung.com wrote: >> On Mon, Sep 26, 2011 at 5:40 PM, kautuk.c @samsung.com >> <consul.kautuk@gmail.com> wrote: >> > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote: >> >> On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: >> >>> write_scan_unavictable_node checks the value req returned by >> >>> strict_strtoul and returns 1 if req is 0. >> >>> >> >>> However, when strict_strtoul returns 0, it means successful conversion >> >>> of buf to unsigned long. >> >>> >> >>> Due to this, the function was not proceeding to scan the zones for >> >>> unevictable pages even though we write a valid value to the >> >>> scan_unevictable_pages sys file. >> >> >> >> Given that there is not a real reason for this knob (anymore) and that >> >> it apparently never really worked since the day it was introduced, how >> >> about we just drop all that code instead? >> >> >> >> Hannes >> >> >> >> --- >> >> From: Johannes Weiner <jweiner@redhat.com> >> >> Subject: mm: remove sysctl to manually rescue unevictable pages >> >> >> >> At one point, anonymous pages were supposed to go on the unevictable >> >> list when no swap space was configured, and the idea was to manually >> >> rescue those pages after adding swap and making them evictable again. >> >> But nowadays, swap-backed pages on the anon LRU list are not scanned >> >> without available swap space anyway, so there is no point in moving >> >> them to a separate list anymore. >> > >> > Is this code only for anonymous pages ? >> > It seems to look at all pages in the zone both file as well as anon. >> > >> >> >> >> The manual rescue could also be used in case pages were stranded on >> >> the unevictable list due to race conditions. But the code has been >> >> around for a while now and newly discovered bugs should be properly >> >> reported and dealt with instead of relying on such a manual fixup. >> > >> > What you say seems to be all right for anon pages, but what about file >> > pages ? >> > I'm not sure about how this could happen, but what if some file-system caused >> > a file cache page to be set to evictable or reclaimable without >> > actually removing >> > that page from the unevictable list ? >> >> What I would like to also add is that while the transition of an anon >> page from and >> to the unevictable lists is straight-forward, should we make the same assumption >> about file cache pages ? > > We should make no assumptions if our code base is open source :-) > >> I am not sure about this, but could a file-system cause this kind of a problem >> independent of the mlocking behaviour of a user-mode app ? > > Currently, I only see shmem and ramfs meddling with unevictability > outside of mlock and they both look correct to me. > > I'd say that if a filesystem required this knob and user-intervention > for the VM to behave correctly, it needs fixing. Yes I agree. Otherwise that kind of defeats the overall purpose of open-source. :) Thanks for review and the info. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 12:10 ` kautuk.c @samsung.com 2011-09-26 12:29 ` kautuk.c @samsung.com @ 2011-09-26 14:18 ` Johannes Weiner 1 sibling, 0 replies; 14+ messages in thread From: Johannes Weiner @ 2011-09-26 14:18 UTC (permalink / raw) To: kautuk.c @samsung.com Cc: Andrew Morton, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro On Mon, Sep 26, 2011 at 05:40:52PM +0530, kautuk.c @samsung.com wrote: > On Mon, Sep 26, 2011 at 4:59 PM, Johannes Weiner <jweiner@redhat.com> wrote: > > On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: > >> write_scan_unavictable_node checks the value req returned by > >> strict_strtoul and returns 1 if req is 0. > >> > >> However, when strict_strtoul returns 0, it means successful conversion > >> of buf to unsigned long. > >> > >> Due to this, the function was not proceeding to scan the zones for > >> unevictable pages even though we write a valid value to the > >> scan_unevictable_pages sys file. > > > > Given that there is not a real reason for this knob (anymore) and that > > it apparently never really worked since the day it was introduced, how > > about we just drop all that code instead? > > > > Hannes > > > > --- > > From: Johannes Weiner <jweiner@redhat.com> > > Subject: mm: remove sysctl to manually rescue unevictable pages > > > > At one point, anonymous pages were supposed to go on the unevictable > > list when no swap space was configured, and the idea was to manually > > rescue those pages after adding swap and making them evictable again. > > But nowadays, swap-backed pages on the anon LRU list are not scanned > > without available swap space anyway, so there is no point in moving > > them to a separate list anymore. > > Is this code only for anonymous pages ? > It seems to look at all pages in the zone both file as well as anon. The code scans both, but the usecase I described was moving swapbacked pages from the unevictable list after configuring swap space. > > The manual rescue could also be used in case pages were stranded on > > the unevictable list due to race conditions. But the code has been > > around for a while now and newly discovered bugs should be properly > > reported and dealt with instead of relying on such a manual fixup. > > What you say seems to be all right for anon pages, but what about file > pages ? > I'm not sure about how this could happen, but what if some file-system caused > a file cache page to be set to evictable or reclaimable without > actually removing > that page from the unevictable list ? Currently, unevictable file pages come from ramfs, shmem, and mlock. ramfs never makes them evictable. shmem, after making an inode evictable again, scans the that inode's address_space and rescues the pages it finds. munlock synchroneously rescues the pages vma's range. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner 2011-09-26 12:10 ` kautuk.c @samsung.com @ 2011-09-26 23:11 ` Andrew Morton 2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner 2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro 2 siblings, 1 reply; 14+ messages in thread From: Andrew Morton @ 2011-09-26 23:11 UTC (permalink / raw) To: Johannes Weiner Cc: Kautuk Consul, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro, Andrew Morton On Mon, 26 Sep 2011 13:29:45 +0200 Johannes Weiner <jweiner@redhat.com> wrote: > On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: > > write_scan_unavictable_node checks the value req returned by > > strict_strtoul and returns 1 if req is 0. > > > > However, when strict_strtoul returns 0, it means successful conversion > > of buf to unsigned long. > > > > Due to this, the function was not proceeding to scan the zones for > > unevictable pages even though we write a valid value to the > > scan_unevictable_pages sys file. > > Given that there is not a real reason for this knob (anymore) and that > it apparently never really worked since the day it was introduced, how > about we just drop all that code instead? > Yes, let's remove it if at all possible. However, to be nice to people I do think we should emit a once-per-boot printk when someone tries to use it, then remove it for real at least a couple of kernel cycles later. Just in case someone's script or tuning app is trying to open that procfs file. > > --- > From: Johannes Weiner <jweiner@redhat.com> > Subject: mm: remove sysctl to manually rescue unevictable pages > > At one point, anonymous pages were supposed to go on the unevictable > list when no swap space was configured, and the idea was to manually > rescue those pages after adding swap and making them evictable again. > But nowadays, swap-backed pages on the anon LRU list are not scanned > without available swap space anyway, so there is no point in moving > them to a separate list anymore. > > The manual rescue could also be used in case pages were stranded on > the unevictable list due to race conditions. But the code has been > around for a while now and newly discovered bugs should be properly > reported and dealt with instead of relying on such a manual fixup. > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> The changelog failed to note that the sysctl doesn't actually *work*. This is a pretty strong argument for removing it ;) Also, a reported-by:Kautuk would have been nice. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* [patch] mm: disable user interface to manually rescue unevictable pages 2011-09-26 23:11 ` Andrew Morton @ 2011-09-27 7:27 ` Johannes Weiner 2011-09-28 2:14 ` Minchan Kim 2011-09-28 4:51 ` KAMEZAWA Hiroyuki 0 siblings, 2 replies; 14+ messages in thread From: Johannes Weiner @ 2011-09-27 7:27 UTC (permalink / raw) To: Andrew Morton Cc: Kautuk Consul, Mel Gorman, Minchan Kim, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro, Andrew Morton At one point, anonymous pages were supposed to go on the unevictable list when no swap space was configured, and the idea was to manually rescue those pages after adding swap and making them evictable again. But nowadays, swap-backed pages on the anon LRU list are not scanned without available swap space anyway, so there is no point in moving them to a separate list anymore. The manual rescue could also be used in case pages were stranded on the unevictable list due to race conditions. But the code has been around for a while now and newly discovered bugs should be properly reported and dealt with instead of relying on such a manual fixup. In addition to the lack of a usecase, the sysfs interface to rescue pages from a specific NUMA node has been broken since its introduction, so it's unlikely that anybody ever relied on that. This patch removes the functionality behind the sysctl and the node-interface and emits a one-time warning when somebody tries to access either of them. Signed-off-by: Johannes Weiner <jweiner@redhat.com> Reported-by: Kautuk Consul <consul.kautuk@gmail.com> --- mm/vmscan.c | 84 +++++----------------------------------------------------- 1 files changed, 8 insertions(+), 76 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 7502726..71b5616 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3397,66 +3397,12 @@ void scan_mapping_unevictable_pages(struct address_space *mapping) } -/** - * scan_zone_unevictable_pages - check unevictable list for evictable pages - * @zone - zone of which to scan the unevictable list - * - * Scan @zone's unevictable LRU lists to check for pages that have become - * evictable. Move those that have to @zone's inactive list where they - * become candidates for reclaim, unless shrink_inactive_zone() decides - * to reactivate them. Pages that are still unevictable are rotated - * back onto @zone's unevictable list. - */ -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */ -static void scan_zone_unevictable_pages(struct zone *zone) +static void warn_scan_unevictable_pages(void) { - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list; - unsigned long scan; - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE); - - while (nr_to_scan > 0) { - unsigned long batch_size = min(nr_to_scan, - SCAN_UNEVICTABLE_BATCH_SIZE); - - spin_lock_irq(&zone->lru_lock); - for (scan = 0; scan < batch_size; scan++) { - struct page *page = lru_to_page(l_unevictable); - - if (!trylock_page(page)) - continue; - - prefetchw_prev_lru_page(page, l_unevictable, flags); - - if (likely(PageLRU(page) && PageUnevictable(page))) - check_move_unevictable_page(page, zone); - - unlock_page(page); - } - spin_unlock_irq(&zone->lru_lock); - - nr_to_scan -= batch_size; - } -} - - -/** - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages - * - * A really big hammer: scan all zones' unevictable LRU lists to check for - * pages that have become evictable. Move those back to the zones' - * inactive list where they become candidates for reclaim. - * This occurs when, e.g., we have unswappable pages on the unevictable lists, - * and we add swap to the system. As such, it runs in the context of a task - * that has possibly/probably made some previously unevictable pages - * evictable. - */ -static void scan_all_zones_unevictable_pages(void) -{ - struct zone *zone; - - for_each_zone(zone) { - scan_zone_unevictable_pages(zone); - } + printk_once(KERN_WARNING + "The scan_unevictable_pages sysctl/node-interface has been " + "disabled for lack of a legitimate use case. If you have " + "one, please send an email to linux-mm@kvack.org.\n"); } /* @@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write, void __user *buffer, size_t *length, loff_t *ppos) { + warn_scan_unevictable_pages(); proc_doulongvec_minmax(table, write, buffer, length, ppos); - - if (write && *(unsigned long *)table->data) - scan_all_zones_unevictable_pages(); - scan_unevictable_pages = 0; return 0; } @@ -3488,6 +3431,7 @@ static ssize_t read_scan_unevictable_node(struct sys_device *dev, struct sysdev_attribute *attr, char *buf) { + warn_scan_unevictable_pages(); return sprintf(buf, "0\n"); /* always zero; should fit... */ } @@ -3495,19 +3439,7 @@ static ssize_t write_scan_unevictable_node(struct sys_device *dev, struct sysdev_attribute *attr, const char *buf, size_t count) { - struct zone *node_zones = NODE_DATA(dev->id)->node_zones; - struct zone *zone; - unsigned long res; - unsigned long req = strict_strtoul(buf, 10, &res); - - if (!req) - return 1; /* zero is no-op */ - - for (zone = node_zones; zone - node_zones < MAX_NR_ZONES; ++zone) { - if (!populated_zone(zone)) - continue; - scan_zone_unevictable_pages(zone); - } + warn_scan_unevictable_pages(); return 1; } -- 1.7.6.2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages 2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner @ 2011-09-28 2:14 ` Minchan Kim 2011-10-04 10:10 ` Johannes Weiner 2011-09-28 4:51 ` KAMEZAWA Hiroyuki 1 sibling, 1 reply; 14+ messages in thread From: Minchan Kim @ 2011-09-28 2:14 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Kautuk Consul, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro, Andrew Morton On Tue, Sep 27, 2011 at 09:27:14AM +0200, Johannes Weiner wrote: > At one point, anonymous pages were supposed to go on the unevictable > list when no swap space was configured, and the idea was to manually > rescue those pages after adding swap and making them evictable again. > But nowadays, swap-backed pages on the anon LRU list are not scanned > without available swap space anyway, so there is no point in moving > them to a separate list anymore. > > The manual rescue could also be used in case pages were stranded on > the unevictable list due to race conditions. But the code has been > around for a while now and newly discovered bugs should be properly > reported and dealt with instead of relying on such a manual fixup. > > In addition to the lack of a usecase, the sysfs interface to rescue > pages from a specific NUMA node has been broken since its > introduction, so it's unlikely that anybody ever relied on that. > > This patch removes the functionality behind the sysctl and the > node-interface and emits a one-time warning when somebody tries to > access either of them. > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> > Reported-by: Kautuk Consul <consul.kautuk@gmail.com> Reviewed-by: Minchan Kim <minchan.kim@gmail.com> There is a nitpick at below but I don't care of it. > --- > mm/vmscan.c | 84 +++++----------------------------------------------------- > 1 files changed, 8 insertions(+), 76 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 7502726..71b5616 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3397,66 +3397,12 @@ void scan_mapping_unevictable_pages(struct address_space *mapping) > > } > > -/** > - * scan_zone_unevictable_pages - check unevictable list for evictable pages > - * @zone - zone of which to scan the unevictable list > - * > - * Scan @zone's unevictable LRU lists to check for pages that have become > - * evictable. Move those that have to @zone's inactive list where they > - * become candidates for reclaim, unless shrink_inactive_zone() decides > - * to reactivate them. Pages that are still unevictable are rotated > - * back onto @zone's unevictable list. > - */ > -#define SCAN_UNEVICTABLE_BATCH_SIZE 16UL /* arbitrary lock hold batch size */ > -static void scan_zone_unevictable_pages(struct zone *zone) > +static void warn_scan_unevictable_pages(void) > { > - struct list_head *l_unevictable = &zone->lru[LRU_UNEVICTABLE].list; > - unsigned long scan; > - unsigned long nr_to_scan = zone_page_state(zone, NR_UNEVICTABLE); > - > - while (nr_to_scan > 0) { > - unsigned long batch_size = min(nr_to_scan, > - SCAN_UNEVICTABLE_BATCH_SIZE); > - > - spin_lock_irq(&zone->lru_lock); > - for (scan = 0; scan < batch_size; scan++) { > - struct page *page = lru_to_page(l_unevictable); > - > - if (!trylock_page(page)) > - continue; > - > - prefetchw_prev_lru_page(page, l_unevictable, flags); > - > - if (likely(PageLRU(page) && PageUnevictable(page))) > - check_move_unevictable_page(page, zone); > - > - unlock_page(page); > - } > - spin_unlock_irq(&zone->lru_lock); > - > - nr_to_scan -= batch_size; > - } > -} > - > - > -/** > - * scan_all_zones_unevictable_pages - scan all unevictable lists for evictable pages > - * > - * A really big hammer: scan all zones' unevictable LRU lists to check for > - * pages that have become evictable. Move those back to the zones' > - * inactive list where they become candidates for reclaim. > - * This occurs when, e.g., we have unswappable pages on the unevictable lists, > - * and we add swap to the system. As such, it runs in the context of a task > - * that has possibly/probably made some previously unevictable pages > - * evictable. > - */ > -static void scan_all_zones_unevictable_pages(void) > -{ > - struct zone *zone; > - > - for_each_zone(zone) { > - scan_zone_unevictable_pages(zone); > - } > + printk_once(KERN_WARNING > + "The scan_unevictable_pages sysctl/node-interface has been " > + "disabled for lack of a legitimate use case. If you have " > + "one, please send an email to linux-mm@kvack.org.\n"); > } > > /* > @@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write, > void __user *buffer, > size_t *length, loff_t *ppos) > { > + warn_scan_unevictable_pages(); > proc_doulongvec_minmax(table, write, buffer, length, ppos); > - > - if (write && *(unsigned long *)table->data) > - scan_all_zones_unevictable_pages(); > - > scan_unevictable_pages = 0; Nitpick: Could we remove this resetting with zero? -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages 2011-09-28 2:14 ` Minchan Kim @ 2011-10-04 10:10 ` Johannes Weiner 0 siblings, 0 replies; 14+ messages in thread From: Johannes Weiner @ 2011-10-04 10:10 UTC (permalink / raw) To: Minchan Kim Cc: Andrew Morton, Kautuk Consul, Mel Gorman, KAMEZAWA Hiroyuki, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro, Andrew Morton On Wed, Sep 28, 2011 at 11:14:24AM +0900, Minchan Kim wrote: > On Tue, Sep 27, 2011 at 09:27:14AM +0200, Johannes Weiner wrote: > > At one point, anonymous pages were supposed to go on the unevictable > > list when no swap space was configured, and the idea was to manually > > rescue those pages after adding swap and making them evictable again. > > But nowadays, swap-backed pages on the anon LRU list are not scanned > > without available swap space anyway, so there is no point in moving > > them to a separate list anymore. > > > > The manual rescue could also be used in case pages were stranded on > > the unevictable list due to race conditions. But the code has been > > around for a while now and newly discovered bugs should be properly > > reported and dealt with instead of relying on such a manual fixup. > > > > In addition to the lack of a usecase, the sysfs interface to rescue > > pages from a specific NUMA node has been broken since its > > introduction, so it's unlikely that anybody ever relied on that. > > > > This patch removes the functionality behind the sysctl and the > > node-interface and emits a one-time warning when somebody tries to > > access either of them. > > > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> > > Reported-by: Kautuk Consul <consul.kautuk@gmail.com> > Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Thanks, Minchan. > > @@ -3469,11 +3415,8 @@ int scan_unevictable_handler(struct ctl_table *table, int write, > > void __user *buffer, > > size_t *length, loff_t *ppos) > > { > > + warn_scan_unevictable_pages(); > > proc_doulongvec_minmax(table, write, buffer, length, ppos); > > - > > - if (write && *(unsigned long *)table->data) > > - scan_all_zones_unevictable_pages(); > > - > > scan_unevictable_pages = 0; > > Nitpick: > Could we remove this resetting with zero? table->data = &scan_unevictable_pages, so I let it in such that the generic sysctl stuff to process the input and clean up afterward is complete. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: disable user interface to manually rescue unevictable pages 2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner 2011-09-28 2:14 ` Minchan Kim @ 2011-09-28 4:51 ` KAMEZAWA Hiroyuki 1 sibling, 0 replies; 14+ messages in thread From: KAMEZAWA Hiroyuki @ 2011-09-28 4:51 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, Kautuk Consul, Mel Gorman, Minchan Kim, Rik van Riel, linux-mm, linux-kernel, Lee Schermerhorn, KOSAKI Motohiro, Andrew Morton On Tue, 27 Sep 2011 09:27:14 +0200 Johannes Weiner <jweiner@redhat.com> wrote: > At one point, anonymous pages were supposed to go on the unevictable > list when no swap space was configured, and the idea was to manually > rescue those pages after adding swap and making them evictable again. > But nowadays, swap-backed pages on the anon LRU list are not scanned > without available swap space anyway, so there is no point in moving > them to a separate list anymore. > > The manual rescue could also be used in case pages were stranded on > the unevictable list due to race conditions. But the code has been > around for a while now and newly discovered bugs should be properly > reported and dealt with instead of relying on such a manual fixup. > > In addition to the lack of a usecase, the sysfs interface to rescue > pages from a specific NUMA node has been broken since its > introduction, so it's unlikely that anybody ever relied on that. > > This patch removes the functionality behind the sysctl and the > node-interface and emits a one-time warning when somebody tries to > access either of them. > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> > Reported-by: Kautuk Consul <consul.kautuk@gmail.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [patch] mm: remove sysctl to manually rescue unevictable pages 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner 2011-09-26 12:10 ` kautuk.c @samsung.com 2011-09-26 23:11 ` Andrew Morton @ 2011-09-28 1:00 ` KOSAKI Motohiro 2 siblings, 0 replies; 14+ messages in thread From: KOSAKI Motohiro @ 2011-09-28 1:00 UTC (permalink / raw) To: jweiner Cc: consul.kautuk, akpm, mel, minchan.kim, kamezawa.hiroyu, riel, linux-mm, linux-kernel, lee.schermerhorn (2011/09/26 20:29), Johannes Weiner wrote: > On Sun, Sep 25, 2011 at 04:29:40PM +0530, Kautuk Consul wrote: >> write_scan_unavictable_node checks the value req returned by >> strict_strtoul and returns 1 if req is 0. >> >> However, when strict_strtoul returns 0, it means successful conversion >> of buf to unsigned long. >> >> Due to this, the function was not proceeding to scan the zones for >> unevictable pages even though we write a valid value to the >> scan_unevictable_pages sys file. > > Given that there is not a real reason for this knob (anymore) and that > it apparently never really worked since the day it was introduced, how > about we just drop all that code instead? > > Hannes > > --- > From: Johannes Weiner <jweiner@redhat.com> > Subject: mm: remove sysctl to manually rescue unevictable pages > > At one point, anonymous pages were supposed to go on the unevictable > list when no swap space was configured, and the idea was to manually > rescue those pages after adding swap and making them evictable again. > But nowadays, swap-backed pages on the anon LRU list are not scanned > without available swap space anyway, so there is no point in moving > them to a separate list anymore. > > The manual rescue could also be used in case pages were stranded on > the unevictable list due to race conditions. But the code has been > around for a while now and newly discovered bugs should be properly > reported and dealt with instead of relying on such a manual fixup. > > Signed-off-by: Johannes Weiner <jweiner@redhat.com> About three years ago when we introduced unevictable pages feature, we were worry about there are mlock, shmmem-lock abuse in the real world and we broke such assumption. And we expected this knob help to dig unevictable pages bug report. Briefly says, If this knob works meangfully, our unevictable handling code or their driver code are buggy. Fortunately, Such bug report was never happen. So, this knob finished the role. Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/ Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2011-10-04 10:11 UTC | newest] Thread overview: 14+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-09-25 10:59 [PATCH 1/1] vmscan.c: Invalid strict_strtoul check in write_scan_unevictable_node Kautuk Consul 2011-09-26 9:20 ` KAMEZAWA Hiroyuki 2011-09-26 11:29 ` [patch] mm: remove sysctl to manually rescue unevictable pages Johannes Weiner 2011-09-26 12:10 ` kautuk.c @samsung.com 2011-09-26 12:29 ` kautuk.c @samsung.com 2011-09-26 14:25 ` Johannes Weiner 2011-09-26 14:55 ` kautuk.c @samsung.com 2011-09-26 14:18 ` Johannes Weiner 2011-09-26 23:11 ` Andrew Morton 2011-09-27 7:27 ` [patch] mm: disable user interface " Johannes Weiner 2011-09-28 2:14 ` Minchan Kim 2011-10-04 10:10 ` Johannes Weiner 2011-09-28 4:51 ` KAMEZAWA Hiroyuki 2011-09-28 1:00 ` [patch] mm: remove sysctl " KOSAKI Motohiro
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).