diff for duplicates of <20130205014308.GG2610@blaptop> diff --git a/a/1.txt b/N1/1.txt index f8ab5bf..09f8367 100644 --- a/a/1.txt +++ b/N1/1.txt @@ -72,3 +72,98 @@ Could you apply this and remove [2]? Otherwise, should I wait for Luigi? [2] mm: prevent addition of pages to swap if may_writepage is unset + +>From 72cdf4159427c1ecdbd21a40b8bd1f13d5b8d5e2 Mon Sep 17 00:00:00 2001 +From: Minchan Kim <minchan@kernel.org> +Date: Mon, 21 Jan 2013 10:52:22 +0900 +Subject: [PATCH] mm: Use up free swap space before reaching OOM kill + +Recently, Luigi reported there are lots of free swap space when +OOM happens. It's easily reproduced on zram-over-swap, where +many instance of memory hogs are running and laptop_mode is enabled. +He said there was no problem when he disabled laptop_mode. +The problem when I investigate problem is following as. + +Assumption for easy explanation: There are no page cache page in system +because they all are already reclaimed. + +1. try_to_free_pages disable may_writepage when laptop_mode is enabled. +2. shrink_inactive_list isolates victim pages from inactive anon lru list. +3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't + pageout because sc->may_writepage is 0 so the page is rotated back into + inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty. +4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and + retry reclaim with higher priority. +5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list + but got failed because it try to isolate pages with ISOLATE_CLEAN mode but + inactive anon lru list is full of dirty pages by 3 so it just returns + without any reclaim progress. +6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned. + Because sc->nr_scanned is increased by shrink_page_list but we don't call + shrink_page_list in 5 due to short of isolated pages. + +Above loop is continued until OOM happens. +The problem didn't happen before [1] was merged because old logic's +isolatation in shrink_inactive_list was successful and tried to call +shrink_page_list to pageout them but it still ends up failed to page out +by may_writepage. But important point is that sc->nr_scanned was increased +although we couldn't swap out them so do_try_to_free_pages could set +may_writepages. + +Since [1] was introduced, it's not a good idea any more to depends on +only the number of scanned pages for setting may_writepage. So this patch +adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2 +which is used to show the significant memory pressure in VM so it's good +fit for our purpose which would be better to lose power saving or clickety +rather than OOM killing. + +[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware] + +Reported-by: Luigi Semenzato <semenzato@google.com> +[Rik is ok if the patch works for Luigi] +Not-yet-Acked-by: Rik van Riel <riel@redhat.com> +Signed-off-by: Minchan Kim <minchan@kernel.org> +--- + mm/vmscan.c | 15 ++++++++++----- + 1 file changed, 10 insertions(+), 5 deletions(-) + +diff --git a/mm/vmscan.c b/mm/vmscan.c +index d75c1ec..4fb3a6d 100644 +--- a/mm/vmscan.c ++++ b/mm/vmscan.c +@@ -2204,6 +2204,13 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, + goto out; + + /* ++ * If we're getting trouble reclaiming, start doing ++ * writepage even in laptop mode. ++ */ ++ if (sc->priority < DEF_PRIORITY - 2) ++ sc->may_writepage = 1; ++ ++ /* + * Try to write back as many pages as we just scanned. This + * tends to cause slow streaming writers to write data to the + * disk smoothly, at the dirtying rate, which is nice. But +@@ -2774,12 +2781,10 @@ loop_again: + } + + /* +- * If we've done a decent amount of scanning and +- * the reclaim ratio is low, start doing writepage +- * even in laptop mode ++ * If we're getting trouble reclaiming, start doing ++ * writepage even in laptop mode. + */ +- if (total_scanned > SWAP_CLUSTER_MAX * 2 && +- total_scanned > sc.nr_reclaimed + sc.nr_reclaimed / 2) ++ if (sc.priority < DEF_PRIORITY - 2) + sc.may_writepage = 1; + + if (zone->all_unreclaimable) { +-- +1.8.1.1 + +-- +Kind regards, +Minchan Kim diff --git a/a/content_digest b/N1/content_digest index f5f3d83..2220dd9 100644 --- a/a/content_digest +++ b/N1/content_digest @@ -13,7 +13,8 @@ "To\0Rik van Riel <riel@redhat.com>" Luigi Semenzato <semenzato@google.com> " Andrew Morton <akpm@linux-foundation.org>\0" - "Cc\0linux-mm@kvack.org" + "Cc\0Andrew Morton <akpm@linux-foundation.org>" + linux-mm@kvack.org linux-kernel@vger.kernel.org Dan Magenheimer <dan.magenheimer@oracle.com> Sonny Rao <sonnyrao@google.com> @@ -95,6 +96,101 @@ "Could you apply this and remove [2]?\n" "Otherwise, should I wait for Luigi?\n" "\n" - [2] mm: prevent addition of pages to swap if may_writepage is unset + "[2] mm: prevent addition of pages to swap if may_writepage is unset\n" + "\n" + ">From 72cdf4159427c1ecdbd21a40b8bd1f13d5b8d5e2 Mon Sep 17 00:00:00 2001\n" + "From: Minchan Kim <minchan@kernel.org>\n" + "Date: Mon, 21 Jan 2013 10:52:22 +0900\n" + "Subject: [PATCH] mm: Use up free swap space before reaching OOM kill\n" + "\n" + "Recently, Luigi reported there are lots of free swap space when\n" + "OOM happens. It's easily reproduced on zram-over-swap, where\n" + "many instance of memory hogs are running and laptop_mode is enabled.\n" + "He said there was no problem when he disabled laptop_mode.\n" + "The problem when I investigate problem is following as.\n" + "\n" + "Assumption for easy explanation: There are no page cache page in system\n" + "because they all are already reclaimed.\n" + "\n" + "1. try_to_free_pages disable may_writepage when laptop_mode is enabled.\n" + "2. shrink_inactive_list isolates victim pages from inactive anon lru list.\n" + "3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't\n" + " pageout because sc->may_writepage is 0 so the page is rotated back into\n" + " inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty.\n" + "4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and\n" + " retry reclaim with higher priority.\n" + "5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list\n" + " but got failed because it try to isolate pages with ISOLATE_CLEAN mode but\n" + " inactive anon lru list is full of dirty pages by 3 so it just returns\n" + " without any reclaim progress.\n" + "6. do_try_to_free_pages doesn't set may_writepage due to zero total_scanned.\n" + " Because sc->nr_scanned is increased by shrink_page_list but we don't call\n" + " shrink_page_list in 5 due to short of isolated pages.\n" + "\n" + "Above loop is continued until OOM happens.\n" + "The problem didn't happen before [1] was merged because old logic's\n" + "isolatation in shrink_inactive_list was successful and tried to call\n" + "shrink_page_list to pageout them but it still ends up failed to page out\n" + "by may_writepage. But important point is that sc->nr_scanned was increased\n" + "although we couldn't swap out them so do_try_to_free_pages could set\n" + "may_writepages.\n" + "\n" + "Since [1] was introduced, it's not a good idea any more to depends on\n" + "only the number of scanned pages for setting may_writepage. So this patch\n" + "adds new trigger point of setting may_writepage as below DEF_PRIOIRTY - 2\n" + "which is used to show the significant memory pressure in VM so it's good\n" + "fit for our purpose which would be better to lose power saving or clickety\n" + "rather than OOM killing.\n" + "\n" + "[1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]\n" + "\n" + "Reported-by: Luigi Semenzato <semenzato@google.com>\n" + "[Rik is ok if the patch works for Luigi]\n" + "Not-yet-Acked-by: Rik van Riel <riel@redhat.com>\n" + "Signed-off-by: Minchan Kim <minchan@kernel.org>\n" + "---\n" + " mm/vmscan.c | 15 ++++++++++-----\n" + " 1 file changed, 10 insertions(+), 5 deletions(-)\n" + "\n" + "diff --git a/mm/vmscan.c b/mm/vmscan.c\n" + "index d75c1ec..4fb3a6d 100644\n" + "--- a/mm/vmscan.c\n" + "+++ b/mm/vmscan.c\n" + "@@ -2204,6 +2204,13 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,\n" + " \t\t\tgoto out;\n" + " \n" + " \t\t/*\n" + "+\t\t * If we're getting trouble reclaiming, start doing\n" + "+\t\t * writepage even in laptop mode.\n" + "+\t\t */\n" + "+\t\tif (sc->priority < DEF_PRIORITY - 2)\n" + "+\t\t\tsc->may_writepage = 1;\n" + "+\n" + "+\t\t/*\n" + " \t\t * Try to write back as many pages as we just scanned. This\n" + " \t\t * tends to cause slow streaming writers to write data to the\n" + " \t\t * disk smoothly, at the dirtying rate, which is nice. But\n" + "@@ -2774,12 +2781,10 @@ loop_again:\n" + " \t\t\t}\n" + " \n" + " \t\t\t/*\n" + "-\t\t\t * If we've done a decent amount of scanning and\n" + "-\t\t\t * the reclaim ratio is low, start doing writepage\n" + "-\t\t\t * even in laptop mode\n" + "+\t\t\t * If we're getting trouble reclaiming, start doing\n" + "+\t\t\t * writepage even in laptop mode.\n" + " \t\t\t */\n" + "-\t\t\tif (total_scanned > SWAP_CLUSTER_MAX * 2 &&\n" + "-\t\t\t total_scanned > sc.nr_reclaimed + sc.nr_reclaimed / 2)\n" + "+\t\t\tif (sc.priority < DEF_PRIORITY - 2)\n" + " \t\t\t\tsc.may_writepage = 1;\n" + " \n" + " \t\t\tif (zone->all_unreclaimable) {\n" + "-- \n" + "1.8.1.1\n" + "\n" + "-- \n" + "Kind regards,\n" + Minchan Kim -5881ec8d54bca1b44daee4f379a00d140efffb52fd68b4031816269cd05f4464 +cf0a829756f34af0dc1d7c15ff42d93c2107a9a9af5fef469304d2821dc42b7c
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.