From: Minchan Kim <minchan@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Dan Magenheimer <dan.magenheimer@oracle.com>,
Sonny Rao <sonnyrao@google.com>, Bryan Freed <bfreed@google.com>,
Hugh Dickins <hughd@google.com>, Rik van Riel <riel@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset
Date: Fri, 18 Jan 2013 08:36:42 +0900 [thread overview]
Message-ID: <20130117233641.GA31368@blaptop> (raw)
In-Reply-To: <20130117142238.e32c46d5.akpm@linux-foundation.org>
On Thu, Jan 17, 2013 at 02:22:38PM -0800, Andrew Morton wrote:
> On Thu, 17 Jan 2013 09:53:14 +0900
> Minchan Kim <minchan@kernel.org> wrote:
>
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > He said there was no problem when he disabled laptop_mode.
> >
> > The problem when I investigate problem is following as.
> >
> > Assumption for easy explanation: There are no page cache page in system
> > because they all are already reclaimed.
> >
> > 1. try_to_free_pages disable may_writepage when laptop_mode is enabled.
> > 2. shrink_inactive_list isolates victim pages from inactive anon lru list.
> > 3. shrink_page_list adds them to swapcache via add_to_swap but it doesn't
> > pageout because sc->may_writepage is 0 so the page is rotated back into
> > inactive anon lru list. The add_to_swap made the page Dirty by SetPageDirty
> > 4. 3 couldn't reclaim any pages so do_try_to_free_pages increase priority and
> > retry reclaim with higher priority.
> > 5. shrink_inactlive_list try to isolate victim pages from inactive anon lru list
> > but got failed because it try to isolate pages with ISOLATE_CLEAN mode but
> > inactive anon lru list is full of dirty pages by 3 so it just returns
> > without any reclaim progress.
> > 6. do_try_to_free_pages doesn't set may_write due to zero total_scanned.
>
> s/may_write/may_writepage/
Thanks!
>
> > Because sc->nr_scanned is increased by shrink_page_list but we don't call
> > shrink_page_list in 5 due to short of isolated pages.
>
> This is the bug, is it not?
>
> In laptop mode, we still need to write out dirty swapcache at some
> point. An appropriate time to do this is when the scanning priority is
Yes and when to some point is really important. Now, the point for that is
depends on on the number of scanned pages by shrink_page_list. It means we
must isolate victim pages from inactive LRU list and call shrink_page_list
to increase sc->nr_scanned but unfortunately, we have various filters to
decrease CPU consumption and LRU churning when VM try to isolate victim pages
so it could prevent isolating victim pages from LRU list.
> getting high. But it seems that this ISOLATE_CLEAN->total_scanned
Yes. I absolutely agree on that some point should depend on priority, NOT
the number of scanned pages. And I already said to you about that.
https://lkml.org/lkml/2013/1/10/643
We used to use such heuristic in several places in VM, ie DEF_PRIORITY - 2
But why I hesitate with the patch is that I think this patch should go to
stable tree so the patch should be really small and have no side effect so
I don't wanted to change laptop_mode behavior heavily caused by changing
condition for may_writepage trigger point.
> interaction is preventing that.
>
> (An enhancement to laptop mode would be to opportunistically write out
> dirty swapcache in or around laptop_mode_timer_fn()).
It could but it should be another patch and VM shouldn't rely on ONLY
laptop_mode_timer_fn, IMHO. VM should have own rule to reclaim pages
regardless of laptop_mode's help to prevent OOM kill.
>
> > Above loop is continued until OOM happens.
> > The problem didn't happen before [1] was merged because old logic's isolatation
> > in shrink_inactive_list was successful and tried to call shrink_page_list
> > to pageout them but it still ends up failed to page out by may_writepage.
> > But important point is that sc->nr_scanned was increased althoug we couldn't
> > swap out them so do_try_to_free_pages could set may_writepages.
> > So this patch need to go stable tree althoug it's a band-aid.
> > Then, for latest linus tree, we should fix laptop_mode's fundamental
> > problem.
>
> Well. Perhaps we can do that now.
Okay. If you don't object my suggestion, I will send patches next week.
Thanks for the review, Andrew!
>
> > [1] f80c067[mm: zone_reclaim: make isolate_lru_page() filter-aware]
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
--
Kind regards,
Minchan Kim
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-01-17 23:36 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-09 6:21 [PATCH 0/2] Use up free swap space before reaching OOM kill Minchan Kim
2013-01-09 6:21 ` [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset Minchan Kim
2013-01-09 6:56 ` Johannes Weiner
2013-01-09 7:10 ` Minchan Kim
2013-01-10 0:18 ` Andrew Morton
2013-01-10 2:03 ` Minchan Kim
2013-01-10 23:24 ` Luigi Semenzato
2013-01-10 23:27 ` Luigi Semenzato
2013-01-11 4:03 ` Minchan Kim
2013-01-10 0:20 ` Andrew Morton
2013-01-16 21:41 ` Andrew Morton
2013-01-17 0:53 ` Minchan Kim
2013-01-17 22:22 ` Andrew Morton
2013-01-17 23:36 ` Minchan Kim [this message]
2013-01-21 1:52 ` Minchan Kim
2013-01-21 14:39 ` Rik van Riel
2013-01-22 0:09 ` Minchan Kim
2013-02-05 1:43 ` Minchan Kim
2013-01-09 6:21 ` [PATCH 2/2] mm: forcely swapout when we are out of page cache Minchan Kim
2013-01-10 0:26 ` Andrew Morton
2013-01-10 2:23 ` Minchan Kim
2013-01-10 21:58 ` Andrew Morton
2013-01-11 4:43 ` Minchan Kim
2013-01-16 0:09 ` Andrew Morton
2013-01-16 0:32 ` Sonny Rao
2013-01-16 0:50 ` Andrew Morton
2013-01-16 1:21 ` Sonny Rao
2013-01-16 4:47 ` Minchan Kim
2013-01-16 20:08 ` Sonny Rao
2013-01-16 4:43 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130117233641.GA31368@blaptop \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bfreed@google.com \
--cc=dan.magenheimer@oracle.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@redhat.com \
--cc=sonnyrao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).