From: Minchan Kim <minchan@kernel.org>
To: Shaohua Li <shli@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Michael Kerrisk <mtk.manpages@gmail.com>,
linux-api@vger.kernel.org, Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
zhangyanfei@cn.fujitsu.com, Rik van Riel <riel@redhat.com>,
Mel Gorman <mgorman@suse.de>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Jason Evans <je@fb.com>, Daniel Micay <danielmicay@gmail.com>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Michal Hocko <mhocko@suse.cz>,
yalin.wang2010@gmail.com, "Wang,
Yalin" <Yalin.Wang@sonymobile.com>
Subject: Re: [PATCH 5/8] mm: move lazily freed pages to inactive list
Date: Thu, 5 Nov 2015 10:11:25 +0900 [thread overview]
Message-ID: <20151105011125.GG7357@bbox> (raw)
In-Reply-To: <20151104182047.GA116691@kernel.org>
On Wed, Nov 04, 2015 at 10:20:47AM -0800, Shaohua Li wrote:
> On Wed, Nov 04, 2015 at 09:53:42AM -0800, Shaohua Li wrote:
> > On Tue, Nov 03, 2015 at 09:52:23AM +0900, Minchan Kim wrote:
> > > On Fri, Oct 30, 2015 at 10:22:12AM -0700, Shaohua Li wrote:
> > > > On Fri, Oct 30, 2015 at 04:01:41PM +0900, Minchan Kim wrote:
> > > > > MADV_FREE is a hint that it's okay to discard pages if there is memory
> > > > > pressure and we use reclaimers(ie, kswapd and direct reclaim) to free them
> > > > > so there is no value keeping them in the active anonymous LRU so this
> > > > > patch moves them to inactive LRU list's head.
> > > > >
> > > > > This means that MADV_FREE-ed pages which were living on the inactive list
> > > > > are reclaimed first because they are more likely to be cold rather than
> > > > > recently active pages.
> > > > >
> > > > > An arguable issue for the approach would be whether we should put the page
> > > > > to the head or tail of the inactive list. I chose head because the kernel
> > > > > cannot make sure it's really cold or warm for every MADV_FREE usecase but
> > > > > at least we know it's not *hot*, so landing of inactive head would be a
> > > > > comprimise for various usecases.
> > > > >
> > > > > This fixes suboptimal behavior of MADV_FREE when pages living on the
> > > > > active list will sit there for a long time even under memory pressure
> > > > > while the inactive list is reclaimed heavily. This basically breaks the
> > > > > whole purpose of using MADV_FREE to help the system to free memory which
> > > > > is might not be used.
> > > >
> > > > My main concern is the policy how we should treat the FREE pages. Moving it to
> > > > inactive lru is definitionly a good start, I'm wondering if it's enough. The
> > > > MADV_FREE increases memory pressure and cause unnecessary reclaim because of
> > > > the lazy memory free. While MADV_FREE is intended to be a better replacement of
> > > > MADV_DONTNEED, MADV_DONTNEED doesn't have the memory pressure issue as it free
> > > > memory immediately. So I hope the MADV_FREE doesn't have impact on memory
> > > > pressure too. I'm thinking of adding an extra lru list and wartermark for this
> > > > to make sure FREE pages can be freed before system wide page reclaim. As you
> > > > said, this is arguable, but I hope we can discuss about this issue more.
> > >
> > > Yes, it's arguble. ;-)
> > >
> > > It seems the divergence comes from MADV_FREE is *replacement* of MADV_DONTNEED.
> > > But I don't think so. If we could discard MADV_FREEed page *anytime*, I agree
> > > but it's not true because the page would be dirty state when VM want to reclaim.
> >
> > There certainly are other usage cases, but even your patch log mainly describes
> > the jemalloc usage case, which uses MADV_DONTNEED.
> >
> > > I'm also against with your's suggestion which let's discard FREEed page before
> > > system wide page reclaim because system would have lots of clean cold page
> > > caches or anonymous pages. In such case, reclaiming of them would be better.
> > > Yeb, it's really workload-dependent so we might need some heuristic which is
> > > normally what we want to avoid.
> > >
> > > Having said that, I agree with you we could do better than the deactivation
> > > and frankly speaking, I'm thinking of another LRU list(e.g. tentatively named
> > > "ezreclaim LRU list"). What I have in mind is to age (anon|file|ez)
> > > fairly. IOW, I want to percolate ez-LRU list reclaiming into get_scan_count.
> > > When the MADV_FREE is called, we could move hinted pages from anon-LRU to
> > > ez-LRU and then If VM find to not be able to discard a page in ez-LRU,
> > > it could promote it to acive-anon-LRU which would be very natural aging
> > > concept because it mean someone touches the page recenlty.
> > >
> > > With that, I don't want to bias one side and don't want to add some knob for
> > > tuning the heuristic but let's rely on common fair aging scheme of VM.
> > >
> > > Another bonus with new LRU list is we could support MADV_FREE on swapless
> > > system.
> > >
> > > >
> > > > Or do you want to push this first and address the policy issue later?
> > >
> > > I believe adding new LRU list would be controversial(ie, not trivial)
> > > for maintainer POV even though code wouldn't be complicated.
> > > So, I want to see problems in *real practice*, not any theoritical
> > > test program before diving into that.
> > > To see such voice of request, we should release the syscall.
> > > So, I want to push this first.
> >
> > The memory pressure issue isn't just in artificial test. In jemalloc, there is
> > a knob (lg_dirty_mult) to control the rate memory should be purged (using
> > MADV_DONTNEED). We already had several reports in our production environment
> > changing the knob can cause extra memory usage (and swap and so on). If
> > jemalloc uses MADV_FREE, jemalloc will not purge any memory, which is equivent
> > to disable current MADV_DONTNEED (eg, lg_dirty_mult = -1). I'm sure this will
> > cause the similar issue, eg (extram memory usage, swap). That said I don't
> > object to push this first, but the memory pressue issue can happen in real
> > production, I hope it's not ignored.
>
> I think the question is if application uses MADV_DONTNEED originally, how much
> better if we replace it to MADV_FREE compared to just delete the MADV_DONTNEED,
> considering anonymous memory is hard to be reclaimed currently.
So, the question from my side is application will use MADV_FREE
as replacement of MADV_DONTNEED without any tune or modification?
At least, I'd like to know jemalloc if they have a plan.
>
> Thanks,
> Shaohua
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-11-05 1:11 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-30 7:01 [PATCH 0/8] MADV_FREE support Minchan Kim
2015-10-30 7:01 ` [PATCH 1/8] mm: support madvise(MADV_FREE) Minchan Kim
2015-10-30 16:49 ` Shaohua Li
[not found] ` <20151030164937.GA44946-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-11-03 0:10 ` Minchan Kim
2015-10-30 7:01 ` [PATCH 2/8] mm: define MADV_FREE for some arches Minchan Kim
2015-10-30 7:01 ` [PATCH 3/8] arch: uapi: asm: mman.h: Let MADV_FREE have same value for all architectures Minchan Kim
[not found] ` <1446188504-28023-4-git-send-email-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-11-02 0:08 ` Hugh Dickins
2015-11-03 2:32 ` Minchan Kim
2015-11-03 2:36 ` Minchan Kim
2015-11-03 3:36 ` David Miller
2015-11-03 4:31 ` Minchan Kim
2015-10-30 7:01 ` [PATCH 4/8] mm: free swp_entry in madvise_free Minchan Kim
2015-10-30 12:28 ` Michal Hocko
[not found] ` <20151030122814.GA23627-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2015-11-03 0:53 ` Minchan Kim
2015-10-30 7:01 ` [PATCH 5/8] mm: move lazily freed pages to inactive list Minchan Kim
[not found] ` <1446188504-28023-6-git-send-email-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-10-30 17:22 ` Shaohua Li
[not found] ` <20151030172212.GB44946-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-11-03 0:52 ` Minchan Kim
2015-11-04 8:15 ` Michal Hocko
2015-11-04 17:53 ` Shaohua Li
2015-11-04 18:20 ` Shaohua Li
2015-11-05 1:11 ` Minchan Kim [this message]
[not found] ` <20151104175342.GA98327-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-11-05 1:03 ` Minchan Kim
2015-11-04 20:55 ` Johannes Weiner
[not found] ` <20151104205504.GA9927-druUgvl0LCNAfugRpC6u6w@public.gmane.org>
2015-11-04 21:48 ` Daniel Micay
[not found] ` <563A7D21.6040505-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-11-04 22:55 ` Johannes Weiner
2015-11-04 23:36 ` Daniel Micay
[not found] ` <563A9681.3070102-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-11-04 23:49 ` Daniel Micay
2015-10-30 7:01 ` [PATCH 6/8] mm: lru_deactivate_fn should clear PG_referenced Minchan Kim
[not found] ` <1446188504-28023-7-git-send-email-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-10-30 12:47 ` Michal Hocko
2015-11-03 1:10 ` Minchan Kim
2015-11-04 8:22 ` Michal Hocko
2015-10-30 7:01 ` [PATCH 7/8] mm: clear PG_dirty to mark page freeable Minchan Kim
[not found] ` <1446188504-28023-8-git-send-email-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-10-30 12:55 ` Michal Hocko
2015-10-30 7:01 ` [PATCH 8/8] mm: mark stable page dirty in KSM Minchan Kim
[not found] ` <1446188504-28023-1-git-send-email-minchan-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
2015-11-01 4:51 ` [PATCH 0/8] MADV_FREE support David Rientjes
2015-11-01 6:29 ` Daniel Micay
[not found] ` <5635B159.8030307-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
2015-11-03 2:23 ` Minchan Kim
2015-11-04 20:19 ` David Rientjes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151105011125.GG7357@bbox \
--to=minchan@kernel.org \
--cc=Yalin.Wang@sonymobile.com \
--cc=akpm@linux-foundation.org \
--cc=danielmicay@gmail.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=je@fb.com \
--cc=kirill@shutemov.name \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=mtk.manpages@gmail.com \
--cc=riel@redhat.com \
--cc=shli@kernel.org \
--cc=yalin.wang2010@gmail.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).