From: Shaohua Li <shli@kernel.org>
To: Minchan Kim <minchan@kernel.org>
Cc: "Michael Kerrisk (man-pages)" <mtk.manpages@gmail.com>,
Michal Hocko <mhocko@suse.cz>,
Andrew Morton <akpm@linux-foundation.org>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-api@vger.kernel.org, Hugh Dickins <hughd@google.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Rik van Riel <riel@redhat.com>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Mel Gorman <mgorman@suse.de>, Jason Evans <je@fb.com>,
zhangyanfei@cn.fujitsu.com,
"Kirill A. Shutemov" <kirill@shutemov.name>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v17 1/7] mm: support madvise(MADV_FREE)
Date: Fri, 6 Feb 2015 10:29:18 -0800 [thread overview]
Message-ID: <20150206182918.GA2290@kernel.org> (raw)
In-Reply-To: <20150206055103.GA13244@blaptop>
On Fri, Feb 06, 2015 at 02:51:03PM +0900, Minchan Kim wrote:
> Hi Shaohua,
>
> On Thu, Feb 05, 2015 at 04:33:11PM -0800, Shaohua Li wrote:
> >
> > Hi Minchan,
> >
> > Sorry to jump in this thread so later, and if some issues are discussed before.
> > I'm interesting in this patch, so tried it here. I use a simple test with
>
> No problem at all. Interest is always win over ignorance.
>
> > jemalloc. Obviously this can improve performance when there is no memory
> > pressure. Did you try setup with memory pressure?
>
> Sure but it was not a huge memory system like yours.
Yes, I'd like to check the symptom in memory pressure, so choose such test.
> > In my test, jemalloc will map 61G vma, and use about 32G memory without
> > MADV_FREE. If MADV_FREE is enabled, jemalloc will use whole 61G memory because
> > madvise doesn't reclaim the unused memory. If I disable swap (tweak your patch
>
> Yes, IIUC, jemalloc replaces MADV_DONTNEED with MADV_FREE completely.
right.
> > slightly to make it work without swap), I got oom. If swap is enabled, my
>
> You mean you modified anon aging logic so it works although there is no swap?
> If so, I have no idea why OOM happens. I guess it should free all of freeable
> pages during the aging so although system stall happens more, I don't expect
> OOM. Anyway, with MADV_FREE with no swap, we should consider more things
> about anonymous aging.
In the patch, MADV_FREE will be disabled and fallback to DONTNEED if no swap is
enabled. Our production environment doesn't enable swap, so I tried to delete
the 'no swap' check and make MADV_FREE always enabled regardless if swap is
enabled. I didn't change anything else. With such change, I saw oom
immediately. So definitely we have aging issue, the pages aren't reclaimed
fast.
> > system is totally stalled because of swap activity. Without the MADV_FREE,
> > everything is ok. Considering we definitely don't want to waste too much
> > memory, a system with memory pressure is normal, so sounds MADV_FREE will
> > introduce big trouble here.
> >
> > Did you think about move the MADV_FREE pages to the head of inactive LRU, so
> > they can be reclaimed easily?
>
> I think it's desirable if the page lived in active LRU.
> The reason I didn't that was caused by volatile ranges system call which
> was motivaion for MADV_FREE in my mind.
> In last LSF/MM, there was concern about data's hotness.
> Some of users want to keep that as it is in LRU position, others want to
> handle that as cold(tail of inactive list)/warm(head of inactive list)/
> hot(head of active list), for example.
> The vrange syscall was just about volatiltiy, not depends on page hotness
> so the decision on my head was not to change LRU order and let's make new
> hotness advise if we need it later.
>
> However, MADV_FREE's main customer is allocators and afaik, they want
> to replace MADV_DONTNEED with MADV_FREE so I think it is really cold,
> but we couldn't make sure so head of inactive is good compromise.
> Another concern about tail of inactive list is that there could be
> plenty of pages in there, which was asynchromos write-backed in
> previous reclaim path, not-yet reclaimed because of not being able
> to free the in softirq context of writeback. It means we ends up
> freeing more potential pages to become workingset in advance
> than pages VM already decided to evict.
Yes, they are definitely cold pages. I thought We should make sure the
MADV_FREE pages are reclaimed first before other pages, at least in the anon
LRU list, though there might be difficult to determine if we should reclaim
writeback pages first or MADV_FREE pages first.
Thanks,
Shaohua
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-02-06 18:29 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-20 10:11 [PATCH v17 0/7] MADV_FREE support Minchan Kim
2014-10-20 10:11 ` [PATCH v17 1/7] mm: support madvise(MADV_FREE) Minchan Kim
2014-11-27 14:47 ` Michal Hocko
2014-11-30 23:56 ` Minchan Kim
2014-12-02 10:01 ` Michal Hocko
2014-12-03 0:00 ` Minchan Kim
2014-12-03 10:13 ` Michal Hocko
2014-12-05 7:08 ` Minchan Kim
2014-12-05 8:32 ` Michal Hocko
2015-02-03 16:39 ` Michael Kerrisk (man-pages)
2015-02-03 23:47 ` Minchan Kim
2015-02-06 0:33 ` Shaohua Li
2015-02-06 5:51 ` Minchan Kim
2015-02-06 18:29 ` Shaohua Li [this message]
2015-02-09 7:15 ` Minchan Kim
2015-02-10 22:38 ` Shaohua Li
2015-02-11 0:56 ` Minchan Kim
2015-02-12 0:14 ` Shaohua Li
2015-02-16 4:36 ` Minchan Kim
2015-02-06 12:58 ` Michal Hocko
2015-02-06 18:32 ` Shaohua Li
2015-02-06 18:40 ` Rik van Riel
2015-02-04 12:52 ` Michal Hocko
2014-10-20 10:11 ` [PATCH v17 2/7] x86: add pmd_[dirty|mkclean] for THP Minchan Kim
2014-10-20 10:12 ` [PATCH v17 3/7] sparc: " Minchan Kim
2014-10-20 10:12 ` [PATCH v17 4/7] powerpc: " Minchan Kim
2014-10-20 10:12 ` [PATCH v17 5/7] arm: add pmd_mkclean " Minchan Kim
2014-10-20 10:12 ` [PATCH v17 6/7] arm64: add pmd_[dirty|mkclean] " Minchan Kim
2014-10-20 10:12 ` [PATCH v17 7/7] mm: Don't split THP page when syscall is called Minchan Kim
2014-11-27 15:49 ` Michal Hocko
2014-12-01 0:11 ` Minchan Kim
2014-11-13 22:58 ` [PATCH v17 0/7] MADV_FREE support Minchan Kim
2014-11-14 1:52 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150206182918.GA2290@kernel.org \
--to=shli@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=je@fb.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@suse.cz \
--cc=minchan@kernel.org \
--cc=mtk.manpages@gmail.com \
--cc=riel@redhat.com \
--cc=zhangyanfei@cn.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).