From: Johannes Weiner <hannes@cmpxchg.org>
To: Shaohua Li <shli@fb.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Kernel-team@fb.com, mhocko@suse.com, minchan@kernel.org,
hughd@google.com, riel@redhat.com, mgorman@techsingularity.net
Subject: Re: [RFC 0/6]mm: add new LRU list for MADV_FREE pages
Date: Tue, 31 Jan 2017 13:59:49 -0500 [thread overview]
Message-ID: <20170131185949.GA5037@cmpxchg.org> (raw)
In-Reply-To: <cover.1485748619.git.shli@fb.com>
Hi Shaohua,
On Sun, Jan 29, 2017 at 09:51:17PM -0800, Shaohua Li wrote:
> We are trying to use MADV_FREE in jemalloc. Several issues are found. Without
> solving the issues, jemalloc can't use the MADV_FREE feature.
> - Doesn't support system without swap enabled. Because if swap is off, we can't
> or can't efficiently age anonymous pages. And since MADV_FREE pages are mixed
> with other anonymous pages, we can't reclaim MADV_FREE pages. In current
> implementation, MADV_FREE will fallback to MADV_DONTNEED without swap enabled.
> But in our environment, a lot of machines don't enable swap. This will prevent
> our setup using MADV_FREE.
> - Increases memory pressure. page reclaim bias file pages reclaim against
> anonymous pages. This doesn't make sense for MADV_FREE pages, because those
> pages could be freed easily and refilled with very slight penality. Even page
> reclaim doesn't bias file pages, there is still an issue, because MADV_FREE
> pages and other anonymous pages are mixed together. To reclaim a MADV_FREE
> page, we probably must scan a lot of other anonymous pages, which is
> inefficient. In our test, we usually see oom with MADV_FREE enabled and nothing
> without it.
Fully agreed, the anon LRU is a bad place for these pages.
> For the first two issues, introducing a new LRU list for MADV_FREE pages could
> solve the issues. We can directly reclaim MADV_FREE pages without writting them
> out to swap, so the first issue could be fixed. If only MADV_FREE pages are in
> the new list, page reclaim can easily reclaim such pages without interference
> of file or anonymous pages. The memory pressure issue will disappear.
Do we actually need a new page flag and a special LRU for them? These
pages are basically like clean cache pages at that point. What do you
think about clearing their PG_swapbacked flag on MADV_FREE and moving
them to the inactive file list? The way isolate+putback works should
not even need much modification, something like clear_page_mlock().
When the reclaim scanner finds anon && dirty && !swapbacked, it can
again set PG_swapbacked and goto keep_locked to move the page back
into the anon LRU to get reclaimed according to swapping rules.
> For the third issue, we can add a separate RSS count for MADV_FREE pages. The
> count will be increased in madvise syscall and decreased in page reclaim (eg,
> unmap). One issue is activate_page(). A MADV_FREE page can be promoted to
> active page there. But there isn't mm_struct context at that place. Iterating
> vma there sounds too silly. The patchset don't fix this issue yet. Hopefully
> somebody can share a hint how to fix this issue.
This problem also goes away if we use the file LRUs.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Johannes Weiner <hannes@cmpxchg.org>
To: Shaohua Li <shli@fb.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Kernel-team@fb.com, mhocko@suse.com, minchan@kernel.org,
hughd@google.com, riel@redhat.com, mgorman@techsingularity.net
Subject: Re: [RFC 0/6]mm: add new LRU list for MADV_FREE pages
Date: Tue, 31 Jan 2017 13:59:49 -0500 [thread overview]
Message-ID: <20170131185949.GA5037@cmpxchg.org> (raw)
In-Reply-To: <cover.1485748619.git.shli@fb.com>
Hi Shaohua,
On Sun, Jan 29, 2017 at 09:51:17PM -0800, Shaohua Li wrote:
> We are trying to use MADV_FREE in jemalloc. Several issues are found. Without
> solving the issues, jemalloc can't use the MADV_FREE feature.
> - Doesn't support system without swap enabled. Because if swap is off, we can't
> or can't efficiently age anonymous pages. And since MADV_FREE pages are mixed
> with other anonymous pages, we can't reclaim MADV_FREE pages. In current
> implementation, MADV_FREE will fallback to MADV_DONTNEED without swap enabled.
> But in our environment, a lot of machines don't enable swap. This will prevent
> our setup using MADV_FREE.
> - Increases memory pressure. page reclaim bias file pages reclaim against
> anonymous pages. This doesn't make sense for MADV_FREE pages, because those
> pages could be freed easily and refilled with very slight penality. Even page
> reclaim doesn't bias file pages, there is still an issue, because MADV_FREE
> pages and other anonymous pages are mixed together. To reclaim a MADV_FREE
> page, we probably must scan a lot of other anonymous pages, which is
> inefficient. In our test, we usually see oom with MADV_FREE enabled and nothing
> without it.
Fully agreed, the anon LRU is a bad place for these pages.
> For the first two issues, introducing a new LRU list for MADV_FREE pages could
> solve the issues. We can directly reclaim MADV_FREE pages without writting them
> out to swap, so the first issue could be fixed. If only MADV_FREE pages are in
> the new list, page reclaim can easily reclaim such pages without interference
> of file or anonymous pages. The memory pressure issue will disappear.
Do we actually need a new page flag and a special LRU for them? These
pages are basically like clean cache pages at that point. What do you
think about clearing their PG_swapbacked flag on MADV_FREE and moving
them to the inactive file list? The way isolate+putback works should
not even need much modification, something like clear_page_mlock().
When the reclaim scanner finds anon && dirty && !swapbacked, it can
again set PG_swapbacked and goto keep_locked to move the page back
into the anon LRU to get reclaimed according to swapping rules.
> For the third issue, we can add a separate RSS count for MADV_FREE pages. The
> count will be increased in madvise syscall and decreased in page reclaim (eg,
> unmap). One issue is activate_page(). A MADV_FREE page can be promoted to
> active page there. But there isn't mm_struct context at that place. Iterating
> vma there sounds too silly. The patchset don't fix this issue yet. Hopefully
> somebody can share a hint how to fix this issue.
This problem also goes away if we use the file LRUs.
next prev parent reply other threads:[~2017-01-31 18:59 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-30 5:51 [RFC 0/6]mm: add new LRU list for MADV_FREE pages Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 1/6] mm: add wrap for page accouting index Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 2/6] mm: add lazyfree page flag Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 3/6] mm: add LRU_LAZYFREE lru list Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 4/6] mm: move MADV_FREE pages into LRU_LAZYFREE list Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 5/6] mm: reclaim lazyfree pages Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-30 5:51 ` [RFC 6/6] mm: enable MADV_FREE for swapless system Shaohua Li
2017-01-30 5:51 ` Shaohua Li
2017-01-31 18:59 ` Johannes Weiner [this message]
2017-01-31 18:59 ` [RFC 0/6]mm: add new LRU list for MADV_FREE pages Johannes Weiner
2017-01-31 19:45 ` Shaohua Li
2017-01-31 19:45 ` Shaohua Li
2017-01-31 21:38 ` Johannes Weiner
2017-01-31 21:38 ` Johannes Weiner
2017-02-01 9:02 ` Michal Hocko
2017-02-01 9:02 ` Michal Hocko
2017-02-01 16:37 ` Shaohua Li
2017-02-01 16:37 ` Shaohua Li
2017-02-02 5:14 ` Minchan Kim
2017-02-02 5:14 ` Minchan Kim
2017-02-02 19:28 ` Johannes Weiner
2017-02-02 19:28 ` Johannes Weiner
2017-02-01 5:47 ` Minchan Kim
2017-02-01 5:47 ` Minchan Kim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170131185949.GA5037@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=Kernel-team@fb.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=minchan@kernel.org \
--cc=riel@redhat.com \
--cc=shli@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.