From: Minchan Kim <minchan.kim@gmail.com>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Elladan <elladan@eskimo.com>,
Nick Piggin <npiggin@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Christoph Lameter <cl@linux-foundation.org>,
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Peter Zijlstra <peterz@infradead.org>,
Rik van Riel <riel@redhat.com>, "tytso@mit.edu" <tytso@mit.edu>,
"linux-mm@kvack.org" <linux-mm@kvack.org>
Subject: Re: [PATCH 2/3] vmscan: make mapped executable pages the first class citizen
Date: Sun, 17 May 2009 09:38:30 +0900 [thread overview]
Message-ID: <28c262360905161738o2ec8b0cg6bfb40b40fc048fa@mail.gmail.com> (raw)
In-Reply-To: <20090516092858.GA12104@localhost>
On Sat, May 16, 2009 at 6:28 PM, Wu Fengguang <fengguang.wu@intel.com> wrote:
> [trivial update on comment text, according to Rik's comment]
>
> --
> vmscan: make mapped executable pages the first class citizen
>
> Protect referenced PROT_EXEC mapped pages from being deactivated.
>
> PROT_EXEC(or its internal presentation VM_EXEC) pages normally belong to some
> currently running executables and their linked libraries, they shall really be
> cached aggressively to provide good user experiences.
>
> Thanks to Johannes Weiner for the advice to reuse the VMA walk in
> page_referenced() to get the PROT_EXEC bit.
>
>
> [more details]
>
> ( The consequences of this patch will have to be discussed together with
> Rik van Riel's recent patch "vmscan: evict use-once pages first". )
>
> ( Some of the good points and insights are taken into this changelog.
> Thanks to all the involved people for the great LKML discussions. )
>
> the problem
> -----------
>
> For a typical desktop, the most precious working set is composed of
> *actively accessed*
> (1) memory mapped executables
> (2) and their anonymous pages
> (3) and other files
> (4) and the dcache/icache/.. slabs
> while the least important data are
> (5) infrequently used or use-once files
>
> For a typical desktop, one major problem is busty and large amount of (5)
> use-once files flushing out the working set.
>
> Inside the working set, (4) dcache/icache have already been too sticky ;-)
> So we only have to care (2) anonymous and (1)(3) file pages.
>
> anonymous pages
> ---------------
> Anonymous pages are effectively immune to the streaming IO attack, because we
> now have separate file/anon LRU lists. When the use-once files crowd into the
> file LRU, the list's "quality" is significantly lowered. Therefore the scan
> balance policy in get_scan_ratio() will choose to scan the (low quality) file
> LRU much more frequently than the anon LRU.
>
> file pages
> ----------
> Rik proposed to *not* scan the active file LRU when the inactive list grows
> larger than active list. This guarantees that when there are use-once streaming
> IO, and the working set is not too large(so that active_size < inactive_size),
> the active file LRU will *not* be scanned at all. So the not-too-large working
> set can be well protected.
>
> But there are also situations where the file working set is a bit large so that
> (active_size >= inactive_size), or the streaming IOs are not purely use-once.
> In these cases, the active list will be scanned slowly. Because the current
> shrink_active_list() policy is to deactivate active pages regardless of their
> referenced bits. The deactivated pages become susceptible to the streaming IO
> attack: the inactive list could be scanned fast (500MB / 50MBps = 10s) so that
> the deactivated pages don't have enough time to get re-referenced. Because a
> user tend to switch between windows in intervals from seconds to minutes.
>
> This patch holds mapped executable pages in the active list as long as they
> are referenced during each full scan of the active list. Because the active
> list is normally scanned much slower, they get longer grace time (eg. 100s)
> for further references, which better matches the pace of user operations.
>
> Therefore this patch greatly prolongs the in-cache time of executable code,
> when there are moderate memory pressures.
>
> before patch: guaranteed to be cached if reference intervals < I
> after patch: guaranteed to be cached if reference intervals < I+A
> (except when randomly reclaimed by the lumpy reclaim)
> where
> A = time to fully scan the active file LRU
> I = time to fully scan the inactive file LRU
>
> Note that normally A >> I.
>
> side effects
> ------------
>
> This patch is safe in general, it restores the pre-2.6.28 mmap() behavior
> but in a much smaller and well targeted scope.
>
> One may worry about some one to abuse the PROT_EXEC heuristic. But as
> Andrew Morton stated, there are other tricks to getting that sort of boost.
>
> Another concern is the PROT_EXEC mapped pages growing large in rare cases,
> and therefore hurting reclaim efficiency. But a sane application targeted for
> large audience will never use PROT_EXEC for data mappings. If some home made
> application tries to abuse that bit, it shall be aware of the consequences.
> If it is abused to scale of 2/3 total memory, it gains nothing but overheads.
>
> CC: Elladan <elladan@eskimo.com>
> CC: Nick Piggin <npiggin@suse.de>
> CC: Johannes Weiner <hannes@cmpxchg.org>
> CC: Christoph Lameter <cl@linux-foundation.org>
> CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Acked-by: Peter Zijlstra <peterz@infradead.org>
> Acked-by: Rik van Riel <riel@redhat.com>
> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
next prev parent reply other threads:[~2009-05-17 0:38 UTC|newest]
Thread overview: 69+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-16 9:00 [PATCH 0/3] make mapped executable pages the first class citizen Wu Fengguang
2009-05-16 9:00 ` [PATCH 1/3] vmscan: report vm_flags in page_referenced() Wu Fengguang
2009-05-16 13:17 ` Johannes Weiner
2009-05-16 13:37 ` Rik van Riel
2009-05-17 0:35 ` Minchan Kim
2009-05-17 1:36 ` Minchan Kim
2009-05-17 1:58 ` Wu Fengguang
2009-05-16 9:00 ` [PATCH 2/3] vmscan: make mapped executable pages the first class citizen Wu Fengguang
2009-05-16 9:28 ` Wu Fengguang
2009-05-16 13:20 ` Johannes Weiner
2009-05-17 0:38 ` Minchan Kim [this message]
2009-05-18 14:46 ` Christoph Lameter
2009-05-19 3:27 ` Wu Fengguang
2009-05-19 4:41 ` KOSAKI Motohiro
2009-05-19 4:44 ` KOSAKI Motohiro
2009-05-19 4:48 ` Wu Fengguang
2009-05-19 5:09 ` Wu Fengguang
2009-05-19 6:27 ` Wu Fengguang
2009-05-19 6:25 ` Wu Fengguang
2009-05-20 11:20 ` Andi Kleen
2009-05-20 14:32 ` Wu Fengguang
2009-05-20 14:47 ` Andi Kleen
2009-05-20 14:56 ` Wu Fengguang
2009-05-20 15:38 ` Wu Fengguang
2009-06-08 12:14 ` Nai Xia
2009-06-08 12:46 ` Wu Fengguang
2009-06-08 15:02 ` Nai Xia
2009-06-08 7:39 ` Wu Fengguang
2009-06-08 7:51 ` KOSAKI Motohiro
2009-06-08 7:56 ` Wu Fengguang
2009-06-08 17:18 ` Nai Xia
2009-06-09 6:44 ` Wu Fengguang
2009-05-19 7:15 ` Wu Fengguang
2009-05-19 7:20 ` KOSAKI Motohiro
2009-05-19 7:49 ` Wu Fengguang
2009-05-19 8:06 ` KOSAKI Motohiro
2009-05-19 8:53 ` Wu Fengguang
2009-05-19 12:28 ` KOSAKI Motohiro
2009-05-20 1:44 ` Wu Fengguang
2009-05-20 1:59 ` KOSAKI Motohiro
2009-05-20 2:31 ` Wu Fengguang
2009-05-20 2:58 ` KOSAKI Motohiro
2009-05-19 13:24 ` Rik van Riel
2009-05-19 15:55 ` KOSAKI Motohiro
2009-05-19 6:39 ` Pekka Enberg
2009-05-19 6:56 ` KOSAKI Motohiro
2009-05-19 7:44 ` Peter Zijlstra
2009-05-19 8:05 ` Pekka Enberg
2009-05-19 8:12 ` Wu Fengguang
2009-05-19 8:14 ` Pekka Enberg
2009-05-19 13:14 ` Rik van Riel
2009-05-16 9:00 ` [PATCH 3/3] vmscan: merge duplicate code in shrink_active_list() Wu Fengguang
2009-05-16 13:39 ` Johannes Weiner
2009-05-16 13:47 ` Wu Fengguang
2009-05-16 14:35 ` Rik van Riel
2009-05-17 1:24 ` Minchan Kim
2009-05-16 14:56 ` [PATCH 0/3] make mapped executable pages the first class citizen Peter Zijlstra
2009-06-17 21:11 ` Jesse Barnes
2009-06-17 21:37 ` Jesse Barnes
2009-06-18 1:25 ` Wu Fengguang
2009-06-18 16:33 ` Jesse Barnes
2009-06-19 9:00 ` Wu, Fengguang
2009-06-19 9:04 ` Peter Zijlstra
2009-06-19 9:32 ` Wu Fengguang
2009-06-19 16:43 ` Jesse Barnes
2009-07-04 1:27 ` Roger WANG
2009-07-06 17:38 ` Jesse Barnes
-- strict thread matches above, loose matches on Subject: below --
2009-05-17 2:23 Wu Fengguang
2009-05-17 2:23 ` [PATCH 2/3] vmscan: " Wu Fengguang
2009-05-19 8:59 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=28c262360905161738o2ec8b0cg6bfb40b40fc048fa@mail.gmail.com \
--to=minchan.kim@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=elladan@eskimo.com \
--cc=fengguang.wu@intel.com \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).