All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Elladan <elladan@eskimo.com>,
	Nick Piggin <npiggin@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Rik van Riel <riel@redhat.com>, "tytso@mit.edu" <tytso@mit.edu>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"minchan.kim@gmail.com" <minchan.kim@gmail.com>
Subject: Re: [PATCH 2/3] vmscan: make mapped executable pages the first class citizen
Date: Tue, 19 May 2009 15:49:25 +0800	[thread overview]
Message-ID: <20090519074925.GA690@localhost> (raw)
In-Reply-To: <20090519161756.4EE4.A69D9226@jp.fujitsu.com>

On Tue, May 19, 2009 at 03:20:19PM +0800, KOSAKI Motohiro wrote:
> > On Tue, May 19, 2009 at 12:41:38PM +0800, KOSAKI Motohiro wrote:
> > > Hi
> > > 
> > > Thanks for great works.
> > > 
> > > 
> > > > SUMMARY
> > > > =======
> > > > The patch decreases the number of major faults from 50 to 3 during 10% cache hot reads.
> > > > 
> > > > 
> > > > SCENARIO
> > > > ========
> > > > The test scenario is to do 100000 pread(size=110 pages, offset=(i*100) pages),
> > > > where 10% of the pages will be activated:
> > > > 
> > > >         for i in `seq 0 100 10000000`; do echo $i 110;  done > pattern-hot-10
> > > >         iotrace.rb --load pattern-hot-10 --play /b/sparse
> > > 
> > > 
> > > Which can I download iotrace.rb?
> > > 
> > > 
> > > > and monitor /proc/vmstat during the time. The test box has 2G memory.
> > > > 
> > > > 
> > > > ANALYZES
> > > > ========
> > > > 
> > > > I carried out two runs on fresh booted console mode 2.6.29 with the VM_EXEC
> > > > patch, and fetched the vmstat numbers on
> > > > 
> > > > (1) begin:   shortly after the big read IO starts;
> > > > (2) end:     just before the big read IO stops;
> > > > (3) restore: the big read IO stops and the zsh working set restored
> > > > 
> > > >         nr_mapped   nr_active_file nr_inactive_file       pgmajfault     pgdeactivate           pgfree
> > > > begin:       2481             2237             8694              630                0           574299
> > > > end:          275           231976           233914              633           776271         20933042
> > > > restore:      370           232154           234524              691           777183         20958453
> > > > 
> > > > begin:       2434             2237             8493              629                0           574195
> > > > end:          284           231970           233536              632           771918         20896129
> > > > restore:      399           232218           234789              690           774526         20957909
> > > > 
> > > > and another run on 2.6.30-rc4-mm with the VM_EXEC logic disabled:
> > > 
> > > I don't think it is proper comparision.
> > > you need either following comparision. otherwise we insert many guess into the analysis.
> > > 
> > >  - 2.6.29 with and without VM_EXEC patch
> > >  - 2.6.30-rc4-mm with and without VM_EXEC patch
> > > 
> > > 
> > > > 
> > > > begin:       2479             2344             9659              210                0           579643
> > > > end:          284           232010           234142              260           772776         20917184
> > > > restore:      379           232159           234371              301           774888         20967849
> > > > 
> > > > The numbers show that
> > > > 
> > > > - The startup pgmajfault of 2.6.30-rc4-mm is merely 1/3 that of 2.6.29.
> > > >   I'd attribute that improvement to the mmap readahead improvements :-)
> > > > 
> > > > - The pgmajfault increment during the file copy is 633-630=3 vs 260-210=50.
> > > >   That's a huge improvement - which means with the VM_EXEC protection logic,
> > > >   active mmap pages is pretty safe even under partially cache hot streaming IO.
> > > > 
> > > > - when active:inactive file lru size reaches 1:1, their scan rates is 1:20.8
> > > >   under 10% cache hot IO. (computed with formula Dpgdeactivate:Dpgfree)
> > > >   That roughly means the active mmap pages get 20.8 more chances to get
> > > >   re-referenced to stay in memory.
> > > > 
> > > > - The absolute nr_mapped drops considerably to 1/9 during the big IO, and the
> > > >   dropped pages are mostly inactive ones. The patch has almost no impact in
> > > >   this aspect, that means it won't unnecessarily increase memory pressure.
> > > >   (In contrast, your 20% mmap protection ratio will keep them all, and
> > > >   therefore eliminate the extra 41 major faults to restore working set
> > > >   of zsh etc.)
> > 
> > More results on X desktop, kernel 2.6.30-rc4-mm:
> > 
> >         nr_mapped   nr_active_file nr_inactive_file       pgmajfault     pgdeactivate           pgfree
> > 
> > VM_EXEC protection ON:
> > begin:       9740             8920            64075              561                0           678360
> > end:          768           218254           220029              565           798953         21057006
> > restore:      857           218543           220987              606           799462         21075710
> > restore X:   2414           218560           225344              797           799462         21080795
> > 
> > VM_EXEC protection OFF:
> > begin:       9368             5035            26389              554                0           633391
> > end:          770           218449           221230              661           646472         17832500
> > restore:     1113           218466           220978              710           649881         17905235
> > restore X:   2687           218650           225484              947           802700         21083584
> > 
> > The added "restore X" means after IO, switch back and forth between the urxvt
> > and firefox windows to restore their working set. I cannot explain why the
> > absolute nr_mapped grows larger at the end of VM_EXEC OFF case. Maybe it's
> > because urxvt is the foreground window during the first run, and firefox is the
> > foreground window during the second run?
> > 
> > Like the console mode, the absolute nr_mapped drops considerably - to 1/13 of
> > the original size - during the streaming IO.
> > 
> > The delta of pgmajfault is 3 vs 107 during IO, or 236 vs 393 during the whole
> > process.
> 
> hmmm.
> 
> about 100 page fault don't match Elladan's problem, I think.
> perhaps We missed any addional reproduce condition?

Elladan's case is not the point of this test.
Elladan's IO is use-once, so probably not a caching problem at all.

This test case is specifically devised to confirm whether this patch
works as expected. Conclusion: it is.

Thanks,
Fengguang

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>, Elladan <elladan@eskimo.com>,
	Nick Piggin <npiggin@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Rik van Riel <riel@redhat.com>, "tytso@mit.edu" <tytso@mit.edu>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"minchan.kim@gmail.com" <minchan.kim@gmail.com>
Subject: Re: [PATCH 2/3] vmscan: make mapped executable pages the first class citizen
Date: Tue, 19 May 2009 15:49:25 +0800	[thread overview]
Message-ID: <20090519074925.GA690@localhost> (raw)
In-Reply-To: <20090519161756.4EE4.A69D9226@jp.fujitsu.com>

On Tue, May 19, 2009 at 03:20:19PM +0800, KOSAKI Motohiro wrote:
> > On Tue, May 19, 2009 at 12:41:38PM +0800, KOSAKI Motohiro wrote:
> > > Hi
> > > 
> > > Thanks for great works.
> > > 
> > > 
> > > > SUMMARY
> > > > =======
> > > > The patch decreases the number of major faults from 50 to 3 during 10% cache hot reads.
> > > > 
> > > > 
> > > > SCENARIO
> > > > ========
> > > > The test scenario is to do 100000 pread(size=110 pages, offset=(i*100) pages),
> > > > where 10% of the pages will be activated:
> > > > 
> > > >         for i in `seq 0 100 10000000`; do echo $i 110;  done > pattern-hot-10
> > > >         iotrace.rb --load pattern-hot-10 --play /b/sparse
> > > 
> > > 
> > > Which can I download iotrace.rb?
> > > 
> > > 
> > > > and monitor /proc/vmstat during the time. The test box has 2G memory.
> > > > 
> > > > 
> > > > ANALYZES
> > > > ========
> > > > 
> > > > I carried out two runs on fresh booted console mode 2.6.29 with the VM_EXEC
> > > > patch, and fetched the vmstat numbers on
> > > > 
> > > > (1) begin:   shortly after the big read IO starts;
> > > > (2) end:     just before the big read IO stops;
> > > > (3) restore: the big read IO stops and the zsh working set restored
> > > > 
> > > >         nr_mapped   nr_active_file nr_inactive_file       pgmajfault     pgdeactivate           pgfree
> > > > begin:       2481             2237             8694              630                0           574299
> > > > end:          275           231976           233914              633           776271         20933042
> > > > restore:      370           232154           234524              691           777183         20958453
> > > > 
> > > > begin:       2434             2237             8493              629                0           574195
> > > > end:          284           231970           233536              632           771918         20896129
> > > > restore:      399           232218           234789              690           774526         20957909
> > > > 
> > > > and another run on 2.6.30-rc4-mm with the VM_EXEC logic disabled:
> > > 
> > > I don't think it is proper comparision.
> > > you need either following comparision. otherwise we insert many guess into the analysis.
> > > 
> > >  - 2.6.29 with and without VM_EXEC patch
> > >  - 2.6.30-rc4-mm with and without VM_EXEC patch
> > > 
> > > 
> > > > 
> > > > begin:       2479             2344             9659              210                0           579643
> > > > end:          284           232010           234142              260           772776         20917184
> > > > restore:      379           232159           234371              301           774888         20967849
> > > > 
> > > > The numbers show that
> > > > 
> > > > - The startup pgmajfault of 2.6.30-rc4-mm is merely 1/3 that of 2.6.29.
> > > >   I'd attribute that improvement to the mmap readahead improvements :-)
> > > > 
> > > > - The pgmajfault increment during the file copy is 633-630=3 vs 260-210=50.
> > > >   That's a huge improvement - which means with the VM_EXEC protection logic,
> > > >   active mmap pages is pretty safe even under partially cache hot streaming IO.
> > > > 
> > > > - when active:inactive file lru size reaches 1:1, their scan rates is 1:20.8
> > > >   under 10% cache hot IO. (computed with formula Dpgdeactivate:Dpgfree)
> > > >   That roughly means the active mmap pages get 20.8 more chances to get
> > > >   re-referenced to stay in memory.
> > > > 
> > > > - The absolute nr_mapped drops considerably to 1/9 during the big IO, and the
> > > >   dropped pages are mostly inactive ones. The patch has almost no impact in
> > > >   this aspect, that means it won't unnecessarily increase memory pressure.
> > > >   (In contrast, your 20% mmap protection ratio will keep them all, and
> > > >   therefore eliminate the extra 41 major faults to restore working set
> > > >   of zsh etc.)
> > 
> > More results on X desktop, kernel 2.6.30-rc4-mm:
> > 
> >         nr_mapped   nr_active_file nr_inactive_file       pgmajfault     pgdeactivate           pgfree
> > 
> > VM_EXEC protection ON:
> > begin:       9740             8920            64075              561                0           678360
> > end:          768           218254           220029              565           798953         21057006
> > restore:      857           218543           220987              606           799462         21075710
> > restore X:   2414           218560           225344              797           799462         21080795
> > 
> > VM_EXEC protection OFF:
> > begin:       9368             5035            26389              554                0           633391
> > end:          770           218449           221230              661           646472         17832500
> > restore:     1113           218466           220978              710           649881         17905235
> > restore X:   2687           218650           225484              947           802700         21083584
> > 
> > The added "restore X" means after IO, switch back and forth between the urxvt
> > and firefox windows to restore their working set. I cannot explain why the
> > absolute nr_mapped grows larger at the end of VM_EXEC OFF case. Maybe it's
> > because urxvt is the foreground window during the first run, and firefox is the
> > foreground window during the second run?
> > 
> > Like the console mode, the absolute nr_mapped drops considerably - to 1/13 of
> > the original size - during the streaming IO.
> > 
> > The delta of pgmajfault is 3 vs 107 during IO, or 236 vs 393 during the whole
> > process.
> 
> hmmm.
> 
> about 100 page fault don't match Elladan's problem, I think.
> perhaps We missed any addional reproduce condition?

Elladan's case is not the point of this test.
Elladan's IO is use-once, so probably not a caching problem at all.

This test case is specifically devised to confirm whether this patch
works as expected. Conclusion: it is.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-05-19  7:53 UTC|newest]

Thread overview: 137+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-16  9:00 [PATCH 0/3] make mapped executable pages the first class citizen Wu Fengguang
2009-05-16  9:00 ` Wu Fengguang
2009-05-16  9:00 ` [PATCH 1/3] vmscan: report vm_flags in page_referenced() Wu Fengguang
2009-05-16  9:00   ` Wu Fengguang
2009-05-16 13:17   ` Johannes Weiner
2009-05-16 13:17     ` Johannes Weiner
2009-05-16 13:37   ` Rik van Riel
2009-05-16 13:37     ` Rik van Riel
2009-05-17  0:35   ` Minchan Kim
2009-05-17  0:35     ` Minchan Kim
2009-05-17  1:36   ` Minchan Kim
2009-05-17  1:36     ` Minchan Kim
2009-05-17  1:58     ` Wu Fengguang
2009-05-17  1:58       ` Wu Fengguang
2009-05-16  9:00 ` [PATCH 2/3] vmscan: make mapped executable pages the first class citizen Wu Fengguang
2009-05-16  9:00   ` Wu Fengguang
2009-05-16  9:28   ` Wu Fengguang
2009-05-16  9:28     ` Wu Fengguang
2009-05-16 13:20     ` Johannes Weiner
2009-05-16 13:20       ` Johannes Weiner
2009-05-17  0:38     ` Minchan Kim
2009-05-17  0:38       ` Minchan Kim
2009-05-18 14:46     ` Christoph Lameter
2009-05-18 14:46       ` Christoph Lameter
2009-05-19  3:27       ` Wu Fengguang
2009-05-19  3:27         ` Wu Fengguang
2009-05-19  4:41         ` KOSAKI Motohiro
2009-05-19  4:41           ` KOSAKI Motohiro
2009-05-19  4:44           ` KOSAKI Motohiro
2009-05-19  4:44             ` KOSAKI Motohiro
2009-05-19  4:48             ` Wu Fengguang
2009-05-19  4:48               ` Wu Fengguang
2009-05-19  5:09           ` Wu Fengguang
2009-05-19  6:27             ` Wu Fengguang
2009-05-19  6:27               ` Wu Fengguang
2009-05-19  6:25           ` Wu Fengguang
2009-05-19  6:25             ` Wu Fengguang
2009-05-20 11:20             ` Andi Kleen
2009-05-20 11:20               ` Andi Kleen
2009-05-20 14:32               ` Wu Fengguang
2009-05-20 14:32                 ` Wu Fengguang
2009-05-20 14:47                 ` Andi Kleen
2009-05-20 14:47                   ` Andi Kleen
2009-05-20 14:56                   ` Wu Fengguang
2009-05-20 14:56                     ` Wu Fengguang
2009-05-20 15:38                     ` Wu Fengguang
2009-05-20 15:38                       ` Wu Fengguang
2009-06-08 12:14                       ` Nai Xia
2009-06-08 12:14                         ` Nai Xia
2009-06-08 12:46                         ` Wu Fengguang
2009-06-08 12:46                           ` Wu Fengguang
2009-06-08 15:02                           ` Nai Xia
2009-06-08 15:02                             ` Nai Xia
2009-06-08  7:39               ` Wu Fengguang
2009-06-08  7:39                 ` Wu Fengguang
2009-06-08  7:51                 ` KOSAKI Motohiro
2009-06-08  7:51                   ` KOSAKI Motohiro
2009-06-08  7:56                   ` Wu Fengguang
2009-06-08  7:56                     ` Wu Fengguang
2009-06-08 17:18                 ` Nai Xia
2009-06-08 17:18                   ` Nai Xia
2009-06-09  6:44                   ` Wu Fengguang
2009-06-09  6:44                     ` Wu Fengguang
2009-05-19  7:15           ` Wu Fengguang
2009-05-19  7:15             ` Wu Fengguang
2009-05-19  7:20             ` KOSAKI Motohiro
2009-05-19  7:20               ` KOSAKI Motohiro
2009-05-19  7:49               ` Wu Fengguang [this message]
2009-05-19  7:49                 ` Wu Fengguang
2009-05-19  8:06                 ` KOSAKI Motohiro
2009-05-19  8:06                   ` KOSAKI Motohiro
2009-05-19  8:53                   ` Wu Fengguang
2009-05-19  8:53                     ` Wu Fengguang
2009-05-19 12:28                     ` KOSAKI Motohiro
2009-05-19 12:28                       ` KOSAKI Motohiro
2009-05-20  1:44                       ` Wu Fengguang
2009-05-20  1:44                         ` Wu Fengguang
2009-05-20  1:59                         ` KOSAKI Motohiro
2009-05-20  1:59                           ` KOSAKI Motohiro
2009-05-20  2:31                           ` Wu Fengguang
2009-05-20  2:58                             ` KOSAKI Motohiro
2009-05-20  2:58                               ` KOSAKI Motohiro
2009-05-19 13:24                     ` Rik van Riel
2009-05-19 13:24                       ` Rik van Riel
2009-05-19 15:55                       ` KOSAKI Motohiro
2009-05-19 15:55                         ` KOSAKI Motohiro
2009-05-19  6:39   ` Pekka Enberg
2009-05-19  6:39     ` Pekka Enberg
2009-05-19  6:56     ` KOSAKI Motohiro
2009-05-19  6:56       ` KOSAKI Motohiro
2009-05-19  7:44     ` Peter Zijlstra
2009-05-19  7:44       ` Peter Zijlstra
2009-05-19  8:05       ` Pekka Enberg
2009-05-19  8:05         ` Pekka Enberg
2009-05-19  8:12         ` Wu Fengguang
2009-05-19  8:12           ` Wu Fengguang
2009-05-19  8:14           ` Pekka Enberg
2009-05-19  8:14             ` Pekka Enberg
2009-05-19 13:14     ` Rik van Riel
2009-05-19 13:14       ` Rik van Riel
2009-05-16  9:00 ` [PATCH 3/3] vmscan: merge duplicate code in shrink_active_list() Wu Fengguang
2009-05-16  9:00   ` Wu Fengguang
2009-05-16 13:39   ` Johannes Weiner
2009-05-16 13:39     ` Johannes Weiner
2009-05-16 13:47     ` Wu Fengguang
2009-05-16 13:47       ` Wu Fengguang
2009-05-16 14:35   ` Rik van Riel
2009-05-16 14:35     ` Rik van Riel
2009-05-17  1:24   ` Minchan Kim
2009-05-17  1:24     ` Minchan Kim
2009-05-16 14:56 ` [PATCH 0/3] make mapped executable pages the first class citizen Peter Zijlstra
2009-06-17 21:11   ` Jesse Barnes
2009-06-17 21:37     ` Jesse Barnes
2009-06-18  1:25     ` Wu Fengguang
2009-06-18  1:25       ` Wu Fengguang
2009-06-18 16:33       ` Jesse Barnes
2009-06-18 16:33         ` Jesse Barnes
2009-06-19  9:00       ` Wu, Fengguang
2009-06-19  9:00         ` Wu, Fengguang
2009-06-19  9:04         ` Peter Zijlstra
2009-06-19  9:04           ` Peter Zijlstra
2009-06-19  9:32           ` Wu Fengguang
2009-06-19  9:32             ` Wu Fengguang
2009-06-19 16:43             ` Jesse Barnes
2009-06-19 16:43               ` Jesse Barnes
2009-07-04  1:27               ` Roger WANG
2009-07-04  1:27                 ` Roger WANG
2009-07-06 17:38                 ` Jesse Barnes
2009-07-06 17:38                   ` Jesse Barnes
  -- strict thread matches above, loose matches on Subject: below --
2009-05-17  2:23 Wu Fengguang
2009-05-17  2:23 ` [PATCH 2/3] vmscan: " Wu Fengguang
2009-05-17  2:23   ` Wu Fengguang
2009-05-19  8:59   ` Wu Fengguang
2009-05-19  8:59     ` Wu Fengguang
2009-06-08  9:10 [PATCH 0/3] make mapped executable pages the first class citizen (with test cases) Wu Fengguang
2009-06-08  9:10 ` [PATCH 2/3] vmscan: make mapped executable pages the first class citizen Wu Fengguang
2009-06-08 15:34   ` Christoph Lameter
2009-06-08 17:30     ` Nai Xia
2009-06-09  3:28     ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090519074925.GA690@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux-foundation.org \
    --cc=elladan@eskimo.com \
    --cc=hannes@cmpxchg.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=minchan.kim@gmail.com \
    --cc=npiggin@suse.de \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.