From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>, Elladan <elladan@eskimo.com>,
Nick Piggin <npiggin@suse.de>,
Johannes Weiner <hannes@cmpxchg.org>,
Peter Zijlstra <peterz@infradead.org>,
Rik van Riel <riel@redhat.com>, "tytso@mit.edu" <tytso@mit.edu>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"minchan.kim@gmail.com" <minchan.kim@gmail.com>
Subject: Re: [PATCH 2/3] vmscan: make mapped executable pages the first class citizen
Date: Tue, 19 May 2009 13:09:32 +0800 [thread overview]
Message-ID: <20090519050932.GB8769@localhost> (raw)
In-Reply-To: <20090519133422.4ECC.A69D9226@jp.fujitsu.com>
[-- Attachment #1: Type: text/plain, Size: 4860 bytes --]
On Tue, May 19, 2009 at 12:41:38PM +0800, KOSAKI Motohiro wrote:
> Hi
>
> Thanks for great works.
>
>
> > SUMMARY
> > =======
> > The patch decreases the number of major faults from 50 to 3 during 10% cache hot reads.
> >
> >
> > SCENARIO
> > ========
> > The test scenario is to do 100000 pread(size=110 pages, offset=(i*100) pages),
> > where 10% of the pages will be activated:
> >
> > for i in `seq 0 100 10000000`; do echo $i 110; done > pattern-hot-10
> > iotrace.rb --load pattern-hot-10 --play /b/sparse
>
>
> Which can I download iotrace.rb?
In the attachment. It relies on some ruby libraries.
> > and monitor /proc/vmstat during the time. The test box has 2G memory.
> >
> >
> > ANALYZES
> > ========
> >
> > I carried out two runs on fresh booted console mode 2.6.29 with the VM_EXEC
> > patch, and fetched the vmstat numbers on
> >
> > (1) begin: shortly after the big read IO starts;
> > (2) end: just before the big read IO stops;
> > (3) restore: the big read IO stops and the zsh working set restored
> >
> > nr_mapped nr_active_file nr_inactive_file pgmajfault pgdeactivate pgfree
> > begin: 2481 2237 8694 630 0 574299
> > end: 275 231976 233914 633 776271 20933042
> > restore: 370 232154 234524 691 777183 20958453
> >
> > begin: 2434 2237 8493 629 0 574195
> > end: 284 231970 233536 632 771918 20896129
> > restore: 399 232218 234789 690 774526 20957909
> >
> > and another run on 2.6.30-rc4-mm with the VM_EXEC logic disabled:
>
> I don't think it is proper comparision.
> you need either following comparision. otherwise we insert many guess into the analysis.
>
> - 2.6.29 with and without VM_EXEC patch
> - 2.6.30-rc4-mm with and without VM_EXEC patch
I think it doesn't matter that much when it comes to "relative" numbers.
But anyway I guess you want to try a more typical desktop ;)
Unfortunately currently the Xorg is broken in my test box..
> >
> > begin: 2479 2344 9659 210 0 579643
> > end: 284 232010 234142 260 772776 20917184
> > restore: 379 232159 234371 301 774888 20967849
> >
> > The numbers show that
> >
> > - The startup pgmajfault of 2.6.30-rc4-mm is merely 1/3 that of 2.6.29.
> > I'd attribute that improvement to the mmap readahead improvements :-)
> >
> > - The pgmajfault increment during the file copy is 633-630=3 vs 260-210=50.
> > That's a huge improvement - which means with the VM_EXEC protection logic,
> > active mmap pages is pretty safe even under partially cache hot streaming IO.
> >
> > - when active:inactive file lru size reaches 1:1, their scan rates is 1:20.8
> > under 10% cache hot IO. (computed with formula Dpgdeactivate:Dpgfree)
> > That roughly means the active mmap pages get 20.8 more chances to get
> > re-referenced to stay in memory.
> >
> > - The absolute nr_mapped drops considerably to 1/9 during the big IO, and the
> > dropped pages are mostly inactive ones. The patch has almost no impact in
> > this aspect, that means it won't unnecessarily increase memory pressure.
> > (In contrast, your 20% mmap protection ratio will keep them all, and
> > therefore eliminate the extra 41 major faults to restore working set
> > of zsh etc.)
>
> I'm surprised this.
> Why your patch don't protect mapped page from streaming io?
It is only protecting the *active* mapped pages, as expected.
But yes, the active percent is much lower than expected :-)
> I strongly hope reproduce myself, please teach me reproduce way.
OK.
Firstly:
for i in `seq 0 100 10000000`; do echo $i 110; done > pattern-hot-10
dd if=/dev/zero of=/tmp/sparse bs=1M count=1 seek=1024000
Then boot into desktop and run concurrently:
iotrace.rb --load pattern-hot-10 --play /tmp/sparse
vmmon nr_mapped nr_active_file nr_inactive_file pgmajfault pgdeactivate pgfree
Note that I was creating the sparse file in btrfs, which happens to be
very slow in sparse file reading:
151.194384MB/s 284.198252s 100001x 450560b --load pattern-hot-10 --play /b/sparse
In that case, the inactive list is rotated at the speed of 250MB/s,
so a full scan of which takes about 3.5 seconds, while a full scan
of active file list takes about 77 seconds.
Attached source code for both of the above tools.
Thanks,
Fengguang
[-- Attachment #2: iotrace.rb --]
[-- Type: application/x-ruby, Size: 8999 bytes --]
[-- Attachment #3: vmmon.c --]
[-- Type: text/x-csrc, Size: 2410 bytes --]
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/time.h>
static int raw = 1;
static int delay = 1;
static int nr_fields;
static char **fields;
static FILE *f;
static void acquire(long *values)
{
char buf[1024];
rewind(f);
memset(values, 0, nr_fields * sizeof(*values));
while (fgets(buf, sizeof(buf), f)) {
int i;
for (i = 0; i < nr_fields; i++) {
char *p;
if (strncmp(buf, fields[i], strlen(fields[i])))
continue;
p = strchr(buf, ' ');
if (p == NULL) {
fprintf(stderr, "vmmon: error parsing /proc\n");
exit(1);
}
values[i] += strtoul(p, NULL, 10);
break;
}
}
}
static void display(long *new_values, long *prev_values,
unsigned long long usecs)
{
int i;
for (i = 0; i < nr_fields; i++) {
if (raw)
printf(" %16ld", new_values[i]);
else {
long long diff;
double ddiff;
ddiff = new_values[i] - prev_values[i];
ddiff *= 1000000;
ddiff /= usecs;
diff = ddiff;
printf(" %16lld", diff);
}
}
printf("\n");
}
static void do1(long *prev_values)
{
struct timeval start;
struct timeval end;
long long usecs;
long new_values[nr_fields];
gettimeofday(&start, NULL);
sleep(delay);
gettimeofday(&end, NULL);
acquire(new_values);
usecs = end.tv_sec - start.tv_sec;
usecs *= 1000000;
usecs += end.tv_usec - start.tv_usec;
display(new_values, prev_values, usecs);
memcpy(prev_values, new_values, nr_fields * sizeof(*prev_values));
}
static void heading(void)
{
int i;
printf("\n");
for (i = 0; i < nr_fields; i++)
printf(" %16s", fields[i]);
printf("\n");
}
static void doit(void)
{
int line = 0;
long prev_values[nr_fields];
acquire(prev_values);
for ( ; ; ) {
if (line == 0)
heading();
do1(prev_values);
line++;
if (line == 24)
line = 0;
}
}
static void usage(void)
{
fprintf(stderr, "usage: vmmon [-r] [-d N] field [field ...]\n");
fprintf(stderr, " -d N : delay N seconds\n");
fprintf(stderr, " -r : show raw numbers instead of diff\n");
exit(1);
}
int main(int argc, char *argv[])
{
int c;
while ((c = getopt(argc, argv, "rd:")) != -1) {
switch (c) {
case 'r':
raw = 1;
case 'd':
delay = strtol(optarg, NULL, 10);
break;
default:
usage();
}
}
if (optind == argc)
usage();
nr_fields = argc - optind;
fields = argv + optind;
f = fopen("/proc/vmstat", "r");
doit();
exit(0);
}
next prev parent reply other threads:[~2009-05-19 5:11 UTC|newest]
Thread overview: 137+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-05-16 9:00 [PATCH 0/3] make mapped executable pages the first class citizen Wu Fengguang
2009-05-16 9:00 ` Wu Fengguang
2009-05-16 9:00 ` [PATCH 1/3] vmscan: report vm_flags in page_referenced() Wu Fengguang
2009-05-16 9:00 ` Wu Fengguang
2009-05-16 13:17 ` Johannes Weiner
2009-05-16 13:17 ` Johannes Weiner
2009-05-16 13:37 ` Rik van Riel
2009-05-16 13:37 ` Rik van Riel
2009-05-17 0:35 ` Minchan Kim
2009-05-17 0:35 ` Minchan Kim
2009-05-17 1:36 ` Minchan Kim
2009-05-17 1:36 ` Minchan Kim
2009-05-17 1:58 ` Wu Fengguang
2009-05-17 1:58 ` Wu Fengguang
2009-05-16 9:00 ` [PATCH 2/3] vmscan: make mapped executable pages the first class citizen Wu Fengguang
2009-05-16 9:00 ` Wu Fengguang
2009-05-16 9:28 ` Wu Fengguang
2009-05-16 9:28 ` Wu Fengguang
2009-05-16 13:20 ` Johannes Weiner
2009-05-16 13:20 ` Johannes Weiner
2009-05-17 0:38 ` Minchan Kim
2009-05-17 0:38 ` Minchan Kim
2009-05-18 14:46 ` Christoph Lameter
2009-05-18 14:46 ` Christoph Lameter
2009-05-19 3:27 ` Wu Fengguang
2009-05-19 3:27 ` Wu Fengguang
2009-05-19 4:41 ` KOSAKI Motohiro
2009-05-19 4:41 ` KOSAKI Motohiro
2009-05-19 4:44 ` KOSAKI Motohiro
2009-05-19 4:44 ` KOSAKI Motohiro
2009-05-19 4:48 ` Wu Fengguang
2009-05-19 4:48 ` Wu Fengguang
2009-05-19 5:09 ` Wu Fengguang [this message]
2009-05-19 6:27 ` Wu Fengguang
2009-05-19 6:27 ` Wu Fengguang
2009-05-19 6:25 ` Wu Fengguang
2009-05-19 6:25 ` Wu Fengguang
2009-05-20 11:20 ` Andi Kleen
2009-05-20 11:20 ` Andi Kleen
2009-05-20 14:32 ` Wu Fengguang
2009-05-20 14:32 ` Wu Fengguang
2009-05-20 14:47 ` Andi Kleen
2009-05-20 14:47 ` Andi Kleen
2009-05-20 14:56 ` Wu Fengguang
2009-05-20 14:56 ` Wu Fengguang
2009-05-20 15:38 ` Wu Fengguang
2009-05-20 15:38 ` Wu Fengguang
2009-06-08 12:14 ` Nai Xia
2009-06-08 12:14 ` Nai Xia
2009-06-08 12:46 ` Wu Fengguang
2009-06-08 12:46 ` Wu Fengguang
2009-06-08 15:02 ` Nai Xia
2009-06-08 15:02 ` Nai Xia
2009-06-08 7:39 ` Wu Fengguang
2009-06-08 7:39 ` Wu Fengguang
2009-06-08 7:51 ` KOSAKI Motohiro
2009-06-08 7:51 ` KOSAKI Motohiro
2009-06-08 7:56 ` Wu Fengguang
2009-06-08 7:56 ` Wu Fengguang
2009-06-08 17:18 ` Nai Xia
2009-06-08 17:18 ` Nai Xia
2009-06-09 6:44 ` Wu Fengguang
2009-06-09 6:44 ` Wu Fengguang
2009-05-19 7:15 ` Wu Fengguang
2009-05-19 7:15 ` Wu Fengguang
2009-05-19 7:20 ` KOSAKI Motohiro
2009-05-19 7:20 ` KOSAKI Motohiro
2009-05-19 7:49 ` Wu Fengguang
2009-05-19 7:49 ` Wu Fengguang
2009-05-19 8:06 ` KOSAKI Motohiro
2009-05-19 8:06 ` KOSAKI Motohiro
2009-05-19 8:53 ` Wu Fengguang
2009-05-19 8:53 ` Wu Fengguang
2009-05-19 12:28 ` KOSAKI Motohiro
2009-05-19 12:28 ` KOSAKI Motohiro
2009-05-20 1:44 ` Wu Fengguang
2009-05-20 1:44 ` Wu Fengguang
2009-05-20 1:59 ` KOSAKI Motohiro
2009-05-20 1:59 ` KOSAKI Motohiro
2009-05-20 2:31 ` Wu Fengguang
2009-05-20 2:58 ` KOSAKI Motohiro
2009-05-20 2:58 ` KOSAKI Motohiro
2009-05-19 13:24 ` Rik van Riel
2009-05-19 13:24 ` Rik van Riel
2009-05-19 15:55 ` KOSAKI Motohiro
2009-05-19 15:55 ` KOSAKI Motohiro
2009-05-19 6:39 ` Pekka Enberg
2009-05-19 6:39 ` Pekka Enberg
2009-05-19 6:56 ` KOSAKI Motohiro
2009-05-19 6:56 ` KOSAKI Motohiro
2009-05-19 7:44 ` Peter Zijlstra
2009-05-19 7:44 ` Peter Zijlstra
2009-05-19 8:05 ` Pekka Enberg
2009-05-19 8:05 ` Pekka Enberg
2009-05-19 8:12 ` Wu Fengguang
2009-05-19 8:12 ` Wu Fengguang
2009-05-19 8:14 ` Pekka Enberg
2009-05-19 8:14 ` Pekka Enberg
2009-05-19 13:14 ` Rik van Riel
2009-05-19 13:14 ` Rik van Riel
2009-05-16 9:00 ` [PATCH 3/3] vmscan: merge duplicate code in shrink_active_list() Wu Fengguang
2009-05-16 9:00 ` Wu Fengguang
2009-05-16 13:39 ` Johannes Weiner
2009-05-16 13:39 ` Johannes Weiner
2009-05-16 13:47 ` Wu Fengguang
2009-05-16 13:47 ` Wu Fengguang
2009-05-16 14:35 ` Rik van Riel
2009-05-16 14:35 ` Rik van Riel
2009-05-17 1:24 ` Minchan Kim
2009-05-17 1:24 ` Minchan Kim
2009-05-16 14:56 ` [PATCH 0/3] make mapped executable pages the first class citizen Peter Zijlstra
2009-06-17 21:11 ` Jesse Barnes
2009-06-17 21:37 ` Jesse Barnes
2009-06-18 1:25 ` Wu Fengguang
2009-06-18 1:25 ` Wu Fengguang
2009-06-18 16:33 ` Jesse Barnes
2009-06-18 16:33 ` Jesse Barnes
2009-06-19 9:00 ` Wu, Fengguang
2009-06-19 9:00 ` Wu, Fengguang
2009-06-19 9:04 ` Peter Zijlstra
2009-06-19 9:04 ` Peter Zijlstra
2009-06-19 9:32 ` Wu Fengguang
2009-06-19 9:32 ` Wu Fengguang
2009-06-19 16:43 ` Jesse Barnes
2009-06-19 16:43 ` Jesse Barnes
2009-07-04 1:27 ` Roger WANG
2009-07-04 1:27 ` Roger WANG
2009-07-06 17:38 ` Jesse Barnes
2009-07-06 17:38 ` Jesse Barnes
-- strict thread matches above, loose matches on Subject: below --
2009-05-17 2:23 Wu Fengguang
2009-05-17 2:23 ` [PATCH 2/3] vmscan: " Wu Fengguang
2009-05-17 2:23 ` Wu Fengguang
2009-05-19 8:59 ` Wu Fengguang
2009-05-19 8:59 ` Wu Fengguang
2009-06-08 9:10 [PATCH 0/3] make mapped executable pages the first class citizen (with test cases) Wu Fengguang
2009-06-08 9:10 ` [PATCH 2/3] vmscan: make mapped executable pages the first class citizen Wu Fengguang
2009-06-08 15:34 ` Christoph Lameter
2009-06-08 17:30 ` Nai Xia
2009-06-09 3:28 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090519050932.GB8769@localhost \
--to=fengguang.wu@intel.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux-foundation.org \
--cc=elladan@eskimo.com \
--cc=hannes@cmpxchg.org \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan.kim@gmail.com \
--cc=npiggin@suse.de \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.