From: Jens Axboe <axboe@suse.de>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, npiggin@suse.de,
linux-mm@kvack.org
Subject: Re: Lockless page cache test results
Date: Wed, 26 Apr 2006 21:15:58 +0200 [thread overview]
Message-ID: <20060426191557.GA9211@suse.de> (raw)
In-Reply-To: <Pine.LNX.4.64.0604261144290.3701@g5.osdl.org>
On Wed, Apr 26 2006, Linus Torvalds wrote:
>
>
> On Wed, 26 Apr 2006, Andrew Morton wrote:
>
> > Jens Axboe <axboe@suse.de> wrote:
> > >
> > > Once per page, it's basically exercising the generic_file_splice_read()
> > > path. Basically X number of "clients" open the same file, and fill those
> > > pages into a pipe using splice. The output end of the pipe is then
> > > spliced to /dev/null to toss it away again.
> >
> > OK. That doesn't sound like something which a real application is likely
> > to do ;)
>
> True, but on the other hand, it does kind of "distill" one (small) part of
> something that real apps _are_ likely to do.
>
> The whole 'splice to /dev/null' part can be seen as totally irrelevant,
> but at the same time a way to ignore all the other parts of normal page
> cache usage (ie the other parts of page cache usage tend to be the "map it
> into user space" or the actual "memcpy_to/from_user()" or the "TCP send"
> part).
>
> The question, of course, is whether the part that remains (the actual page
> lookup) is important enough to matter, once it is part of a bigger chain
> in a real application.
>
> In other words, the splice() thing is just a way to isolate one part of a
> chain that is usually much more involved, and micro-benchmark just that
> one part.
Nick called it a find_get_page() micro benchmark, which is pretty might
spot on. So naturally it shows the absolute best side of the lockless
page cache, but that is also very interesting. The /dev/null output can
just be seen as a "infinitely" fast output method, both from a
throughput and light weight POV.
> It would be interesting to see where doing gang-lookup moves the target,
> but on the other hand, with smaller files (and small files are still
> common), gang lookup isn't going to help as much.
With a 16-page gang lookup in splice, the top profile for the 4-client
case (which is now at 4GiB/sec instead of 3) are:
samples % symbol name
30396 36.7217 __do_page_cache_readahead
25843 31.2212 find_get_pages_contig
9699 11.7174 default_idle
Even disregarding that readahead contender that could probably be made a
little more clever, we are still spending an awful lot of time in the
page lookup. I didn't mention this before, but the get_page/put_page
overhead is also a lot smaller with the lockless patches.
--
Jens Axboe
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@suse.de>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Andrew Morton <akpm@osdl.org>,
linux-kernel@vger.kernel.org, npiggin@suse.de,
linux-mm@kvack.org
Subject: Re: Lockless page cache test results
Date: Wed, 26 Apr 2006 21:15:58 +0200 [thread overview]
Message-ID: <20060426191557.GA9211@suse.de> (raw)
In-Reply-To: <Pine.LNX.4.64.0604261144290.3701@g5.osdl.org>
On Wed, Apr 26 2006, Linus Torvalds wrote:
>
>
> On Wed, 26 Apr 2006, Andrew Morton wrote:
>
> > Jens Axboe <axboe@suse.de> wrote:
> > >
> > > Once per page, it's basically exercising the generic_file_splice_read()
> > > path. Basically X number of "clients" open the same file, and fill those
> > > pages into a pipe using splice. The output end of the pipe is then
> > > spliced to /dev/null to toss it away again.
> >
> > OK. That doesn't sound like something which a real application is likely
> > to do ;)
>
> True, but on the other hand, it does kind of "distill" one (small) part of
> something that real apps _are_ likely to do.
>
> The whole 'splice to /dev/null' part can be seen as totally irrelevant,
> but at the same time a way to ignore all the other parts of normal page
> cache usage (ie the other parts of page cache usage tend to be the "map it
> into user space" or the actual "memcpy_to/from_user()" or the "TCP send"
> part).
>
> The question, of course, is whether the part that remains (the actual page
> lookup) is important enough to matter, once it is part of a bigger chain
> in a real application.
>
> In other words, the splice() thing is just a way to isolate one part of a
> chain that is usually much more involved, and micro-benchmark just that
> one part.
Nick called it a find_get_page() micro benchmark, which is pretty might
spot on. So naturally it shows the absolute best side of the lockless
page cache, but that is also very interesting. The /dev/null output can
just be seen as a "infinitely" fast output method, both from a
throughput and light weight POV.
> It would be interesting to see where doing gang-lookup moves the target,
> but on the other hand, with smaller files (and small files are still
> common), gang lookup isn't going to help as much.
With a 16-page gang lookup in splice, the top profile for the 4-client
case (which is now at 4GiB/sec instead of 3) are:
samples % symbol name
30396 36.7217 __do_page_cache_readahead
25843 31.2212 find_get_pages_contig
9699 11.7174 default_idle
Even disregarding that readahead contender that could probably be made a
little more clever, we are still spending an awful lot of time in the
page lookup. I didn't mention this before, but the get_page/put_page
overhead is also a lot smaller with the lockless patches.
--
Jens Axboe
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-04-26 19:15 UTC|newest]
Thread overview: 99+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-26 13:53 Lockless page cache test results Jens Axboe
2006-04-26 14:43 ` Nick Piggin
2006-04-26 14:43 ` Nick Piggin
2006-04-26 19:46 ` Jens Axboe
2006-04-26 19:46 ` Jens Axboe
2006-04-27 5:39 ` Chen, Kenneth W
2006-04-27 5:39 ` Chen, Kenneth W
2006-04-27 6:07 ` Nick Piggin
2006-04-27 6:07 ` Nick Piggin
2006-04-27 6:15 ` Andi Kleen
2006-04-27 6:15 ` Andi Kleen
2006-04-27 7:51 ` Chen, Kenneth W
2006-04-27 7:51 ` Chen, Kenneth W
2006-04-26 16:55 ` Andrew Morton
2006-04-26 16:55 ` Andrew Morton
2006-04-26 17:42 ` Jens Axboe
2006-04-26 17:42 ` Jens Axboe
2006-04-26 18:10 ` Andrew Morton
2006-04-26 18:10 ` Andrew Morton
2006-04-26 18:23 ` Jens Axboe
2006-04-26 18:23 ` Jens Axboe
2006-04-26 18:46 ` Andrew Morton
2006-04-26 18:46 ` Andrew Morton
2006-04-26 19:21 ` Jens Axboe
2006-04-26 19:21 ` Jens Axboe
2006-04-27 5:58 ` Nick Piggin
2006-04-27 5:58 ` Nick Piggin
2006-04-26 18:34 ` Christoph Lameter
2006-04-26 18:34 ` Christoph Lameter
2006-04-26 18:47 ` Andrew Morton
2006-04-26 18:47 ` Andrew Morton
2006-04-26 18:48 ` Christoph Lameter
2006-04-26 18:48 ` Christoph Lameter
2006-04-26 18:49 ` Jens Axboe
2006-04-26 18:49 ` Jens Axboe
2006-04-26 20:31 ` Christoph Lameter
2006-04-26 20:31 ` Christoph Lameter
2006-04-28 14:01 ` David Chinner
2006-04-28 14:01 ` David Chinner
2006-04-28 14:10 ` David Chinner
2006-04-28 14:10 ` David Chinner
2006-04-30 9:49 ` Nick Piggin
2006-04-30 11:20 ` Nick Piggin
2006-04-30 11:20 ` Nick Piggin
2006-04-30 11:39 ` Jens Axboe
2006-04-30 11:39 ` Jens Axboe
2006-04-30 11:44 ` Nick Piggin
2006-04-26 18:58 ` Christoph Hellwig
2006-04-26 18:58 ` Christoph Hellwig
2006-04-26 19:02 ` Jens Axboe
2006-04-26 19:02 ` Jens Axboe
2006-04-26 19:00 ` Linus Torvalds
2006-04-26 19:00 ` Linus Torvalds
2006-04-26 19:15 ` Jens Axboe [this message]
2006-04-26 19:15 ` Jens Axboe
2006-04-26 20:12 ` Andrew Morton
2006-04-26 20:12 ` Andrew Morton
2006-04-27 7:45 ` Jens Axboe
2006-04-27 7:47 ` Jens Axboe
2006-04-27 7:47 ` Jens Axboe
2006-04-27 7:57 ` Nick Piggin
2006-04-27 7:57 ` Nick Piggin
2006-04-27 8:02 ` Nick Piggin
2006-04-27 8:02 ` Nick Piggin
2006-04-27 9:00 ` Jens Axboe
2006-04-27 9:00 ` Jens Axboe
2006-04-27 13:36 ` Nick Piggin
2006-04-27 13:36 ` Nick Piggin
2006-04-27 8:36 ` Jens Axboe
2006-04-27 8:36 ` Jens Axboe
2006-04-28 11:28 ` Wu Fengguang
2006-04-28 11:28 ` Wu Fengguang
2006-04-28 11:28 ` Wu Fengguang
2006-04-27 5:49 ` Nick Piggin
2006-04-27 5:49 ` Nick Piggin
2006-04-27 15:12 ` Linus Torvalds
2006-04-27 15:12 ` Linus Torvalds
2006-04-28 4:54 ` Nick Piggin
2006-04-28 4:54 ` Nick Piggin
2006-04-28 5:34 ` Linus Torvalds
2006-04-28 5:34 ` Linus Torvalds
2006-04-27 9:35 ` Jens Axboe
2006-04-27 5:22 ` Nick Piggin
2006-04-27 5:22 ` Nick Piggin
2006-04-26 18:57 ` Jens Axboe
2006-04-27 2:19 ` KAMEZAWA Hiroyuki
2006-04-27 2:19 ` KAMEZAWA Hiroyuki
2006-04-27 8:03 ` Jens Axboe
2006-04-27 8:03 ` Jens Axboe
2006-04-27 11:16 ` Jens Axboe
2006-04-27 11:16 ` Jens Axboe
2006-04-27 11:41 ` KAMEZAWA Hiroyuki
2006-04-27 11:41 ` KAMEZAWA Hiroyuki
2006-04-27 11:45 ` Jens Axboe
2006-04-27 11:45 ` Jens Axboe
2006-04-28 9:10 ` Pavel Machek
2006-04-28 9:10 ` Pavel Machek
2006-04-28 9:21 ` Jens Axboe
2006-04-28 9:21 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2006-04-28 16:58 Al Boldi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060426191557.GA9211@suse.de \
--to=axboe@suse.de \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=npiggin@suse.de \
--cc=torvalds@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.