From: Wu Fengguang <fengguang.wu@intel.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH 5/8] readahead: add /debug/readahead/stats
Date: Tue, 29 Nov 2011 11:23:23 +0800 [thread overview]
Message-ID: <20111129032323.GC19506@localhost> (raw)
In-Reply-To: <20111121152958.e4fd76d4.akpm@linux-foundation.org>
On Mon, Nov 21, 2011 at 03:29:58PM -0800, Andrew Morton wrote:
> On Mon, 21 Nov 2011 17:18:24 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>
> > The accounting code will be compiled in by default (CONFIG_READAHEAD_STATS=y),
> > and will remain inactive unless enabled explicitly with either boot option
> >
> > readahead_stats=1
> >
> > or through the debugfs interface
> >
> > echo 1 > /debug/readahead/stats_enable
>
> It's unfortunate that these two things have different names.
Yes unfortunately.
> I'd have thought that the debugfs knob was sufficient - no need for the
> boot option.
The boot option intents to catch the boot time readaheads.
However it's not that big deal, I'll drop the boot option.
> > The added overheads are two readahead_stats() calls per readahead.
> > Which is trivial costs unless there are concurrent random reads on
> > super fast SSDs, which may lead to cache bouncing when updating the
> > global ra_stats[][]. Considering that normal users won't need this
> > except when debugging performance problems, it's disabled by default.
> > So it looks reasonable to keep this debug code simple rather than trying
> > to improve its scalability.
>
> I may be wrong, but I don't think the CPU cost of this code matters a
> lot. People will rarely turn it on and disk IO is a lot slower than
> CPU actions and it's waaaaaaay more important to get high-quality info
> about readahead than it is to squeeze out a few CPU cycles.
Agreed in general.
> > @@ -51,6 +62,182 @@ EXPORT_SYMBOL_GPL(file_ra_state_init);
> >
> > #define list_to_page(head) (list_entry((head)->prev, struct page, lru))
> >
> > +#ifdef CONFIG_READAHEAD_STATS
> > +#include <linux/seq_file.h>
> > +#include <linux/debugfs.h>
> > +
> > +static u32 readahead_stats_enable __read_mostly;
> > +
> > +static int __init config_readahead_stats(char *str)
> > +{
> > + int enable = 1;
> > + get_option(&str, &enable);
> > + readahead_stats_enable = enable;
> > + return 0;
> > +}
> > +early_param("readahead_stats", config_readahead_stats);
>
> Why use early_param() rather than plain old __setup()?
Heh it's a no-brain copy from other code ;)
Anyway, the readahead_stats boot parameter will be dropped.
> > +enum ra_account {
> > + /* number of readaheads */
> > + RA_ACCOUNT_COUNT, /* readahead request */
> > + RA_ACCOUNT_EOF, /* readahead request covers EOF */
> > + RA_ACCOUNT_CHIT, /* readahead request covers some cached pages */
>
> I don't like chit :) "cache_hit" would be better. Or just "hit".
Yeah it's not good. I renamed it to RA_ACCOUNT_CACHE_HIT.
> > + RA_ACCOUNT_ASIZE, /* readahead async size */
Also renamed that to RA_ACCOUNT_ASYNC_SIZE.
> > + RA_ACCOUNT_ACTUAL, /* readahead actual IO size */
> > + /* end mark */
> > + RA_ACCOUNT_MAX,
> > +};
> > +
> >
> > ...
> >
> > +static void readahead_event(struct address_space *mapping,
> > + pgoff_t offset,
> > + unsigned long req_size,
> > + unsigned int ra_flags,
> > + pgoff_t start,
> > + unsigned int size,
> > + unsigned int async_size,
> > + unsigned int actual)
> > +{
> > +#ifdef CONFIG_READAHEAD_STATS
> > + if (readahead_stats_enable) {
> > + readahead_stats(mapping, offset, req_size, ra_flags,
> > + start, size, async_size, actual);
> > + readahead_stats(mapping, offset, req_size,
> > + RA_PATTERN_ALL << READAHEAD_PATTERN_SHIFT,
> > + start, size, async_size, actual);
> > + }
> > +#endif
> > +}
>
> The stub should be inlined, methinks. The overhead of evaluating and
> preparing eight arguments is significant. I don't think the compiler
> is yet smart enough to save us.
The parameter list actually becomes even out of control when doing the
bit fields:
+ readahead_event(mapping, offset, req_size,
+ ra->pattern, ra->for_mmap, ra->for_metadata,
+ ra->start + ra->size >= eof,
+ ra->start, ra->size, ra->async_size, actual);
So I end up passing file_ra_state around. The added cost is, I'll have
to dynamically create a file_ra_state for the fadvise case, which
should be acceptable since it's a cold path.
> >
> > ...
> >
> > --- linux-next.orig/Documentation/kernel-parameters.txt 2011-11-21 17:08:38.000000000 +0800
> > +++ linux-next/Documentation/kernel-parameters.txt 2011-11-21 17:08:51.000000000 +0800
> > @@ -2251,6 +2251,12 @@ bytes respectively. Such letter suffixes
> > This default max readahead size may be overrode
> > in some cases, notably NFS, btrfs and software RAID.
> >
> > + readahead_stats[=0|1]
> > + Enable/disable readahead stats accounting.
> > +
> > + It's also possible to enable/disable it after boot:
> > + echo 1 > /sys/kernel/debug/readahead/stats_enable
>
> Can the current setting be read back?
Yes. This is possible:
echo 0 > /sys/kernel/debug/readahead/stats_enable
Thanks,
Fengguang
next prev parent reply other threads:[~2011-11-29 3:23 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-21 9:18 [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Wu Fengguang
2011-11-21 9:18 ` [PATCH 1/8] block: limit default readahead size for small devices Wu Fengguang
2011-11-21 10:00 ` Christoph Hellwig
2011-11-21 11:24 ` Wu Fengguang
2011-11-21 12:47 ` Andi Kleen
2011-11-21 14:46 ` Jeff Moyer
2011-11-21 22:52 ` Andrew Morton
2011-11-22 14:23 ` Jeff Moyer
2011-11-23 12:18 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 2/8] readahead: make default readahead size a kernel parameter Wu Fengguang
2011-11-21 10:01 ` Christoph Hellwig
2011-11-21 11:35 ` Wu Fengguang
2011-11-24 22:28 ` Jan Kara
2011-11-25 0:36 ` Dave Chinner
2011-11-28 2:39 ` Wu Fengguang
2011-11-30 13:04 ` Christian Ehrhardt
2011-11-30 13:29 ` Wu Fengguang
2011-11-30 16:09 ` Jan Kara
2011-11-21 13:16 ` Namhyung Kim
2011-11-21 13:24 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 3/8] readahead: replace ra->mmap_miss with ra->ra_flags Wu Fengguang
2011-11-21 11:04 ` Steven Whitehouse
2011-11-21 11:42 ` Wu Fengguang
2011-11-21 23:01 ` Andrew Morton
2011-11-23 12:47 ` Wu Fengguang
2011-11-23 20:31 ` Andrew Morton
2011-11-29 3:42 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 4/8] readahead: record readahead patterns Wu Fengguang
2011-11-21 23:19 ` Andrew Morton
2011-11-29 2:40 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 5/8] readahead: add /debug/readahead/stats Wu Fengguang
2011-11-21 14:17 ` Andi Kleen
2011-11-22 14:14 ` Wu Fengguang
2011-11-21 23:29 ` Andrew Morton
2011-11-21 23:32 ` Andi Kleen
2011-11-29 3:23 ` Wu Fengguang [this message]
2011-11-29 4:49 ` Andrew Morton
2011-11-29 6:41 ` Wu Fengguang
2011-11-29 12:29 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 6/8] readahead: add debug tracing event Wu Fengguang
2011-11-21 14:01 ` Steven Rostedt
2011-11-21 9:18 ` [PATCH 7/8] readahead: basic support for backwards prefetching Wu Fengguang
2011-11-21 23:33 ` Andrew Morton
2011-11-29 3:08 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 8/8] readahead: dont do start-of-file readahead after lseek() Wu Fengguang
2011-11-21 23:36 ` Andrew Morton
2011-11-22 14:18 ` Wu Fengguang
2011-11-21 9:56 ` [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Christoph Hellwig
2011-11-21 12:00 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111129032323.GC19506@localhost \
--to=fengguang.wu@intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=andi@firstfloor.org \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).