From: Andrew Morton <akpm@linux-foundation.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
<linux-fsdevel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH 4/8] readahead: record readahead patterns
Date: Mon, 21 Nov 2011 15:19:19 -0800 [thread overview]
Message-ID: <20111121151919.4b76a475.akpm@linux-foundation.org> (raw)
In-Reply-To: <20111121093846.510441032@intel.com>
On Mon, 21 Nov 2011 17:18:23 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> Record the readahead pattern in ra_flags and extend the ra_submit()
> parameters, to be used by the next readahead tracing/stats patches.
>
> 7 patterns are defined:
>
> pattern readahead for
> -----------------------------------------------------------
> RA_PATTERN_INITIAL start-of-file read
> RA_PATTERN_SUBSEQUENT trivial sequential read
> RA_PATTERN_CONTEXT interleaved sequential read
> RA_PATTERN_OVERSIZE oversize read
> RA_PATTERN_MMAP_AROUND mmap fault
> RA_PATTERN_FADVISE posix_fadvise()
> RA_PATTERN_RANDOM random read
It would be useful to spell out in full detail what an "interleaved
sequential read" is, and why a read is considered "oversized", etc.
The 'enum readahead_pattern' definition site would be a good place for
this.
> Note that random reads will be recorded in file_ra_state now.
> This won't deteriorate cache bouncing because the ra->prev_pos update
> in do_generic_file_read() already pollutes the data cache, and
> filemap_fault() will stop calling into us after MMAP_LOTSAMISS.
>
> --- linux-next.orig/include/linux/fs.h 2011-11-20 20:10:48.000000000 +0800
> +++ linux-next/include/linux/fs.h 2011-11-20 20:18:29.000000000 +0800
> @@ -951,6 +951,39 @@ struct file_ra_state {
>
> /* ra_flags bits */
> #define READAHEAD_MMAP_MISS 0x000003ff /* cache misses for mmap access */
> +#define READAHEAD_MMAP 0x00010000
Why leave a gap?
And what is READAHEAD_MMAP anyway?
> +#define READAHEAD_PATTERN_SHIFT 28
Why 28?
> +#define READAHEAD_PATTERN 0xf0000000
> +
> +/*
> + * Which policy makes decision to do the current read-ahead IO?
> + */
> +enum readahead_pattern {
> + RA_PATTERN_INITIAL,
> + RA_PATTERN_SUBSEQUENT,
> + RA_PATTERN_CONTEXT,
> + RA_PATTERN_MMAP_AROUND,
> + RA_PATTERN_FADVISE,
> + RA_PATTERN_OVERSIZE,
> + RA_PATTERN_RANDOM,
> + RA_PATTERN_ALL, /* for summary stats */
> + RA_PATTERN_MAX
> +};
Again, the behaviour is all undocumented. I see from the code that
multiple flags can be set at the same time. So afacit a file can be
marked RANDOM and SUBSEQUENT at the same time, which seems oxymoronic.
This reader wants to know what the implications of this are - how the
code chooses, prioritises and acts. But this code doesn't tell me.
> +static inline unsigned int ra_pattern(unsigned int ra_flags)
> +{
> + unsigned int pattern = ra_flags >> READAHEAD_PATTERN_SHIFT;
OK, no masking is needed because the code silently assumes that arg
`ra_flags' came out of an ra_state.ra_flags and it also silently
assumes that no higher bits are used in ra_state.ra_flags.
That's a bit of a handgrenade - if someone redoes the flags
enumeration, the code will explode.
> + return min_t(unsigned int, pattern, RA_PATTERN_ALL);
> +}
<scratches head>
What the heck is that min_t() doing in there?
> +static inline void ra_set_pattern(struct file_ra_state *ra,
> + unsigned int pattern)
> +{
> + ra->ra_flags = (ra->ra_flags & ~READAHEAD_PATTERN) |
> + (pattern << READAHEAD_PATTERN_SHIFT);
> +}
>
> /*
> * Don't do ra_flags++ directly to avoid possible overflow:
>
> ...
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
linux-fsdevel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH 4/8] readahead: record readahead patterns
Date: Mon, 21 Nov 2011 15:19:19 -0800 [thread overview]
Message-ID: <20111121151919.4b76a475.akpm@linux-foundation.org> (raw)
In-Reply-To: <20111121093846.510441032@intel.com>
On Mon, 21 Nov 2011 17:18:23 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> Record the readahead pattern in ra_flags and extend the ra_submit()
> parameters, to be used by the next readahead tracing/stats patches.
>
> 7 patterns are defined:
>
> pattern readahead for
> -----------------------------------------------------------
> RA_PATTERN_INITIAL start-of-file read
> RA_PATTERN_SUBSEQUENT trivial sequential read
> RA_PATTERN_CONTEXT interleaved sequential read
> RA_PATTERN_OVERSIZE oversize read
> RA_PATTERN_MMAP_AROUND mmap fault
> RA_PATTERN_FADVISE posix_fadvise()
> RA_PATTERN_RANDOM random read
It would be useful to spell out in full detail what an "interleaved
sequential read" is, and why a read is considered "oversized", etc.
The 'enum readahead_pattern' definition site would be a good place for
this.
> Note that random reads will be recorded in file_ra_state now.
> This won't deteriorate cache bouncing because the ra->prev_pos update
> in do_generic_file_read() already pollutes the data cache, and
> filemap_fault() will stop calling into us after MMAP_LOTSAMISS.
>
> --- linux-next.orig/include/linux/fs.h 2011-11-20 20:10:48.000000000 +0800
> +++ linux-next/include/linux/fs.h 2011-11-20 20:18:29.000000000 +0800
> @@ -951,6 +951,39 @@ struct file_ra_state {
>
> /* ra_flags bits */
> #define READAHEAD_MMAP_MISS 0x000003ff /* cache misses for mmap access */
> +#define READAHEAD_MMAP 0x00010000
Why leave a gap?
And what is READAHEAD_MMAP anyway?
> +#define READAHEAD_PATTERN_SHIFT 28
Why 28?
> +#define READAHEAD_PATTERN 0xf0000000
> +
> +/*
> + * Which policy makes decision to do the current read-ahead IO?
> + */
> +enum readahead_pattern {
> + RA_PATTERN_INITIAL,
> + RA_PATTERN_SUBSEQUENT,
> + RA_PATTERN_CONTEXT,
> + RA_PATTERN_MMAP_AROUND,
> + RA_PATTERN_FADVISE,
> + RA_PATTERN_OVERSIZE,
> + RA_PATTERN_RANDOM,
> + RA_PATTERN_ALL, /* for summary stats */
> + RA_PATTERN_MAX
> +};
Again, the behaviour is all undocumented. I see from the code that
multiple flags can be set at the same time. So afacit a file can be
marked RANDOM and SUBSEQUENT at the same time, which seems oxymoronic.
This reader wants to know what the implications of this are - how the
code chooses, prioritises and acts. But this code doesn't tell me.
> +static inline unsigned int ra_pattern(unsigned int ra_flags)
> +{
> + unsigned int pattern = ra_flags >> READAHEAD_PATTERN_SHIFT;
OK, no masking is needed because the code silently assumes that arg
`ra_flags' came out of an ra_state.ra_flags and it also silently
assumes that no higher bits are used in ra_state.ra_flags.
That's a bit of a handgrenade - if someone redoes the flags
enumeration, the code will explode.
> + return min_t(unsigned int, pattern, RA_PATTERN_ALL);
> +}
<scratches head>
What the heck is that min_t() doing in there?
> +static inline void ra_set_pattern(struct file_ra_state *ra,
> + unsigned int pattern)
> +{
> + ra->ra_flags = (ra->ra_flags & ~READAHEAD_PATTERN) |
> + (pattern << READAHEAD_PATTERN_SHIFT);
> +}
>
> /*
> * Don't do ra_flags++ directly to avoid possible overflow:
>
> ...
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Andrew Morton <akpm@linux-foundation.org>
To: Wu Fengguang <fengguang.wu@intel.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
<linux-fsdevel@vger.kernel.org>, Ingo Molnar <mingo@elte.hu>,
Jens Axboe <jens.axboe@oracle.com>,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Rik van Riel <riel@redhat.com>,
LKML <linux-kernel@vger.kernel.org>,
Andi Kleen <andi@firstfloor.org>
Subject: Re: [PATCH 4/8] readahead: record readahead patterns
Date: Mon, 21 Nov 2011 15:19:19 -0800 [thread overview]
Message-ID: <20111121151919.4b76a475.akpm@linux-foundation.org> (raw)
In-Reply-To: <20111121093846.510441032@intel.com>
On Mon, 21 Nov 2011 17:18:23 +0800
Wu Fengguang <fengguang.wu@intel.com> wrote:
> Record the readahead pattern in ra_flags and extend the ra_submit()
> parameters, to be used by the next readahead tracing/stats patches.
>
> 7 patterns are defined:
>
> pattern readahead for
> -----------------------------------------------------------
> RA_PATTERN_INITIAL start-of-file read
> RA_PATTERN_SUBSEQUENT trivial sequential read
> RA_PATTERN_CONTEXT interleaved sequential read
> RA_PATTERN_OVERSIZE oversize read
> RA_PATTERN_MMAP_AROUND mmap fault
> RA_PATTERN_FADVISE posix_fadvise()
> RA_PATTERN_RANDOM random read
It would be useful to spell out in full detail what an "interleaved
sequential read" is, and why a read is considered "oversized", etc.
The 'enum readahead_pattern' definition site would be a good place for
this.
> Note that random reads will be recorded in file_ra_state now.
> This won't deteriorate cache bouncing because the ra->prev_pos update
> in do_generic_file_read() already pollutes the data cache, and
> filemap_fault() will stop calling into us after MMAP_LOTSAMISS.
>
> --- linux-next.orig/include/linux/fs.h 2011-11-20 20:10:48.000000000 +0800
> +++ linux-next/include/linux/fs.h 2011-11-20 20:18:29.000000000 +0800
> @@ -951,6 +951,39 @@ struct file_ra_state {
>
> /* ra_flags bits */
> #define READAHEAD_MMAP_MISS 0x000003ff /* cache misses for mmap access */
> +#define READAHEAD_MMAP 0x00010000
Why leave a gap?
And what is READAHEAD_MMAP anyway?
> +#define READAHEAD_PATTERN_SHIFT 28
Why 28?
> +#define READAHEAD_PATTERN 0xf0000000
> +
> +/*
> + * Which policy makes decision to do the current read-ahead IO?
> + */
> +enum readahead_pattern {
> + RA_PATTERN_INITIAL,
> + RA_PATTERN_SUBSEQUENT,
> + RA_PATTERN_CONTEXT,
> + RA_PATTERN_MMAP_AROUND,
> + RA_PATTERN_FADVISE,
> + RA_PATTERN_OVERSIZE,
> + RA_PATTERN_RANDOM,
> + RA_PATTERN_ALL, /* for summary stats */
> + RA_PATTERN_MAX
> +};
Again, the behaviour is all undocumented. I see from the code that
multiple flags can be set at the same time. So afacit a file can be
marked RANDOM and SUBSEQUENT at the same time, which seems oxymoronic.
This reader wants to know what the implications of this are - how the
code chooses, prioritises and acts. But this code doesn't tell me.
> +static inline unsigned int ra_pattern(unsigned int ra_flags)
> +{
> + unsigned int pattern = ra_flags >> READAHEAD_PATTERN_SHIFT;
OK, no masking is needed because the code silently assumes that arg
`ra_flags' came out of an ra_state.ra_flags and it also silently
assumes that no higher bits are used in ra_state.ra_flags.
That's a bit of a handgrenade - if someone redoes the flags
enumeration, the code will explode.
> + return min_t(unsigned int, pattern, RA_PATTERN_ALL);
> +}
<scratches head>
What the heck is that min_t() doing in there?
> +static inline void ra_set_pattern(struct file_ra_state *ra,
> + unsigned int pattern)
> +{
> + ra->ra_flags = (ra->ra_flags & ~READAHEAD_PATTERN) |
> + (pattern << READAHEAD_PATTERN_SHIFT);
> +}
>
> /*
> * Don't do ra_flags++ directly to avoid possible overflow:
>
> ...
>
next prev parent reply other threads:[~2011-11-21 23:19 UTC|newest]
Thread overview: 116+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-21 9:18 [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 1/8] block: limit default readahead size for small devices Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 10:00 ` Christoph Hellwig
2011-11-21 10:00 ` Christoph Hellwig
2011-11-21 11:24 ` Wu Fengguang
2011-11-21 11:24 ` Wu Fengguang
2011-11-21 12:47 ` Andi Kleen
2011-11-21 12:47 ` Andi Kleen
2011-11-21 14:46 ` Jeff Moyer
2011-11-21 14:46 ` Jeff Moyer
2011-11-21 14:46 ` Jeff Moyer
2011-11-21 22:52 ` Andrew Morton
2011-11-21 22:52 ` Andrew Morton
2011-11-21 22:52 ` Andrew Morton
2011-11-22 14:23 ` Jeff Moyer
2011-11-22 14:23 ` Jeff Moyer
2011-11-22 14:23 ` Jeff Moyer
2011-11-23 12:18 ` Wu Fengguang
2011-11-23 12:18 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 2/8] readahead: make default readahead size a kernel parameter Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 10:01 ` Christoph Hellwig
2011-11-21 10:01 ` Christoph Hellwig
2011-11-21 11:35 ` Wu Fengguang
2011-11-21 11:35 ` Wu Fengguang
2011-11-24 22:28 ` Jan Kara
2011-11-24 22:28 ` Jan Kara
2011-11-25 0:36 ` Dave Chinner
2011-11-25 0:36 ` Dave Chinner
2011-11-28 2:39 ` Wu Fengguang
2011-11-28 2:39 ` Wu Fengguang
2011-11-30 13:04 ` Christian Ehrhardt
2011-11-30 13:04 ` Christian Ehrhardt
2011-11-30 13:04 ` Christian Ehrhardt
2011-11-30 13:29 ` Wu Fengguang
2011-11-30 13:29 ` Wu Fengguang
2011-11-30 13:29 ` Wu Fengguang
2011-11-30 16:09 ` Jan Kara
2011-11-30 16:09 ` Jan Kara
2011-11-21 13:16 ` Namhyung Kim
2011-11-21 13:16 ` Namhyung Kim
2011-11-21 13:24 ` Wu Fengguang
2011-11-21 13:24 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 3/8] readahead: replace ra->mmap_miss with ra->ra_flags Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 11:04 ` Steven Whitehouse
2011-11-21 11:04 ` Steven Whitehouse
2011-11-21 11:42 ` Wu Fengguang
2011-11-21 11:42 ` Wu Fengguang
2011-11-21 23:01 ` Andrew Morton
2011-11-21 23:01 ` Andrew Morton
2011-11-21 23:01 ` Andrew Morton
2011-11-23 12:47 ` Wu Fengguang
2011-11-23 12:47 ` Wu Fengguang
2011-11-23 20:31 ` Andrew Morton
2011-11-23 20:31 ` Andrew Morton
2011-11-29 3:42 ` Wu Fengguang
2011-11-29 3:42 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 4/8] readahead: record readahead patterns Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 23:19 ` Andrew Morton [this message]
2011-11-21 23:19 ` Andrew Morton
2011-11-21 23:19 ` Andrew Morton
2011-11-29 2:40 ` Wu Fengguang
2011-11-29 2:40 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 5/8] readahead: add /debug/readahead/stats Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 14:17 ` Andi Kleen
2011-11-21 14:17 ` Andi Kleen
2011-11-22 14:14 ` Wu Fengguang
2011-11-22 14:14 ` Wu Fengguang
2011-11-21 23:29 ` Andrew Morton
2011-11-21 23:29 ` Andrew Morton
2011-11-21 23:32 ` Andi Kleen
2011-11-21 23:32 ` Andi Kleen
2011-11-29 3:23 ` Wu Fengguang
2011-11-29 3:23 ` Wu Fengguang
2011-11-29 4:49 ` Andrew Morton
2011-11-29 4:49 ` Andrew Morton
2011-11-29 6:41 ` Wu Fengguang
2011-11-29 6:41 ` Wu Fengguang
2011-11-29 12:29 ` Wu Fengguang
2011-11-29 12:29 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 6/8] readahead: add debug tracing event Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 14:01 ` Steven Rostedt
2011-11-21 14:01 ` Steven Rostedt
2011-11-21 9:18 ` [PATCH 7/8] readahead: basic support for backwards prefetching Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 23:33 ` Andrew Morton
2011-11-21 23:33 ` Andrew Morton
2011-11-21 23:33 ` Andrew Morton
2011-11-29 3:08 ` Wu Fengguang
2011-11-29 3:08 ` Wu Fengguang
2011-11-21 9:18 ` [PATCH 8/8] readahead: dont do start-of-file readahead after lseek() Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 9:18 ` Wu Fengguang
2011-11-21 23:36 ` Andrew Morton
2011-11-21 23:36 ` Andrew Morton
2011-11-21 23:36 ` Andrew Morton
2011-11-22 14:18 ` Wu Fengguang
2011-11-22 14:18 ` Wu Fengguang
2011-11-21 9:56 ` [PATCH 0/8] readahead stats/tracing, backwards prefetching and more Christoph Hellwig
2011-11-21 9:56 ` Christoph Hellwig
2011-11-21 12:00 ` Wu Fengguang
2011-11-21 12:00 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20111121151919.4b76a475.akpm@linux-foundation.org \
--to=akpm@linux-foundation.org \
--cc=a.p.zijlstra@chello.nl \
--cc=andi@firstfloor.org \
--cc=fengguang.wu@intel.com \
--cc=jens.axboe@oracle.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@elte.hu \
--cc=riel@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.