From: Ram Pai <linuxram@us.ibm.com>
To: Fengguang Wu <fengguang.wu@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Suparna Bhattacharya <suparna@in.ibm.com>,
John Tran <jbtran@ca.ibm.com>, Mike Sullivan <mksully@us.ibm.com>,
Chris Mason <mason@suse.com>,
Steven Pratt <slpratt@austin.ibm.com>,
Badari Pulavarty <badari@us.ibm.com>,
Mingming Cao <mcao@us.ibm.com>,
Linux Kernel <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -mm] readahead: partial sendfile fix
Date: Mon, 12 Feb 2007 11:49:10 -0800 [thread overview]
Message-ID: <1171309751.2891.70.camel@ram.us.ibm.com> (raw)
In-Reply-To: <20070210014049.GA11269@mail.ustc.edu.cn>
On Sat, 2007-02-10 at 09:40 +0800, Fengguang Wu wrote:
> Enable readahead to handle partially done read requests, e.g.
>
> sendfile(188, 1921, [1478592], 19553028) = 37440
> sendfile(188, 1921, [1516032], 19515588) = 28800
> sendfile(188, 1921, [1544832], 19486788) = 37440
> sendfile(188, 1921, [1582272], 19449348) = 14400
> sendfile(188, 1921, [1596672], 19434948) = 37440
> sendfile(188, 1921, [1634112], 19397508) = 37440
>
> In the above strace log,
> - some lighttpd is doing _sequential_ reading
> - every sendfile() returns with only _partial_ work done
>
> page_cache_readahead() expects that if it returns @next_index, it will
> be
> called exactly at @next_index next time. That's not true here. So the
> pattern
> will be falsely recognized as a random read trace.
>
> Also documented in "Linux AIO Performance and Robustness for
> Enterprise
> Workloads" section 3.5:
>
> sendfile(fd, 0, 2GB, fd2) = 8192,
> tells readahead about up to 128KB of the read
> sendfile(fd, 8192, 2GB - 8192, fd2) = 8192,
> tells readahead about 8KB - 132KB of the read
> sendfile(fd, 16384, 2GB - 16384, fd2) = 8192,
> tells readahead about 16KB-140KB of the read
> ...
> This confuses the readahead logic about the I/O pattern which
> appears
> to be 0-128K, 8K-132K, 16K-140K instead of clear sequentiality
> from
> 0-2GB that is really appropriate.
>
> Retry based AIO shares the same read pattern and readahead problem.
> In this case, simply disabling readahead on restarted aio is not a
> good option:
> we still need to call into readahead in the rare case of (req_size >
> ra_max).
The solution you proposed seems kludgy to me. If you determine that the
its a restarted aio, then start reading from where readahead had left
reading from earlier. To me a simple fix is:
- if (unlikely(aio_restarted()))
- next_index = last_index; /* Avoid repeat readahead */
+ if (unlikely(aio_restarted()))
+ next_index = min(prev_index+1, last_index);
No?
RP
>
> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
> ---
> mm/filemap.c | 3 ---
> mm/readahead.c | 9 +++++++++
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
> --- linux-2.6.20-rc6-mm3.orig/mm/readahead.c
> +++ linux-2.6.20-rc6-mm3/mm/readahead.c
> @@ -581,6 +581,15 @@ page_cache_readahead(struct address_spac
> int sequential;
>
> /*
> + * A previous read request is partially completed,
> + * causing the retried/continued read calls into us
> prematurely.
> + */
> + if (ra->start < offset &&
> + offset < ra->prev_page &&
> + ra->prev_page < ra->ahead_start +
> ra->ahead_size)
> + goto out;
> +
> + /*
> * We avoid doing extra work and bogusly perturbing the
> readahead
> * window expansion logic.
> */
> --- linux-2.6.20-rc6-mm3.orig/mm/filemap.c
> +++ linux-2.6.20-rc6-mm3/mm/filemap.c
> @@ -915,9 +915,6 @@ void do_generic_mapping_read(struct addr
> if (!isize)
> goto out;
>
> - if (unlikely(aio_restarted()))
> - next_index = last_index; /* Avoid repeat readahead */
> -
> end_index = (isize - 1) >> PAGE_CACHE_SHIFT;
> for (;;) {
> struct page *page;
>
next prev parent reply other threads:[~2007-02-12 19:50 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-02-10 1:40 [PATCH -mm] readahead: partial sendfile fix Fengguang Wu
2007-02-10 1:40 ` Fengguang Wu
2007-02-12 19:49 ` Ram Pai [this message]
2007-03-08 10:25 ` Fengguang Wu
2007-03-08 10:25 ` Fengguang Wu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1171309751.2891.70.camel@ram.us.ibm.com \
--to=linuxram@us.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=badari@us.ibm.com \
--cc=fengguang.wu@gmail.com \
--cc=jbtran@ca.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mason@suse.com \
--cc=mcao@us.ibm.com \
--cc=mksully@us.ibm.com \
--cc=slpratt@austin.ibm.com \
--cc=suparna@in.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.