linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Hisashi Hifumi <hifumi.hisashi@oss.ntt.co.jp>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"jens.axboe@oracle.com" <jens.axboe@oracle.com>
Subject: Re: [PATCH] readahead:add blk_run_backing_dev
Date: Wed, 27 May 2009 12:36:01 +0800	[thread overview]
Message-ID: <20090527043601.GA26361@localhost> (raw)
In-Reply-To: <20090527130358.689C.A69D9226@jp.fujitsu.com>

On Wed, May 27, 2009 at 12:06:12PM +0800, KOSAKI Motohiro wrote:
> > > Ah.  So it's likely to be some strange interaction with the RAID setup.
> > 
> > The normal case is, if page N become uptodate at time T(N), then
> > T(N) <= T(N+1) holds. But for RAID, the data arrival time depends on
> > runtime status of individual disks, which breaks that formula. So
> > in do_generic_file_read(), just after submitting the async readahead IO
> > request, the current page may well be uptodate, so the page won't be locked,
> > and the block device won't be implicitly unplugged:
> 
> Hifumi-san, Can you get blktrace data and confirm Wu's assumption?

To make the reasoning more obvious:

Assume we just submitted readahead IO request for pages N ~ N+M, then
        T(N) <= T(N+1)
        T(N) <= T(N+2)
        T(N) <= T(N+3)
        ...
        T(N) <= T(N+M)   (M = readahead size)
So if the reader is going to block on any page in the above chunk,
it is going to first block on page N.

With RAID (and NFS to some degree), there is no strict ordering,
so the reader is more likely to block on some random pages.

In the first case, the effective async_size = M, in the second case,
the effective async_size <= M. The more async_size, the more degree of
readahead pipeline, hence the more low level IO latencies are hidden
to the application.

Thanks,
Fengguang

> 
> > 
> >                if (PageReadahead(page))
> >                         page_cache_async_readahead()
> >                 if (!PageUptodate(page))
> >                                 goto page_not_up_to_date;
> >                 //...
> > page_not_up_to_date:
> >                 lock_page_killable(page);
> > 
> > 
> > Therefore explicit unplugging can help, so
> > 
> >         Acked-by: Wu Fengguang <fengguang.wu@intel.com> 
> > 
> > The only question is, shall we avoid the double unplug by doing this?
> > 
> > ---
> >  mm/readahead.c |   10 ++++++++++
> >  1 file changed, 10 insertions(+)
> > 
> > --- linux.orig/mm/readahead.c
> > +++ linux/mm/readahead.c
> > @@ -490,5 +490,15 @@ page_cache_async_readahead(struct addres
> >  
> >  	/* do read-ahead */
> >  	ondemand_readahead(mapping, ra, filp, true, offset, req_size);
> > +
> > +	/*
> > +	 * Normally the current page is !uptodate and lock_page() will be
> > +	 * immediately called to implicitly unplug the device. However this
> > +	 * is not always true for RAID conifgurations, where data arrives
> > +	 * not strictly in their submission order. In this case we need to
> > +	 * explicitly kick off the IO.
> > +	 */
> > +	if (PageUptodate(page))
> > +		blk_run_backing_dev(mapping->backing_dev_info, NULL);
> >  }
> >  EXPORT_SYMBOL_GPL(page_cache_async_readahead);
> 
> 

  reply	other threads:[~2009-05-27  4:36 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-18  9:38 [PATCH] readahead:add blk_run_backing_dev Hisashi Hifumi
2009-05-18 17:53 ` Jens Axboe
2009-05-19  0:44   ` Hisashi Hifumi
2009-05-19 10:05   ` Hisashi Hifumi
2009-05-20  0:55   ` Hisashi Hifumi
2009-05-20  2:51   ` Wu Fengguang
2009-05-21  6:01     ` Hisashi Hifumi
2009-05-22  1:05       ` Wu Fengguang
2009-05-22  1:44         ` Hisashi Hifumi
2009-05-22  2:33           ` Wu Fengguang
2009-05-26 23:42             ` Andrew Morton
2009-05-27  0:25               ` Hisashi Hifumi
2009-05-27  2:09                 ` Wu Fengguang
2009-05-27  2:21                   ` Hisashi Hifumi
2009-05-27  2:35                     ` KOSAKI Motohiro
2009-05-27  2:36                     ` Andrew Morton
2009-05-27  2:38                       ` Hisashi Hifumi
2009-05-27  3:55                       ` Wu Fengguang
2009-05-27  4:06                         ` KOSAKI Motohiro
2009-05-27  4:36                           ` Wu Fengguang [this message]
2009-05-27  6:20                             ` Hisashi Hifumi
2009-05-28  1:20                             ` Hisashi Hifumi
2009-05-28  2:23                               ` KOSAKI Motohiro
2009-06-01  1:39                                 ` Hisashi Hifumi
2009-06-01  2:23                                   ` KOSAKI Motohiro
2009-05-27  2:36                     ` Wu Fengguang
2009-05-27  2:47                       ` Hisashi Hifumi
2009-05-27  2:57                         ` Wu Fengguang
2009-05-27  3:06                           ` Hisashi Hifumi
2009-05-27  3:26                             ` KOSAKI Motohiro
2009-06-01  2:37                             ` Wu Fengguang
2009-06-01  2:51                               ` Hisashi Hifumi
2009-06-01  3:02                                 ` Wu Fengguang
2009-06-01  3:06                                   ` KOSAKI Motohiro
2009-06-01  3:07                                   ` Hisashi Hifumi
2009-06-01  4:30                                     ` Wu Fengguang
2009-05-27  2:07               ` Wu Fengguang
2009-05-20  1:07 ` KOSAKI Motohiro
2009-05-20  1:43   ` Hisashi Hifumi
2009-05-20  2:52     ` Wu Fengguang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090527043601.GA26361@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=hifumi.hisashi@oss.ntt.co.jp \
    --cc=jens.axboe@oracle.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).