public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Randy Dunlap <randy.dunlap@oracle.com>,
	linux-kernel@vger.kernel.org, hifumi.hisashi@oss.ntt.co.jp,
	Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: mmotm 2009-06-02-16-11 uploaded (readahead)
Date: Tue, 9 Jun 2009 06:46:19 +0200	[thread overview]
Message-ID: <20090609044619.GZ11363@kernel.dk> (raw)
In-Reply-To: <20090608213817.999143dd.akpm@linux-foundation.org>

On Mon, Jun 08 2009, Andrew Morton wrote:
> On Tue, 9 Jun 2009 05:59:16 +0200 Jens Axboe <jens.axboe@oracle.com> wrote:
> 
> > ...
> > > Doing a block-specific call from inside page_cache_async_readahead() is
> > > a bit of a layering violation - this may not be a block-backed
> > > filesystem at all.
> > > 
> > > otoh, perhaps blk_run_backing_dev() is wrongly named and defined in the
> > > wrong place.  Perhaps non-block-backed backing_devs want to implement
> > > an unplug-style function too?  In which case the whole thing should be
> > > renamed and moved outside blkdev.h.
> > > 
> > > If we don't want to do that, shouldn't backing_dev_info.unplug* be
> > > wrapped in #ifdef CONFIG_BLOCK?  And wasn't it a layering violation to
> > > put block-specific things into the backing_dev_info?
> > > 
> > > Jens, talk to me!
> > > 
> > > From the readahead POV: does it make sense to call the backing-dev's
> > > "unplug" function even if that isn't a block-based device?  Or was this
> > > just a weird block-device-only performance problem?  Hard to say.
> > 
> > Layering wise, I don't think it's that bad. It would have looked cleaner
> > to do:
> > 
> >         blk_run_address_space(mapping);
> > 
> > instead, but we would still need to make that available outside of
> > CONFIG_BLOCK as well.
> > 
> > What I don't like about the patch is that it's a heuristic, a "I poked
> > this and it made that faster" with nobody really understanding why.
> 
> Well.  I _think_ we understand it.  I'm not sure that we understand why
> it made scst faster though.

That's what I mean, the full effect isn't understood.

> > And
> > it's second guessing the block layer unplugging, so perhaps the real fix
> > should be going on there. Or perhaps it's just fine and this micro
> > optimization just helps this one case and that's great.
> > 
> > So ho humm, not terribly excited about it, but I guess we can shove it
> > in there for testing. But lets please use blk_run_address_space() and
> > add an empty stub for that.
> 
> But blk_anything() shouldn't be in the readahead code - readahead isn't
> specific to block-based devices!
> 
> y:/usr/src/25> egrep "blk|block" mm/readahead.c 
> #include <linux/blkdev.h>
>  * block layer to abandon the readahead if request allocation would block.
>  * force_page_cache_readahead() will ignore queue congestion and will block on
> y:/usr/src/25> 
> 
> 
> From a layering POV we should have some mapping_start_io(address_space
> *) which of course calls blk_run_address_space() if it's a block-backed
> and calls <something else> if it's not block-backed.  Problem is, if
> the backing device is, say, NFS then we have no reason to believe that
> starting IO at this time is beneficial to NFS.

Well, if the patch is sane, then it should work for NFS as well. And if
NFS has a backing dev, then it should be the right thing to do. But it
is a bit... all over the map. It's not that different from thinking that
this patch will hurt other block backed cases too. So far we just have
two very different parts that get faster with this, for something as
simple as a sequential read.

blk_run_address_space() SHOULD just be named mapping_start_io() as 1)
that's what it does, and 2) it's not really a block layer function to
begin with.

> But sure, the world wouldn't end if we put a block-specific IO hint in
> there.  It just isn't quite right.

Fully agree!

-- 
Jens Axboe


  reply	other threads:[~2009-06-09  4:46 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-06-02 23:11 mmotm 2009-06-02-16-11 uploaded akpm
2009-06-03  3:54 ` mmotm 2009-06-02-16-11 uploaded (readahead) Randy Dunlap
2009-06-03 20:47   ` Andrew Morton
2009-06-04  1:25     ` KOSAKI Motohiro
2009-06-04  1:59       ` Wu Fengguang
2009-06-09  3:59     ` Jens Axboe
2009-06-09  4:38       ` Andrew Morton
2009-06-09  4:46         ` Jens Axboe [this message]
2009-06-09  4:51         ` Wu Fengguang
2009-06-09 11:01           ` Vladislav Bolkhovitin
2009-06-03  4:12 ` mmotm 2009-06-02-16-11 uploaded (staging) Randy Dunlap
2009-06-03  4:45   ` David Rientjes
2009-06-03  4:55     ` Greg KH
2009-06-03  4:56     ` Andrew Morton
2009-06-03 15:07     ` Randy Dunlap
2009-06-03  4:46   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090609044619.GZ11363@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengguang.wu@intel.com \
    --cc=hifumi.hisashi@oss.ntt.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    --cc=randy.dunlap@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox