linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Chris Mason <chris.mason@oracle.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Loke, Chetan" <Chetan.Loke@netscout.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Andreas Dilger <adilger@dilger.ca>,
	Andrea Arcangeli <aarcange@redhat.com>, Jan Kara <jack@suse.cz>,
	Mike Snitzer <snitzer@redhat.com>,
	linux-scsi@vger.kernel.org, neilb@suse.de, dm-devel@redhat.com,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, Jeff Moyer <jmoyer@redhat.com>,
	Wu Fengguang <fengguang.wu@gmail.com>,
	Boaz Harrosh <bharrosh@panasas.com>,
	linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
	"Darrick J.Wong" <djwong@us.ibm.com>
Subject: Re: [Lsf-pc] [dm-devel]  [LSF/MM TOPIC] a few storage topics
Date: Fri, 27 Jan 2012 09:38:01 +1100	[thread overview]
Message-ID: <20120126223801.GA15102@dastard> (raw)
In-Reply-To: <20120125200613.GH15866@shiny>

On Wed, Jan 25, 2012 at 03:06:13PM -0500, Chris Mason wrote:
> On Wed, Jan 25, 2012 at 12:37:48PM -0600, James Bottomley wrote:
> > On Wed, 2012-01-25 at 13:28 -0500, Loke, Chetan wrote:
> > > > So there are two separate problems mentioned here.  The first is to
> > > > ensure that readahead (RA) pages are treated as more disposable than
> > > > accessed pages under memory pressure and then to derive a statistic for
> > > > futile RA (those pages that were read in but never accessed).
> > > > 
> > > > The first sounds really like its an LRU thing rather than adding yet
> > > > another page flag.  We need a position in the LRU list for never
> > > > accessed ... that way they're first to be evicted as memory pressure
> > > > rises.
> > > > 
> > > > The second is you can derive this futile readahead statistic from the
> > > > LRU position of unaccessed pages ... you could keep this globally.
> > > > 
> > > > Now the problem: if you trash all unaccessed RA pages first, you end up
> > > > with the situation of say playing a movie under moderate memory
> > > > pressure that we do RA, then trash the RA page then have to re-read to display
> > > > to the user resulting in an undesirable uptick in read I/O.
> > > > 
> > > > Based on the above, it sounds like a better heuristic would be to evict
> > > > accessed clean pages at the top of the LRU list before unaccessed clean
> > > > pages because the expectation is that the unaccessed clean pages will
> > > > be accessed (that's after all, why we did the readahead).  As RA pages age
> > > 
> > > Well, the movie example is one case where evicting unaccessed page may not be the right thing to do. But what about a workload that perform a random one-shot search?
> > > The search was done and the RA'd blocks are of no use anymore. So it seems one solution would hurt another.
> > 
> > Well not really: RA is always wrong for random reads.  The whole purpose
> > of RA is assumption of sequential access patterns.
> 
> Just to jump back, Jeff's benchmark that started this (on xfs and ext4):
> 
> 	- buffered 1MB reads get down to the scheduler in 128KB chunks
> 
> The really hard part about readahead is that you don't know what
> userland wants.  In Jeff's test, he's telling the kernel he wants 1MB
> ios and our RA engine is doing 128KB ios.
> 
> We can talk about scaling up how big the RA windows get on their own,
> but if userland asks for 1MB, we don't have to worry about futile RA, we
> just have to make sure we don't oom the box trying to honor 1MB reads
> from 5000 different procs.

Right - if we know the read request is larger than the RA window,
then we should ignore the RA window and just service the request in
a single bio. Well, at least, in chunks as large as the underlying
device will allow us to build....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  parent reply	other threads:[~2012-01-26 22:38 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABE8wws67dn0fwhTCs_XqH0g_CxGuT+hPQH9cVFe1xx5t_O9Jw@mail.gmail.com>
2012-01-17 20:06 ` [LSF/MM TOPIC] a few storage topics Mike Snitzer
2012-01-17 21:36   ` [Lsf-pc] " Jan Kara
2012-01-18 22:58     ` Darrick J. Wong
2012-01-18 23:22       ` Jan Kara
2012-01-18 23:42         ` Boaz Harrosh
2012-01-19  9:46           ` Jan Kara
2012-01-19 15:08             ` Andrea Arcangeli
2012-01-19 20:52               ` Jan Kara
2012-01-19 21:39                 ` Andrea Arcangeli
2012-01-22 11:31                   ` Boaz Harrosh
2012-01-23 16:30                     ` Jan Kara
2012-01-22 12:21             ` Boaz Harrosh
2012-01-23 16:18               ` Jan Kara
2012-01-23 17:53                 ` Andrea Arcangeli
2012-01-23 18:28                   ` Jeff Moyer
2012-01-23 18:56                     ` Andrea Arcangeli
2012-01-23 19:19                       ` Jeff Moyer
2012-01-24 15:15                     ` Chris Mason
2012-01-24 16:56                       ` [dm-devel] " Christoph Hellwig
2012-01-24 17:01                         ` Andreas Dilger
2012-01-24 17:06                         ` [Lsf-pc] [dm-devel] " Andrea Arcangeli
2012-01-24 17:08                         ` Chris Mason
2012-01-24 17:08                         ` [Lsf-pc] " Andreas Dilger
2012-01-24 18:05                           ` [dm-devel] " Jeff Moyer
2012-01-24 18:40                             ` Christoph Hellwig
2012-01-24 19:07                               ` Chris Mason
2012-01-24 19:14                                 ` Jeff Moyer
2012-01-24 20:09                                   ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:13                                     ` [Lsf-pc] " Jeff Moyer
2012-01-24 20:39                                       ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:59                                         ` Jeff Moyer
2012-01-24 21:08                                           ` Jan Kara
2012-01-25  3:29                                         ` Wu Fengguang
2012-01-25  6:15                                           ` [Lsf-pc] " Andreas Dilger
2012-01-25  6:35                                             ` [Lsf-pc] [dm-devel] " Wu Fengguang
2012-01-25 14:00                                               ` Jan Kara
2012-01-26 12:29                                                 ` Andreas Dilger
2012-01-27 17:03                                                   ` Ted Ts'o
2012-01-26 16:25                                               ` Vivek Goyal
2012-01-26 20:37                                                 ` Jan Kara
2012-01-26 22:34                                                 ` Dave Chinner
2012-01-27  3:27                                                   ` Wu Fengguang
2012-01-27  5:25                                                     ` Andreas Dilger
2012-01-27  7:53                                                       ` Wu Fengguang
2012-01-25 14:33                                             ` Steven Whitehouse
2012-01-25 14:45                                               ` Jan Kara
2012-01-25 16:22                                               ` Loke, Chetan
2012-01-25 16:40                                                 ` Steven Whitehouse
2012-01-25 17:08                                                   ` Loke, Chetan
2012-01-25 17:32                                                   ` James Bottomley
2012-01-25 18:28                                                     ` Loke, Chetan
2012-01-25 18:37                                                       ` Loke, Chetan
2012-01-25 18:37                                                       ` James Bottomley
2012-01-25 20:06                                                         ` Chris Mason
2012-01-25 22:46                                                           ` Andrea Arcangeli
2012-01-25 22:58                                                             ` Jan Kara
2012-01-26  8:59                                                             ` Boaz Harrosh
2012-01-26 16:40                                                             ` Loke, Chetan
2012-01-26 17:00                                                               ` Andreas Dilger
2012-01-26 17:16                                                                 ` Loke, Chetan
2012-02-03 12:37                                                               ` Wu Fengguang
2012-01-26 22:38                                                           ` Dave Chinner [this message]
2012-01-26 16:17                                                         ` Loke, Chetan
2012-01-25 18:44                                                       ` Boaz Harrosh
2012-02-03 12:55                                                   ` Wu Fengguang
2012-01-24 19:11                               ` [dm-devel] [Lsf-pc] " Jeff Moyer
2012-01-26 22:31                             ` Dave Chinner
2012-01-24 17:12                       ` Jeff Moyer
2012-01-24 17:32                         ` Chris Mason
2012-01-24 18:14                           ` Jeff Moyer
2012-01-25  0:23           ` NeilBrown
2012-01-25  6:11             ` Andreas Dilger
2012-01-18 23:39       ` Dan Williams
2012-01-24 17:59   ` Martin K. Petersen
2012-01-24 19:48     ` Douglas Gilbert
2012-01-24 20:04       ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120126223801.GA15102@dastard \
    --to=david@fromorbit.com \
    --cc=Chetan.Loke@netscout.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aarcange@redhat.com \
    --cc=adilger@dilger.ca \
    --cc=bharrosh@panasas.com \
    --cc=chris.mason@oracle.com \
    --cc=djwong@us.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=fengguang.wu@gmail.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=neilb@suse.de \
    --cc=snitzer@redhat.com \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).