linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: Jan Kara <jack@suse.cz>, Andreas Dilger <adilger@dilger.ca>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"linux-scsi@vger.kernel.org" <linux-scsi@vger.kernel.org>,
	Mike Snitzer <snitzer@redhat.com>,
	"neilb@suse.de" <neilb@suse.de>,
	Christoph Hellwig <hch@infradead.org>,
	"dm-devel@redhat.com" <dm-devel@redhat.com>,
	fengguang.wu@gmail.com, Boaz Harrosh <bharrosh@panasas.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"lsf-pc@lists.linux-foundation.org"
	<lsf-pc@lists.linux-foundation.org>,
	Chris Mason <chris.mason@oracle.com>,
	"Darrick J.Wong" <djwong@us.ibm.com>
Subject: Re: [Lsf-pc] [dm-devel]  [LSF/MM TOPIC] a few storage topics
Date: Tue, 24 Jan 2012 22:08:23 +0100	[thread overview]
Message-ID: <20120124210823.GE20650@quack.suse.cz> (raw)
In-Reply-To: <x49liowkeax.fsf@segfault.boston.devel.redhat.com>

On Tue 24-01-12 15:59:02, Jeff Moyer wrote:
> Jan Kara <jack@suse.cz> writes:
> > On Tue 24-01-12 15:13:40, Jeff Moyer wrote:
> >> Jan Kara <jack@suse.cz> writes:
> >> 
> >> > On Tue 24-01-12 14:14:14, Jeff Moyer wrote:
> >> >> Chris Mason <chris.mason@oracle.com> writes:
> >> >> 
> >> >> >> All three filesystems use the generic mpages code for reads, so they
> >> >> >> all get the same (bad) I/O patterns.  Looks like we need to fix this up
> >> >> >> ASAP.
> >> >> >
> >> >> > Can you easily run btrfs through the same rig?  We don't use mpages and
> >> >> > I'm curious.
> >> >> 
> >> >> The readahead code was to blame, here.  I wonder if we can change the
> >> >> logic there to not break larger I/Os down into smaller sized ones.
> >> >> Fengguang, doing a dd if=file of=/dev/null bs=1M results in 128K I/Os,
> >> >> when 128KB is the read_ahead_kb value.  Is there any heuristic you could
> >> >> apply to not break larger I/Os up like this?  Does that make sense?
> >> >   Well, not breaking up I/Os would be fairly simple as ondemand_readahead()
> >> > already knows how much do we want to read. We just trim the submitted I/O to
> >> > read_ahead_kb artificially. And that is done so that you don't trash page
> >> > cache (possibly evicting pages you have not yet copied to userspace) when
> >> > there are several processes doing large reads.
> >> 
> >> Do you really think applications issue large reads and then don't use
> >> the data?  I mean, I've seen some bad programming, so I can believe that
> >> would be the case.  Still, I'd like to think it doesn't happen.  ;-)
> >   No, I meant a cache thrashing problem. Suppose that we always readahead
> > as much as user asks and there are say 100 processes each wanting to read 4
> > MB.  Then you need to find 400 MB in the page cache so that all reads can
> > fit.  And if you don't have them, reads for process 50 may evict pages we
> > already preread for process 1, but process one didn't yet get to CPU to
> > copy the data to userspace buffer. So the read becomes wasted.
> 
> Yeah, you're right, cache thrashing is an issue.  In my tests, I didn't
> actually see the *initial* read come through as a full 1MB I/O, though.
> That seems odd to me.
  At first sight yes. But buffered reading internally works page-by-page
so what it does is that it looks at the first page it wants, sees we don't
have that in memory, so we submit readahead (hence 128 KB request) and then
wait for that page to become uptodate. Then, when we are coming to the end
of preread window (trip over marked page), we submit another chunk of
readahead...

								Honza
-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

  reply	other threads:[~2012-01-24 21:08 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CABE8wws67dn0fwhTCs_XqH0g_CxGuT+hPQH9cVFe1xx5t_O9Jw@mail.gmail.com>
2012-01-17 20:06 ` [LSF/MM TOPIC] a few storage topics Mike Snitzer
2012-01-17 21:36   ` [Lsf-pc] " Jan Kara
2012-01-18 22:58     ` Darrick J. Wong
2012-01-18 23:22       ` Jan Kara
2012-01-18 23:42         ` Boaz Harrosh
2012-01-19  9:46           ` Jan Kara
2012-01-19 15:08             ` Andrea Arcangeli
2012-01-19 20:52               ` Jan Kara
2012-01-19 21:39                 ` Andrea Arcangeli
2012-01-22 11:31                   ` Boaz Harrosh
2012-01-23 16:30                     ` Jan Kara
2012-01-22 12:21             ` Boaz Harrosh
2012-01-23 16:18               ` Jan Kara
2012-01-23 17:53                 ` Andrea Arcangeli
2012-01-23 18:28                   ` Jeff Moyer
2012-01-23 18:56                     ` Andrea Arcangeli
2012-01-23 19:19                       ` Jeff Moyer
2012-01-24 15:15                     ` Chris Mason
2012-01-24 16:56                       ` [dm-devel] " Christoph Hellwig
2012-01-24 17:01                         ` Andreas Dilger
2012-01-24 17:06                         ` [Lsf-pc] [dm-devel] " Andrea Arcangeli
2012-01-24 17:08                         ` Chris Mason
2012-01-24 17:08                         ` [Lsf-pc] " Andreas Dilger
2012-01-24 18:05                           ` [dm-devel] " Jeff Moyer
2012-01-24 18:40                             ` Christoph Hellwig
2012-01-24 19:07                               ` Chris Mason
2012-01-24 19:14                                 ` Jeff Moyer
2012-01-24 20:09                                   ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:13                                     ` [Lsf-pc] " Jeff Moyer
2012-01-24 20:39                                       ` [Lsf-pc] [dm-devel] " Jan Kara
2012-01-24 20:59                                         ` Jeff Moyer
2012-01-24 21:08                                           ` Jan Kara [this message]
2012-01-25  3:29                                         ` Wu Fengguang
2012-01-25  6:15                                           ` [Lsf-pc] " Andreas Dilger
2012-01-25  6:35                                             ` [Lsf-pc] [dm-devel] " Wu Fengguang
2012-01-25 14:00                                               ` Jan Kara
2012-01-26 12:29                                                 ` Andreas Dilger
2012-01-27 17:03                                                   ` Ted Ts'o
2012-01-26 16:25                                               ` Vivek Goyal
2012-01-26 20:37                                                 ` Jan Kara
2012-01-26 22:34                                                 ` Dave Chinner
2012-01-27  3:27                                                   ` Wu Fengguang
2012-01-27  5:25                                                     ` Andreas Dilger
2012-01-27  7:53                                                       ` Wu Fengguang
2012-01-25 14:33                                             ` Steven Whitehouse
2012-01-25 14:45                                               ` Jan Kara
2012-01-25 16:22                                               ` Loke, Chetan
2012-01-25 16:40                                                 ` Steven Whitehouse
2012-01-25 17:08                                                   ` Loke, Chetan
2012-01-25 17:32                                                   ` James Bottomley
2012-01-25 18:28                                                     ` Loke, Chetan
2012-01-25 18:37                                                       ` Loke, Chetan
2012-01-25 18:37                                                       ` James Bottomley
2012-01-25 20:06                                                         ` Chris Mason
2012-01-25 22:46                                                           ` Andrea Arcangeli
2012-01-25 22:58                                                             ` Jan Kara
2012-01-26  8:59                                                             ` Boaz Harrosh
2012-01-26 16:40                                                             ` Loke, Chetan
2012-01-26 17:00                                                               ` Andreas Dilger
2012-01-26 17:16                                                                 ` Loke, Chetan
2012-02-03 12:37                                                               ` Wu Fengguang
2012-01-26 22:38                                                           ` Dave Chinner
2012-01-26 16:17                                                         ` Loke, Chetan
2012-01-25 18:44                                                       ` Boaz Harrosh
2012-02-03 12:55                                                   ` Wu Fengguang
2012-01-24 19:11                               ` [dm-devel] [Lsf-pc] " Jeff Moyer
2012-01-26 22:31                             ` Dave Chinner
2012-01-24 17:12                       ` Jeff Moyer
2012-01-24 17:32                         ` Chris Mason
2012-01-24 18:14                           ` Jeff Moyer
2012-01-25  0:23           ` NeilBrown
2012-01-25  6:11             ` Andreas Dilger
2012-01-18 23:39       ` Dan Williams
2012-01-24 17:59   ` Martin K. Petersen
2012-01-24 19:48     ` Douglas Gilbert
2012-01-24 20:04       ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120124210823.GE20650@quack.suse.cz \
    --to=jack@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=adilger@dilger.ca \
    --cc=bharrosh@panasas.com \
    --cc=chris.mason@oracle.com \
    --cc=djwong@us.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=fengguang.wu@gmail.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=neilb@suse.de \
    --cc=snitzer@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).