linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Chris Mason <chris.mason@oracle.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Loke, Chetan" <Chetan.Loke@netscout.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Andreas Dilger <adilger@dilger.ca>, Jan Kara <jack@suse.cz>,
	Mike Snitzer <snitzer@redhat.com>,
	linux-scsi@vger.kernel.org, neilb@suse.de, dm-devel@redhat.com,
	Christoph Hellwig <hch@infradead.org>,
	linux-mm@kvack.org, Jeff Moyer <jmoyer@redhat.com>,
	Wu Fengguang <fengguang.wu@gmail.com>,
	Boaz Harrosh <bharrosh@panasas.com>,
	linux-fsdevel@vger.kernel.org, lsf-pc@lists.linux-foundation.org,
	"Darrick J.Wong" <djwong@us.ibm.com>
Subject: Re: [Lsf-pc] [dm-devel]  [LSF/MM TOPIC] a few storage topics
Date: Wed, 25 Jan 2012 23:46:14 +0100	[thread overview]
Message-ID: <20120125224614.GM30782@redhat.com> (raw)
In-Reply-To: <20120125200613.GH15866@shiny>

On Wed, Jan 25, 2012 at 03:06:13PM -0500, Chris Mason wrote:
> We can talk about scaling up how big the RA windows get on their own,
> but if userland asks for 1MB, we don't have to worry about futile RA, we
> just have to make sure we don't oom the box trying to honor 1MB reads
> from 5000 different procs.

:) that's for sure if read has a 1M buffer as destination. However
even cp /dev/sda reads/writes through a 32kb buffer, so it's not so
common to read in 1m buffers.

But I also would prefer to stay on the simple side (on a side note we
run out of page flags already on 32bit I think as I had to nuke
PG_buddy already).

Overall I think the risk of the pages being evicted before they can be
copied to userland is quite a minor risk. A 16G system with 100
readers all hitting on disk at the same time using 100M readahead
would still only create a 100m memory pressure... So it'd sure be ok,
100m is less than what kswapd keeps always free for example. Think a
4TB system. Especially if 128k fixed has been ok so far on a 1G system.

If we really want to be more dynamic than a setting at boot depending
on ram size, we could limit it to a fraction of freeable memory (using
similar math to determine_dirtyable_memory, maybe calling it over time
but not too frequently to reduce the overhead). Like if there's 0
memory freeable keep it low. If there's 1G freeable out of that math
(and we assume the readahead hit rate is near 100%), raise the maximum
readahead to 1M even if the total ram is only 1G. So we allow up to
1000 readers before we even recycle the readahead.

I doubt the complexity of tracking exactly how many pages are getting
recycled before they're copied to userland would be worth it, besides
it'd be 0% for 99% of systems and workloads.

Way more important is to have feedback on the readahead hits and be
sure when readahead is raised to the maximum the hit rate is near 100%
and fallback to lower readaheads if we don't get that hit rate. But
that's not a VM problem and it's a readahead issue only.

The actual VM pressure side of it, sounds minor issue if the hit rate
of the readahead cache is close to 100%.

The config option is also ok with me, but I think it'd be nicer to set
it at boot depending on ram size (one less option to configure
manually and zero overhead).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-01-25 22:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120124151504.GQ4387@shiny>
     [not found] ` <20120124165631.GA8941@infradead.org>
     [not found]   ` <186EA560-1720-4975-AC2F-8C72C4A777A9@dilger.ca>
     [not found]     ` <x49fwf5kmbl.fsf@segfault.boston.devel.redhat.com>
     [not found]       ` <20120124184054.GA23227@infradead.org>
     [not found]         ` <20120124190732.GH4387@shiny>
     [not found]           ` <x49vco0kj5l.fsf@segfault.boston.devel.redhat.com>
     [not found]             ` <20120124200932.GB20650@quack.suse.cz>
     [not found]               ` <x49pqe8kgej.fsf@segfault.boston.devel.redhat.com>
     [not found]                 ` <20120124203936.GC20650@quack.suse.cz>
     [not found]                   ` <20120125032932.GA7150@localhost>
     [not found]                     ` <F6F2DEB8-F096-4A3B-89E3-3A132033BC76@dilger.ca>
     [not found]                       ` <1327502034.2720.23.camel@menhir>
     [not found]                         ` <D3F292ADF945FB49B35E96C94C2061B915A638A6@nsmail.netscout.com>
     [not found]                           ` <1327509623.2720.52.camel@menhir>
2012-01-25 17:32                             ` [Lsf-pc] [dm-devel] [LSF/MM TOPIC] a few storage topics James Bottomley
2012-01-25 18:28                               ` Loke, Chetan
2012-01-25 18:37                                 ` Loke, Chetan
2012-01-25 18:37                                 ` James Bottomley
2012-01-25 20:06                                   ` Chris Mason
2012-01-25 22:46                                     ` Andrea Arcangeli [this message]
2012-01-25 22:58                                       ` Jan Kara
2012-01-26  8:59                                       ` Boaz Harrosh
2012-01-26 16:40                                       ` Loke, Chetan
2012-01-26 17:00                                         ` Andreas Dilger
2012-01-26 17:16                                           ` Loke, Chetan
2012-02-03 12:37                                         ` Wu Fengguang
2012-01-26 22:38                                     ` Dave Chinner
2012-01-26 16:17                                   ` Loke, Chetan
2012-01-25 18:44                                 ` Boaz Harrosh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120125224614.GM30782@redhat.com \
    --to=aarcange@redhat.com \
    --cc=Chetan.Loke@netscout.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=adilger@dilger.ca \
    --cc=bharrosh@panasas.com \
    --cc=chris.mason@oracle.com \
    --cc=djwong@us.ibm.com \
    --cc=dm-devel@redhat.com \
    --cc=fengguang.wu@gmail.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=jmoyer@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=neilb@suse.de \
    --cc=snitzer@redhat.com \
    --cc=swhiteho@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).