linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>,
	Balbir Singh <balbir@linux.vnet.ibm.com>,
	linux-mm@kvack.org, akpm@linux-foundation.org, npiggin@kernel.dk,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org,
	kamezawa.hiroyu@jp.fujitsu.com, Mel Gorman <mel@csn.ul.ie>,
	Minchan Kim <minchan.kim@gmail.com>
Subject: Re: [PATCH 0/3] Unmapped page cache control (v5)
Date: Mon, 4 Apr 2011 10:19:36 +1000	[thread overview]
Message-ID: <20110404001936.GL6957@dastard> (raw)
In-Reply-To: <20110403183229.AE4C.A69D9226@jp.fujitsu.com>

On Sun, Apr 03, 2011 at 06:32:16PM +0900, KOSAKI Motohiro wrote:
> > On Fri, Apr 01, 2011 at 10:17:56PM +0900, KOSAKI Motohiro wrote:
> > > > > But, I agree that now we have to concern slightly large VM change parhaps
> > > > > (or parhaps not). Ok, it's good opportunity to fill out some thing.
> > > > > Historically, Linux MM has "free memory are waste memory" policy, and It
> > > > > worked completely fine. But now we have a few exceptions.
> > > > >
> > > > > 1) RT, embedded and finance systems. They really hope to avoid reclaim
> > > > >    latency (ie avoid foreground reclaim completely) and they can accept
> > > > >    to make slightly much free pages before memory shortage.
> > > > 
> > > > In general we need a mechanism to ensure we can avoid reclaim during
> > > > critical sections of application. So some way to give some hints to the
> > > > machine to free up lots of memory (/proc/sys/vm/dropcaches is far too
> > > > drastic) may be useful.
> > > 
> > > Exactly.
> > > I've heard multiple times this request from finance people. And I've also 
> > > heared the same request from bullet train control software people recently.
> > 
[...]
> > Fundamentally, if you just switch off memory reclaim to avoid the
> > latencies involved with direct memory reclaim, then all you'll get
> > instead is ENOMEM because there's no memory available and none will be
> > reclaimed. That's even more fatal for the system than doing reclaim.
> 
> You have two level oversight.
> 
> Firstly, *ALL* RT application need to cooperate applications, kernel, 
> and other various system level daemons. That's no specific issue of 
> this topic. OK, *IF* RT application run egoistic, a system may hang 
> up easily even routh mere simple busy loop, yes. But, Who want to do so?

Sure - that's RT-101. I think I have a good understanding of these
principles after spending 7 years of my life working on wide-area
distributed real-time control systems (think city-scale water and
electricity supply).

> Secondly, You misparsed "avoid direct reclaim" paragraph. We don't talk
> about "avoid direct reclaim even if system memory is no enough", We talk
> about "avoid direct reclaim by preparing before". 

I don't think I misparsed it. I am addressing the "avoid direct
reclaim by preparing before" principle directly. The problem with it
is that just enalrging the free memory pool doesn't guarantee future
allocation success when there are other concurrent allocations
occurring. IOWs, if you don't _reserve_ the free memory for the
critical area in advance then there is no guarantee it will be
available when needed by the critical section.

A simple example: the radix tree node preallocation code to
guarantee inserts succeed while holding a spinlock. If just relying
on free memory was sufficient, then GFP_ATOMIC allocations are all
that is necessary. However, even that isn't sufficient as even the
GFP_ATOMIC reserved pool can be exhausted by other concurrent
GFP_ATOMIC allocations. Hence preallocation is required before
entering the critical section to guarantee success in all cases.

And to state the obvious: doing allocation before the critical
section will trigger reclaim if necessary so there is no need to
have the application trigger reclaim.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-04-04  0:19 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-30  5:30 [PATCH 0/3] Unmapped page cache control (v5) Balbir Singh
2011-03-30  5:31 ` [PATCH 1/3] Move zone_reclaim() outside of CONFIG_NUMA (v5) Balbir Singh
2011-03-30  5:31 ` [PATCH 2/3] Refactor zone_reclaim code (v5) Balbir Singh
2011-03-30  5:32 ` [PATCH 3/3] Provide control over unmapped pages (v5) Balbir Singh
2011-03-30 23:35   ` Andrew Morton
2011-03-31  5:52     ` Balbir Singh
2011-03-30 23:36 ` [PATCH 0/3] Unmapped page cache control (v5) Andrew Morton
2011-03-31  5:27   ` Balbir Singh
2011-03-31  5:32     ` Andrew Morton
2011-04-01 17:31       ` Balbir Singh
2011-03-31  5:40 ` KOSAKI Motohiro
2011-03-31  8:28   ` Balbir Singh
2011-04-01  7:56     ` KOSAKI Motohiro
2011-04-01 13:12       ` Balbir Singh
2011-04-01 13:21         ` KOSAKI Motohiro
2011-04-01 18:04           ` Balbir Singh
2011-04-03  9:39             ` KOSAKI Motohiro
2011-03-31 20:13   ` Christoph Lameter
2011-04-01 13:17     ` KOSAKI Motohiro
2011-04-01 14:50       ` Christoph Lameter
2011-04-03  9:44         ` KOSAKI Motohiro
2011-04-03 18:45           ` Christoph Lameter
2011-04-01 23:10       ` Satoru Moriya
2011-04-02  1:10       ` Dave Chinner
2011-04-03  9:32         ` KOSAKI Motohiro
2011-04-04  0:19           ` Dave Chinner [this message]
2011-04-04 12:05             ` KOSAKI Motohiro
2011-04-03 18:41         ` Christoph Lameter
2011-03-31 21:40 ` Dave Chinner
2011-04-01  3:08   ` Balbir Singh
2011-04-01  5:31     ` Dave Chinner
2011-04-01  3:18   ` KOSAKI Motohiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110404001936.GL6957@dastard \
    --to=david@fromorbit.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbir@linux.vnet.ibm.com \
    --cc=cl@linux.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=npiggin@kernel.dk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).