linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mel@csn.ul.ie>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Christoph Hellwig <hch@infradead.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Dave Chinner <david@fromorbit.com>,
	Chris Mason <chris.mason@oracle.com>,
	Nick Piggin <npiggin@suse.de>, Rik van Riel <riel@redhat.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: [PATCH 14/14] fs,xfs: Allow kswapd to writeback pages
Date: Mon, 5 Jul 2010 15:16:40 +0100	[thread overview]
Message-ID: <20100705141640.GD13780@csn.ul.ie> (raw)
In-Reply-To: <20100702152643.36019b4e.kamezawa.hiroyu@jp.fujitsu.com>

On Fri, Jul 02, 2010 at 03:26:43PM +0900, KAMEZAWA Hiroyuki wrote:
> On Thu, 1 Jul 2010 11:30:32 +0100
> Mel Gorman <mel@csn.ul.ie> wrote:
> > > memcg shouldn't
> > > depends on it. If so, memcg should depends on some writeback-thread (as kswapd).
> > > ok.
> > > 
> > > Then, my concern here is that which kswapd we should wake up and how it can stop.
> > 
> > And also what the consequences are of kswapd being occupied with containers
> > instead of the global lists for a time.
> > 
>
> yes, we may have to add a thread or workqueue for memcg for isolating workloads.
> 

Possibly, and the closer it is to kswapd behaviour the better I would
imagine but I must warn that I do not have much familiar with the
behaviour of large numbers of memcg entering reclaim.

> > A slightly greater concern is that clean pages can be temporarily "lost"
> > on the cleaning list. If a direct reclaimer moves pages to the LRU_CLEANING
> > list, it's no longer considering those pages even if a flusher thread
> > happened to clean those pages before kswapd had a chance. Lets say under
> > heavy memory pressure a lot of pages are being dirties and encountered on
> > the LRU list. They move to LRU_CLEANING where dirty balancing starts making
> > sure they get cleaned but are no longer being reclaimed.
> > 
> > Of course, I might be wrong but it's not a trivial direction to take.
> > 
> 
> I hope dirty_ratio at el may help us. But I agree this "hiding" can cause
> issue.
> IIRC, someone wrote a patch to prevent too many threads enter vmscan..
> such kinds of work may be necessary.
> 

Using systemtap, I have found in global reclaim at least that the ratio of
dirty to clean pages is not a problem. What does appear to be a problem is
that dirty pages are getting to the end of the inactive file list while
still dirty but I haven't formulated a theory as to why yet - maybe it's
because the dirty balancing is cleaning new pages first?  Right now, I
believe dirty_ratio is working as expected but old dirty pages is a problem.

> > > <SNIP>
> > > @@ -2275,7 +2422,9 @@ static int kswapd(void *p)
> > >  		prepare_to_wait(&pgdat->kswapd_wait, &wait, TASK_INTERRUPTIBLE);
> > >  		new_order = pgdat->kswapd_max_order;
> > >  		pgdat->kswapd_max_order = 0;
> > > -		if (order < new_order) {
> > > +		if (need_to_cleaning_node(pgdat)) {
> > > +			launder_pgdat(pgdat);
> > > +		} else if (order < new_order) {
> > >  			/*
> > >  			 * Don't sleep if someone wants a larger 'order'
> > >  			 * allocation
> > 
> > I see the direction you are thinking of but I have big concerns about clean
> > pages getting delayed for too long on the LRU_CLEANING pages before kswapd
> > puts them back in the right place. I think a safer direction would be for
> > memcg people to investigate Andrea's "switch stack" suggestion.
> > 
>
> Hmm, I may have to consider that. My concern is that IRQ's switch-stack works
> well just because no-task-switch in IRQ routine. (I'm sorry if I misunderstand.)
> 
> One possibility for memcg will be limit the number of reclaimers who can use
> __GFP_FS and use shared stack per cpu per memcg.
> 
> Hmm. yet another per-memcg memory shrinker may sound good. 2 years ago, I wrote
> a patch to do high-low-watermark memory shirker thread for memcg.
>   
>   - limit
>   - high
>   - low
> 
> start memory reclaim/writeback when usage exceeds "high" and stop it is below
> "low". Implementing this with thread pool can be a choice.
> 

Indeed, maybe something like a kswapd-memcg thread that is shared between
a configurable number of containers?

> 
> > In the meantime for my own series, memcg now treats dirty pages similar to
> > lumpy reclaim. It asks flusher threads to clean pages but stalls waiting
> > for those pages to be cleaned for a time. This is an untested patch on top
> > of the current series.
> > 
> 
> Wow...Doesn't this make memcg too slow ?

It depends heavily on how often dirty pages are being written back by direct
reclaim. It's not ideal but stalling briefly is better than crashing.
Ideally, the number of dirty pages encountered by direct reclaim would
be so small that it wouldn't matter so I'm looking into that.

> Anyway, memcg should kick flusher
> threads..or something, needs other works, too.
> 

With this patch, the flusher threads get kicked when direct reclaim encounters
pages it cannot clean.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2010-07-05 14:16 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-29 11:34 [PATCH 0/14] Avoid overflowing of stack during page reclaim V3 Mel Gorman
2010-06-29 11:34 ` [PATCH 01/14] vmscan: Fix mapping use after free Mel Gorman
2010-06-29 14:27   ` Minchan Kim
2010-07-01  9:53     ` Mel Gorman
2010-06-29 14:44   ` Johannes Weiner
2010-06-29 11:34 ` [PATCH 02/14] tracing, vmscan: Add trace events for kswapd wakeup, sleeping and direct reclaim Mel Gorman
2010-06-29 11:34 ` [PATCH 03/14] tracing, vmscan: Add trace events for LRU page isolation Mel Gorman
2010-06-29 11:34 ` [PATCH 04/14] tracing, vmscan: Add trace event when a page is written Mel Gorman
2010-06-29 11:34 ` [PATCH 05/14] tracing, vmscan: Add a postprocessing script for reclaim-related ftrace events Mel Gorman
2010-06-29 11:34 ` [PATCH 06/14] vmscan: kill prev_priority completely Mel Gorman
2010-06-29 11:34 ` [PATCH 07/14] vmscan: simplify shrink_inactive_list() Mel Gorman
2010-06-29 11:34 ` [PATCH 08/14] vmscan: Remove unnecessary temporary vars in do_try_to_free_pages Mel Gorman
2010-06-29 11:34 ` [PATCH 09/14] vmscan: Setup pagevec as late as possible in shrink_inactive_list() Mel Gorman
2010-06-29 11:34 ` [PATCH 10/14] vmscan: Setup pagevec as late as possible in shrink_page_list() Mel Gorman
2010-06-29 11:34 ` [PATCH 11/14] vmscan: Update isolated page counters outside of main path in shrink_inactive_list() Mel Gorman
2010-06-29 11:34 ` [PATCH 12/14] vmscan: Do not writeback pages in direct reclaim Mel Gorman
2010-07-02 19:51   ` Andrew Morton
2010-07-05 13:49     ` Mel Gorman
2010-07-06  0:36       ` KOSAKI Motohiro
2010-07-06  5:46         ` Minchan Kim
2010-07-06  6:02           ` KOSAKI Motohiro
2010-07-06  6:38             ` Minchan Kim
2010-07-06 10:12         ` Mel Gorman
2010-07-06 11:13           ` KOSAKI Motohiro
2010-07-06 11:24           ` Minchan Kim
2010-07-06 15:25             ` Mel Gorman
2010-07-06 20:27               ` Johannes Weiner
2010-07-06 22:28                 ` Minchan Kim
2010-07-07  0:24                   ` Mel Gorman
2010-07-07  1:15                     ` Christoph Hellwig
2010-07-07  9:43                       ` Mel Gorman
2010-07-07 12:51                         ` Rik van Riel
2010-07-07  1:14                 ` Christoph Hellwig
2010-07-08  6:39                 ` KOSAKI Motohiro
2010-07-07  5:03       ` Wu Fengguang
2010-07-07  9:50         ` Mel Gorman
2010-07-07 18:09         ` Christoph Hellwig
2010-06-29 11:34 ` [PATCH 13/14] fs,btrfs: Allow kswapd to writeback pages Mel Gorman
2010-06-30 13:05   ` Chris Mason
2010-07-01  9:55     ` Mel Gorman
2010-06-29 11:34 ` [PATCH 14/14] fs,xfs: " Mel Gorman
2010-06-29 12:37   ` Christoph Hellwig
2010-06-29 12:51     ` Mel Gorman
2010-06-30  0:14       ` KAMEZAWA Hiroyuki
2010-07-01 10:30         ` Mel Gorman
2010-07-02  6:26           ` KAMEZAWA Hiroyuki
2010-07-02  6:31             ` KAMEZAWA Hiroyuki
2010-07-05 14:16             ` Mel Gorman [this message]
2010-07-06  0:45               ` KAMEZAWA Hiroyuki
2010-07-02 19:33 ` [PATCH 0/14] Avoid overflowing of stack during page reclaim V3 Andrew Morton
2010-07-05  1:35   ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100705141640.GD13780@csn.ul.ie \
    --to=mel@csn.ul.ie \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=chris.mason@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@infradead.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).