Re: [patch]vmscan: make kswapd use a correct order

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Simon Kirby <sim@hostway.ca>
To: Mel Gorman <mel@csn.ul.ie>
Cc: Minchan Kim <minchan.kim@gmail.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Shaohua Li <shaohua.li@intel.com>, linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [patch]vmscan: make kswapd use a correct order
Date: Fri, 10 Dec 2010 15:42:06 -0800	[thread overview]
Message-ID: <20101210234206.GA30377@hostway.ca> (raw)
In-Reply-To: <20101210113245.GQ20133@csn.ul.ie>

On Fri, Dec 10, 2010 at 11:32:45AM +0000, Mel Gorman wrote:

> On Thu, Dec 09, 2010 at 03:44:52PM -0800, Simon Kirby wrote:
> > On Mon, Dec 06, 2010 at 12:03:42PM +0000, Mel Gorman wrote:
> > 
> > > But there is still potentially two problems here. The first was kswapd
> > > throwing out everything in zone normal. Even when fixed, there is
> > > potentially still too many pages being thrown out. The situation might
> > > be improved but not repaired.
> > 
> > Yes.
> > 
> > > > Let me clarify: On _another_ box, with 2.6.36 but without your patches
> > > > and without as much load or SSD devices, I forced slub to use order-0
> > > > except where order-1 was absolutely necessary (objects > 4096 bytes),
> > > > just to see what impact this had on free memory.  There was a change,
> > > > but still lots of memory left free.  I was trying to avoid confusion by
> > > > posting graphs from different machines, but here is that one just as a
> > > > reference: http://0x.ca/sim/ref/2.6.36/memory_stor25r_week.png
> > > > (I made the slub order adjustment on Tuesday, November 30th.)
> > > > The spikes are actually from mail nightly expunge/purge runs.  It seems
> > > > that minimizing the slub orders did remove the large free spike that
> > > > was happening during mailbox compaction runs (nightly), and overall there
> > > > was a bit more memory used on average, but it definitely didn't "fix" it. 
> > > 
> > > Ok, but it's still evidence that lumpy reclaim is still the problem here. This
> > > should be "fixed" by reclaim/compaction which has less impact and frees
> > > fewer pages than lumpy reclaim. If necessary, I can backport this to 2.6.36
> > > for you to verify. There is little chance the series would be accepted into
> > > -stable but you'd at least know that 2.6.37 or 2.6.38 would behave as expected.
> > 
> > Isn't lumpy reclaim supposed to _improve_ this situation by trying to
> > free contiguous stuff rather than shooting aimlessly until contiguous
> > pages appear? 
> 
> For lower orders like order-1 and order-2, it reclaims randomly before
> using lumpy reclaim as the assumption is that these lower pages free
> naturally.

Hmm.. We were looking were looking at some other servers' munin graphs,
and I seem to notice a correlation between high allocation rates (eg,
heavily loaded servers) and more memory being free.  I am wondering if
the problem isn't the choice of how to reclaim, but more an issue from
concurrent allocation calls.  Because (direct) reclaim isn't protected
from other allocations, it can fight with allocations that split back up
the orders, which might be increasing fragmentation.

The fragmentation and reaching watermarks does seem to be what is causing
a larger amount to stay free, once it _gets_ fragmented...

I was thinking of ways that it could hold pages while reclaiming, and
then free them all and allocate the request under a lock to avoid
colliding with other allocations.  I see shrink_active_list() almost
seems to have something like this with l_hold, but nothing cares about
watermarks down at that level.  The inactive list goes through a separate
routine, and only inactive uses lumpy reclaim.

> > Or is there some other point to it?  If this is the case,
> > maybe the issue is that lumpy reclaim isn't happening soon enough, so it
> > shoots around too much before it tries to look for lumpy stuff. 
> 
> It used to happen sooner but it ran into latency problems.

Latency from writeback or something?  I wonder if it would be worth
trying a cheap patch to try lumpy mode immediately, just to see how
things change.

> > In
> > 2.6.3[67], set_lumpy_reclaim_mode() only sets lumpy mode if sc->order >
> > PAGE_ALLOC_COSTLY_ORDER (>= 4), or if priority < DEF_PRIORITY - 2.
> > 
> > Also, try_to_compact_pages() bails without doing anything when order <=
> > PAGE_ALLOC_COSTLY_ORDER, which is the order I'm seeing problems at.  So,
> > without further chanegs, I don't see how CONFIG_COMPACTION or 2.6.37 will
> > make any difference, unless I'm missing some related 2.6.37 changes.
> 
> There is increasing pressure to use compaction for the lower orders as
> well. This problem is going to be added to the list of justifications :/

I figured perhaps this was skipped due to being expensive, or else why
wouldn't it just always happen for non-zero orders.  I do see a lot of
servers with 200,000 free order-0 pages and almost no order-1 or anything
bigger, so maybe this could help.  I could try to modify the test in
try_to_compact_pages() to if (!order || !may_enter_fs || !may_perform_io).

Simon-

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-12-10 23:42 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-01  3:08 [patch]vmscan: make kswapd use a correct order Shaohua Li
2010-12-01  4:21 ` Minchan Kim
2010-12-01  5:42   ` Shaohua Li
2010-12-01  9:44 ` KOSAKI Motohiro
2010-12-01 15:58   ` Minchan Kim
2010-12-02  0:09     ` KOSAKI Motohiro
2010-12-02  0:29       ` KOSAKI Motohiro
2010-12-02  0:58         ` Minchan Kim
2010-12-02  0:19     ` Andrew Morton
2010-12-02  9:40       ` Mel Gorman
2010-12-02  0:29     ` Shaohua Li
2010-12-02  0:54       ` Minchan Kim
2010-12-02  1:05         ` Shaohua Li
2010-12-02  1:23           ` Minchan Kim
2010-12-02  1:36             ` Minchan Kim
2010-12-02  9:42               ` Mel Gorman
2010-12-02 15:25                 ` Minchan Kim
2010-12-02  2:39             ` Shaohua Li
2010-12-02  1:28       ` KOSAKI Motohiro
2010-12-02 10:12     ` Mel Gorman
2010-12-02 15:35       ` Minchan Kim
2010-12-02 15:42         ` Mel Gorman
2010-12-02 20:53           ` Simon Kirby
2010-12-03 12:00             ` Mel Gorman
2010-12-04 12:07               ` Simon Kirby
2010-12-06 12:03                 ` Mel Gorman
2010-12-09 23:44                   ` Simon Kirby
2010-12-10 11:32                     ` Mel Gorman
2010-12-10 23:42                       ` Simon Kirby [this message]
2010-12-14  9:52                         ` Mel Gorman
  -- strict thread matches above, loose matches on Subject: below --
2010-12-02 16:00 [PATCH] vmscan: " Minchan Kim
2010-12-03 12:11 ` Mel Gorman
2010-12-09 22:13 ` Andrew Morton
2010-12-10  3:53   ` Minchan Kim
2010-12-10 11:17   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101210234206.GA30377@hostway.ca \
    --to=sim@hostway.ca \
    --cc=akpm@linux-foundation.org \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=minchan.kim@gmail.com \
    --cc=shaohua.li@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).