All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Andy Whitcroft <apw@shadowen.org>
Cc: Mel Gorman <mel@skynet.ie>,
	Nicolas Mailhot <nicolas.mailhot@laposte.net>,
	Christoph Lameter <clameter@sgi.com>,
	akpm@linux-foundation.org,
	Linux Memory Management List <linux-mm@kvack.org>
Subject: Re: [PATCH 1/2] Have kswapd keep a minimum order free other than order-0
Date: Fri, 18 May 2007 12:25:00 +1000	[thread overview]
Message-ID: <464D0E7C.5050509@yahoo.com.au> (raw)
In-Reply-To: <464C48F1.3060903@shadowen.org>

Andy Whitcroft wrote:
> Nick Piggin wrote:

>>>order-0 alloc
>>>watermark hit => wake kswapd
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-3 alloc => kick kswap for order 3
>>>order-0 alloc            kswapd reclaiming order 0
>>>order-3 alloc            kswapd reclaiming order 0
>>>order-3 alloc            kswapd reclaiming order 0
>>>order-3 alloc => highorder mark hit, fail
>>>
>>>kswapd will keep reclaiming at order-0 until it completes a reclaim cycle
>>>and spots the new order and start over again. So there is a potentially
>>>sizable window there where problems can hit. Right?
>>
>>Take a look at the code. wakeup_kswapd and __alloc_pages.
>>
>>First, assume the zone is above high watermarks for order-0 and order-1.
>>order-0 allocs...
>>order-1 low watermark hit => don't care, not allocing order-1
>>order-0 low watermark hit => wake kswapd reclaim order 0
>>order-1 alloc => wakeup_kswapd raises kswapd_max_order to 1
>>order-1 allocs continue to succeed until the min watermark is hit
>>order-1 *atomic* allocs continue until the atomic reserve is hit
>>order-1 memalloc allocs continue until no more order-1 pages left.
> 
> 
> This represents the ideal.  However we never consider the reserves at
> order-1 unless we get an order-1 allocation.  With lots of order-0
> allocations (the norm) we can run the order-1 availability well below
> even the atomic reserve without anyone noticing, while the total reserve
> is above the order-0 low watermark.

Yes, but my reply was addressing the misconception that kswapd never
has its reclaim-order updated while it is reclaiming for a lower order.

It is by design that we don't make order-0 allocations notice order-1
watermarks, so if there is some problem with that, then that is what
should be changed. Not randomly break the watermarking code.


>  Here kswapd has been idle as there
> is only order-0 activity and we have sufficient of those.  THEN an
> order-1 comes in, we are below the order-1 low watermarks, we wake
> kswapd, and retry and discover we are below the atomic threshold and
> _fail_ the allocation.

And that is by design because we don't want to have order-1 pages free
if there are only order-0 allocations.

Anyway, atomic allocations are able to fail gracefully, in which case
kswapd will be kicked for next time. Non-atomic allocations can enter
direct reclaim, so it isn't the end of the world.


>>There really is (or should be) a proper watermarking system in place that
>>provides the right buffering for higher order allocations.
> 
> 
> I think that this is should be, not is.

Well you also said earlier that our problems are due to higher order
watermarks being too aggressive. So I think what is needed is to
actually work out what the real problem is first.


>>>I believe it failed to work due to a combination of kswapd reclaiming at
>>>the wrong order for a while and the fact that the watermarks are pretty
>>>agressive when it comes to higher orders. I'm trying to think of
>>>alternative fixes but keep coming back to the current fix using
>>>!(alloc_flags & ALLOC_CPUSET) to allow !wait allocations to succeed if
>>>the memory is there and above min watermarks at order-0.
>>
>>kswapd reclaiming at the wrong order should be a bug. It should start
>>reclaiming at the right order as soon as an allocation (atomic or not)
>>goes through the "start reclaiming now" watermark.
>>
>>Now this is just looking at mainline code that has the kswapd_max_order,
>>and kswapd doesn't actually reclaim "at" any order -- it just uses the
>>kswapd_max_order to know when the required "stop reclaiming now" marks
>>have been hit. If lumpy reclaim is not reclaiming at the right order,
>>then it means it isn't refreshing from kswapd_max_order enough.
> 
> 
> Yes I believe all of this is working as designed.  The problem is that
> we treat order-0 and order-1 allocations as independant.  We do not take
> into account that we split order-1's to make order-0.  We do not check
> the order-1 reserve for order 0 and so wake kswapd early enough.  It is
> very hard given the interdependant nature if the current calculation to
> detect transitions at _other_ orders when we allocate at any specific order.

Breaking the watermark code then adding a ridiculous hack to pin the
reclaim order to the highest created kmem cache is the wrong way to
go about this.

There are a number of right ways to help with this problem you describe.
One would be to *raise* higher order watermarks. Another would be to
have some decaying check-this-order-watermark-on-alloc counter in the
zone.

All this higher order allocation stuff had better _really_ be worth it...

-- 
SUSE Labs, Novell Inc.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-18  2:25 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-14 17:32 [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Mel Gorman
2007-05-14 17:32 ` [PATCH 1/2] Have kswapd keep a minimum order free other than order-0 Mel Gorman
2007-05-14 18:01   ` Christoph Lameter
2007-05-14 18:13     ` Christoph Lameter
2007-05-14 18:24       ` Mel Gorman
2007-05-14 18:52         ` Christoph Lameter
2007-05-15  8:42         ` Nicolas Mailhot
2007-05-15  9:16           ` Mel Gorman
2007-05-16  8:25             ` Nick Piggin
2007-05-16  9:03               ` Mel Gorman
2007-05-16  9:10                 ` Nick Piggin
2007-05-16  9:45                   ` Mel Gorman
2007-05-16 12:28                     ` Nick Piggin
2007-05-16 13:50                       ` Mel Gorman
2007-05-16 14:04                         ` Nick Piggin
2007-05-16 15:32                           ` Mel Gorman
2007-05-16 15:44                             ` Nick Piggin
2007-05-16 16:46                               ` Mel Gorman
2007-05-17  7:09                                 ` Nick Piggin
2007-05-17 12:22                                   ` Andy Whitcroft
2007-05-18  2:25                                     ` Nick Piggin [this message]
2007-05-16 15:46                             ` Nick Piggin
2007-05-16 14:20                         ` Nick Piggin
2007-05-16 15:06                           ` Nicolas Mailhot
2007-05-16 15:33                             ` Mel Gorman
2007-05-15 17:09           ` Christoph Lameter
2007-05-15  4:39       ` Christoph Lameter
2007-05-14 18:19     ` Mel Gorman
2007-05-14 17:32 ` [PATCH 2/2] Only check absolute watermarks for ALLOC_HIGH and ALLOC_HARDER allocations Mel Gorman
2007-05-16 12:14   ` Nick Piggin
2007-05-16 13:24     ` Mel Gorman
2007-05-16 13:35       ` Nick Piggin
2007-05-16 14:00         ` Mel Gorman
2007-05-16 14:11           ` Nick Piggin
2007-05-16 18:28             ` Andy Whitcroft
2007-05-16 18:48               ` Mel Gorman
2007-05-16 19:00                 ` Christoph Lameter
2007-05-17  7:34               ` Nick Piggin
2007-05-14 18:13 ` [PATCH 0/2] Two patches to address bug report in relation to high-order atomic allocations Nicolas Mailhot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=464D0E7C.5050509@yahoo.com.au \
    --to=nickpiggin@yahoo.com.au \
    --cc=akpm@linux-foundation.org \
    --cc=apw@shadowen.org \
    --cc=clameter@sgi.com \
    --cc=linux-mm@kvack.org \
    --cc=mel@skynet.ie \
    --cc=nicolas.mailhot@laposte.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.