From: Rik van Riel <riel@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>,
akpm@linux-foundation.org, mgorman@suse.de,
Valdis.Kletnieks@vt.edu, jirislaby@gmail.com, jslaby@suse.cz,
zkabelac@redhat.com, mm-commits@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
torvalds@linux-foundation.org
Subject: Re: [PATCH] mm,vmscan: free pages if compaction_suitable tells us to
Date: Sun, 25 Nov 2012 18:31:01 -0500 [thread overview]
Message-ID: <50B2AA35.70803@redhat.com> (raw)
In-Reply-To: <20121125224433.GB2799@cmpxchg.org>
On 11/25/2012 05:44 PM, Johannes Weiner wrote:
> On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
>> On Sun, 25 Nov 2012 17:57:28 +0100
>> Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> wrote:
>>
>>> With kernel 3.7-rc6 I've still problems with kswapd0 on my laptop
>>
>>> And this is most of the time. I've only observed this behavior on the
>>> laptop. Other systems don't show this.
>>
>> This suggests it may have something to do with small memory zones,
>> where we end up with the "funny" situation that the high watermark
>> (+ balance gap) for a particular zone is less than the low watermark
>> + 2<<order pages, which is the number of free pages required to keep
>> compaction_suitable happy.
>>
>> Could you try this patch?
>
> It's not quite enough because it's not reaching the conditions you
> changed, see analysis in https://lkml.org/lkml/2012/11/20/567
You are right, I forgot the preliminary loop in balance_pgdat().
> But even fixing it up (by adding the compaction_suitable() test in
> this preliminary scan over the zones and setting end_zone accordingly)
> is not enough because no actual reclaim happens at priority 12 in a
> small zone. So the number of free pages is not actually changing and
> the compaction_suitable() checks keep the loop going.
Indeed, it is a hairy situation. I tried to come up with a simple
patch, but apparently that is not enough...
> The problem is fairly easy to reproduce, by the way. Just boot with
> mem=800M to have a relatively small lowmem reserve in the DMA zone.
> Fill it up with page cache, then allocate transparent huge pages.
>
> With your patch and my fix to the preliminary zone loop, there won't
> be any hung task warnings anymore because kswapd actually calls
> shrink_slab() and there is a rescheduling point in there, but it still
> loops forever.
>
> It also seems a bit aggressive to try to balance a small zone like DMA
> for a huge page when it's not a GFP_DMA allocation, but none of these
> checks actually take the classzone into account. Do we have any
> agreement over what this whole thing is supposed to be doing?
It is supposed to free memory, in order to:
1) allow allocations to succeed, and
2) balance memory pressure between zones
I think the compaction_suitable check in the final loop
over the zones is backwards.
We need to loop back to the start if compaction_suitable
returns COMPACT_SKIPPED for _every_ zone in the pgdat.
Does that sound reasonable?
I'll whip up a patch.
--
All rights reversed
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Rik van Riel <riel@redhat.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de>,
akpm@linux-foundation.org, mgorman@suse.de,
Valdis.Kletnieks@vt.edu, jirislaby@gmail.com, jslaby@suse.cz,
zkabelac@redhat.com, mm-commits@vger.kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
torvalds@linux-foundation.org
Subject: Re: [PATCH] mm,vmscan: free pages if compaction_suitable tells us to
Date: Sun, 25 Nov 2012 18:31:01 -0500 [thread overview]
Message-ID: <50B2AA35.70803@redhat.com> (raw)
In-Reply-To: <20121125224433.GB2799@cmpxchg.org>
On 11/25/2012 05:44 PM, Johannes Weiner wrote:
> On Sun, Nov 25, 2012 at 01:29:50PM -0500, Rik van Riel wrote:
>> On Sun, 25 Nov 2012 17:57:28 +0100
>> Johannes Hirte <johannes.hirte@fem.tu-ilmenau.de> wrote:
>>
>>> With kernel 3.7-rc6 I've still problems with kswapd0 on my laptop
>>
>>> And this is most of the time. I've only observed this behavior on the
>>> laptop. Other systems don't show this.
>>
>> This suggests it may have something to do with small memory zones,
>> where we end up with the "funny" situation that the high watermark
>> (+ balance gap) for a particular zone is less than the low watermark
>> + 2<<order pages, which is the number of free pages required to keep
>> compaction_suitable happy.
>>
>> Could you try this patch?
>
> It's not quite enough because it's not reaching the conditions you
> changed, see analysis in https://lkml.org/lkml/2012/11/20/567
You are right, I forgot the preliminary loop in balance_pgdat().
> But even fixing it up (by adding the compaction_suitable() test in
> this preliminary scan over the zones and setting end_zone accordingly)
> is not enough because no actual reclaim happens at priority 12 in a
> small zone. So the number of free pages is not actually changing and
> the compaction_suitable() checks keep the loop going.
Indeed, it is a hairy situation. I tried to come up with a simple
patch, but apparently that is not enough...
> The problem is fairly easy to reproduce, by the way. Just boot with
> mem=800M to have a relatively small lowmem reserve in the DMA zone.
> Fill it up with page cache, then allocate transparent huge pages.
>
> With your patch and my fix to the preliminary zone loop, there won't
> be any hung task warnings anymore because kswapd actually calls
> shrink_slab() and there is a rescheduling point in there, but it still
> loops forever.
>
> It also seems a bit aggressive to try to balance a small zone like DMA
> for a huge page when it's not a GFP_DMA allocation, but none of these
> checks actually take the classzone into account. Do we have any
> agreement over what this whole thing is supposed to be doing?
It is supposed to free memory, in order to:
1) allow allocations to succeed, and
2) balance memory pressure between zones
I think the compaction_suitable check in the final loop
over the zones is backwards.
We need to loop back to the start if compaction_suitable
returns COMPACT_SKIPPED for _every_ zone in the pgdat.
Does that sound reasonable?
I'll whip up a patch.
--
All rights reversed
next prev parent reply other threads:[~2012-11-25 23:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-19 20:21 [merged] mm-revert-mm-vmscan-scale-number-of-pages-reclaimed-by-reclaim-compaction-based-on-failures.patch removed from -mm tree akpm
[not found] ` <20121125175728.3db4ac6a@fem.tu-ilmenau.de>
2012-11-25 18:29 ` [PATCH] mm,vmscan: free pages if compaction_suitable tells us to Rik van Riel
2012-11-25 22:44 ` Johannes Weiner
2012-11-25 22:44 ` Johannes Weiner
2012-11-25 23:31 ` Rik van Riel [this message]
2012-11-25 23:31 ` Rik van Riel
2012-11-26 0:16 ` [PATCH] mm,vmscan: only loop back if compaction would fail in all zones Rik van Riel
2012-11-26 0:16 ` Rik van Riel
2012-11-26 3:15 ` Johannes Weiner
2012-11-26 3:15 ` Johannes Weiner
2012-11-26 4:10 ` Johannes Weiner
2012-11-26 4:10 ` Johannes Weiner
2012-11-26 11:17 ` Johannes Hirte
2012-11-26 11:17 ` Johannes Hirte
2012-11-26 15:32 ` Rik van Riel
2012-11-26 15:32 ` Rik van Riel
2012-11-27 22:35 ` Valdis.Kletnieks
2012-11-26 1:21 ` [PATCH] mm,vmscan: free pages if compaction_suitable tells us to Jaegeuk Hanse
2012-11-26 1:21 ` Jaegeuk Hanse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50B2AA35.70803@redhat.com \
--to=riel@redhat.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=jirislaby@gmail.com \
--cc=johannes.hirte@fem.tu-ilmenau.de \
--cc=jslaby@suse.cz \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mm-commits@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=zkabelac@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.