From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>, Hugh Dickins <hughd@google.com>,
Joonsoo Kim <js1304@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Mel Gorman <mgorman@suse.de>,
David Rientjes <rientjes@google.com>,
Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
Hillf Danton <hillf.zj@alibaba-inc.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 0/3] OOM detection rework v4
Date: Tue, 1 Mar 2016 19:14:08 +0100 [thread overview]
Message-ID: <56D5DBF0.2020004@suse.cz> (raw)
In-Reply-To: <20160301133846.GF9461@dhcp22.suse.cz>
On 03/01/2016 02:38 PM, Michal Hocko wrote:
> $ grep compact /proc/vmstat
> compact_migrate_scanned 113983
> compact_free_scanned 1433503
> compact_isolated 134307
> compact_stall 128
> compact_fail 26
> compact_success 102
> compact_kcompatd_wake 0
>
> So the whole load has done the direct compaction only 128 times during
> that test. This doesn't sound much to me
> $ grep allocstall /proc/vmstat
> allocstall 1061
>
> we entered the direct reclaim much more but most of the load will be
> order-0 so this might be still ok. So I've tried the following:
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 1993894b4219..107d444afdb1 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -2910,6 +2910,9 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned int order,
> mode, contended_compaction);
> current->flags &= ~PF_MEMALLOC;
>
> + if (order > 0 && order <= PAGE_ALLOC_COSTLY_ORDER)
> + trace_printk("order:%d gfp_mask:%pGg compact_result:%lu\n", order, &gfp_mask, compact_result);
> +
> switch (compact_result) {
> case COMPACT_DEFERRED:
> *deferred_compaction = true;
>
> And the result was:
> $ cat /debug/tracing/trace_pipe | tee ~/trace.log
> gcc-8707 [001] .... 137.946370: __alloc_pages_direct_compact: order:2 gfp_mask:GFP_KERNEL_ACCOUNT|__GFP_NOTRACK compact_result:1
> gcc-8726 [000] .... 138.528571: __alloc_pages_direct_compact: order:2 gfp_mask:GFP_KERNEL_ACCOUNT|__GFP_NOTRACK compact_result:1
>
> this shows that order-2 memory pressure is not overly high in my
> setup. Both attempts ended up COMPACT_SKIPPED which is interesting.
>
> So I went back to 800M of hugetlb pages and tried again. It took ages
> so I have interrupted that after one hour (there was still no OOM). The
> trace log is quite interesting regardless:
> $ wc -l ~/trace.log
> 371 /root/trace.log
>
> $ grep compact_stall /proc/vmstat
> compact_stall 190
>
> so the compaction was still ignored more than actually invoked for
> !costly allocations:
> sed 's@.*order:\([[:digit:]]\).* compact_result:\([[:digit:]]\)@\1 \2@' ~/trace.log | sort | uniq -c
> 190 2 1
> 122 2 3
> 59 2 4
>
> #define COMPACT_SKIPPED 1
> #define COMPACT_PARTIAL 3
> #define COMPACT_COMPLETE 4
>
> that means that compaction is even not tried in half cases! This
> doesn't sounds right to me, especially when we are talking about
> <= PAGE_ALLOC_COSTLY_ORDER requests which are implicitly nofail, because
> then we simply rely on the order-0 reclaim to automagically form higher
> blocks. This might indeed work when we retry many times but I guess this
> is not a good approach. It leads to a excessive reclaim and the stall
> for allocation can be really large.
>
> One of the suspicious places is __compaction_suitable which does order-0
> watermark check (increased by 2<<order). I have put another trace_printk
> there and it clearly pointed out this was the case.
Yes, compaction is historically quite careful to avoid making low memory
conditions worse, and to prevent work if it doesn't look like it can ultimately
succeed the allocation (so having not enough base pages means that compacting
them is considered pointless). This aspect of preventing non-zero-order OOMs is
somewhat unexpected :)
> So I have tried the following:
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 4d99e1f5055c..7364e48cf69a 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1276,6 +1276,9 @@ static unsigned long __compaction_suitable(struct zone *zone, int order,
> alloc_flags))
> return COMPACT_PARTIAL;
>
> + if (order <= PAGE_ALLOC_COSTLY_ORDER)
> + return COMPACT_CONTINUE;
> +
> /*
> * Watermarks for order-0 must be met for compaction. Note the 2UL.
> * This is because during migration, copies of pages need to be
>
> and retried the same test (without huge pages):
> $ time make -j20 > /dev/null
>
> real 8m46.626s
> user 14m15.823s
> sys 2m45.471s
>
> the time increased but I haven't checked how stable the result is.
>
> $ grep compact /proc/vmstat
> compact_migrate_scanned 139822
> compact_free_scanned 1661642
> compact_isolated 139407
> compact_stall 129
> compact_fail 58
> compact_success 71
> compact_kcompatd_wake 1
>
> $ grep allocstall /proc/vmstat
> allocstall 1665
>
> this is worse because we have scanned more pages for migration but the
> overall success rate was much smaller and the direct reclaim was invoked
> more. I do not have a good theory for that and will play with this some
> more. Maybe other changes are needed deeper in the compaction code.
I was under impression that similar checks to compaction_suitable() were done
also in compact_finished(), to stop compacting if memory got low due to parallel
activity. But I guess it was a patch from Joonsoo that didn't get merged.
My only other theory so far is that watermark checks fail in
__isolate_free_page() when we want to grab page(s) as migration targets. I would
suggest enabling all compaction tracepoint and the migration tracepoint. Looking
at the trace could hopefully help faster than going one trace_printk() per attempt.
Once we learn all the relevant places/checks, we can think about how to
communicate to them that this compaction attempt is "important" and should
continue as long as possible even in low-memory conditions. Maybe not just a
costly order check, but we also have alloc_flags or could add something to
compact_control, etc.
> I will play with this some more but I would be really interested to hear
> whether this helped Hugh with his setup. Vlastimi, Joonsoo does this
> even make sense to you?
>
>> I was only suggesting to allocate hugetlb pages, if you preferred
>> not to reboot with artificially reduced RAM. Not an issue if you're
>> booting VMs.
>
> Ohh, I see.
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-03-01 18:14 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-15 18:19 [PATCH 0/3] OOM detection rework v4 Michal Hocko
2015-12-15 18:19 ` [PATCH 1/3] mm, oom: rework oom detection Michal Hocko
2016-01-14 22:58 ` David Rientjes
2016-01-16 1:07 ` Tetsuo Handa
2016-01-19 22:48 ` David Rientjes
2016-01-20 11:13 ` Tetsuo Handa
2016-01-20 13:13 ` Michal Hocko
2016-04-04 8:23 ` Vladimir Davydov
2016-04-04 9:42 ` Michal Hocko
2015-12-15 18:19 ` [PATCH 2/3] mm: throttle on IO only when there are too many dirty and writeback pages Michal Hocko
2016-03-17 11:35 ` Tetsuo Handa
2016-03-17 12:01 ` Michal Hocko
2015-12-15 18:19 ` [PATCH 3/3] mm: use watermak checks for __GFP_REPEAT high order allocations Michal Hocko
2015-12-16 23:35 ` [PATCH 0/3] OOM detection rework v4 Andrew Morton
2015-12-18 12:12 ` Michal Hocko
2015-12-16 23:58 ` Andrew Morton
2015-12-18 13:15 ` Michal Hocko
2015-12-18 16:35 ` Johannes Weiner
2015-12-24 12:41 ` Tetsuo Handa
2015-12-28 12:08 ` Tetsuo Handa
2015-12-28 14:13 ` Tetsuo Handa
2016-01-06 12:44 ` Vlastimil Babka
2016-01-08 12:37 ` Michal Hocko
2015-12-29 16:32 ` Michal Hocko
2015-12-30 15:05 ` Tetsuo Handa
2016-01-02 15:47 ` Tetsuo Handa
2016-01-20 12:24 ` Michal Hocko
2016-01-27 23:18 ` David Rientjes
2016-01-28 21:19 ` Michal Hocko
2015-12-29 16:27 ` Michal Hocko
2016-01-28 20:40 ` [PATCH 4/3] mm, oom: drop the last allocation attempt before out_of_memory Michal Hocko
2016-01-28 21:36 ` Johannes Weiner
2016-01-28 23:19 ` David Rientjes
2016-01-28 23:51 ` Johannes Weiner
2016-01-29 10:39 ` Tetsuo Handa
2016-01-29 15:32 ` Michal Hocko
2016-01-30 12:18 ` Tetsuo Handa
2016-01-29 15:23 ` Michal Hocko
2016-01-29 15:24 ` Michal Hocko
2016-01-28 21:19 ` [PATCH 5/3] mm, vmscan: make zone_reclaimable_pages more precise Michal Hocko
2016-01-28 23:20 ` David Rientjes
2016-01-29 3:41 ` Hillf Danton
2016-01-29 10:35 ` Tetsuo Handa
2016-01-29 15:17 ` Michal Hocko
2016-01-29 21:30 ` Tetsuo Handa
2016-02-03 13:27 ` [PATCH 0/3] OOM detection rework v4 Michal Hocko
2016-02-03 22:58 ` David Rientjes
2016-02-04 12:57 ` Michal Hocko
2016-02-04 13:10 ` Tetsuo Handa
2016-02-04 13:39 ` Michal Hocko
2016-02-04 14:24 ` Michal Hocko
2016-02-07 4:09 ` Tetsuo Handa
2016-02-15 20:06 ` Michal Hocko
2016-02-16 13:10 ` Tetsuo Handa
2016-02-16 15:19 ` Michal Hocko
2016-02-25 3:47 ` Hugh Dickins
2016-02-25 6:48 ` Sergey Senozhatsky
2016-02-25 9:17 ` Hillf Danton
2016-02-25 9:27 ` Michal Hocko
2016-02-25 9:48 ` Hillf Danton
2016-02-25 11:02 ` Sergey Senozhatsky
2016-02-25 9:23 ` Michal Hocko
2016-02-26 6:32 ` Hugh Dickins
2016-02-26 7:54 ` Hillf Danton
2016-02-26 9:24 ` Michal Hocko
2016-02-26 10:27 ` Hillf Danton
2016-02-26 13:49 ` Michal Hocko
2016-02-26 9:33 ` Michal Hocko
2016-02-29 21:02 ` Michal Hocko
2016-03-02 2:19 ` Joonsoo Kim
2016-03-02 9:50 ` Michal Hocko
2016-03-02 13:32 ` Joonsoo Kim
2016-03-02 14:06 ` Michal Hocko
2016-03-02 14:34 ` Joonsoo Kim
2016-03-03 9:26 ` Michal Hocko
2016-03-03 10:29 ` Tetsuo Handa
2016-03-03 14:10 ` Joonsoo Kim
2016-03-03 15:25 ` Michal Hocko
2016-03-04 5:23 ` Joonsoo Kim
2016-03-04 15:15 ` Michal Hocko
2016-03-04 17:39 ` Michal Hocko
2016-03-07 5:23 ` Joonsoo Kim
2016-03-03 15:50 ` Vlastimil Babka
2016-03-03 16:26 ` Michal Hocko
2016-03-04 7:10 ` Joonsoo Kim
2016-03-02 15:01 ` Minchan Kim
2016-03-07 16:08 ` [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4) Michal Hocko
2016-03-08 3:51 ` Sergey Senozhatsky
2016-03-08 9:08 ` Michal Hocko
2016-03-08 9:24 ` Sergey Senozhatsky
2016-03-08 9:24 ` [PATCH] mm, oom: protect !costly allocations some more Vlastimil Babka
2016-03-08 9:32 ` Sergey Senozhatsky
2016-03-08 9:46 ` Michal Hocko
2016-03-08 9:52 ` Vlastimil Babka
2016-03-08 10:10 ` Michal Hocko
2016-03-08 11:12 ` Vlastimil Babka
2016-03-08 12:22 ` Michal Hocko
2016-03-08 12:29 ` Vlastimil Babka
2016-03-08 9:58 ` [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4) Sergey Senozhatsky
2016-03-08 13:57 ` Michal Hocko
2016-03-08 10:36 ` Hugh Dickins
2016-03-08 13:42 ` [PATCH 0/2] oom rework: high order enahncements Michal Hocko
2016-03-08 13:42 ` [PATCH 1/3] mm, compaction: change COMPACT_ constants into enum Michal Hocko
2016-03-08 14:19 ` Vlastimil Babka
2016-03-09 3:55 ` Hillf Danton
2016-03-08 13:42 ` [PATCH 2/3] mm, compaction: cover all compaction mode in compact_zone Michal Hocko
2016-03-08 14:22 ` Vlastimil Babka
2016-03-09 3:57 ` Hillf Danton
2016-03-08 13:42 ` [PATCH 3/3] mm, oom: protect !costly allocations some more Michal Hocko
2016-03-08 14:34 ` Vlastimil Babka
2016-03-08 14:48 ` Michal Hocko
2016-03-08 15:03 ` Vlastimil Babka
2016-03-09 11:11 ` Michal Hocko
2016-03-09 14:07 ` Vlastimil Babka
2016-03-11 12:17 ` Hugh Dickins
2016-03-11 13:06 ` Michal Hocko
2016-03-11 19:08 ` Hugh Dickins
2016-03-14 16:21 ` Michal Hocko
2016-03-08 15:19 ` [PATCH] mm, oom: protect !costly allocations some more (was: Re: [PATCH 0/3] OOM detection rework v4) Joonsoo Kim
2016-03-08 16:05 ` Michal Hocko
2016-03-08 17:03 ` Joonsoo Kim
2016-03-09 10:41 ` Michal Hocko
2016-03-11 14:53 ` Joonsoo Kim
2016-03-11 15:20 ` Michal Hocko
2016-02-29 20:35 ` [PATCH 0/3] OOM detection rework v4 Michal Hocko
2016-03-01 7:29 ` Hugh Dickins
2016-03-01 13:38 ` Michal Hocko
2016-03-01 14:40 ` Michal Hocko
2016-03-01 18:14 ` Vlastimil Babka [this message]
2016-03-02 2:55 ` Joonsoo Kim
2016-03-02 12:37 ` Michal Hocko
2016-03-02 14:06 ` Joonsoo Kim
2016-03-02 12:24 ` Michal Hocko
2016-03-02 13:00 ` Michal Hocko
2016-03-02 13:22 ` Vlastimil Babka
2016-03-02 2:28 ` Joonsoo Kim
2016-03-02 12:39 ` Michal Hocko
2016-03-03 9:54 ` Hugh Dickins
2016-03-03 12:32 ` Michal Hocko
2016-03-03 20:57 ` Hugh Dickins
2016-03-04 7:41 ` Vlastimil Babka
2016-03-04 7:53 ` Joonsoo Kim
2016-03-04 12:28 ` Michal Hocko
2016-03-11 10:45 ` Tetsuo Handa
2016-03-11 13:08 ` Michal Hocko
2016-03-11 13:32 ` Tetsuo Handa
2016-03-11 15:28 ` Michal Hocko
2016-03-11 16:49 ` Tetsuo Handa
2016-03-11 17:00 ` Michal Hocko
2016-03-11 17:20 ` Tetsuo Handa
2016-03-12 4:08 ` Tetsuo Handa
2016-03-13 14:41 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56D5DBF0.2020004@suse.cz \
--to=vbabka@suse.cz \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=hillf.zj@alibaba-inc.com \
--cc=hughd@google.com \
--cc=js1304@gmail.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).