Re: [PATCH 02/10] mm, compaction: report compaction as contended only due to lock contention

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Thelen <gthelen@google.com>, Mel Gorman <mgorman@suse.de>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 02/10] mm, compaction: report compaction as contended only due to lock contention
Date: Fri, 20 Jun 2014 13:47:32 +0200	[thread overview]
Message-ID: <53A41F54.8000501@suse.cz> (raw)
In-Reply-To: <20140613024005.GA8704@gmail.com>

On 06/13/2014 04:40 AM, Minchan Kim wrote:
> On Thu, Jun 12, 2014 at 04:02:04PM +0200, Vlastimil Babka wrote:
>> On 06/12/2014 01:49 AM, Minchan Kim wrote:
>> >On Wed, Jun 11, 2014 at 02:22:30PM +0200, Vlastimil Babka wrote:
>> >>On 06/11/2014 03:10 AM, Minchan Kim wrote:
>> >>>On Mon, Jun 09, 2014 at 11:26:14AM +0200, Vlastimil Babka wrote:
>> >>>>Async compaction aborts when it detects zone lock contention or need_resched()
>> >>>>is true. David Rientjes has reported that in practice, most direct async
>> >>>>compactions for THP allocation abort due to need_resched(). This means that a
>> >>>>second direct compaction is never attempted, which might be OK for a page
>> >>>>fault, but hugepaged is intended to attempt a sync compaction in such case and
>> >>>>in these cases it won't.
>> >>>>
>> >>>>This patch replaces "bool contended" in compact_control with an enum that
>> >>>>distinguieshes between aborting due to need_resched() and aborting due to lock
>> >>>>contention. This allows propagating the abort through all compaction functions
>> >>>>as before, but declaring the direct compaction as contended only when lock
>> >>>>contantion has been detected.
>> >>>>
>> >>>>As a result, hugepaged will proceed with second sync compaction as intended,
>> >>>>when the preceding async compaction aborted due to need_resched().
>> >>>
>> >>>You said "second direct compaction is never attempted, which might be OK
>> >>>for a page fault" and said "hugepagd is intented to attempt a sync compaction"
>> >>>so I feel you want to handle khugepaged so special unlike other direct compact
>> >>>(ex, page fault).
>> >>
>> >>Well khugepaged is my primary concern, but I imagine there are other
>> >>direct compaction users besides THP page fault and khugepaged.
>> >>
>> >>>By this patch, direct compaction take care only lock contention, not rescheduling
>> >>>so that pop questions.
>> >>>
>> >>>Is it okay not to consider need_resched in direct compaction really?
>> >>
>> >>It still considers need_resched() to back of from async compaction.
>> >>It's only about signaling contended_compaction back to
>> >>__alloc_pages_slowpath(). There's this code executed after the
>> >>first, async compaction fails:
>> >>
>> >>/*
>> >>  * It can become very expensive to allocate transparent hugepages at
>> >>  * fault, so use asynchronous memory compaction for THP unless it is
>> >>  * khugepaged trying to collapse.
>> >>  */
>> >>if (!(gfp_mask & __GFP_NO_KSWAPD) || (current->flags & PF_KTHREAD))
>> >>         migration_mode = MIGRATE_SYNC_LIGHT;
>> >>
>> >>/*
>> >>  * If compaction is deferred for high-order allocations, it is because
>> >>  * sync compaction recently failed. In this is the case and the caller
>> >>  * requested a movable allocation that does not heavily disrupt the
>> >>  * system then fail the allocation instead of entering direct reclaim.
>> >>  */
>> >>if ((deferred_compaction || contended_compaction) &&
>> >>                                         (gfp_mask & __GFP_NO_KSWAPD))
>> >>         goto nopage;
>> >>
>> >>Both THP page fault and khugepaged use __GFP_NO_KSWAPD. The first
>> >>if() decides whether the second attempt will be sync (for
>> >>khugepaged) or async (page fault). The second if() decides that if
>> >>compaction was contended, then there won't be any second attempt
>> >>(and reclaim) at all. Counting need_resched() as contended in this
>> >>case is bad for khugepaged. Even for page fault it means no direct
>> >
>> >I agree khugepaged shouldn't count on need_resched, even lock contention
>> >because it was a result from admin's decision.
>> >If it hurts system performance, he should adjust knobs for khugepaged.
>> >
>> >>reclaim and a second async compaction. David says need_resched()
>> >>occurs so often then it is a poor heuristic to decide this.
>> >
>> >But page fault is a bit different. Inherently, high-order allocation
>> >(ie, above PAGE_ALLOC_COSTLY_ORDER) is fragile so all of the caller
>> >shoud keep in mind that and prepare second plan(ex, 4K allocation)
>> >so direct reclaim/compaction should take care of latency rather than
>> >success ratio.
>> 
>> Yes it's a rather delicate balance. But the plan is now to try
>> balance this differently than using need_resched.
>> 
>> >If need_resched in second attempt(ie, synchronous compaction) is almost
>> >true, it means the process consumed his timeslice so it shouldn't be
>> >greedy and gives a CPU resource to others.
>> 
>> Synchronous compaction uses cond_resched() so that's fine I think?
> 
> Sorry for being not clear. I post for the clarification before taking
> a rest in holiday. :)
> 
> When THP page fault occurs and found rescheduling while doing async
> direct compaction, it goes "nopage" and fall-backed to 4K page.
> It's good to me.
> 
> Another topic: I couldn't find any cond_resched. Anyway, it could be
> another patch.
> 

Thanks for the explanation. I'll include a cond_resched() at the level of
try_to_compact_pages() where it fits better, so it's not necessary in the place you
suggested. This should solve the "don't be greedy" problem. I will not yet include
the "bail out for latency" part because we are now slowly moving towards removing
need_resched() as a condition for stopping compaction, and this would on the contrary
extend it to prevent direct reclaim as well. David's data suggests that compaction often
bails out due to need_resched(), so this would reduce the amount of direct reclaim and I
don't want to touch that area in this series :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Vlastimil Babka <vbabka@suse.cz>
To: Minchan Kim <minchan@kernel.org>
Cc: David Rientjes <rientjes@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Thelen <gthelen@google.com>, Mel Gorman <mgorman@suse.de>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Michal Nazarewicz <mina86@mina86.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Christoph Lameter <cl@linux.com>, Rik van Riel <riel@redhat.com>
Subject: Re: [PATCH 02/10] mm, compaction: report compaction as contended only due to lock contention
Date: Fri, 20 Jun 2014 13:47:32 +0200	[thread overview]
Message-ID: <53A41F54.8000501@suse.cz> (raw)
In-Reply-To: <20140613024005.GA8704@gmail.com>

On 06/13/2014 04:40 AM, Minchan Kim wrote:
> On Thu, Jun 12, 2014 at 04:02:04PM +0200, Vlastimil Babka wrote:
>> On 06/12/2014 01:49 AM, Minchan Kim wrote:
>> >On Wed, Jun 11, 2014 at 02:22:30PM +0200, Vlastimil Babka wrote:
>> >>On 06/11/2014 03:10 AM, Minchan Kim wrote:
>> >>>On Mon, Jun 09, 2014 at 11:26:14AM +0200, Vlastimil Babka wrote:
>> >>>>Async compaction aborts when it detects zone lock contention or need_resched()
>> >>>>is true. David Rientjes has reported that in practice, most direct async
>> >>>>compactions for THP allocation abort due to need_resched(). This means that a
>> >>>>second direct compaction is never attempted, which might be OK for a page
>> >>>>fault, but hugepaged is intended to attempt a sync compaction in such case and
>> >>>>in these cases it won't.
>> >>>>
>> >>>>This patch replaces "bool contended" in compact_control with an enum that
>> >>>>distinguieshes between aborting due to need_resched() and aborting due to lock
>> >>>>contention. This allows propagating the abort through all compaction functions
>> >>>>as before, but declaring the direct compaction as contended only when lock
>> >>>>contantion has been detected.
>> >>>>
>> >>>>As a result, hugepaged will proceed with second sync compaction as intended,
>> >>>>when the preceding async compaction aborted due to need_resched().
>> >>>
>> >>>You said "second direct compaction is never attempted, which might be OK
>> >>>for a page fault" and said "hugepagd is intented to attempt a sync compaction"
>> >>>so I feel you want to handle khugepaged so special unlike other direct compact
>> >>>(ex, page fault).
>> >>
>> >>Well khugepaged is my primary concern, but I imagine there are other
>> >>direct compaction users besides THP page fault and khugepaged.
>> >>
>> >>>By this patch, direct compaction take care only lock contention, not rescheduling
>> >>>so that pop questions.
>> >>>
>> >>>Is it okay not to consider need_resched in direct compaction really?
>> >>
>> >>It still considers need_resched() to back of from async compaction.
>> >>It's only about signaling contended_compaction back to
>> >>__alloc_pages_slowpath(). There's this code executed after the
>> >>first, async compaction fails:
>> >>
>> >>/*
>> >>  * It can become very expensive to allocate transparent hugepages at
>> >>  * fault, so use asynchronous memory compaction for THP unless it is
>> >>  * khugepaged trying to collapse.
>> >>  */
>> >>if (!(gfp_mask & __GFP_NO_KSWAPD) || (current->flags & PF_KTHREAD))
>> >>         migration_mode = MIGRATE_SYNC_LIGHT;
>> >>
>> >>/*
>> >>  * If compaction is deferred for high-order allocations, it is because
>> >>  * sync compaction recently failed. In this is the case and the caller
>> >>  * requested a movable allocation that does not heavily disrupt the
>> >>  * system then fail the allocation instead of entering direct reclaim.
>> >>  */
>> >>if ((deferred_compaction || contended_compaction) &&
>> >>                                         (gfp_mask & __GFP_NO_KSWAPD))
>> >>         goto nopage;
>> >>
>> >>Both THP page fault and khugepaged use __GFP_NO_KSWAPD. The first
>> >>if() decides whether the second attempt will be sync (for
>> >>khugepaged) or async (page fault). The second if() decides that if
>> >>compaction was contended, then there won't be any second attempt
>> >>(and reclaim) at all. Counting need_resched() as contended in this
>> >>case is bad for khugepaged. Even for page fault it means no direct
>> >
>> >I agree khugepaged shouldn't count on need_resched, even lock contention
>> >because it was a result from admin's decision.
>> >If it hurts system performance, he should adjust knobs for khugepaged.
>> >
>> >>reclaim and a second async compaction. David says need_resched()
>> >>occurs so often then it is a poor heuristic to decide this.
>> >
>> >But page fault is a bit different. Inherently, high-order allocation
>> >(ie, above PAGE_ALLOC_COSTLY_ORDER) is fragile so all of the caller
>> >shoud keep in mind that and prepare second plan(ex, 4K allocation)
>> >so direct reclaim/compaction should take care of latency rather than
>> >success ratio.
>> 
>> Yes it's a rather delicate balance. But the plan is now to try
>> balance this differently than using need_resched.
>> 
>> >If need_resched in second attempt(ie, synchronous compaction) is almost
>> >true, it means the process consumed his timeslice so it shouldn't be
>> >greedy and gives a CPU resource to others.
>> 
>> Synchronous compaction uses cond_resched() so that's fine I think?
> 
> Sorry for being not clear. I post for the clarification before taking
> a rest in holiday. :)
> 
> When THP page fault occurs and found rescheduling while doing async
> direct compaction, it goes "nopage" and fall-backed to 4K page.
> It's good to me.
> 
> Another topic: I couldn't find any cond_resched. Anyway, it could be
> another patch.
> 

Thanks for the explanation. I'll include a cond_resched() at the level of
try_to_compact_pages() where it fits better, so it's not necessary in the place you
suggested. This should solve the "don't be greedy" problem. I will not yet include
the "bail out for latency" part because we are now slowly moving towards removing
need_resched() as a condition for stopping compaction, and this would on the contrary
extend it to prevent direct reclaim as well. David's data suggests that compaction often
bails out due to need_resched(), so this would reduce the amount of direct reclaim and I
don't want to touch that area in this series :)

next prev parent reply	other threads:[~2014-06-20 11:47 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-09  9:26 [PATCH 01/10] mm, compaction: do not recheck suitable_migration_target under lock Vlastimil Babka
2014-06-09  9:26 ` Vlastimil Babka
2014-06-09  9:26 ` [PATCH 02/10] mm, compaction: report compaction as contended only due to lock contention Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-09 23:50   ` David Rientjes
2014-06-09 23:50     ` David Rientjes
2014-06-10  7:11     ` Vlastimil Babka
2014-06-10  7:11       ` Vlastimil Babka
2014-06-10 23:40       ` David Rientjes
2014-06-10 23:40         ` David Rientjes
2014-06-11  1:10   ` Minchan Kim
2014-06-11  1:10     ` Minchan Kim
2014-06-11 12:22     ` Vlastimil Babka
2014-06-11 12:22       ` Vlastimil Babka
2014-06-11 23:49       ` Minchan Kim
2014-06-11 23:49         ` Minchan Kim
2014-06-12 14:02         ` Vlastimil Babka
2014-06-12 14:02           ` Vlastimil Babka
2014-06-13  2:40           ` Minchan Kim
2014-06-13  2:40             ` Minchan Kim
2014-06-20 11:47             ` Vlastimil Babka [this message]
2014-06-20 11:47               ` Vlastimil Babka
2014-06-09  9:26 ` [PATCH 03/10] mm, compaction: periodically drop lock and restore IRQs in scanners Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-09 23:58   ` David Rientjes
2014-06-09 23:58     ` David Rientjes
2014-06-10  7:15     ` Vlastimil Babka
2014-06-10  7:15       ` Vlastimil Babka
2014-06-10 23:41       ` David Rientjes
2014-06-10 23:41         ` David Rientjes
2014-06-11  1:32   ` Minchan Kim
2014-06-11  1:32     ` Minchan Kim
2014-06-11 11:24     ` Vlastimil Babka
2014-06-11 11:24       ` Vlastimil Babka
2014-06-09  9:26 ` [PATCH 04/10] mm, compaction: skip rechecks when lock was already held Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-10  0:00   ` David Rientjes
2014-06-10  0:00     ` David Rientjes
2014-06-11  1:50   ` Minchan Kim
2014-06-11  1:50     ` Minchan Kim
2014-06-09  9:26 ` [PATCH 05/10] mm, compaction: remember position within pageblock in free pages scanner Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-10  0:07   ` David Rientjes
2014-06-10  0:07     ` David Rientjes
2014-06-11  2:12   ` Minchan Kim
2014-06-11  2:12     ` Minchan Kim
2014-06-11  8:16     ` Joonsoo Kim
2014-06-11  8:16       ` Joonsoo Kim
2014-06-11 11:41       ` Vlastimil Babka
2014-06-11 11:41         ` Vlastimil Babka
2014-06-11 11:33     ` Vlastimil Babka
2014-06-11 11:33       ` Vlastimil Babka
2014-06-11  3:29   ` Zhang Yanfei
2014-06-11  3:29     ` Zhang Yanfei
2014-06-09  9:26 ` [PATCH 06/10] mm, compaction: skip buddy pages by their order in the migrate scanner Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-10  0:08   ` David Rientjes
2014-06-10  0:08     ` David Rientjes
2014-06-09  9:26 ` [PATCH 07/10] mm: rename allocflags_to_migratetype for clarity Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-11  2:41   ` Minchan Kim
2014-06-11  2:41     ` Minchan Kim
2014-06-11  3:38     ` Zhang Yanfei
2014-06-11  3:38       ` Zhang Yanfei
2014-06-09  9:26 ` [PATCH 08/10] mm, compaction: pass gfp mask to compact_control Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-11  2:48   ` Minchan Kim
2014-06-11  2:48     ` Minchan Kim
2014-06-11 11:46     ` Vlastimil Babka
2014-06-11 11:46       ` Vlastimil Babka
2014-06-12  0:24       ` David Rientjes
2014-06-12  0:24         ` David Rientjes
2014-06-09  9:26 ` [RFC PATCH 09/10] mm, compaction: try to capture the just-created high-order freepage Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-11 14:56   ` Vlastimil Babka
2014-06-11 14:56     ` Vlastimil Babka
2014-06-12  2:20     ` Minchan Kim
2014-06-12  2:20       ` Minchan Kim
2014-06-12  8:21       ` Vlastimil Babka
2014-06-12  8:21         ` Vlastimil Babka
2014-06-09  9:26 ` [RFC PATCH 10/10] mm, compaction: do not migrate pages when that cannot satisfy page fault allocation Vlastimil Babka
2014-06-09  9:26   ` Vlastimil Babka
2014-06-09 23:41 ` [PATCH 01/10] mm, compaction: do not recheck suitable_migration_target under lock David Rientjes
2014-06-09 23:41   ` David Rientjes
2014-06-11  0:33 ` Minchan Kim
2014-06-11  0:33   ` Minchan Kim
2014-06-11  2:45 ` Zhang Yanfei
2014-06-11  2:45   ` Zhang Yanfei

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53A41F54.8000501@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=gthelen@google.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mina86@mina86.com \
    --cc=minchan@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.