public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm:;
Subject: Re: [PATCH v5 07/14] mm, compaction: khugepaged should not give up due to need_resched()
Date: Tue, 29 Jul 2014 11:45:20 +0200	[thread overview]
Message-ID: <53D76D30.3000106@suse.cz> (raw)
In-Reply-To: <alpine.DEB.2.02.1407281748110.8998@chino.kir.corp.google.com>

On 07/29/2014 02:59 AM, David Rientjes wrote:
> On Mon, 28 Jul 2014, Vlastimil Babka wrote:
>
>> diff --git a/include/linux/compaction.h b/include/linux/compaction.h
>> index b2e4c92..60bdf8d 100644
>> --- a/include/linux/compaction.h
>> +++ b/include/linux/compaction.h
>> @@ -13,6 +13,14 @@
>>   /* The full zone was compacted */
>>   #define COMPACT_COMPLETE	4
>>
>> +/* Used to signal whether compaction detected need_sched() or lock contention */
>> +/* No contention detected */
>> +#define COMPACT_CONTENDED_NONE	0
>> +/* Either need_sched() was true or fatal signal pending */
>> +#define COMPACT_CONTENDED_SCHED	1
>> +/* Zone lock or lru_lock was contended in async compaction */
>> +#define COMPACT_CONTENDED_LOCK	2
>> +
>
> Make this an enum?

I tried originally, but then I would have to define it elsewhere 
(mm/internal.h I think) together with compact_control. I didn't think it 
was worth the extra pollution of shared header, when the return codes 
are also #define and we might still get rid of need_resched() one day.

>> @@ -223,9 +223,21 @@ static void update_pageblock_skip(struct compact_control *cc,
>>   }
>>   #endif /* CONFIG_COMPACTION */
>>
>> -static inline bool should_release_lock(spinlock_t *lock)
>> +static int should_release_lock(spinlock_t *lock)
>>   {
>> -	return need_resched() || spin_is_contended(lock);
>> +	/*
>> +	 * Sched contention has higher priority here as we may potentially
>> +	 * have to abort whole compaction ASAP. Returning with lock contention
>> +	 * means we will try another zone, and further decisions are
>> +	 * influenced only when all zones are lock contended. That means
>> +	 * potentially missing a lock contention is less critical.
>> +	 */
>> +	if (need_resched())
>> +		return COMPACT_CONTENDED_SCHED;
>> +	else if (spin_is_contended(lock))
>> +		return COMPACT_CONTENDED_LOCK;
>> +	else
>> +		return COMPACT_CONTENDED_NONE;
>
> I would avoid the last else statement and just return
> COMPACT_CONTENDED_NONE.

OK, checkpatch agrees.

>> @@ -1154,11 +1168,11 @@ static unsigned long compact_zone_order(struct zone *zone, int order,
>>   	INIT_LIST_HEAD(&cc.migratepages);
>>
>>   	ret = compact_zone(zone, &cc);
>> +	*contended = cc.contended;
>>
>>   	VM_BUG_ON(!list_empty(&cc.freepages));
>>   	VM_BUG_ON(!list_empty(&cc.migratepages));
>>
>> -	*contended = cc.contended;
>
> Not sure why, but ok :)

Oversight due to how this patch evolved :)

>>   	return ret;
>>   }
>>
>> @@ -1171,14 +1185,15 @@ int sysctl_extfrag_threshold = 500;
>>    * @gfp_mask: The GFP mask of the current allocation
>>    * @nodemask: The allowed nodes to allocate from
>>    * @mode: The migration mode for async, sync light, or sync migration
>> - * @contended: Return value that is true if compaction was aborted due to lock contention
>> + * @contended: Return value that determines if compaction was aborted due to
>> + *	       need_resched() or lock contention
>>    * @candidate_zone: Return the zone where we think allocation should succeed
>>    *
>>    * This is the main entry point for direct page compaction.
>>    */
>>   unsigned long try_to_compact_pages(struct zonelist *zonelist,
>>   			int order, gfp_t gfp_mask, nodemask_t *nodemask,
>> -			enum migrate_mode mode, bool *contended,
>> +			enum migrate_mode mode, int *contended,
>>   			struct zone **candidate_zone)
>>   {
>>   	enum zone_type high_zoneidx = gfp_zone(gfp_mask);
>> @@ -1188,6 +1203,9 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
>>   	struct zone *zone;
>>   	int rc = COMPACT_DEFERRED;
>>   	int alloc_flags = 0;
>> +	bool all_zones_lock_contended = true; /* init true for &= operation */
>> +
>> +	*contended = COMPACT_CONTENDED_NONE;
>>
>>   	/* Check if the GFP flags allow compaction */
>>   	if (!order || !may_enter_fs || !may_perform_io)
>> @@ -1201,13 +1219,20 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
>>   	for_each_zone_zonelist_nodemask(zone, z, zonelist, high_zoneidx,
>>   								nodemask) {
>>   		int status;
>> +		int zone_contended;
>>
>>   		if (compaction_deferred(zone, order))
>>   			continue;
>>
>>   		status = compact_zone_order(zone, order, gfp_mask, mode,
>> -						contended);
>> +							&zone_contended);
>>   		rc = max(status, rc);
>> +		/*
>> +		 * It takes at least one zone that wasn't lock contended
>> +		 * to turn all_zones_lock_contended to false.
>> +		 */
>> +		all_zones_lock_contended &=
>> +			(zone_contended == COMPACT_CONTENDED_LOCK);
>
> Eek, does this always work?  COMPACT_CONTENDED_LOCK is 0x2 and

(zone_contended == COMPACT_CONTENDED_LOCK) is a bool so it should work?

> all_zones_lock_contended is a bool initialized to true.  I'm fairly
> certain you'd get better code generation if you defined
> all_zones_lock_contended as
>
> 	int all_zones_lock_contended = COMPACT_CONTENDED_LOCK
>
> from the start and the source code would be more clear.

Yeah that would look nicer, thanks for the suggestion.

>>
>>   		/* If a normal allocation would succeed, stop compacting */
>>   		if (zone_watermark_ok(zone, order, low_wmark_pages(zone), 0,
>> @@ -1220,8 +1245,20 @@ unsigned long try_to_compact_pages(struct zonelist *zonelist,
>>   			 * succeeds in this zone.
>>   			 */
>>   			compaction_defer_reset(zone, order, false);
>> -			break;
>> -		} else if (mode != MIGRATE_ASYNC) {
>> +			/*
>> +			 * It is possible that async compaction aborted due to
>> +			 * need_resched() and the watermarks were ok thanks to
>> +			 * somebody else freeing memory. The allocation can
>> +			 * however still fail so we better signal the
>> +			 * need_resched() contention anyway.
>> +			 */
>
> Makes sense because the page allocator still tries to allocate the page
> even if this is returned, but it's not very clear in the comment.

OK

>> @@ -2660,15 +2660,36 @@ rebalance:
>>   	if (!(gfp_mask & __GFP_NO_KSWAPD) || (current->flags & PF_KTHREAD))
>>   		migration_mode = MIGRATE_SYNC_LIGHT;
>>
>> -	/*
>> -	 * If compaction is deferred for high-order allocations, it is because
>> -	 * sync compaction recently failed. In this is the case and the caller
>> -	 * requested a movable allocation that does not heavily disrupt the
>> -	 * system then fail the allocation instead of entering direct reclaim.
>> -	 */
>> -	if ((deferred_compaction || contended_compaction) &&
>> -						(gfp_mask & __GFP_NO_KSWAPD))
>> -		goto nopage;
>
> Hmm, this check will have unfortunately changed in the latest mmotm due to
> mm-thp-restructure-thp-avoidance-of-light-synchronous-migration.patch.

I think you were changing (and moving around) a different check so there 
would be merge conflicts but no semantic problem.

>> +	/* Checks for THP-specific high-order allocations */
>> +	if (gfp_mask & __GFP_NO_KSWAPD) {
>> +		/*
>> +		 * If compaction is deferred for high-order allocations, it is
>> +		 * because sync compaction recently failed. If this is the case
>> +		 * and the caller requested a THP allocation, we do not want
>> +		 * to heavily disrupt the system, so we fail the allocation
>> +		 * instead of entering direct reclaim.
>> +		 */
>> +		if (deferred_compaction)
>> +			goto nopage;
>> +
>> +		/*
>> +		 * In all zones where compaction was attempted (and not
>> +		 * deferred or skipped), lock contention has been detected.
>> +		 * For THP allocation we do not want to disrupt the others
>> +		 * so we fallback to base pages instead.
>> +		 */
>> +		if (contended_compaction == COMPACT_CONTENDED_LOCK)
>> +			goto nopage;
>> +
>> +		/*
>> +		 * If compaction was aborted due to need_resched(), we do not
>> +		 * want to further increase allocation latency, unless it is
>> +		 * khugepaged trying to collapse.
>> +		 */
>> +		if (contended_compaction == COMPACT_CONTENDED_SCHED
>> +			&& !(current->flags & PF_KTHREAD))
>> +			goto nopage;
>> +	}
>>
>>   	/* Try direct reclaim and then allocating */
>>   	page = __alloc_pages_direct_reclaim(gfp_mask, order,


  reply	other threads:[~2014-07-29  9:45 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-28 13:11 [PATCH v5 00/14] compaction: balancing overhead and success rates Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 01/14] mm, THP: don't hold mmap_sem in khugepaged when allocating THP Vlastimil Babka
2014-07-28 23:39   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 02/14] mm, compaction: defer each zone individually instead of preferred zone Vlastimil Babka
2014-07-28 23:59   ` David Rientjes
2014-07-29  9:02     ` Vlastimil Babka
2014-07-29  6:38   ` Joonsoo Kim
2014-07-29  9:12     ` Vlastimil Babka
2014-07-30 16:22       ` Vlastimil Babka
2014-08-01  8:51         ` Vlastimil Babka
2014-08-04  6:45           ` Joonsoo Kim
2014-07-28 13:11 ` [PATCH v5 03/14] mm, compaction: do not count compact_stall if all zones skipped compaction Vlastimil Babka
2014-07-29  0:04   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 04/14] mm, compaction: do not recheck suitable_migration_target under lock Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 05/14] mm, compaction: move pageblock checks up from isolate_migratepages_range() Vlastimil Babka
2014-07-29  0:29   ` David Rientjes
2014-07-29  9:27     ` Vlastimil Babka
2014-07-29 23:02       ` David Rientjes
2014-07-29 23:21         ` Kirill A. Shutemov
2014-07-29 23:51           ` David Rientjes
2014-07-30  9:27             ` Vlastimil Babka
2014-07-30  9:39         ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 06/14] mm, compaction: reduce zone checking frequency in the migration scanner Vlastimil Babka
2014-07-29  0:44   ` David Rientjes
2014-07-29  9:31     ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 07/14] mm, compaction: khugepaged should not give up due to need_resched() Vlastimil Babka
2014-07-29  0:59   ` David Rientjes
2014-07-29  9:45     ` Vlastimil Babka [this message]
2014-07-29 22:57       ` David Rientjes
2014-07-29  6:53   ` Joonsoo Kim
2014-07-29  7:31     ` David Rientjes
2014-07-29  8:27       ` Joonsoo Kim
2014-07-29  9:16         ` David Rientjes
2014-07-29  9:49       ` Vlastimil Babka
2014-07-29 22:53         ` David Rientjes
2014-07-30  9:08           ` Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 08/14] mm, compaction: periodically drop lock and restore IRQs in scanners Vlastimil Babka
2014-07-29  1:03   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 09/14] mm, compaction: skip rechecks when lock was already held Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 10/14] mm, compaction: remember position within pageblock in free pages scanner Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 11/14] mm, compaction: skip buddy pages by their order in the migrate scanner Vlastimil Babka
2014-07-29  1:05   ` David Rientjes
2014-07-28 13:11 ` [PATCH v5 12/14] mm: rename allocflags_to_migratetype for clarity Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 13/14] mm, compaction: pass gfp mask to compact_control Vlastimil Babka
2014-07-28 13:11 ` [PATCH v5 14/14] mm, compaction: try to capture the just-created high-order freepage Vlastimil Babka
2014-07-29  7:34   ` Joonsoo Kim
2014-07-29 15:34     ` Vlastimil Babka
2014-07-30  8:39       ` Joonsoo Kim
2014-07-30  9:56         ` Vlastimil Babka
2014-07-30 14:19           ` Joonsoo Kim
2014-07-30 15:05             ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53D76D30.3000106@suse.cz \
    --to=vbabka@suse.cz \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox